Codebase for my homepage.
git clone git://vcs.sapka.me/michal-sapka-me
Log | Files | Refs

commit 6132c61730ad53ab940bf052c3cf75a9e64cffb4
parent 90e8acffcbcc01679f84517d0bbc5a67e63b117b
Author: MichaƂ M. Sapka <michal@sapka.me>
Date:   Fri, 28 Apr 2023 22:43:02 +0200

feat: article for 2023-04-28

Acontent/2023/git-objects.md | 30++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+), 0 deletions(-)

diff --git a/content/2023/git-objects.md b/content/2023/git-objects.md @@ -0,0 +1,30 @@ +--- +title: "Git Objects" +category: "software" +abstract: How does Git store it's database? +date: 2023-04-28T22:37:57+02:00 +year: 2023 +draft: false +tags: +- Git +- tutorial +- engineering +--- +Any git repository has a hidden `.git` folder. If you open it, all internals of Git are at your disposal. Today, something I should have learned a long time ago: objects. + +First: a commit is an object. You can see it via `git cat-file -p <SHA of commit>`. The first two lines of the output will look like this: +``` +tree b4653c20c7486d8b9e4eb10a882b79a3a9f3cfdf +parent 5eb01813d3e6b1f2ac1c7f432d5d994a7fee9ec1 +``` + +The parent is the SHA of the parent's commit, but that's unimportant today. Instead, let's focus on the tree. You can check what's inside using the same `git cat-file -p <SHA>`, and you will see a listing of the top-level folder in the git repository. You can also `cat-file` any of those. There are two types of objects in Git: + +- tree - a tree of other objects +- blob - a file (compressed) + +What does it mean? A commit is a reference to the state of the entire repository at a given moment in time. The state consists of entire files (blobs) and references to other nodes in the trees (directories). Neat. + +This is why you don't want to store big binary files in git, as each version is a copy of the file. Not very space-effective. + +You can see each of those objects in `.git/objects`, but since they are compressed, it's much easier to use `git cat-file`. Note that blob objects don't have any filename attached - just the content. Instead, the filename is taken from a tree object. This is a benefit: the blob object will be reused when you have the same file under multiple names.