Vous êtes sur la page 1sur 56

GIT Internals

Pedro Melo <{mailto,xmpp}:melo@simplicidade.org>

A short GIT History


2002 Apr 2005: The BitKeeper Wars Apr 2005: Episode IV - A New Hope July 2005: Hamano is the new maintainer Late 2008: GitHub hits the spotlight
Im an egotistical bastard, and I name all my projects after myself. First Linux, now git. Linus

al t son Per

A short GIT History

on ake

2002 Apr 2005: The BitKeeper Wars Apr 2005: Episode IV - A New Hope July 2005: Hamano is the new maintainer Late 2008: GitHub hits the spotlight
Im an egotistical bastard, and I name all my projects after myself. First Linux, now git. Linus

GIT rules (without !!)


Track content, not changes Simple repository Complex software Its easier to update the software, complex
to update all the repos so far Git Mantra: http://bit.ly/git-phylosophy

In other words, I'm right. I'm always right, but sometimes I'm more right than other times. And dammit, when I say "les don't matter", I'm really really Right(tm).
Linus

Strong Points
Non-Linear development Distributed Development Centralized development is a subcase Efciency Toolkit Design

Objects
Git repositories store objects Stored in the Object Database Inside the Git directory .git at the root of your project Four major object types Objects are compressed for storage (zlib) SHA1 of header+content ID

The Blob
Files are stored as blobs Only content, no metadata

Meet the blob


blob [content_size]\0 Your content goes here after the header I like pizza with apples

The tree
Trees store directories Mode, type, pointer and name Recursive, trees can contain trees Stored as a simple text le

Meet the tree


tree [content_size]\0 100644 blob b5f21a README 100644 blob afe433 Makele.PL 040000 tree a42cd0 lib

The commit
The object that makes history Pointer to a tree and the parent(s)
commits if any message

Author, committer and commit

Meet the commit...


commit [content_size]\0 tree 23edfc author Pedro Melo <melo@mini.me> 1243036800 committer Pedro Melo <melo@mini.me> 1243036800 commit without a parent usually called rst commit

...and its child the other commit


commit [content_size]\0 tree fde45c parent 3454df author Pedro Melo <melo@mini.me> 1243036932 committer Pedro Melo <melo@mini.me> 1243036932 and we xed that nasty bug after all, they do tend to crop up

The tag
A name for a particular commit Can contain a message Optionally GPG signed Allows for cryptographically secure
releases

Meet the tag


tag [content_size]\0 object 123fec type commit tag v1 tagger Pedro Melo <melo@mini.me> 1243037423 made it to 1.0!

Git Data Model Recap


Immutable objects A le per object Repacked into object packs for efciency Organized as a directed acyclic graph

proj/ Makele.PL lib/ Cool.pm

proj/ Makele.PL lib/ Cool.pm

proj/ Makele.PL lib/ Cool.pm

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Cool.pm

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Makele.PL lib/

Cool.pm

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Makele.PL lib/

Cool.pm

Cool.pm

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Makele.PL lib/

Cool.pm

Cool.pm

References
Names for commits Mutable, they point to a specic commit A branch is a reference, a name to a
commit
Name

and move to a new one after each commit

Special HEAD reference: points to a


reference

proj/ Makele.PL lib/ Cool.pm

Makele.PL lib/

Makele.PL lib/

Makele.PL lib/

Cool.pm

Cool.pm

master

HEAD

master

HEAD

master test

HEAD

master test HEAD

master

test

HEAD

master

HEAD

test

test

master HEAD

test

master

Merge

test

Merge
master

test

master

Rebase

master

Rebase

test

master

Rebase

test

Rebase + Merge

master test

Rebase + Merge

master test

Non-SCM uses for Git


Leverage strengths immutable over network pulls only missing objects fast checkout (compare to copy, less to
read)

easy rollback

Beware of weak points


Always stores full copy of les not good for backups of DB dumps Full history more disk space this might chance as shallow clones gain
funcionality...

Content distribution
Updates done in a master, central
repository

Hierarchy of slave repositories Fast sync between repositories, fast


checkout

Can be automated with hooks Useful if you have lots of static les, faster
than rsync

Read-only lesystem
Design web server that fetch objects
directly from the object database

Compact storage, efcient retrieval Packs of objects also very VM friendly,


mmap ready

Some solutions already available OSS

Wiki/Ticketing backend
Use git repository as storage for wiki or
ticketing systems

Good match for distributed developement Several solutions already available OSS ... but similar to SCM usages

Thats all folks!


Ill be around #codebits, feel free to ask me
stuff

If you want a git as a SCM demo, lets get


organized and Ill do a impromptu presentation, or even private lapdan^H^H^H^H^Hdemos

After #codebits

<{mailto,xmpp}:melo@simplicidade.org

About Git
http://git-scm.com/ Git Internals: http://peepcode.com/products/git-internals-pdf Git book: http://progit.org/

About Me
http://simplicidade.org/notes/ @pedromelo {mailto,xmpp}:melo@simplicidade.org skype:melopt http://github.com/melo http://www.slideshare.net/melopt

Vous aimerez peut-être aussi