As you’re probably aware of by now, I really like Git. It took some time but things finally started clicking. One of the things I wanted to do was make it easier to interact with Git from Python / Django projects.
I searched around for a Python Git module. I really didn’t find anything that looked complete to me, although I didn’t look too hard. Not being the creative type I noticed that Ruby has the grit library created by Tom Preston-Werner and Chris Wanstrath, which is very nice. I decided to port it because I can use it for some cool stuff, and because I figured it would help me learn a lot about Python. So here it is.
About
GitPython is a python library used to interact with Git repositories.
GitPython is a port of the grit library in Ruby created by Tom Preston-Werner and Chris Wanstrath.
The method_missing stuff was taken from this blog post.
REQUIREMENTS
- Git tested with 1.5.3.7
- Python Nose – used for running the tests
- Mock by Michael Foord – used for tests.
INSTALL
You can download the code from CheeseShop or alternatively pull the source.
python setup.py install
SOURCE
GitPython’s git repo is available on Gitorious, which can be browsed at:
http://gitorious.org/projects/git-python/
and cloned from:
git://gitorious.org/git-python/mainline.git
USAGE
GitPython provides object model access to your git repository. Once you have created a repository object, you can traverse it to find parent commit(s), trees, blobs, etc.
Initialize a Repo object
The first step is to create a Repo object to represent your repository.
>>> from git_python import *
>>> repo = Repo("/Users/mtrier/Development/git-python")
In the above example, the directory /Users/mtrier/Development/git-python
is my working repository and contains the .git directory. You can also
initialize GitPython with a bare repository.
>>> repo = Repo.init_bare("/var/git/git-python.git")
Getting a list of commits
From the Repo object, you can get a list of Commit
objects.
>>> repo.commits()
[<GitPython.Commit "207c0c4418115df0d30820ab1a9acd2ea4bf4431">,
<GitPython.Commit "a91c45eee0b41bf3cdaad3418ca3850664c4a4b4">,
<GitPython.Commit "e17c7e11aed9e94d2159e549a99b966912ce1091">,
<GitPython.Commit "bd795df2d0e07d10e0298670005c0e9d9a5ed867">]
Called without arguments, Repo.commits returns a list of up to ten commits
reachable by the master branch (starting at the latest commit). You can ask
for commits beginning at a different branch, commit, tag, etc.
>>> repo.commits('mybranch')
>>> repo.commits('40d3057d09a7a4d61059bca9dca5ae698de58cbe')
>>> repo.commits('v0.1')
You can specify the maximum number of commits to return.
>>> repo.commits('master', 100)
If you need paging, you can specify a number of commits to skip.
>>> repo.commits('master', 10, 20)
The above will return commits 21-30 from the commit list.
The Commit object
Commit objects contain information about a specific commit.
>>> head = repo.commits()[0]
>>> head.id
'207c0c4418115df0d30820ab1a9acd2ea4bf4431'
>>> head.parents
[<GitPython.Commit "a91c45eee0b41bf3cdaad3418ca3850664c4a4b4">]
>>> head.tree
<GitPython.Tree "563413aedbeda425d8d9dcbb744247d0c3e8a0ac">
>>> head.author
<GitPython.Actor "Michael Trier <mtrier@gmail.com>">
>>> head.authored_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)
>>> head.committer
<GitPython.Actor "Michael Trier <mtrier@gmail.com>">
>>> head.committed_date
(2008, 5, 7, 5, 0, 56, 2, 128, 0)
>>> head.message
'cleaned up a lot of test information. Fixed escaping so it works with subprocess.'
You can traverse a commit’s ancestry by chaining calls to parents.
>>> repo.commits()[0].parents[0].parents[0].parents[0]
The above corresponds to master^^^ or master~3 in git parlance.
The Tree object
A tree records pointers to the contents of a directory. Let’s say you want the root tree of the latest commit on the master branch.
>>> tree = repo.commits()[0].tree
<GitPython.Tree "a006b5b1a8115185a228b7514cdcd46fed90dc92">
>>> tree.id
'a006b5b1a8115185a228b7514cdcd46fed90dc92'
Once you have a tree, you can get the contents.
>>> contents = tree.contents
[<GitPython.Blob "6a91a439ea968bf2f5ce8bb1cd8ddf5bf2cad6c7">,
<GitPython.Blob "e69de29bb2d1d6434b8b29ae775ad8c2e48c5391">,
<GitPython.Tree "eaa0090ec96b054e425603480519e7cf587adfc3">,
<GitPython.Blob "980e72ae16b5378009ba5dfd6772b59fe7ccd2df">]
This tree contains three Blob objects and one Tree object. The trees are
subdirectories and the blobs are files. Trees below the root have additional
attributes.
>>> contents = tree.contents[-2]
<GitPython.Tree "e5445b9db4a9f08d5b4de4e29e61dffda2f386ba">
>>> contents.name
'test'
>>> contents.mode
'040000'
There is a convenience method that allows you to get a named sub-object from a tree.
>>> tree/"lib"
<GitPython.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a30">
You can also get a tree directly from the repository if you know its name.
>>> repo.tree()
<GitPython.Tree "master">
>>> repo.tree("c1c7214dde86f76bc3e18806ac1f47c38b2b7a30")
<GitPython.Tree "c1c7214dde86f76bc3e18806ac1f47c38b2b7a30">
The Blob object
A blob represents a file. Trees often contain blobs.
>>> blob = tree.contents[-1]
<GitPython.Blob "b19574431a073333ea09346eafd64e7b1908ef49">
A blob has certain attributes.
>>> blob.name
'urls.py'
>>> blob.mode
'100644'
>>> blob.mime_type
'text/x-python'
>>> len(blob)
415
You can get the data of a blob as a string.
>>> blob.data
"from django.conf.urls.defaults import *\nfrom django.conf..."
You can also get a blob directly from the repo if you know its name.
>>> repo.blob("b19574431a073333ea09346eafd64e7b1908ef49")
<GitPython.Blob "b19574431a073333ea09346eafd64e7b1908ef49">
What Else?
There is more stuff in there, like the ability to tar or gzip repos, stats, blame, and probably a few other things. Additionally calls to the git instance are handled through a method_missing construct, which makes available any git commands directly, with a nice conversion of Python dicts to command line parameters.
Check the unit tests, they’re pretty exhaustive.
What is Next?
There are a couple of tests that don’t pass due to an inability to mock them properly, so I’m going to get those fixed up.
I also plan to restructure some of the object relationships. A few of them feel a little dirty to me.
LICENSE
New BSD License. See the LICENSE file.


That’s very awesome, empty! I had an inkling of the same thought when I was looking for a git python library but never bothered trying to create something. I especially like how you used the div method to simulate directory traversal… pretty slick.
Is there a gitorious written in Django in your future? :)
Wow, this looks really cool! I was wondering why there were no Python libraries for interfacing with Git. This must have taken you quite a while.
Michael, this is great stuff, you rock!
You definitely have been faster than me /me doing rm -rf ~/Code/gitpython
Nice work! Thank you very much for this.
looks great! congrats on the launch. thanks a bunch :)
Cool stuff Michael! Specially now, that I’m moving all my projects to Git.
Awesome! I looked for a Python Git module a while ago and turned up nothing, and then today I looked again and found this. Nice work =)
Thanx for the release!
I’ve been thinking about using git from Python. I was aware of Stacked Git (stgit), which is a quilt-like implementation on top of git written in Python.
Have you looked at Stacked Git?
Eddy: I have not looked at it but will.
Thanks for all the encouraging comments everyone.
Excellent. I was looking around for something similar to this a few weeks ago after getting turned on to git. I’d found the ruby implementation, and am glad to see a native python version.
I’ve been using git hooks for deploying a small django app. It’ll be nice to be able to directly work with git through python :-)
Am I right that currently the library is for read-only access to a repository? Would be nice to be able to add files, and commit changes too. Might that be within scope for future enhancement?
Am really intrigued by the notion of using git as a data store for a simple CMS, a la the Ruby-based git-wiki, hence the question.
First, sorry for the contrarian point of view.
I applaud you for one more effort to make a good tool usable in the Python world; however, why use a good tool when there is a great other one? ;-)
In your posts I cannot find your reasons for choosing Git; anyway, I would like to plug the jewel that Mercurial is. Apart from being written in Python, I find it more pythonic in simplicity of both usage and implementation.
I summarized its many virtues here:
http://lwn.net/Articles/274823/
Again, not intending to rain on your parade, and still curious about your reasons for going with git.
For anyone who git hasnt click for yet,check out this find reading http://www.newartisans.com/blog_files/git.from.bottom.up.php
what a good work!
You might want to check out git-issues, a bug tracker for git written in python. It’s at github: http://github.com/ktf/git-issues/tree/master
I’m thinking about making a django frontend…