Who's that Git?
Author: Drew McCormack
Web Site: www.maccoremac.com
If you do any sort of programming, be it in traditional languages like C or Fortran, scripting in Bash or Python, or web development, you should be using some form of Source Control Management (SCM) system. (You can even use SCM for your Latex documents.) For many years, the Concurrent Version System, or CVS, ruled the roost, but things have moved on of late: Subversion has largely supplanted CVS, ‘fixing’ many of the legacy issues that CVS suffered, while retaining the same fundamental approach.
Some now argue that while Subversion is undoubtedly much nicer to work with than CVS, it doesn’t go nearly far enough, and still suffers many of the same issues. Two of these issues stand out:
- You need access to the repository for basic operations like branching and merging, which usually means these actions are not possible offline, and are relatively slow.
- Branching and merging is difficult, requiring manual accounting of revision numbers and the like. This means that many developers simply avoid branching, and work on the trunk, which can lead to its own problems.
To be honest, I hadn’t thought much about these issues myself. I like Subversion, and I just assumed that the pain of branching and merging must be intrinsic to the act. But now I know otherwise…enter Git.
I’ve been seeing references to Git online now for around 6 months, but haven’t taken much notice. People were claiming that it was much better than Subversion, but I didn’t really see how anything could be simpler than Subversion. That was until last week, when a screencast appeared on PeepCode.com, and I decided it was time to find out what the fuss was all about.
The screencast helped me understand the ways in which Git improves on Subversion, and also that you don’t have to give up Subversion entirely to reap the benefits of Git — Git will happily work with a Subversion repository. If you had to sum up what Git does well in one word, it would probably be ‘branching’. Git makes it so easy to create new branches, switch between them, and merge them together, that it is almost more difficult not to work on a branch. There are really two things that make this possible:
- Each user works with a complete repository, including all branches, so actions only involve the local machine, not a remote server. This makes things very fast.
- Git keeps track of the relationships between branches. For example, if you merge one branch into another, Git stores that information, and the next time you merge, it will only merge the changes since the first merge.
In Subversion, you have to keep track of what branches have been merged yourself, by adding revision numbers to log messages, for example. In Git, you rarely need to include a revision number in a command at all. For example, if you are on the master branch — the Git equivalent of the trunk — you merge changes made in a branch called mybranch like this
git merge mybranch
Compare that with Subversion, where you would first need to determine the revision number of your last merge by examining the log history, and then adding the revision numbers to the merge command. Git takes care of this for you.
The fact that every user has their own copy of the repository is a bit scary to Subversion users. Most expect such a repository to be huge, but Git is amazing at saving space, and Git repositories are often only slightly bigger than Subversion working copies.
Another question that arises from the ‘local repository’ issue is how you share your changes with other developers. Well, even though you work with your own repository, you can have certain branches track branches on remote machines. When you are ready, you can push changes in your local repository up to the remote server, or fetch changes that others have made on the remove server. So, ultimately, Git allows you to collaborate just as easily as Subversion, only it makes the development process itself more satisfying, because you can easily create and move between private branches offline.
Git and Subversion Sittin’ in the Bath…
My initial reluctance to checkout Git — pun fully intended — stemmed somewhat from the fact that I work on projects that are married to Subversion, and I thought that even if I liked Git, in practice, I couldn’t use it on any project. This was not true.
Git includes a command called
git-svn, which allows you to checkout code from a subversion repository, and then work with it locally with Git. When you are ready, you can upload your changes to the subversion repository, which doesn’t even need to know you are using Git. Neither do your co-workers. So you get all of the advantages of Git branching, without having to convince everyone to switch.
The Git Workflow
I’ve only been using Git for a short time, but I already have a few insights into how it will change the way I work. With Subversion, I would typically carry out small changes, like bug fixes, in a working copy of my source code checked out from the trunk. This would allow me to keep up with the latest changes in the repository, and easily commit my own fixes, but it had one big disadvantage: I couldn’t commit changes until I had something working, or the trunk would stop working. This means I wasn’t getting any of the advantages of regular commits, and if a particular bug turned out to result in considerable changes, I didn’t really have the safety net that an SCM system should offer.
For more major changes, I would usually create a separate branch, and checkout the source into a separate working copy. So creating a new branch would mean a full checkout of maybe a million lines of code, and then a full build. It also meant wasted disk space, because I had a full working copy for each branch, and one for the trunk.
With Git, the workflow is different: You checkout one ‘working copy’, which becomes your local repository. You then create as many branches as you like, each of which takes up virtually no space at all, and just as little time. You then switch between different branches as needed; switching is very fast, much faster than the equivalent operation in Subversion. Because you only have one ‘checkout’ of the source code, you do not duplicate build products like object files and libraries. This saves time and space.
Go Git It
I’m still very new to Git, but I am starting to see the light. I git it, and so should you. If you need more convincing, I recommend the PeepCode screencast, and — for Subversion users — there is a good site that compares Git commands to their Subversion equivalents. Very helpful.
Git is easy to build and install, working straight out of the box for me on Mac OS X 10.5. The Git web site provides all the answers.