My blog has been moved to

Sunday, April 13, 2008

Quick Start: using git (for Windows) with Google Code Project Hosting

Ideally everyone uses git, because just as Linus once wrote: Centralized works. It is just inferior.

And like what I expressed before, the project hosting feature of Google Code is a nice choice (much better than SourceForge) for doing collaborative works on open source projects. You have a project page, a wiki, a download space, an issue tracker, integrated statistics and mailing-list features, and of course a repository. However, up to now only subversion repository is offered. Likely because of the 80% folks reason.

That is surely not the end of the world. You can try to polish your growing git skills with that subversion repository by using git-svn. The idea is that you use git and then, once a while, you place the code in the subversion repository. This has several extra advantages than doing git only. First, there is a backup done by Google to your repository. In addition, other developers that know only how to use friendly GUI tools, e.g. TortoiseSVN, can get the code easily. And if they are the project members and contribute some code (through Subversion because that is what those poor guys are using), you can still merge their changes without pain to your master working branch. Also, your code is searchable with Google Code Search and this gives more exposure and promotes code sharing and reuse.

This short instruction is mainly targeted to "average" Windows developers who wants to use work offline using git and still are not afraid to use command-line, but not for the usual hard-core Linux geeks (who can easily recall the manpage of git even if you suddenly wake them in the middle of the night). For the latter, there are many posts that already cover it. Just so that you are aware, what is written below is not a replacement for the git-svn manual. In addition, suitably adjusted, it should also work with other types of subversion repositories beside Google Code Project Hosting's one, e.g. your company internal repository.

To get git, visit the msysgit project (which is, unsurprisingly, hosted on google code). Download the latest version that also includes git-svn, e.g. Git-preview20080301.exe at the time I write this. After that, double click the .exe file to install it. You will be presented with the installation wizard, just accept the default and continue clicking the Next button a couple of times until you finish. Usually it means you'll get git on C:\Program Files\Git and also you can right-click on Windows Explorer window to start doing some git magic from that active folder.

To initialize the git repository, from Windows Explorer folder tree, right click on a folder and choose the menu item Git Bash here. A console will open, here where you have type in the following:

git svn clone -s --username=joe.sixpack git

Of course, substitute joe.sixpack and coolproject with your Google Code user name and project name. The following message will show up:

Initialized empty Git repository in .git/
Authentication realm:  Google Code Subversion Repository
Password for 'joe.sixpack':

and then just enter the generated Google Code password for the subversion repository. If everything is fine, all the revisions in the remote Subversion repository of your project are being pulled to your local git repository. This may take a while, especially if your repository has years of history.

Note: On Windows, git-svn performance is just so-so. Maybe this will be improved in the future version. If suddenly you see git-svn fails at some point with the following error message:

 Cannot commit config file!
 config svn-remote.svn.branches-maxRev 116: command returned error: 4

then just retype your last clone command as git-svn would happily continue from where it stops (i.e. no need to pull from revision 1 again).

Now what you get is a subfolder called git (the last argument in the clone command that you typed before) which is your working folder and which contains the git repository (evidenced by a subfolder .git inside it). Switch to this git folder and you are ready for the subsequent steps.

To compact your git repository, run:

git gc

In principle, you need to do this only once a while. Here it is a good idea to do it because you just imported your whole subversion repository.

Tip: Because git repository is very compact, don't bother with disk space when you do the initial step of cloning. This is usually a point that is always repeated by your nearest git evangelist. To give an example, after cloning over 1300 revisions of SpeedCrunch source code, my git repository is only around 7 MB. On the other hand, a single subversion checkout (including all the branches) is already 51 MB. Mind you, that 7 MB contains all changes that have been committed during the history of the project vs 51 MB which is associated only with few subversion revisions only (trunk and several branches).

To start using git for your brand new cool feature, usually it is best to create a new branch for that particular feature. First, check all the branches that you might have using:

git branch -a
which gives something like:
* master

where the star sign (*) indicates the current branch. The branch called master is created for you automatically, it is basically the local git branch to track the trunk version of the remote subversion repository.

Now create a new branch, e.g. joe/feature1, from master with the command:

git checkout -b joe/feature1 master

The name joe/feature1 is arbitrary. It won't be visible to the outside world because it is your local branch. So feel free use a naming scheme that suits you.

Time to have fun! In this branch, implement the feature that you love and do some coding there. Once you are happy, just commit it (to this branch) using:

git commit -a

Continue hacking on it until you are satisfied. Make as many commits as you like. The sky is the limit.

Finally, when you think the feature is rock-solid, to merge back your changes to master, do the following:

git checkout master
git merge joe/feature1

which means switching back again to the master branch and the apply all the changes that have been done in joe/feature1 branch. Simple and lovely, isn't it?

Now, what about the poor souls that still need to check out or update the code through the subversion repository in Google Code? Well, you still can push your changes there. To synchronize the remote subversion repository with your git master you need to do:

git svn dcommit

The other way around is perfectly possible as well. If your collaborators change something in the remote subversion repository, to keep your git master up to date just use:

git svn rebase

That is all. For more advanced techniques, refer to the git-svn manual. Don't forget to polish your git skills by reading the manual and tutorials as well as (of course) doing a lot of practices.

Tip: An alternative to the very first stage of cloning is just to populate the new git repository only with some latest revisions of remote Subversion repository. Use the following commands:

git svn init -s --username=joe.sixpack
git svn fetch -r 1300:HEAD
git gc

which will just give all the changes starting from revision 1300 to the last one (HEAD). This is of course not really recommended because you won't get access to the whole history of your Subversion repository, but it is still useful if your project is very large and you do not care with all those changes made in its prehistoric time. If you use this init+fetch method instead of clone, there is no subfolder created. Your working folder (and git repository) is where you did the init and fetch. Now you can continue the usual steps of creating a local branch, implementing your feature, synchronizing, and so on.

Don't you just love git?


Unknown said...

git rocks, but git-svn has the awkward side effect that my repository is *much* bigger than the SVN version.

For example, I checked out the rsibreak source from KDE SVN with git-svn, after some minor commits, the repository is about 73 MB in size. Pretty much if you ask me.

I wonder where that comes from.

Unknown said...

I use Bazaar, hosting my projects at .

There I have free Bugtracker, Translation Tool, Question & Answer Module and a Blueprint Manager to plan future features.

Ariya Hidayat said...

@Bram: did you do 'git gc --prune --aggressive'?

Unknown said...

What do you mean ideally everyone uses git? I think it should be Mercurial :-p

git might be attracting a lot of attention now because of Linus/Linux, but that doesn't mean it's the only option and it sure doesn't mean it's the best.

Although in the end I'm sure git quircks could be ironed out. I don't mind, as long as we start using centralized VCSes ;)

Unknown said...

@Ariya: I hadn't run it, but I still see no improvement.

There are files in the .git repository (.git/svn/git-svn/.rev_db.xxxxx) which contain only millions of zeroes, nothing useful in there.

Anonymous said...

hi ariya, from my experience git svn is super slow, just use github, it's dead easy.

Ariya Hidayat said...

@ariekeren: of course ideally use git only (I also use github), the whole idea behind the this git-svn stuff is only for interacting with those that still stick with subversion.