How I Learned to Stop Worrying and Love Git
November 12, 2011 14:59 Filed in: Development
Subversion Curmudgeon
I’m a big proponent of source control, and have been for almost as long as I’ve been developing software. In my Windows days I used Visual SourceSafe…and actually liked it.In 2008 when I went Windows-sober and dove into iOS full time, I embraced Subversion as the de-facto SCM system for the platform.
I grew quite comfortable with it, and made sure to expose my fellow developers to it as well. Subversion has performed admirably and I’ve never been bitten by it.
Around a year and a half ago, some of my more forward thinking developer friends starting advocating Git as a replacement, and encouraged me to eschew SVN.
I was, to put it mildly, hesitant. I’ve been in the business for long enough to have a bit of healthy caution when new technologies come on the scene. Don’t get me wrong - I love new/shiny technology and play with everything I can. But I’m a bit slower to adopt when we’re talking about mission critical workflow tools.
I need a really good reason to fix something that’s not broken.
And so, I resisted. As time progressed, I ended up the lone SVN holdout in my circle of developer buddies…and that concerned me.
Was I being too cautious? I didn’t want to be “that guy” - the sad old developer who clings to ancient technology because it’s what he knows, while the rest of the world marches forward.
Once I had a safe point in my release workflow to do so, I converted some of my SVN repos to Git and started experimenting. Once I got over the initial learning curve problems, I found that I fell in love with it.
Since I held out for so long against Git, I now feel honor-bound to do my part proselytizing it.
The purpose of this blog entry is to give people some real-world benefits to making the switch, and to pull together links on how to make the migration.
Why Migrate? Real-World Benefits of Git
Before I switched, the main arguments people used to try and sell me on Git were:- It’s distributed. I can check in code without being online.
- It has really great branching.
The branching benefit didn’t sell me either, as I’ve found branching to be an activity that is best done sparingly, and with large quantities of pain.
Now that I have switched, I offer some real-world benefits that would have helped me get over the hump sooner.
Real Branches and Tags
Most SVN repos have a root-level structure like this:/branches
/tags
/trunk
“Trunk” is the latest and greatest, “tags” holds a copy for each released version, and “branches” is used to store experiments until they get merged back into the trunk.
The problem is, this folder structure is not enforced by Subversion - it’s just 3 folders of files as far as SVN is concerned. They could just as well be called “foo” “bar” and “stuff.”
Because SVN doesn’t know the meaning of these folders, it doesn’t treat them any differently.
“So what?” you say. “I know what they are for and I’m using them correctly.”
Maybe so, but let’s assume a long-lived SVN project like this:
/branches
/newversion_testcode
/tags
/version_1.0
/version_1.1
/version_1.2
/version_1.3
/version_1.4
/trunk
When you pull this down from your SVN repository, it’s going to run for a good long while…and you’re going to end up with 7 full copies of the project on your disk. “So, what, disk is cheap” may be a valid argument, but when you consider most people are moving to small+fast SSDs, space becomes precious again.
Tags are simply names applied to a particular point in the commit timeline. Adding them won’t increase your download time or consume more of your local workstation disk.Git treats branches and tags as first-class citizens, and the fact that it does so brings some nice benefits.
Similarly, branches are stored intelligently by Git. Adding one won’t increase your disk consumption, as deltas are used to keep track of what’s what internally.
All this combines to my next point:
Better Performance
Because branches and tags don’t come down as redundant data, performance is substantially better over the wire.To benchmark just how much better, I timed a full checkout of a large, long-lived SVN project containing dozens of tags and a couple of branches.
The original SVN repo was converted to Git and hosted on the same server for comparison purposes.
The results speak for themselves:
SVN: The repository took 30:09 minutes to download, and consumed 3.86GB of disk.
Git: The repository took 2:22 minutes to download, and consumed 187.6MB of disk.
This is even more impressive when you consider my next point:Git sends much less data over the wire than SVN and consumes less disk on your workstation.
Local Commit History
SVN only gives you a snapshot in time. If your server went away for some reason, the checkin history would be lost; you’d only have the latest copy of trunk and each tag on your local workstation.Git instead gives you the entire repository, with all checkin history back to the initial creation of the repo. This is a “nice-to-have” feature for redundancy, but it also becomes another performance benefit when diffing files, viewing checkin history, etc. All these operations can run locally rather than over the wire.
Despite consuming less disk, Git puts more information on your workstation and gives you the entire repository commit history.
Better Renames
Renaming files on SVN is a somewhat dicey operation that ends up committing as two operations: a delete, and an add. Git is able to identify that the file itself did not change, and it registers this as what it is: a rename.No .svn Droppings
SVN makes a hidden file in every folder of your project called “.svn”. These files can become troublesome when copying folders between projects. For example, say you want to pull a copy of JSONKit from project1 into project2 - the /JSONKit folder will contain it’s own .svn folder which must be deleted, or SVN will become confused at commit time.This is a bigger and more pervasive issue than it sounds like. Any serious SVN user knows this command for deleting all hidden .svn folders and starting from scratch:
find -name "\.svn" -exec rm -rf {} \;By comparison, Git stores only one .git folder and it’s in the root directory of the project. If you ever want to de-gittify a project, just delete one .git folder and you have a clean copy ready to import somewhere else.
Any SVN users who have struggled with issues related to .svn folders will appreciate this.
Better Branching
As I mentioned previously, branching in SVN was an activity I tried to avoid as much as possible.Creating a branch in SVN is a fairly nasty operation:
- Copy your trunk folder to the branches folder (which makes a full copy of the project)
- Do your experimental work in the branch
- Merge it back to trunk and delete the experimental branch
With Git, branching is fast, lightweight, and effortless. You can create as many branches as you like, and switching between them is like changing channels on a TV - it just happens, and your working folder is suddenly updated. It’s so fast and effortless that it encourages branching as part of your development workflow.
To get a full understanding of the power of branching and why you’ll want to do more of it in Git, I recommend reading Pro Git - it’s a great explanation.
Ok, I’m Sold - How Do I Convert?
Step 1: Prepare the SVN Repos
If you have one giant repo for all your projects, you should first split them into individual per-project repos.If you’re already using separate repos for each project, skip this step.
This requires using svnadmin filters and for me was the biggest and most time consuming part of switching.
The basic process is:
- Dump your repo using svnadmin dump
- Create a new repo
- Load the new repo using svndumpfilter, filtering to only the folder(s) you care about for the new project-centric repo.
Step 2: Convert the Repos
Once you have nice, separated per-project SVN repos, you can easily convert each one to a nice fresh Git repo.The best information I found about this comes from John Albin.
If you just have one repository to convert, follow his 7 step process here:
http://www.albin.net/git/convert-subversion-to-git
If you have a lot of repositories to convert, or you just prefer automation, look at his pre-built scripts that automate the process here:
http://www.albin.net/git/git-svn-migrate
Conclusion
In summary, most of my trepidation about switching to Git was unfounded, and the benefits of making the move have made me a believer. I hope that you’ll consider giving Git a try too!0 Comments