Getting (g)it

Posted on ‐ Tagged #git #source control #scm

Chances are, if you’re reading this article and you’re a programmer, you’re working with Git, or at least some form of version control. But do you really get (g)it? This article will dive into what makes Git, or any other comparable distributed version control system, your new best friend.

Before we go on, I must warn you that this article will be of relatively little interest to you if you’re already comfortable working with Git. However, I know of quite a few people who work with Git on a near daily basis, but struggle to grasp even the basic concepts of what makes it great. This article is for them.

You can do it!

While you might not feel this way right now, I promise you Git is nothing like voodoo magic and does not, in fact, require years of study to wield properly. In order to effectively use Git, one must possess only two skills: * An understanding of the fundamental concepts of distributed version control (I will teach you this). * A willingness to go out and read the manual pages of Git, or look up commands for yourself some other way (you get to do this one on your own).

Yes, I can hear you wondering, why bother reading the rest of this article if you won’t teach me any commands? Well trust me, if you understand the concepts, the rest will follow naturally.

I should point out that, I do expect you to have a very basic understanding of Git at this point. Specifically, I expect you to know what I’m talking about when I mention commits, branches, cloning, pulling and pushing. If you are completely new to Git, I suggest you read through chapters 1 & 2 on the official Git website, then return here. (Or continue on with chapter 3 and never come back here at all!)

Diving in

Now, before we talk about Git itself, lets take a step back and reflect on why we are using Git in the first place? Remember that project from highschool where you had to work together with three other classmates on this project or essay, and you were slinging Microsoft Word documents around via email? You had no idea who..

  • Had the correct, latest version?
  • Changed what, and where?
  • Or why?

And you..

  • Realized the next day that it was a mistake to delete those five paragraphs you had previously been sweating over, and there was no way to get them back because you didn’t have a backup?

Those are the kinds of issues we’re trying to solve here. (And yes, you could indeed use Git to collaborate on stuff other than coding, too.)

That which makes version control great

So here’s why you should be using version control, and why you should be using it right, because otherwise, you might as well save yourself some wasted time and not use it at all.

Version control, if done right, lets you answer all those questions about your project, at any given time, about any given change, whether it’s just your project, that of you and two coworkers, or the combined effort of a few thousand people. It also, as an added benefit, lets you easily share your project with all of your peers.

In software development, this allows you to easily collaborate with your team, see who changed what, when and why, merge changes that were done by different people back together, and undo (revert) changes if you decide they weren’t a good idea.

Isolating changes

Isolating changes down to their lowest common denomenator is absolutely key to using version control effectively. Do not change the colorscheme of your website while simultaniously adding that spiffy comments box to your blog. What if you decide you don’t like that color after all? What if spam bots find you and start blasting you with advertisements, and you decide you’re better off without a comments section after all?

Are you going to manually undo the changes you made? Open up all those files you changed, find the modified lines and change them back to what they were previously? No! At least, I sure hope not.

Isolate those changes. Isolate, isolate, isolate. Those go in their own seperate commits. Don’t like that colorscheme after all? Maybe weeks later you don’t want that comment section anymore? Find the hash of the commit in which you added that, then run git revert <commithash> and voila, you’re done in 60 seconds. You can’t do that if you don’t isolate, because then your only choice is to roll back more than one single, specific item, or not roll back at all.

Branching - where some people lose hope

Alright. So. Committing things seperately. Seems simple enough, right? Good. Bear with me for a bit, we’re going to talk about branches now. Sound scary? It’s not, trust me.

Cheap (local) branching is what sets Git, or any other Distributed Version Control System (DVCS) for that matter, apart from it’s predecessors like CVS or Subversion.

Say you’re a highly successful software company. You have this amazing product, a team of developers, and everything is running along nicely. Sooner or later (and I can assure you, definitely sooner rather than later), you’re going to want to try something new. Something you don’t think you’re going to get right with just one simple change. You need to make multiple changes, and that means multiple commits. Because you’re doing version control right, remember?

In fact, you may realize you’ll need a lot more than just one day to get it right. Actually, you worry if you’ll get it right at all. But you think this idea is so great, it should be pursued, and if you can’t get it to work, you’ll just scrap it and pretend it never happened.

Fork in the road

Branches are exactly what you need to facilitate this process. Instead of all working on the same thing, like having all your developers working on the master branch, you’re going to seperate your various lines of development out into multiple branches. Think of it like a fork in the road on a hiking trail.

Person A goes left because it’s shorter and easier to walk, while person B, enjoying a good challenge, takes the steep, uphill path to the right. But eventually, they’re going to meet up again at the end of the trail.

That, or one of them gets eaten by a bear, never to be heard from again. Poor fella.

Branching is really no different. Up until a specific point in time, you have a shared history, a shared path. But at some point, you split, because you want to take it into a different direction. You work in isolation* until you’re finished, at which point you converge back onto the original path again. Or, you decide your idea was crap, and you drop that line of development entirely.

Like that poor guy who got eaten. Your idea was bad, but the moment you decide to drop it, that’s it. There’s nothing you have to undo as your code was in total isolation from any other lines of development, so you can drop it without issue.

The beauty of Git is you can create, merge or destroy branches willy nilly, at virtually no cost. It takes less than a second to create a new branch (for Git, that is. You may need a little more time to input the command to make Git do your bidding, though), so really, why wouldn’t you do this each time you started working on something new? After all, you never know if that thing you’re working on is really a simple one-line fix, or if it turns out to be incredibly complicated after all, do you?

Not to mention, what if your manager shows up at your desk, demanding to fix this just-reported critical issue right this damn second? What are you going to do then? Make a new clone of the repository in another directory? Or worse, mix it in with your current, unrelated bugfix? Why not simply, switch branches, fix it, test it, ship it, then switch back to your original branch and continue where you left off?

Doesn’t that just make your life all sorts of easy? (Oh, in fact, don’t forget to also merge that hotfix you just released back into that first branch too. That way you don’t run into the same issue there anymore, either.)

Wrapping up

Okay, whew. That turned out to be a relatively long post, and now that I’ve written it all, I’m not at all sure it succeeds in making all this totally clear for you. However, hopefully, it has given you new insights and at least made it a little more clear now.

Easy branching makes for such a radical change in how we (can) use version control, getting this working right for you is guaranteed to boost your productivity immensely.

* This isn’t entirely true. It’s good practice to merge back changes, called reintegration merges, frequently. But that’s a topic for another time.