GIT For Basics

[article]
Summary:

Editor's Note: Recently, I asked my colleague, Dilip, to give me an article describing his own experience learning a CM tool that I was not familar with myself. The best part of writing for CM Crossroads is that we get reports from people who actually know and understand how these products work in the real world. Please take a look at Dilip's excellent article and get ready to share your best practices and experience next!

I  have always been curious about open source tools. Recently, I was googling for CM tools and I found a source code revision tool created by Linux Torvalds call ‘GIT'.


Initially, this seemed like just one more SCM tool in the market! I began to wonder how does GIT differ, from other CM Tools, and how can I benefit from using this product (instead of the others that I was more familiar with)?

GIT is basically a lightweight source code revision (version) tracking tool (VCS). It can be used to track anything that you store on your hard disk.

I am not claiming to be an expert in every CM tool on the market, but, in my opinion, GIT is worth a try. Its design itself is strong as compared to many other tools. The real question that I was trying to answer was, "will this tool suit our needs?".

GIT works with all file names and file types and handled the permissions that I gave it, creating deltas between versions.  Branching in GIT is very efficient. Let's take a look at some basic functions.

What's interesting in GIT is it's design. GIT efficiently tracks the contents of files (and not the files themselves)!  (It took me some time to think in this way.) What I needed to understand, initially, is that GIT does not store the deltas between two elements to generate the versions. It also does not care about its permissions.  (Of course it supports parallel development as many other tools do as well.)

One more thing that I liked is that GIT is designed in a distributive way. That means that you have all the benefits of a distributed source code revision tool with GIT. The advantage of distributed development is that you don't need to bother to sync up with a central server. GIT keeps the entire copy of projects files on your local disk. If a colleague has a fix and you need to verify or apply into your feature, GIT will just pull it from the repo.

Another benefit is that the product is not dependent upon a main server (which would be a single point of failure). You do use the main repo to "clone" from, but it does not stop you from working or committing changes. You continue working (committing) and when the server is finally up, you can push your changes. Even if the server doesn't come up at all, you can setup a new one from the cloned copy of your repo. This is a great advantage of using distributed development systems!

Quick basic
Given a file, it generates the HASH reference code based on the contents of file. It uses cryptographic keys including the MAC SHA1 algorithm to generate the SHA1 reference code for the contents of file. This code serves as KEY to the CONTENTS of file. Something similar to index referencing an array. If you update the contents of a file/directory, GIT will generate a different SHA1 reference code. Another interesting thing is, here check-ins (called a "commit" in GIT) are local to your cloned copy of repo. Local check-ins has it own pros and cons. The advantage is that GIT is much faster. The disadvantage is that if we won't submit (integrate) changes often to central server (where the repo is hosted), you'll have a tough time during integration. But that can be managed easily. Just pull the latest code onto some local branch. Merge the changes over there, integrate and resolve the conflicts locally and then just push!

Learning GIT
Learning GIT to me was not very easy until I stopped comparing it with other SCM tools. The screencast all over the web is very impressive! This is a new tool, and there are many people out there using and helping others to learn it (through online screencast docs). I think that this product is really cool. I suggest that you look at the screencast docs such as 'The GIT community book' and Gitcasts.

We can find many interesting references to screencasts, cheat sheets and many others at Scott Chacon maintained page.

As I said earlier, GIT is a distributed source code revision tool. Whoever pulls the code (from the main repo) will have the entire copy of project history. This distributed system best suits the open source world, GIT can be setup as a centralized system too. For any organization which wants to take advantage of GIT, they can setup a repo on some central server and ask their development team to pull from it. As I mentioned earlier, since the commits are local, we can still continue to work even if the central server is down for maintenance :)

The best thing I would suggest is to create a free public repos at GITHUB and get some hands-on experience with your live repositories. You just need to know only a single command to start working with it. git clone.

 Example: To copy/clone any GIT repo, use the below command. This will give you a entire repo onto your local disk.

git clone git://git.kernel.org/pub/scm/git/git.git

If you are behind a firewall or accessing the web via a proxy server you can create a tunnel within https to access. Check accessing repos at GITHIB from http proxy.

Okay, you have a copy now. You have spent days on it and made some interesting changes and now you want someone to take a look of it. Just send a pull requests from your repo.

Once your repo is published to the net, the entire world can collaborate to your project.

What next?
Git being open source, there are many open source projects adopting it. Many projects are being moved from SVN to GIT. GIT works on many OS's like Windows, Mac, Redhat, ubuntu. This is an added advantage too. I won't be surprised If GIT easily replaces a few of the other SCM tools out there.

Now there are few things that GIT won't do. It doesn't track renames and documentation is not so friendly. Many times it is confusing. Overall, I loved learning this new product and I am looking forward to using it in my work!

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.