gorenje/svngit.textile

## svngit.textile

      
    Raw
  

              svngit.textile
            
          
    Introduction

This post attempts to provide Subversion developers a new perspective on Git and how git is differs from

subversion but not using the usual “git is distributed development” or “git is peer-to-peer versioning 

management”, which tend not provide an argument for an existing subversion project to switch to git.
Instead, I will attempt to provide a historical background to the development of the first versioning 

tools and how these lead to the development of git. It is more that probably that certain historical events

mentioned here are completely and utterly wrong, this not intended. Corrections and improvements are very

welcome!
TL;DR: this entire post can be summed up by my personal mantra “Git is a patch management tool” and not

a software versioning system (although it can do that also).
Subversion v. Git

Svn and Git do the same things, they have the same purpose in life. They purpose is to manage software 

changes to make it easy to see who did what, what changed when and which change potentially broke something.

Svn and git differ in how they do this. And the reason why they differ is that they have a different views 

how to achieve the goal. But also because the came from two different development processes.
Subversion takes the point of view that each change is somehow related to the previous change, i.e.

revision 23 has an implicit dependency on the existence of revision 22, and so on. The reason for this

is clear: a change can only be done to a particular state of the code base.
Git takes the view that each change is isolated and stands on it’s own two feet. Hence the revision numbers

in Git are not linear, rather they appear to be random (which they’re not). More precisely, git is naming

the patches so that they can be referenced later. So all that git is doing is managing a bunch of patches 

that can be applied to anything.
Subversion does not explicitly not take this view, however each revision is, essentially, also a patch.

Unfortunately svn does not provide any tools for allowing the extra flexibility of patches. Git does

and does it with explicit support for patching (e.g. git cherry-pick, git am, git apply).
What are patches?

Back in the beginnings of the open source community, when a developer found a bug in a piece of open 

source software, they would create a fix, make a patch and email that patch to the maintainer. Now if 

the maintainer had not done any further development, they could apply the patch to the code base and 

world was rosy again.
If on the other hand, the maintainer had done some more development and made changes that made applying 

the patch non-trivial, the original contributor had to upgrade their codebase and redo their patch (unless 

the maintainer was feeling particularly benevolent and did that for the contributor).
Patches became the number one way of getting your changes/fixes/improvements into an open source project

and they became a type change management. However, they became impractical for larger projects, particularly

ones with many contributors and few maintainers. Out of this, RCS (revision control system) was born.
RCS had the intention of “marking” the state of a code base so that patches could be applied more easily;

if a patch was made from a particular revision of a file, then it could be applied to that exact revision. 

Originally, RCS was just a bunch of scripts around the patch command to make their management painless. 

RCS was also document based, meaning it managed documents individually and not an entire project.
What went wrong?

(For the sack of brevity, branching and merging have been ignored.)
Eventually CVS (concurrent versioning system) was born. CVS is the original server-client architecture

of software versioning and allowed multiple developers to work on one code base without breaking too much!

CVS maintained a central repository of the code base and incremental changes made to that codebase. It 

also began the management of a project in its entirety (with the use of tags) and not as individual files. 

This lead to the idea of a revisions, representing the state of the entire project and each individual file 

in that project at a particular point in time.
Eventually Subversion (and many other VCS’s) came along which concretized the concept of a project-wide 

revision, meaning that even if just one file changed, the entire project was bumped up a revision. Making 

a revision very static and creating an implicit dependency on the previous state of the entire project.
Subversion also maintained the centralized one-world view of software versioning. Of course, this type of

architecture is very important if each revision depends on the previous one – there is basically no easy

way of maintaining a linear list of software revisions with multiple servers.
All this lead to a centralized and linear thinking in the software development process. However, this did

open the way for community development and larger contributor base for open source projects.
CDD – Community Driven Development

Open Source projects began using subversion for managing their codebase and that was great. It allowed a

group of developers to work concurrently and independently, without the risk of breaking or overwriting 

existing changes.
Eventually what happened was that large open source projects still maintained changes and fixes via patches

because SVN did not easily allow contributors to provide changes. Although branching was a possibility, it

required that a contributor had to have commit privileges to the main repository – including “trunk”

(trunk generally being the branch that would end up being released).
So now a group of maintainers managed the stream of patches coming in from contributors without commit 

privileges. This state of affairs introduced the concept of forking projects: taking the code base and 

creating a new repo with changes made by the contributor. This was a particular issue for projects where 

the maintainers did not have enough time to apply patches or simply ignored patches. One big driving 

force behind this movement was (and still is) SourceForge.net – the first (noteworthy) subversion 

repository hoster.
Thinking patches

Git came out the requirements for software maintenance of open source projects with core maintainers, 

with reviewers and with contributors. For example, the Linux kernel has a bunch of core maintainers that 

can commit, individual teams for specific parts of the kernel (these act, in part, as reviewers who 

submit patches to the core committers) and contributors who have fixes, improvements and feature that 

they would like to see in the kernel.
Again, this would all be patched based. Since the Linux kernel is a modular piece of software, a patch for

a particular driver could be applied regardless of what happened to other parts of the kernel. Hence

there is no particular need to maintain one central revision off which patches needed to be made.
Out of this, git was born. Git is basically a distributed patch management service and not a software 

versioning system. Hence thinking of each commit as being a patch makes working with git easier. Branches 

are just a collection of patches, patches may be merged into a single patch, branch may be checked out and

everything can be undone (using: git checkout master).
Git even explicitly supports patches by allowing for their creation (git format-patch) and application 

(git am & git apply). In fact git does little to interfere with any existing development one might have,

it still supports a centralized development process (however not providing any easy way of having a 

linear versioning of code).
Conclusion

Some things in life never change, and patching code has been part of the software development process since

the dawn of epoch. Diff and patch were the basis for many a good piece of software, both are still with us.
Git builds on this and provides a tool that has become essential community driven 

development, providing versioning, patching, branching and merging … and undo!