One of the advantages that Git has over Subversion and CVS is the use of its
index as a staging area, which turns out to be a much more flexible model
than Subversion. One of the things that always annoyed me about Subversion was
that there seemed to be no elegant way to only commit only some of your changes
to a particular tracked file. Subversion deals only in files in the working
copy, and if you want to commit changes to a file, you have to commit all the
changes in that file, even if they’re not related.
Where Subversion falls short
As an example, suppose you’re making changes to a working copy of a Subversion
repository called myproject
, and you’ve made a few changes to the main file,
myproject.php
; on one line, you’ve fixed a bug caused by getting the
parameters for htmlentities()
in the wrong order. On another, near the head
of the file, you’ve changed a php.ini
setting to allow the script to run for
a long time. Here’s what the output of svn status
and svn diff
might look
like in this case:
$ svn status
M myproject.php
$ svn diff
Index: myproject.php
================================================================
--- myproject.php (revision 2)
+++ myproject.php (working copy)
@@ -1,5 +1,7 @@
<?php
+ini_set("max_execution_time", 300);
+
/**
* Open main class.
*/
@@ -120,7 +122,7 @@
public function dumpvalue($value)
{
- print htmlentities($value, "UTF-8", ENT_COMPAT);
+ print htmlentities($value, ENT_COMPAT, "UTF-8");
}
Under Subversion, unless you move files around, you can’t commit only one of
these changes; you need to commit both. This isn’t really the end of the world,
since you could include a commit message describing both things you changed:
$ svn commit -m "Allowed longer runtime, fixed parameter order bug"
Transmitting file data .
Committed revision 3.
But if you’re finicky like me, and you’d prefer to think of commits as grouping
semantically related changes as much as possible, it would be much better to be
able to commit these two changes separately, and this is where Git’s use of an
index shines.
Git’s method
Let’s work with the same project again, but this time as a Git repository.
We’ll make the same changes again, and view the output of git status
and git
diff
:
$ git status
# On branch master
# Changes not staged for commit:
#
# modified: myproject.php
#
no changes added to commit
$ git diff
diff --git a/myproject.php b/myproject.php
index 7c20f21..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -1,5 +1,7 @@
<?php
+ini_set("max_execution_time", 300);
+
/**
* Open main class.
*/
@@ -120,7 +122,7 @@ class MyProject
public function dumpvalue($value)
{
- print htmlentities($value, "UTF-8", ENT_COMPAT);
+ print htmlentities($value, ENT_COMPAT, "UTF-8");
}
So far, so good. Now when we run git add myproject.php
to stage the changes
in the index ready for commit, by default it does the same thing Subversion
does, putting all of the changes in that file into the staging area. That’s
probably fine in most cases, but today we want to commit one change, and then
the other. The most basic way to do this is using Git’s --patch
option.
The --patch
option can be added to git add
, and to some other Git commands
concerned with manipulating the index as well, to explicitly prompt you about
staging or not staging different sections of the file, that it terms hunks.
In our case, the process of including only the first change would look
something like this:
$ git add --patch myproject.php
diff --git a/myproject.php b/myproject.php
index 7c20f21..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -1,5 +1,7 @@
<?php
+ini_set("max_execution_time", 300);
+
/**
* Open main class.
*/
Stage this hunk [y,n,q,a,d,/,j,J,g,e,?]? y
@@ -120,7 +122,7 @@ class MyProject
public function dumpvalue($value)
{
- print htmlentities($value, "UTF-8", ENT_COMPAT);
+ print htmlentities($value, ENT_COMPAT, "UTF-8");
}
Stage this hunk [y,n,q,a,d,/,K,g,e,?]? n
This done, if you compare the output of git diff --staged
and git diff
,
you’ll notice that there are changes staged ready for commit in the file, and
also changes that are not staged that we can commit separately later:
$ git diff --staged
diff --git a/myproject.php b/myproject.php
index 7c20f21..4bb2362 100644
--- a/myproject.php
+++ b/myproject.php
@@ -1,5 +1,7 @@
<?php
+ini_set("max_execution_time", 300);
+
/**
* Open main class.
*/
$ git diff
diff --git a/myproject.php b/myproject.php
index 4bb2362..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -122,7 +122,7 @@ class MyProject
public function dumpvalue($value)
{
- print htmlentities($value, "UTF-8", ENT_COMPAT);
+ print htmlentities($value, ENT_COMPAT, "UTF-8");
}
So your staging area is all ready with just that one change in it, and all you
need to do is type git commit
with an appropriate message:
$ git commit -m "Allowed longer runtime"
[master 19d9068] Allowed longer runtime
1 files changed, 2 insertions(+), 0 deletions(-)
And the other change you made is still there, waiting to be staged and
committed whenever you see fit:
$ git diff
diff --git a/myproject.php b/myproject.php
index 4bb2362..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -122,7 +122,7 @@ class MyProject
public function dumpvalue($value)
{
- print htmlentities($value, "UTF-8", ENT_COMPAT);
+ print htmlentities($value, ENT_COMPAT, "UTF-8");
}
Other methods
Because Git’s index can be manipulated with its lower-level tools very easily,
you can treat the differences between your changes and the index like any other
diff
task. This means more advanced tools like Fugitive for Vim can be
even better for seeing changesets in individual files as you stage them for
commit. Check out Drew Neil’s Vimcast series on Fugitive if you’re
interested in doing this; it’s quite an in-depth series of videos, but very
much worth watching if you’re a Vim user who wants to understand and use Git to
its fullest, and you really value precision and clarity in your commits.