Committing part of a file

One of the advantages that Git has over Subversion and CVS is the use of its index as a staging area, which turns out to be a much more flexible model than Subversion. One of the things that always annoyed me about Subversion was that there seemed to be no elegant way to only commit only some of your changes to a particular tracked file. Subversion deals only in files in the working copy, and if you want to commit changes to a file, you have to commit all the changes in that file, even if they’re not related.

Where Subversion falls short

As an example, suppose you’re making changes to a working copy of a Subversion repository called myproject, and you’ve made a few changes to the main file, myproject.php; on one line, you’ve fixed a bug caused by getting the parameters for htmlentities() in the wrong order. On another, near the head of the file, you’ve changed a php.ini setting to allow the script to run for a long time. Here’s what the output of svn status and svn diff might look like in this case:

$ svn status
M myproject.php

$ svn diff
Index: myproject.php
================================================================
--- myproject.php (revision 2)
+++ myproject.php (working copy)
@@ -1,5 +1,7 @@
 <?php
+ini_set("max_execution_time", 300);
+
 /**
  * Open main class.
  */
 @@ -120,7 +122,7 @@
 public function dumpvalue($value)
 {
-    print htmlentities($value, "UTF-8", ENT_COMPAT);
+    print htmlentities($value, ENT_COMPAT, "UTF-8");
 }

Under Subversion, unless you move files around, you can’t commit only one of these changes; you need to commit both. This isn’t really the end of the world, since you could include a commit message describing both things you changed:

$ svn commit -m "Allowed longer runtime, fixed parameter order bug"
Transmitting file data .
Committed revision 3.

But if you’re finicky like me, and you’d prefer to think of commits as grouping semantically related changes as much as possible, it would be much better to be able to commit these two changes separately, and this is where Git’s use of an index shines.

Git’s method

Let’s work with the same project again, but this time as a Git repository. We’ll make the same changes again, and view the output of git status and git diff:

$ git status
# On branch master
# Changes not staged for commit:
#
# modified: myproject.php
#
no changes added to commit

$ git diff
diff --git a/myproject.php b/myproject.php
index 7c20f21..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -1,5 +1,7 @@
 <?php
+ini_set("max_execution_time", 300);
+
 /**
 * Open main class.
 */
 @@ -120,7 +122,7 @@ class MyProject
 public function dumpvalue($value)
 {
-    print htmlentities($value, "UTF-8", ENT_COMPAT);
+    print htmlentities($value, ENT_COMPAT, "UTF-8");
 }

So far, so good. Now when we run git add myproject.php to stage the changes in the index ready for commit, by default it does the same thing Subversion does, putting all of the changes in that file into the staging area. That’s probably fine in most cases, but today we want to commit one change, and then the other. The most basic way to do this is using Git’s --patch option.

The --patch option can be added to git add, and to some other Git commands concerned with manipulating the index as well, to explicitly prompt you about staging or not staging different sections of the file, that it terms hunks. In our case, the process of including only the first change would look something like this:

$ git add --patch myproject.php
diff --git a/myproject.php b/myproject.php
index 7c20f21..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -1,5 +1,7 @@
 <?php
+ini_set("max_execution_time", 300);
+
 /**
 * Open main class.
 */
 Stage this hunk [y,n,q,a,d,/,j,J,g,e,?]? y
 @@ -120,7 +122,7 @@ class MyProject
 public function dumpvalue($value)
 {
-    print htmlentities($value, "UTF-8", ENT_COMPAT);
+    print htmlentities($value, ENT_COMPAT, "UTF-8");
 }
Stage this hunk [y,n,q,a,d,/,K,g,e,?]? n

This done, if you compare the output of git diff --staged and git diff, you’ll notice that there are changes staged ready for commit in the file, and also changes that are not staged that we can commit separately later:

$ git diff --staged
diff --git a/myproject.php b/myproject.php
index 7c20f21..4bb2362 100644
--- a/myproject.php
+++ b/myproject.php
@@ -1,5 +1,7 @@
 <?php
+ini_set("max_execution_time", 300);
+
 /**
 * Open main class.
 */

$ git diff
diff --git a/myproject.php b/myproject.php
index 4bb2362..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -122,7 +122,7 @@ class MyProject
 public function dumpvalue($value)
 {
-    print htmlentities($value, "UTF-8", ENT_COMPAT);
+    print htmlentities($value, ENT_COMPAT, "UTF-8");
 }

So your staging area is all ready with just that one change in it, and all you need to do is type git commit with an appropriate message:

$ git commit -m "Allowed longer runtime"
[master 19d9068] Allowed longer runtime
1 files changed, 2 insertions(+), 0 deletions(-)

And the other change you made is still there, waiting to be staged and committed whenever you see fit:

$ git diff
diff --git a/myproject.php b/myproject.php
index 4bb2362..c149190 100644
--- a/myproject.php
+++ b/myproject.php
@@ -122,7 +122,7 @@ class MyProject
 public function dumpvalue($value)
 {
-    print htmlentities($value, "UTF-8", ENT_COMPAT);
+    print htmlentities($value, ENT_COMPAT, "UTF-8");
 }

Other methods

Because Git’s index can be manipulated with its lower-level tools very easily, you can treat the differences between your changes and the index like any other diff task. This means more advanced tools like Fugitive for Vim can be even better for seeing changesets in individual files as you stage them for commit. Check out Drew Neil’s Vimcast series on Fugitive if you’re interested in doing this; it’s quite an in-depth series of videos, but very much worth watching if you’re a Vim user who wants to understand and use Git to its fullest, and you really value precision and clarity in your commits.