Debug with Git

Reading Time: 11 minutes

Testing shows the presence, not the absence of bugs.
Dijkstra

Apparently, software regression is a very nasty situation in the development process. It usually means that the last delivery contains something breaking. To overcome the situation the whole release must be analyzed. A developer has to write tests, rollback the changes, run tests, and … it is still there, one more step back in the VCS history and the error is still reproducible. And now this bug just got another additional label “legacy”.

Actually, it turns out that this functionality has not been used for a while thus the bug could be introduced not just with the last commit or two but quite some time ago. In case if the codebase is big enough it may lead to some significant amount of time to find an exact change that introduced this bug.

In practice, there is a way how to automate this search. Below there is an example of this operation within the Git repository.

Bug

The next example will illustrate the simplicity of the situation when the error in the software can be introduced.

Some financial software is in development. At some point comes a new task. The requirement sounds like this:

implement a net salary calculation service;

service has two inputs: gross salary and tax;

gross salary – may be quite big;

tax – is in the range between 1 and 100;

service has one output: net salary;

net salary – must be calculated up to two decimal points;

Full implementation history is available in the linked repository.¹ But the key stages are described below.

Stage #1

To accumulate all the requirements service method signature may look like this:

public BigDecimal getNetSalaryInCurrency(BigDecimal grossSalaryInCurrency, 
                                         int taxInPercent) {
    return null;
}

So far it has no body, it is done by purpose to follow TDD.

Stage #2

Implementation is quite trivial:

public BigDecimal getNetSalaryInCurrency(BigDecimal grossSalaryInCurrency, 
                                         int taxInPercent) {

    BigDecimal taxInCurrency = grossSalaryInCurrency
            .multiply(BigDecimal.valueOf(taxInPercent))
            .divide(BigDecimal.valueOf(100), RoundingMode.HALF_UP);

    return grossSalaryInCurrency
            .subtract(taxInCurrency)
            .setScale(2, RoundingMode.HALF_UP);
}

Operations with big numbers may require some extra effort to get the proper result but in the current case it requires just a proper rounding. The mode that has been taken is “HALF_UP” and it looks pretty suitable in this situation.²

Testing includes simple and complex cases:

@ParameterizedTest
@CsvSource({
        "100.00, 10, 90.00",
        "100.00, 3, 97.00",
        "123.45, 5, 117.28"

})
public void testNetSalaryCalculation(
        BigDecimal grossSalaryInCurrency, 
        int taxInPercent, 
        BigDecimal expectedNetSalaryInCurrency) {

    BigDecimal actualNetSalaryInCurrency = taxCalculationService
            .getNetSalaryInCurrency(grossSalaryInCurrency, taxInPercent);

    Assertions.assertEquals(
            expectedNetSalaryInCurrency, actualNetSalaryInCurrency);
}

Stage #3

Now the code is working and it is time to add some documentation, do refactoring and release it.

Stage #4

At some point customer realizes that in the real world financial calculations are using another rounding mode quite frequently – Bankers Rounding. This mode is called “HALF_EVEN” and looks more suitable since it scenario.³

The change is pretty small: only rounding mode. After the change all the tests are still green, that may look suspicious but in any case it looks fine.

Spoiler: this is actually the place where the bug has been introduced, so in the commits history it is highlighted with [Silent Bug] label.

Stage #5

Another portion of minor changes, refactoring and release.

Stage #6

A critical bug appears in production.

In some cases the result of the calculation is wrong, the deviation is not big (around one cent) but it still exists and may lead to money loss.

Debug

In this case it is quite obvious what happened: software regression. The bug looks to be introduced with the latest release as it was not appearing before. The most common way of searching for the problem cause is going back through the history of commits trying to find a change that introduced the error.

The first helper that can be very useful here is a test for a particular scenario from the bug:

@ParameterizedTest
@CsvSource({"655.50, 7, 609.61"})
public void testNetSalaryCalculation(
        BigDecimal grossSalaryInCurrency, 
        int taxInPercent, 
        BigDecimal expectedNetSalaryInCurrency) {

    BigDecimal actualNetSalaryInCurrency = taxCalculationService
            .getNetSalaryInCurrency(grossSalaryInCurrency, taxInPercent);

    Assertions.assertEquals(
            expectedNetSalaryInCurrency, actualNetSalaryInCurrency);
}

It is failing on the current commit. Then it is time to trace the history. But after checking out the commit before the release it turns out that the test is still failing. Now the whole debug session will take more time than expected: it is required to do a kind of binary search through the history of commits jumping back and forth trying to find a bug.

Luckily Git control system has a very useful tool: git-bisect.⁴

Theory

Git-bisect is a tool that helps to traverse through the history of commits searching for a particular pattern (code change, behavior change, etc). It works within a range between “good” commit (the one where the error still does not exist) and “bad” commit (the one where the error already exists). Both must be provided as an input to the git-bisect session.

Git-bisect can work in two modes:

Manual. In this mode git-bisect will ask if a currently checked-out commit is “good” or “bad” and the answer must be provided explicitly in the terminal. It will keep asking until it finds the one which is “bad” and the very next or previous ones are “good”.
Automatic. Fortunately, git-bisect has an option to identify “good” or “bad” commit on its own. This can be done via “run” parameter, it accepts a command to be executed in the terminal and based on the result (exit code) the decision about the commit can be made automatically. If the exit code is 0 – commit is “good”, otherwise (code is in the range between 1 and 127, except 125) – bad. Because of this feature git-bisect can be seamlessly integrated with maven test execution.

Practice

Having in place a test that can help with identification of the problematic commit enables a possibility to run git-bisect in automatic mode. For convenient reading and understanding of the whole process below it is divided into logical steps.

Step #0 – Git Bisect Session Configuration

Based on the commit history there can be made some predictions when the problem may occur. The last commit in the history can be marked as “bad” and the one around the release can be assumed as “good”.
The following commands will prepare the git-bisect session:

$ git bisect start
$ git bisect bad
$ git bisect good 85621facb6d23f294785cec4b7987172315e8a57

Git-bisect decides to start with a commit that is right before the one with a bug:

Bisecting: 2 revisions left to test after this (roughly 1 step)
[cd0b7caaa5252f960be15bad3a323b8e521ac08d] Refactoring: new method to get percent part

Step 1 – First Iteration

Now the session can be started in automatic mode:

$ git bisect run mvn clean test

After the first run all the tests run without failures:

[INFO] ---------------------------------------------------------------------
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
[INFO] ---------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ---------------------------------------------------------------------

And current commit is marked as “good”.

Step 2 – Second Iteration

The new iteration is started automatically. Git-bisect picks another commit first, this one is right after the buggy one:

Bisecting: 0 revisions left to test after this (roughly 1 step)
[48281274ac5decf123dd4bd745cbe69fe4422d47] Refactoring: new method for currency formatting

And now tests have one failure:

[INFO] ---------------------------------------------------------------------
[ERROR] Failures: 
[ERROR]   BUG_XXX_TaxCalculationServiceTest.testNetSalaryCalculation:21 expected: <609.61> but was: <609.62>
[INFO] 
[ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0
[INFO] ---------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ---------------------------------------------------------------------

And this commit is marked as “bad”.

Step 3 – Third Iteration

As usual, git-bisect starts picking a new commit:

Bisecting: 0 revisions left to test after this (roughly 0 steps)
[4677a9bdbf5ca9bf9ce37a5e58e32a3323de60a7] [Silent BUG] Switching to "Bankers Rounding"

And executing a tests run:

[INFO] ---------------------------------------------------------------------
[ERROR] Failures: 
[ERROR]   BUG_XXX_TaxCalculationServiceTest.testNetSalaryCalculation:21 expected: <609.61> but was: <609.62>
[INFO] 
[ERROR] Tests run: 4, Failures: 1, Errors: 0, Skipped: 0
[INFO] ---------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ---------------------------------------------------------------------

This one fails as well.

Step 4 – Finalizing

For now, the process is over, there is no more commits to check and it is clear when the bug has been introduced.

At the end of the session git-bisect gives a short report:

4677a9bdbf5ca9bf9ce37a5e58e32a3323de60a7 is the first bad commit
commit 4677a9bdbf5ca9bf9ce37a5e58e32a3323de60a7

    [Silent BUG] Switching to "Bankers Rounding"

bisect run success

And this conclusion actually matches the reality, this is exactly the place where the bug has been introduced.

Conclusion

Git-bisect can be a very helpful tool to traverse through the history of commits finding some particular changes, not necessarily bugs. As an argument for “run” parameter it may accept any application or script that can execute the logic of making a decision about the branch state.

Last but not least, terms like “good” or “bad” can be also redefined for own convenience.

Links

1. Sample Project (repository on GitHub)

2. HALF_UP Rounding Mode (Java SE 7 spec)

3. HALF_EVEN Rounding Mode (Java SE 7 spec)

4. Git Bisect

Bug