# Debugging Step by Step – Delta Debugging

A Cambridge University study states that software bugs
cost the global economy $312 billion per year. Bug hunting is a difficult task. As Brian
W. Kernighan wrote: “*Debugging is twice
as hard as writing the code in the first place. Therefore, if you write the
code as cleverly as possible, you are, by definition, not smart enough to debug
it.*” It's unfortunately true. Several times even talented developers are
blocked to find bugs. In the following articles "**Debugging Step by Step** ..." we give some pieces of advice to
ease this task.

Let’s assume we have a bug. What is the first step? It depends... Let’s see an example. We implemented a Quicksort algorithm and test it with the numbers:

4, 12, 9, 14, 3, 10, 17, 11, 8, 7, 4, 1, 6, 19, 5, 21, 2, 3

1, 3, 2, 5, 3, 4, 4, 6, 7, 8, 9, 10, 11, 12, 14, 17, 19, 21

Thus the test failed. What to do now? One of the possibilities is to start debugging. However, it is not a good idea. Why? Because the execution contains too many steps which is difficult to follow. The number of execution steps is 670, and we should analyze a lot of them until we find the bug.

Delta debugging is an input reduction method resulting in two input sets:

- one input reveals the bug
- the other doesn't

for which the difference is minimal, i.e. in the second input only one number is missing from the first one.

There are quite a few manual methods for minimizing the input. If our input is homogeneous, then simple halving may work. The method is simple, after arranging the input into two possibly same size lists, we execute the code, and if one of them is a failure-induced input, then we select it. We keep halving until the method works. When it doesn't, we can reduce the remaining input removing elements one by one. Let's do it now.

## Example

Step 1. In our example the halved input is: 4, 12, 9, 14, 3, 10, 17, 11, 8 - getting the output: 3, 12, 4, 8, 9, 10, 11, 14, 17, i.e. clearly faulty.

Step 2. The next input is: 4, 12, 9, 14 - for which the output is correct: 4, 9, 12, 14.

Step 3. We try the other half, which is 3, 10, 17, 11, 8, obtaining the output: 3, 8, 10, 11, 17 - again correct.

Step 4. Our next output is the reduction of our last failure-induced input by cutting the last two numbers: 4, 12, 9, 14, 3, 10, 17. The result is also erroneous: 4, 10, 9, 12, 3, 14, 17.

Cutting any elements the output remains correct, thus our minimized input is : 4, 12, 9, 14, 3, 10, 17. For this test the number of execution steps is 192, less than 30% of the original.

However, the input we obtained is a local minimum, and in some cases we are able to minimize the failure-induced input just thinking a little bit. Consider the output for the first test again:

1**,
3, 2, 5, 3, 4,** 4, 6, 7, 8, 9, 10, 11, 12, 14, 17, 19, 21

We can see that apart from the sub-result: 3, 2, 5, 3, 4, the result is correct. Thus we select these numbers from the input set: 4, 3, 5, 2, 3 and it works since the result still contains the bug: 3, 3, 4, 2, 5. No further reduction can be achieved by this way of thinking, though it's not a global minimum either. Nevertheless, the number of steps is reduced to 139, i.e. 20% of the original test.

Returning to our original systematic method and removing the first number 4, our input: 3, 5, 2, 3 - we obtain the faulty 3, 3, 2, 5 - which is the minimum failure-induced input resulting in 111 execution steps and 83% reduction.

And how big the time saving of this minimization is? We don't know. However, if you help us by fixing the bug, we can give the answer. The faulty version of Quicksort is below. If your birthday is in months 1-6, use the original, otherwise the minimized input. Just drop me (forgacs@4dsoft.hu) a mail with the fixed line(s), the input used and the time spent on fixing (in minutes). I publish the results on our website http://4dsoft.eu/blog and a subsequent article here. Thank you for your cooperation.

## Conclusion

Delta debugging is a divide and conquer method by which our original debugging task is simplified by minimizing the input. Debugging is then performed on the simpler code trace.

Our experiences proved that the total bug finding time is largely reduced by applying delta debugging. Our input is sometimes huge 100,000+ lines of Java code, delta debugging is absolutely essential to find bugs in this environment.

## Quicksort with bug

import java.io.*; import java.util.*; public class QuickSort { public static void quicksort(int numbers[], int low, int high) { int i = low, j = high; // Get the pivot element from the middle of the list int pivot = numbers[low + (high-low)/2]; // Divide into two lists while (i <= j) { while (numbers[i] < pivot) { i++; } while (numbers[j] > pivot) { j--; } if (i <= j) { exchange(numbers, i, j); } i++; j--; } // Recursion if (low < j) quicksort(numbers, low, j); //low if (i < high) quicksort(numbers, i, high); } private static void exchange(int numbers[], int i, int j) { int temp = numbers[i]; numbers[i] = numbers[j]; numbers[j] = temp; int k = 1; } public static void main(String argv[]) { int A[] = {4, 12, 9, 14, 3, 10, 17, 11, 8, 7, 4, 1, 6, 19, 5, 21, 2, 3}; // minimized failure-induced input // int A[] = {3, 5, 2, 3}; quicksort(A, 0, A.length - 1); for (int i=0 ; i < A.length ; i++) System.out.print(A[i] + " "); System.out.println(); }

*(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)*

## Comments

## Yaro Nosa replied on Fri, 2014/02/21 - 11:24am

Looking at Java implementation you've provide and comparing it to let say Haskell, I realized that you can avoid debugging bugs at all when code represent you intentions clearly.

## Istvan Forgacs replied on Sat, 2014/02/22 - 12:41pm in response to: Yaro Nosa

You are right w.r.t. this code. However, in the case of complex and large code debugging is sometimes inevitable. In those cases the number of execution steps may exceed 10 millions for large input but can be reduced to some thousands by minimization. Actually, in several cases we are not able to find bugs without minimization.

In complex input such as source code for static analysers, minimization is also good for developers to filter out lots of potential problems.