White Papers > Finding Bugs in C Code with Reactis for C

Download: PDF  


5  Finding and Fixing Runtime Errors

Reactis for C immediately stops execution when a runtime error occurs, making it easy to find and fix.

Whenever Reactis for C is simulating C code in Simulator or generating tests in Tester, it is also performing a multitude of checks for runtime errors. The result is a powerful tool to find, diagnose, and fix a variety of runtime errors in your C code. The runtime errors detected by Reactis for C include:

  • Overflow Numeric calculations which produce a result too large to represent.
  • Divide by Zero Dividing a numeric value by zero.
  • Invalid Shift Shifting an integer value by an amount which produces an undefined result according to the C standard.
  • Memory Errors Accessing an invalid memory region in a way which produces an undefined result, such as accessing an array outside its bounds or accessing heap-allocated memory after the memory has been freed.
  • Uninitialized Data Access Accessing memory before the memory has been initialized, so that the result of the access is undefined under C semantics.

In a typical C environment, most of the above errors do not stop program execution, but instead produce an unintended result. This result is then used for subsequent program calculations and may not result in an observable program malfunction (such as an incorrect output) until much later, making the source of the error difficult to track down. In Reactis for C, all of these errors can be immediately detected, allowing the source of the error to be quickly determined. Furthermore, the inputs which lead to the error are recorded, allowing the execution sequence to be replayed up to the point where the error occurs, making it easy to observe prior calculations which could be the ultimate root cause of the runtime error.

A C program shown which calculates one thousand squared using
16-bit arithmetic. When compiled with GCC the program executes
and terminates normally, but produces an output of 16960 instead
of one million.
Figure 8: A program containing an overflow and its output.

Figure 8 shows what happens when an integer overflow occurs in a C program. In this case, the program uses 16-bit arithmetic to calculate 10002. The program compiles without any errors and, when executed, generates output and terminates normally. However, instead of the expected value of one million, the value output is 16960. This is because when integer calculation results are too large to fit in the container type, the result is truncated by the most significant bits which do not fit. The result is a value which wraps around from a very large value to a much smaller value or vice-versa. Reactis for C can be configured to immediately interrupt program execution whenever wrapping would occur, making it easy to find and fix such bugs.

5.1  Memory Errors

Memory errors are particularly easy to make in C and can be very hard to debug. Reactis for C automatically detects memory errors. A memory error occurs whenever a program reads-from or writes-to an invalid address. Memory errors are particularly common in C programs because the C programming language gives the programmer direct access to the program’s memory, which can boost performance but also allows software defects to access arbitrary memory locations. Typical memory errors include out-of-bounds array indexes, buffer overruns, dangling heap pointers (accessing a region of heap-allocated memory after the memory has been freed), dangling stack pointers (accessing a pointer to a local variable of a function after the function has returned) and the use of pointers cast from incorrect numeric values.

A function is shown in which data is copied from one buffer to another until
the first negative value is transferred.
Figure 9: A function containing a potential memory error.

Memory errors can be very difficult to debug using a traditional debugger because there is often a long delay between the point where the memory error occurs and the point where the program crashes or produces an invalid output. With Reactis for C, memory errors are detected immediately as they occur, allowing the cause of the error to be quickly identified and fixed.

A function containing a typical memory error vulnerability is shown in Figure 9. The function copy_dbuf copies values of type double from one array to another until a negative value is encountered. If the number of positive values in src exceeds the length of dst, then the memory after dst will be overwritten. In a typical C environment, this type of error does not result in an immediate error. Instead, the values stored after the array pointed-to by dst are overwritten. The corrupted values do not have any harmful effects on the program behavior until they are used in a subsequent calculation. Hence, there is a significant gap between the point in the program execution where the error actually occurs and the point where the error produces an observable effect. This gap in time makes the diagnosis of memory errors very difficult.

A screenshot of
Reactis for C is shown in which a error dialog appears. The error dialog explains
that a spatial memory error has occurred. In the Reactis for C main window, the
location of the error in the source code is highlighted.
Figure 10: Memory error detected by Reactis for C.

In Reactis for C, memory errors are detected immediately (either when running a program in Reactis Simulator or generating tests). When the function copy_dbuf() from Figure 9 is called and the size of the dst buffer is smaller than the src buffer, an error occurs at the point where the first write beyond the bounds of src occurs. Program execution is suspended and an error dialog appears, as shown in Figure 10. When the highlight button in the error dialog is clicked, the source line where there error occurred flashes yellow, as shown in Figure 11.

Clicking on the highlight button in an error causes the location of the error
in the source code to be highlighted.
Figure 11: Highlighting the location of a memory error.

A typical memory error summary and description is shown in Figure 12. (Note that for the sake of brevity, the stack trace which appears after the description text has been omitted.) The error message includes the source location of the error, the kind of error, the memory address that was being accessed at the time of the error, the allowed numeric access range and the allowed symbolic access range. The latter is particularly helpful in many cases because, when a variable is accessed via pointer, the symbolic information will include the name and source code location of the variable pointed-to. In a traditional debugger, only the numeric address contained within the pointer is available, and this address no longer corresponds to the original target of the pointer. This is one of the factors which makes memory error diagnosis difficult. In Reactis for C, the target of the pointer is immediately available. In this case the variable is buf2.

error message is shown with emphasis on the location of the error
(line 24 of dbuf_error.c), the invalid pointer (dst), the target of dst (buf2)
and the location where the invalid pointer was created (line 38 of dbuf_error.c)
Figure 12: Closeup of error message from Figure 10.

Memory errors can be divided into two categories, temporal and spatial. Spatial memory errors are cases where an address access occurs outside the bounds of the intended target. Temporal memory errors occur when memory is accessed after it has been recycled, so that the intended target may have been overwritten with new data.

Spatial memory errors include the following:

Invalid array index
Accessing A[i] when i is outside the bounds of A.
Buffer overrun
Accessing *p when the value of p has been incremented to point past the end of its target.
Invalid pointer
Accessing *p when p has been overwritten with a non-pointer value (this can happen when using a union construct).

A temporal memory error occurs when a pointer is used to access heap or stack memory which has been deallocated or reallocated for some other purpose. Temporal errors can be divided into 2 categories:

Heap error
Accessing *p when p points to a chunk of heap-allocated memory which has been previously deallocated via the free() function.
Stack error
Accessing *p when p points to a local variable of a function f() after f() has returned.

Temporal memory errors are usually more complex than spatial memory errors and are hence also more difficult to diagnose and fix.

function is shown in which (1) heap memory is allocated,
(2) the value 25 is stored in the newly-allocated memory,
(3) the memory is deallocated,
and (4) the value of the deallocated memory is loaded via pointer
Figure 13: A function which reads from recycled heap memory.

Figure 13 shows a function which reads from heap memory after the memory has been freed. This function will compile and run without any obvious error in almost any C execution platform. However, the value returned may not be 25. This type of execution error leads to insidiously intermittent malfunctions which can be a nightmare diagnose.

Fortunately, Reactis for C detects temporal memory errors and interrupts program execution at the point where the invalid memory access occurs. Figure 14 shows the result of executing read_after_free() in Reactis Simulator. The memory error is immediately caught and its location (the assignment x = *p) is highlighted.

A screenshot
of Reactis for C shows an error dialog being presented
explaining that a heap memory error has occurred.
Figure 14: Reactis for C detects the error in the function of Figure 13.

5.2  Uninitialized Memory

Another class of error which is also difficult to debug in C programs is reading from uninitialized memory. There are two ways uninitialized memory reads can occur in a C program:

Uninitialized heap memory
Heap memory is allocated via malloc() and some of this memory is not initialized before it is read.
Uninitialized local variable
A local variable of a function is not initialized before it is read.

In both cases, whatever value happens to be stored in the allocated memory is used. As is the case with other memory errors, there is often a delay between the point where the uninitialized memory read occurs and the point where observable erroneous behavior occurs. An example of this is the function sum() in Figure 15.

A function is shown which uses a variable x to hold the sum of an array. The variable
x is not initialized to zero before computing the sum of the array values.
Figure 15: A function with an uninitialized local variable x.

In the body of function sum() the summation variable x is not initialized. This code will compile and execute on almost any C platform. The value returned by sum() will be equal to the sum of the first n values stored in A plus whatever value happens to be stored in the memory allocated for variable x when sum() is called.

A screenshot of
Reactis shows an error dialog which states that the
variable x was not initialized before being read.
Figure 16: Reactis for C detects the error in the function of Figure 15.

When using Reactis for C, uninitialized memory reads trigger an immediate suspension of program execution and an error message that gives the location where the error occurred and the program variables involved. Figure 16 shows the result of executing the function sum() with Reactis for C.

Spatial memory errors, temporal memory errors and uninitialized memory reads often have subtly corrupting effects on program execution. These errors essentially inject random data into the program, causing the program to intermittently malfunction. It is also common for memory errors to only occur in rare circumstances, such as when a very large buffer size is requested or a complex boolean expression becomes true. A major strength of Reactis for C is its ability to immediately catch memory errors as they occur and to generate test inputs which are likely to trigger memory errors.