Because this is a book about embedded Linux development, we use a version of GDB that has been compiled as a cross-debugger. That is, the debugger itself runs on your development host, but it understands binary executables in the architecture for which it was configured at compile time. In the next few examples, we use GDB compiled for a Red Hat Linux-compatible development host, and an XScale (ARM) target processor. Although we use the short name gdb, we are presenting examples based on the XScale-enabled cross-gdb from the Monta Vista embedded Linux distribution for ARM XScale. The binary is called xscale_be-gdb. It is still GDB, simply configured for a cross-development environment.
The GDB debugger is a complex program with many configuration options during the build process. It is not our intention to provide guidance on building gdb that has been covered in other literature. For the purposes of this chapter, we assume that you have obtained a working GDB configured for the architecture and host development environment you will be using.
13.1.1. Debugging a Core Dump
One of the most common reasons to drag GDB out of the toolbox is to evaluate a core dump . It is quick and easy, and often leads to immediate identification of the offending code. A core dump results when an application program generates a fault, such as accessing a memory location that it does not own. Many conditions can trigger a core dump, [80] See SIG_KERNEL_COREDUMP_MASK in .../kernel/signal.c for a definition of which signals generate a core dump.
but SIGSEGV (segmentation fault) is by far the most common. A SIGSEGV is a Linux kernel signal that is generated on illegal memory accesses by a user process. When this signal is generated, the kernel terminates the process. The kernel then dumps a core image, if so enabled.
To enable generation of a core dump, your process must have the resource limits to enable a core dump. This is achieved by setting the process's resource limits using the setrlimit() function call, or from a BASH or BusyBox shell command prompt, using ulimit. It is not uncommon to find the following line in the initialization scripts of an embedded system to enable the generation of core dumps on process errors:
$ ulimit -c unlimited
This BASH built-in command is used to set the size limit of a core dump. In the previous instance, the size is set to unlimited.
When an application program generates a segmentation fault (for example, by writing to a memory address outside its permissible range), Linux terminates the process and generates a core dump, if so enabled. The core dump is a snapshot of the running process at the time the segmentation fault occurred.
It helps to have debugging symbols enabled in your binary. GDB produces much more useful output with debugging symbols (gcc -g) enabled during the build. However, it is still possible to determine the sequence of events leading to the segmentation fault, even if the binary was compiled without debugging symbols. You might need to do a bit more investigative work without the aid of debugging symbols. You must manually correlate virtual addresses to locations within your program.
Listing 13-1 shows the results of a core dump analysis session using GDB. The output has been reformatted slightly to fit the page. We have used some demonstration software to intentionally produce a segmentation fault. Here is the output of the process (called webs) that generated the segmentation fault:
root@coyote:/workspace/websdemo# ./webs
Segmentation fault (core dumped)
Listing 13-1. Core Dump Analysis Using GDB
$ xscale_be-gdb webs core
GNU gdb 6.3 (MontaVista 6.3-20.0.22.0501131 2005-07-23)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public
License, and you are welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "--host=i686-pc-linux-gnu -target=armv5teb-montavista-linuxeabi"...
Core was generated by './webs'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /opt/montavista/pro/.../libc.so.6...done.
Loaded symbols for /opt/montavista/pro/.../libc.so.6
Reading symbols from /opt/montavista/pro/.../ld-linux.so.3...done.
Loaded symbols for /opt/montavista/pro/.../ld-linux.so.3
#0 0x00012ac4 in ClearBlock (RealBigBlockPtr=0x0, l=100000000) at led.c:43
43 *ptr = 0;
(gdb) l
38
39 static int ClearBlock(char * BlockPtr, int l)
40 {
41 char * ptr;
42 for (ptr = BlockPtr; (ptr - BlockPtr) < l; ptr++)
43 *ptr = 0;
44 return 0;
45 }
46 static int InitBlock(char * ptr, int n)
47 {
(gdb) p ptr
$1 = 0x0
(gdb)
The first line of Listing 13-1 shows how GDB was invoked from the command line. Because we are doing cross-debugging, we need the cross-version of GDB that has been compiled for our host and target system. We invoke our version of cross-gdb as shown and pass xscale_be-gdb the name of the binary followed by the name of the core dump filein this case, simply core. After GDB prints several banner lines describing its configuration and other information, it prints the reason for the termination: signal 11, the indication of a segmentation fault. [81] Signals and their associated numbers are defined in .../asm-/signal.h in your Linux kernel source tree.
Several lines follow as GDB loads the binary, the libraries it depends on, and the core file. The last line printed upon GDB startup is the current location of the program when the fault occurred. The line preceded by the #0 string indicates the stack frame (stack frame zero in a function called ClearBlock() at virtual address 0x00012ac4). The following line preceded by 43 is the line number of the offending source line from a file called led.c. From there, GDB displays its command prompt and waits for input.
To provide some context, we enter the gdb list command, using its abbreviated form l. GDB recognizes command abbreviations where there is no ambiguity. Here the program error begins to present itself. The offending line, according to GDB's analysis of the core dump is:
43 *ptr = 0;
Next we issue the gdb print command on the ptr variable, again abbreviated as p. As you can see from Listing 13-1, the value of the pointer ptr is 0. So we conclude that the reason for the segmentation fault is the dereference of a null pointer, a common programming error. From here, we can elect to use the backtrace command to see the call chain leading to this error, which might lead us back to the actual source of the error. Listing 13-2 displays these results.
Listing 13-2. Backtrace Command
(gdb) bt
#0 0x00012ac4 in ClearBlock (RealBigBlockPtr=0x0, l=100000000) at led.c:43
#1 0x00012b08 in InitBlock (ptr=0x0, n=100000000) at led.c:48
#2 0x00012b50 in ErrorInHandler (wp=0x325c8, urlPrefix=0x2f648 "/Error",
webDir=0x2f660 "", arg=0, url=0x34f30 "/Error", path=0x34d68 "/Error",
Читать дальше