Running Linux, 5th Edition - Matthias Kalle Dalheimer [394]
The core file is written to disk by the operating system whenever certain failures occur. The most frequent reason for a crash and the subsequent core dump is a memory violation—that is, trying to read or write memory to which your program does not have access. For example, attempting to write data using a null pointer can cause a segmentation fault , which is essentially a fancy way of saying, "you screwed up." Segmentation faults are a common error and occur when you try to access (read from or write to) a memory address that does not belong to your process's address space. This includes the address 0, as often happens with uninitialized pointers. Segmentation faults are often caused by trying to access an array item outside the declared size of the array, and are commonly a result of an off-by-one error. They can also be caused by a failure to allocate memory for a data structure.
Other errors that result in core files are so-called bus errors and floating-point exceptions. Bus errors result from using incorrectly aligned data and are therefore rare on the Intel architecture, which does not pose the strong alignment conditions that other architectures do. Floating-point exceptions point to a severe problem in a floating-point calculation like an overflow, but the most usual case is a division by zero.
However, not all such memory errors will cause immediate crashes. For example, you may overwrite memory in some way, but the program continues to run, not knowing the difference between actual data and instructions or garbage. Subtle memory violations can cause programs to behave erratically. One of the authors once witnessed a bug that caused the program to jump randomly around, but without tracing it with gdb, it still appeared to work normally. The only evidence of a bug was that the program returned output that meant, roughly, that two and two did not add up to four. Sure enough, the bug was an attempt to write one too many characters into a block of allocated memory. That single-byte error caused hours of grief.
You can prevent these kinds of memory problems (even the best programmers make these mistakes!) using the Valgrind package , a set of memory-management routines that replaces the commonly used malloc() and free() functions as well as their C++ counterparts, the operators new and delete. We talk about Valgrind in "Using Valgrind," later in this chapter.
However, if your program does cause a memory fault, it will crash and dump core. Under Linux, core files are named, appropriately, core. The core file appears in the current working directory of the running process, which is usually the working directory of the shell that started the program; on occasion, however, programs may change their own working directory.
Some shells provide facilities for controlling whether core files are written. Under bash, for example, the default behavior is not to write core files. To enable core file output, you should use the command:
ulimit -c unlimited
probably in your .bashrc initialization file. You can specify a maximum size for core files other than unlimited, but truncated core files may not be of use when debugging applications.
Also, in order for a core file to be useful, the program must be compiled with debugging code enabled, as described in the previous section. Most binaries on your system will not contain debugging code, so the core file will be of limited value.
Our example for using gdb with a core file is yet another mythical program called cross. Like trymh in the previous section, cross takes an image file as input, does some calculations on it, and outputs another image file. However, when running cross, we get a segmentation fault:
papaya$ cross < image30.pfm > image30.pbm
Segmentation fault (core dumped)
papaya$
To invoke gdb for use with a core file, you must specify