A power tool to understand memory layout

Core analyzer

What is Core Analyzer

Many program bugs, especially those in C/C++, are memory related. When a program failure is observed, either a crash or an error, we often face a suspicious memory address or an invalid data object which becomes the key to solve the problem and demands further investigation. But it is usually not an easy task with ever increasing number of data objects and more complex execution context. Therefore, it is essential to understand how a piece of memory is allocated, owned and accessed from many distinct aspects from high to low or micro to macro level. In other words, there are different views from the application, compiler, memory manager and kernel as they belong to various layers of a program. The compiler shows its idea of an object’s type and memory layout through debug symbols and the way data is accessed by the generated code; For the heap object, the memory manager’s heap metadata indicates the size of a given memory block and whether it is free or in use; An object may be located in the text, data, heap, or stack segment, which are managed and protected by certain permission bits set by the kernel virtual memory manager. All this information is important and could be the foundation to build a theory, to prove or disprove an assumption to the cause of the program failure.

A nontrivial program would have many data objects which could be as simple as primitives, such as char, integer, float, etc. or complex aggregates like C++ object with multiple inheritances. Data objects are related through direct or indirect references by design. One object may be shared or referenced by multiple other objects. The application code usually goes through multiple indirect references to access a memory target. This makes it difficult to figure out the root cause when something goes wrong. In a typical debugging session, we may have one or more suspected data objects at hand. The challenge is to find out what other objects are holding references to the suspected and may potentially access them incorrectly. This is more or less like reverse engineering and understandably very difficult. Using debugger to inspect all variables can be a daunting task if not impossible and prone to errors for a program of thousands or even millions of variables. On top of that, heap data objects have no debug symbols to describe their types or locations unlike global and local variables. Yet they are often the target of investigation. Take the memory overrun as an example, the key to track down this type of bug is to figure out what is the memory object preceding the victim and who owns it and how it is read and written.

Core Analyzer is a powerful tool to help answer above questions. Although some debuggers provide part of the functions (for example, Windbg has an extension command !heap to analyze heap memory), none of them could use the heap information to uncover the complex relationships among numerous data objects. Besides, many programs use customized memory manager for various reasons, in which case debugger has no idea at all. Core Analyzer understands various core dump file formats on different platforms, e.g., ELF core on Linux and minidump on Windows. With built-in knowledge of the heap data structures of the program’s underlying memory manager, it scans the process’s heap to check its consistency and point out corrupted spots if any. By searching the process space for all references directly or indirectly to a suspicious object, the tool helps to unveil the object’s type and usage in a systematic way.

The following table lists core analyzer’s main features.

Heap

· Scan heap and report memory corruption and memory usage statistics

· Display the layout of surrounding memory blocks

· Display the memory block status for given address

· Show top memory blocks with biggest size

Reference

· Find the memory object’s size, type and symbol associated with given address

· Display complete reference list to given object with optional levels of indirection

Others

· Find all instances of the given C++ object type

· Display objects shared by selected threads

· Display disassembled instructions annotated with data object  context

· Data pattern within a range of memory region

· Process map including all segments and their attributes

 

Currently, Core Analyzer understands heap data managed by the system allocator on Linux/Windows as well as SmartHeap by MicroQuill. However, it is designed to be extensible. It is likely that you are using a customized memory manager in your product. By plugging in a few functions requested by the infrastructure, you can easily let the tool understand your heap data structures and take advantage of all power features. I have been using this tool extensively and find it indispensable to debug any serious issue.

Download Core Analyzer

The project is currently hosted on Source Forge. You could download binaries and source code, or leave your comments which is very much appreciated.

Licensing Info

The use and distribution of Core Analyzer is governed by the GNU Lesser General Public License (LGPL) as published by the Free Software Foundation.