Spark Troubleshooting guide: Profiling Spark: How to collect heap dump using jmap utility

Document created by hdevanath Employee on Jun 19, 2017Last modified by hdevanath Employee on Jun 19, 2017
Version 2Show Document
  • View in full screen mode

One useful utility to troubleshoot memory-related issues is heap dump. A heap dump is a snapshot of the memory of a Java™ process. The snapshot contains information about the Java objects and classes in the heap at the moment the snapshot is triggered. Case 1) To print java heap summary, identify the Java process (PID) for which the heap dump is required

jmap -heap <PID>

Case 2) The jmap command with the -histo option can be used to obtain a class specific histogram of the heap. Depending on the parameter specified, the jmap -histo command can print out the heap histogram for a running process or a core file.When the command is executed on a running process, the tool prints the number of objects, memory size in bytes, and fully qualified class name for each class. Internal classes in the Java HotSpot VM are enclosed in angle brackets. The histogram is useful in understanding how the heap is used.

jmap -histo:live <PID>

Case 3) Dumps the Java heap in hprof binary format to filename. The live suboption is optional. If specified, only the live objects in the heap are dumped.

jmap -dump:live,format=b,file=/tmp/dump.hprof <PID>

Note: We can use VisualVM to browse the heap dump. Its opensource and available for free download here
The following is an example showing how to analyze heap dump for NodeManager PID
Step 1) Identify the NM process

jps | grep -i nodemanager 8177 NodeManager

Step 2) Dump the Java heap (refer Case 3)

jmap -dump:live,format=b,file=/tmp/dump.hprof 8177

Step 3) Open the heap dump in VisualVM
What to expect

User-added image