Profiling memory usage of Python code

In a previous post, I explained how to use the Python profiler.  The profile is great for finding out which parts of the code run the slowest, or are called most often.  However, the profiler doesn’t give any information about how much RAM is being consumed, or where it’s being consumed.  If your program needs so much memory that it starts swapping to disk, its speed can be reduced by orders of magnitude.  On the positive side, your code may run much faster if it fits entirely in the processor cache.  In this post, I will introduce two tools that can help you understand the RAM usage of your Python code.

Heapy

Heapy is a Python heap profiler that’s part of the guppy development suite.  To be honest, I don’t understand what the other parts of guppy are good for.  I didn’t really understand heapy because there isn’t really any documentation provided, until I found this heapy example page.  I’m not going to explain much more about how Heapy works, because it didn’t really help me.  The reason is that I write simulation code, and most of the memory usage consists of Numpy arrays.  Numpy arrays store their data as C arrays, which unfortunately don’t show up in the Heapy profiles.  Also, much of the array processing is done in compiled external libraries, which also won’t show up in Heapy. Because of this, I had to find a tool which would profile the code at the binary level.

Valgrind (massif)

Massif is a heap profiler that comes with the Valgrind suite of profiling tools.  It’s not a perfect tool for Python profiling, but it’s helped me understand a lot about my Python programs.  Executing massif is easy, although it does slow your code down by a factor of 10-30.  On the command line:

valgrind --tool=massif python test_flux_calculation.py

This produces a “massif.out.?????” file which is a text file, but not in a very readable format.  To get a more human-readable file, use ms_print:

ms_print massif.out.12076 > profile.txt

You can read the Massif docs to understand the output, so I will just give some comments on its application to Python. Massif shows “snapshots” of heap usage, some of which contain a detailed allocation graph that shows where the memory is actually being used.  This is not very useful for a Python script, because most of the graph just shows calls to the Python library.  You don’t want to profile a large application this way.  I find it better to write a little test script that captures the functionality of the most critical, RAM-eating sections of code. To me, the most useful output is the  plot of memory used over time (you can also plot memory currently used vs. total memory allocated).  Here is an example from a test script:

    MB
558.5^                                                 #.               ..  .
     |                                                 #:               ::  :
     |                                                 #:               ::  :
     |                                              @::#::       :::    :: @::
     |                                              @::#::       :::    :: @::
     |                                       ,.     @.:#::.      :::    ::.@::
     |                                       @:     @::#:::      :::    :::@::
     |                                       @:     @::#:::      :::    :::@::
     |                               ,       @:     @::#:::      :::    :::@::
     |                               @       @:     @::#:::      :::    :::@::
     |                               @       @:     @::#:::      :::    :::@::
     |                              @@       @:     @::#:::      :::    :::@::
     |                              @@       @:     @::#:::      :::    :::@::
     |                              @@       @:     @::#:::      :::    :::@::
     |                          ::::@@       @:     @::#:::      :::    :::@::
     |                          ::::@@       @:     @::#:::      :::    :::@::
     |                         ,::::@@       @:     @::#:::      :::    :::@::
     |                         @::::@@       @:     @::#:::      :::    :::@::
     |                         @::::@@       @:     @::#:::      :::    :::@::
     |                 ...,....@::::@@       @:     @::#:::      :::    :::@:.
   0 +----------------------------------------------------------------------->Gi
     0                                                                   7.121

You can see several large array allocations that eat up over 550Mb of RAM.  I reduced the amount of memory required by allocating some arrays “on the fly” instead of storing them as instance variables.  Here is the profile after optimization:

    MB
406.1^                                               , ..                ,.::
     |                                               # ::                @:::
     |                                               # ::                @:::
     |                                               # ::                @:::
     |                                       @::     #::::       :::     @::::
     |                                       @::     #::::       :::     @::::
     |                                       @::     #::::       :::     @::::
     |                                       @::     #::::       :::     @::::
     |                               @:      @::     #:::::      :::     @::::
     |                               @:      @::     #:::::      :::     @::::
     |                               @:      @::     #:::::      :::     @::::
     |                              .@:      @::     #:::::      :::     @::::
     |                              :@:      @::     #:::::      :::     @::::
     |                              :@:      @::     #:::::      :::     @::::
     |                              :@:      @::     #:::::      :::     @::::
     |                          ,...:@:      @::     #:::::      :::     @::::
     |                          @::::@:      @::     #:::::      :::     @::::
     |                          @::::@:      @::     #:::::      :::     @::::
     |                          @::::@:      @::     #:::::      :::     @::::
     |              ,....,,@ @::@::::@:      @::     #:::::      :::     @::::
   0 +----------------------------------------------------------------------->Gi
     0                                                                   6.866

It’s not going to get much better than that, since I know from other simple test scripts that each numpy array occupies about 100Mb of RAM, and the code needs three such arrays to function.  I find it helpful to progressively un-comment lines in the test script, and see how the memory graph changes.

2 thoughts on “Profiling memory usage of Python code

  1. Jonas Wallin

    I am working on a multigrid course using python.
    Where memory profiling is really important.
    And I had exactly the same problem when I tried to use heapy.
    This post helped me a lot.
    Thanks

    Reply
  2. Pingback: Storing large Numpy arrays on disk: Python Pickle vs. HDF5 | shocksolution.com: scientific computing, modeling, and simulation

Leave a Reply