memory leak in python bindings

View: New views
2 Messages — Rating Filter:   Alert me  

memory leak in python bindings

by Brian Warner-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Hi all..

I spent some time a few weeks ago tracking down a memory leak in the metakit
python bindings. I've come up with a patch which seems to fix the issue.. I'm
hoping that someone with more experience with the code than me could take a
look at it and consider applying it upstream.

We're using metakit-2.4.9.5 . It appears that the buggy 'PWOMappingMmbr
operator[]' (in python/scxx/PWOMapping.h) is still present in current metakit
CVS. To reproduce the memory leak, run the attached showleak.py and watch
with top. Basically every key that is passed to a lookup operation gets its
refcount incremented one time too many, and never gets released. Doing a
whole bunch of queries where you construct a new string for each one will
cause the leak to become obvious pretty quickly.

A patch which adds the correct matching decref is attached.

cheers,
 -Brian

#! /usr/bin/python

import metakit
import os, os.path

STOREINFO_FORMAT = "storeinfo[name:S,tail:I,freelist:B,path:S,active:I]"
BLOCKDB_FORMAT = "block_db[_B[blockid:B,location:S,size:I,ctime:I,ttl:I,reason:I]]"


def get_memory_usage():
    # only works on linux
    for line in open("/proc/self/status", "r").readlines():
        if line.startswith("VmSize:"):
            tag,kb,units = line.split()
            return int(kb)
def _show_memory(c):
    print "%5d: %d kB" % (c, get_memory_usage())

class Getter:
    def __init__(self):
        _db_filename = "/tmp/foo.mk"
        self._db = metakit.storage(_db_filename, 1)
        self._blocks = self._db.getas(BLOCKDB_FORMAT).blocked().hash(self._db.getas("__mkhash__[_H:I,_R:I]"), 1)

    def query(self):
        # this creates a new string each time
        bid =  ("%c" % 0x36) * 16
        return self._blocks.find(blockid = bid) != -1

def _do_lots_of(func, count=500*1000, check_each=50*1000):
    """Run a function a lot, counting memory usage every once in a while.
    """
    until_next = 0
    for i in range(count):
        func()
        if not until_next:
            _show_memory(i)
            until_next = check_each
        until_next -= 1
    _show_memory(i+1)


#print "my pid:", os.getpid()
g = Getter()
# do a quarter million queries
_do_lots_of(g.query, 250000, 50000)
del g
_show_memory(9999)

--- old-metakit-2.4.9.5/python/scxx/PWOMapping.h 2006-08-09 00:18:33.000000000 -0700
+++ new-metakit-2.4.9.5/python/scxx/PWOMapping.h 2006-08-09 00:18:33.000000000 -0700
@@ -55,9 +59,11 @@
   //PyMapping_GetItemString
   //PyDict_GetItemString
   PWOMappingMmbr operator [] (const char* key) {
+    // note: this PyMapping call creates a new reference
     PyObject* rslt = PyMapping_GetItemString(_obj, (char*) key);
     if (rslt == NULL)
       PyErr_Clear();
+    Py_XDECREF(rslt); // PWOMappingMmbr claims its own refcnt, so decref now
     PWOString _key(key);
     return PWOMappingMmbr(rslt, *this, _key);
   };

_____________________________________________
Metakit mailing list  -  Metakit@...
http://www.equi4.com/mailman/listinfo/metakit

Re: memory leak in python bindings

by Jack Diederich :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, Oct 02, 2006 at 05:52:29PM -0700, Brian Warner wrote:
>
> Hi all..
>
> I spent some time a few weeks ago tracking down a memory leak in the metakit
> python bindings. I've come up with a patch which seems to fix the issue.. I'm
> hoping that someone with more experience with the code than me could take a
> look at it and consider applying it upstream.
>
<snip>

> --- old-metakit-2.4.9.5/python/scxx/PWOMapping.h 2006-08-09 00:18:33.000000000 -0700
> +++ new-metakit-2.4.9.5/python/scxx/PWOMapping.h 2006-08-09 00:18:33.000000000 -0700
> @@ -55,9 +59,11 @@
>    //PyMapping_GetItemString
>    //PyDict_GetItemString
>    PWOMappingMmbr operator [] (const char* key) {
> +    // note: this PyMapping call creates a new reference
>      PyObject* rslt = PyMapping_GetItemString(_obj, (char*) key);
>      if (rslt == NULL)
>        PyErr_Clear();
> +    Py_XDECREF(rslt); // PWOMappingMmbr claims its own refcnt, so decref now
>      PWOString _key(key);
>      return PWOMappingMmbr(rslt, *this, _key);
>    };

Looks good to me.  I tried looking for similar refleaks but
quickly got lost between the PyMapping_* calls and the PyDict_*
calls (PyDict_* calls generally steal references, the abstract
interface doesn't).  I think the _obj member must always be a
PyDict or there would be a lot of core dumping going on.

-Jack
_____________________________________________
Metakit mailing list  -  Metakit@...
http://www.equi4.com/mailman/listinfo/metakit
LightInTheBox - Buy quality products at wholesale price