Potential memory problems in Zend_Db_Table

View: New views
4 Messages — Rating Filter:   Alert me  

Potential memory problems in Zend_Db_Table

by Catharsis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hello,

Now I know this is probably not the most efficient way to do things but I can't figure out just why this bit of code is using up the 64Mb Ive allocated to Zend.

First a little bit of background :
Im processing eBay Categories (which I aquire from their API), each category has an id and a parent_id field. If the category is a root node id and parent_id are the same. So to find what your breadcrumb trail is you have to work from the top node down. There are 14,254 records in the table, at most its 4 levels deep.

e.g. Foo > Bar > Wilma > Fred

So I made a Category class (which is a Zend_Db_Table_Abstract) and then use this to run through the data, code to iteratre through it looks a like this. It is designed to prep all the breadcrumb data so I don't have to do it on the fly.

	public function breadcrumbAction()
	{
		$c = new Category();
		$select = $c->select()->order( 'level' );
		$results = $c->fetchAll( $select );

		foreach( $results as $result )
		{
			$breadcrumb = array();
			$currentID = $result->id;
			$parentID = $result->parent_id;

                        // All the time the parent and current id are not the same
			while( $parentID !== $currentID )
			{
                                // Grab the parent row
				$parent = $c->select()->where( 'id = ?', $parentID );
				$row = $c->fetchRow( $parent );
                                // Reset ther vars
				$parentID = $row->parent_id;
				$currentID = $row->id;
                                // Put the parent name in the breadcrumb trail
				array_unshift( $breadcrumb, $row->name );
			}

			$where = $c->getAdapter()->quoteInto('id = ?', $result->id);
			$c->update( array( 'breadcrumb' => implode( ' > ', $breadcrumb ) ), $where );
		}

Now this gets through about 10k of records before dying, I got frustrated and changed the code and this works without dying horribly

	public function breadcrumbAction()
	{
		$select = $this->db->select()->from('categories', array('id','parent_id'))->order( 'level' );
		$stmt = $this->db->query( $select );
		$results = $stmt->fetchAll();

		foreach( $results as $result )
		{
			$breadcrumb = array();
			$currentID = $result['id'];
			$parentID = $result['parent_id'];

			while( $parentID !== $currentID )
			{
				$parent = $c->select()->where( 'id = ?', $parentID );
				$row = $c->fetchRow( $parent );
				$parentID = $row->parent_id;
				$currentID = $row->id;;
				array_unshift( $breadcrumb, $row->name );
			}

			$c = new Category();
			$where = $c->getAdapter()->quoteInto('id = ?', $result['id']);
			$c->update( array( 'breadcrumb' => implode( ' > ', $breadcrumb ) ), $where );
		}
	}
Someone care to explain to me why the first dies?

Re: Potential memory problems in Zend_Db_Table

by Bill Karwin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Catharsis wrote:
Now I know this is probably not the most efficient way to do things but I can't figure out just why this bit of code is using up the 64Mb Ive allocated to Zend.
I don't know for sure why it's exhausting your memory, but 64MB is too little memory for any modern web app.  The default setting for max memory in current versions of PHP is 128MB.  But I can offer some suggestions regarding inefficiency.

You fetched the whole collection of category data before the loop, and you even sorted it by level, but you aren't taking advantage of this.  You are preparing and executing extra SQL queries inside the loop when you don't have to (at least you should prepare outside the loop and execute with parameters inside the loop).  

Your second code sample even creates a new instance of the table object during each iteration of the loop.  This is an expensive operation to put inside a loop.

Here, try something like this:

public function breadcrumb()
{
  $c = new Category();
  $select = $c->select()->order( 'level' );
  $rowset = $c->fetchAll( $select );

  $hash = array();

  foreach ($rowset as $row)
  {
    // keep a running index of each row by its id
    $hash[$row->id] = $row;

    // since we're processing rows in level order,  
    // we know that we've seen the parent already
    $parent = $hash[$row->parent_id];

    $breadcrumb = array();
    if ($row != $parent)
    {
      // the parent's breadcrumb is guaranteed to be set,
      // since we've processed the tree in level order
      $breadcrumb[] = $parent->breadcrumb;
    }

    // append the current row's name to the array and join it
    // to the breadcrumb we got from the parent
    $breadcrumb[] = $row->name;
    $row->breadcrumb = implode( ' > ', $breadcrumb);

    // save the current row's breadcrumb and end the loop
    $row->save();
  }
}

There's also a way to store trees in an efficient manner, so it won't be so costly to produce the breadcrumb on the fly.

  CREATE TABLE CategoryAdjacency (
    ancestor INTEGER NOT NULL REFERENCES Category(id),
    descendant INTEGER NOT NULL REFERENCES Category(id),
    pathlength INTEGER NOT NULL,
    PRIMARY KEY (ancestor, descendant)
  );

Insert into this table every path through the tree, including paths of length zero..  For example, if A->B and B->C, then insert (A, A, 0), (A, B, 1), (A, C, 2), (B, B, 0), (B, C, 1), (C, C, 0).  Then you can query a breadcrumb for C in a single query like this:

  SELECT GROUP_CONCAT(c.name ORDER BY a.pathlength DESC SEPARATOR ' > ')
  FROM Category AS c JOIN CategoryAdjacency AS a ON c.id = a.ancestor
  WHERE a.descendant = 'C';

If you are using MySQL prior to version 4.1 or if you use another brand of database, you can't use GROUP_CONCAT().  Here's a workaround:

  SELECT c.name
  FROM Category AS c JOIN CategoryAdjacency AS a ON c.id = a.ancestor
  WHERE a.descendant = 'C'
  ORDER BY a.pathlength DESC;

Fetch with fetchCol() and implode the result in PHP code using ' > ' as the separator.

Regards,
Bill Karwin

Re: Potential memory problems in Zend_Db_Table

by Catharsis :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Bill Karwin wrote:
But I can offer some suggestions regarding inefficiency.
Yeah I went and re-worked this bit of code yesterday thanks to some input from a collegue, I was using the 'get it working first, then make it better' method of coding, it looks a little bit more like your suggestion, appreciate the help.  Still makes me think as to why it run out of memory though.  It the first bit of code had done something 4 levels deep so no reason as to why it should bomb out or use that much more.

Bill Karwin wrote:
There's also a way to store trees in an efficient manner, so it won't be so costly to produce the
breadcrumb on the fly.

..[snip]..
Yeah that's a good suggestion thanks again.  I was going to try and put together some test code to prove or disprove my memory leek theory but haven't had the time to do so.

Appreciate the help Bill
- Chris

Re: Potential memory problems in Zend_Db_Table

by Bill Karwin :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message


Catharsis wrote:
I was going to try and put together some test code to prove or disprove my memory leek theory but haven't had the time to do so.
It should take you about five minutes.  One easy way to do this would be to call memory_get_usage() inside the loop and output the value returned.

Also, do you have the Zend_Db_Profiler enabled while you are looping over 14,000+ queries?  The profiler instantiates a Zend_Db_Profiler_Query object for each SQL execution.  Since your max memory limit is so low, and part of it has naturally already been allocated, it'd be possible for 14,000 query profile objects to exceed the memory limit.

Regards,
Bill Karwin
LightInTheBox - Buy quality products at wholesale price