RelStorage: Clearing temp_store in replication-friendly way

View: New views
9 Messages — Rating Filter:   Alert me  

RelStorage: Clearing temp_store in replication-friendly way

by Stefan H. Holek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

I have made two checkins to relstorage 1.1 branch:

[1] http://svn.zope.org/relstorage/branches/1.1/?rev=88789&view=rev
[2] http://svn.zope.org/relstorage/branches/1.1/?rev=88790&view=rev

If someone wants to discuss them this is the thread. ;-)

Ad [1]:
This clearly is not the only problem with temporary tables and  
replication. What the change does, however, is turn a hard error -  
which stops replication in a way requiring operator intervention -  
into a recoverable failure. And all this in an application neutral  
way! <Applause here>

We hope to tackle the main issue (a.k.a. better-not-use-temporary-
tables-with-mysql-replication-at-all) in a later installment.

Cheers,
Stefan

P.S.: I am quite excited about the memcached support. Does it "just  
work"? I.e. can I run my ZODB in RAM now? ;-)

--
Anything that, in happening, causes itself to happen again,
happens again.  --Douglas Adams


_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Shane Hathaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefan H. Holek wrote:
> I have made two checkins to relstorage 1.1 branch:

It's really cool to have another contributor.  Thanks!  1.1c1 has
already been tagged, so I fixed the change log.

> [1] http://svn.zope.org/relstorage/branches/1.1/?rev=88789&view=rev
> [2] http://svn.zope.org/relstorage/branches/1.1/?rev=88790&view=rev
>
> If someone wants to discuss them this is the thread. ;-)
>
> Ad [1]:
> This clearly is not the only problem with temporary tables and  
> replication. What the change does, however, is turn a hard error -  
> which stops replication in a way requiring operator intervention -  
> into a recoverable failure. And all this in an application neutral  
> way! <Applause here>
>
> We hope to tackle the main issue (a.k.a. better-not-use-temporary-
> tables-with-mysql-replication-at-all) in a later installment.

Ok.  Conceptually, what we need is a way for each connection to write to
a scratch table that no other connection can see.  Is there a better way
to do that than temporary tables?

> P.S.: I am quite excited about the memcached support. Does it "just  
> work"? I.e. can I run my ZODB in RAM now? ;-)

I expect the new memcache support to be safe for everyone to use, but we
still require the main database to be connected at all times, since
memcache provides no ACID properties by itself.  We unfortunately can't
do obvious things like cache the current transaction ID for an object,
since that would break MVCC.  What we do cache is:

1. The current tid, given an oid and the transaction ID that is active
for the current connection.

2. The pickle given an oid and tid.

These should both help read-heavy databases, but might be detrimental
for write-heavy databases.  My performance test suite, which writes a
lot, produced slightly *worse* results with memcache enabled.

Shane
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Stephan Richter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 24 July 2008, Shane Hathaway wrote:

> > P.S.: I am quite excited about the memcached support. Does it "just  
> > work"? I.e. can I run my ZODB in RAM now? ;-)
>
> I expect the new memcache support to be safe for everyone to use, but we
> still require the main database to be connected at all times, since
> memcache provides no ACID properties by itself.  We unfortunately can't
> do obvious things like cache the current transaction ID for an object,
> since that would break MVCC.  What we do cache is:
>
> 1. The current tid, given an oid and the transaction ID that is active
> for the current connection.
>
> 2. The pickle given an oid and tid.
>
> These should both help read-heavy databases, but might be detrimental
> for write-heavy databases.  My performance test suite, which writes a
> lot, produced slightly *worse* results with memcache enabled.

We should get Brian Aker into this discussion, since is one of the main
architects of MySQL and one of the founders of memcached.

As Shane knows, I have some access to Brian these days, so we could have an
online meeting talking about it.

Regards,
Stephan
--
Stephan Richter
Web Software Design, Development and Training
Google me. "Zope Stephan Richter"
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Shane Hathaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stephan Richter wrote:
> We should get Brian Aker into this discussion, since is one of the main
> architects of MySQL and one of the founders of memcached.
>
> As Shane knows, I have some access to Brian these days, so we could have an
> online meeting talking about it.

Cool.  Memcache could do a whole lot more for us if its API had a notion
of MVCC.  I think that might not be as complicated as it sounds. :-)

Shane
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by wichert :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stephan Richter wrote:
On Thursday 24 July 2008, Shane Hathaway wrote:
  
P.S.: I am quite excited about the memcached support. Does it "just  
work"? I.e. can I run my ZODB in RAM now? ;-)
      
I expect the new memcache support to be safe for everyone to use, but we
still require the main database to be connected at all times, since
memcache provides no ACID properties by itself.  We unfortunately can't
do obvious things like cache the current transaction ID for an object,
since that would break MVCC.  What we do cache is:

1. The current tid, given an oid and the transaction ID that is active
for the current connection.

2. The pickle given an oid and tid.

These should both help read-heavy databases, but might be detrimental
for write-heavy databases.  My performance test suite, which writes a
lot, produced slightly *worse* results with memcache enabled.
    

We should get Brian Aker into this discussion, since is one of the main 
architects of MySQL and one of the founders of memcached.

As Shane knows, I have some access to Brian these days, so we could have an 
online meeting talking about it.
  

Well, if I can make a suggestion :)

>From what I understand a new memcache client library was written for MySQL (libmemcache is quite horrible). Python bindings for that library would improve many things, including this :)

Wichert.

-- 
Wichert Akkerman wichert@...   It is simple to make things.
http://www.wiggy.net/                  It is hard to make things simple.

_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Stefan H. Holek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

This is not a problem of the ZODB or relstorage, but specific to how  
MySQL handles a) replication and b) temporary tables.

MySQL employs a log-based replication mechanism. This means the  
replication slave replays the commands performed on the master to  
keep its copy of the database in sync.

In MySQL, temporary tables live in RAM. So when a slave goes down,  
its copy of the table vanishes. When the slave comes back up the log  
may still contain commands using the temporary table however, causing  
execution to barf (and replication to stop). To fix this condition,  
and get replication going again, we have to perform a manual copy of  
the master's database to the affected slave.

We believe the solution is to avoid temporary tables altogether, and  
to recreate the needed semantics in a replication-safe way. The  
refactoring for temp_store could look like:

1) Create 'temp_store' as a permanent table.
2) Add a 'connid' column, storing the MySQL connection id.
3) Use CONNECTION_ID() in all inserts to populate 'connid'.
4) Qualify all updates and queries using 'temp_store' with
    WHERE connid = CONNECTION_ID() or equivalent.
5) Clear entries from 'temp_store' at transaction boundaries with
    DELETE FROM temp_store WHERE connid = CONNECTION_ID().

I plan to work on this in the near future.

Stefan


On 24. Jul 2008, at 18:33, Shane Hathaway wrote:

>> We hope to tackle the main issue (a.k.a. better-not-use-temporary-  
>> tables-with-mysql-replication-at-all) in a later installment.
>
> Ok.  Conceptually, what we need is a way for each connection to  
> write to a scratch table that no other connection can see.  Is  
> there a better way to do that than temporary tables?

--
Anything that, in happening, causes something else to happen,
causes something else to happen.  --Douglas Adams


_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Shane Hathaway :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Stefan H. Holek wrote:
> In MySQL, temporary tables live in RAM. So when a slave goes down,  
> its copy of the table vanishes. When the slave comes back up the log  
> may still contain commands using the temporary table however, causing  
> execution to barf (and replication to stop). To fix this condition,  
> and get replication going again, we have to perform a manual copy of  
> the master's database to the affected slave.

I would expect that when a slave goes down, the slave must replay all of
the statements since the beginning of a transaction, including the
statements that create temporary tables.  Does it not work that way?

Do slaves respect transaction boundaries?  If they don't, then ZODB
clients of slaves will miss object updates, leading to sporadically
inconsistent ZODB caches, especially under load.

> We believe the solution is to avoid temporary tables altogether, and  
> to recreate the needed semantics in a replication-safe way. The  
> refactoring for temp_store could look like:
>
> 1) Create 'temp_store' as a permanent table.
> 2) Add a 'connid' column, storing the MySQL connection id.
> 3) Use CONNECTION_ID() in all inserts to populate 'connid'.
> 4) Qualify all updates and queries using 'temp_store' with
>     WHERE connid = CONNECTION_ID() or equivalent.
> 5) Clear entries from 'temp_store' at transaction boundaries with
>     DELETE FROM temp_store WHERE connid = CONNECTION_ID().
>
> I plan to work on this in the near future.

We can do that, but I hope it doesn't impact performance too much.  It
seems a shame to not use a RAM-based temporary table.  OTOH, I've tried
to structure RelStorage to allow changes like this without too much effort.

Shane

_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Stefan H. Holek :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On 27. Jul 2008, at 18:48, Shane Hathaway wrote:

> I would expect that when a slave goes down, the slave must replay  
> all of the statements since the beginning of a transaction,  
> including the statements that create temporary tables.  Does it not  
> work that way?

Not in MySQL anyway. MySQL knows transactions at table-level only.  
For replication, the slave maintains the name of the master log file  
and a pointer to the next line to be read from it. That's all.

> Do slaves respect transaction boundaries?  If they don't, then ZODB  
> clients of slaves will miss object updates, leading to sporadically  
> inconsistent ZODB caches, especially under load.

Slaves replay BEGIN, COMMIT, and ROLLBACK statements issued on the  
master.

> We can do that, but I hope it doesn't impact performance too much.  
> It seems a shame to not use a RAM-based temporary table.  OTOH,  
> I've tried to structure RelStorage to allow changes like this  
> without too much effort.

I am primarily aiming for redundancy here, not performance.

And -BTW- you have and thanks for that.

Stefan

--
Anything that, in happening, causes itself to happen again,
happens again.  --Douglas Adams


_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: RelStorage: Clearing temp_store in replication-friendly way

by Stephan Richter-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thursday 24 July 2008, Shane Hathaway wrote:
> Cool.  Memcache could do a whole lot more for us if its API had a notion
> of MVCC.  I think that might not be as complicated as it sounds. :-)

Brian told me that MVCC is internally implemented. He will be in the office
next week and I'll have a more detailed conversation with him.

Regards,
Stephan
--
Stephan Richter
Web Software Design, Development and Training
Google me. "Zope Stephan Richter"
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@...
http://mail.zope.org/mailman/listinfo/zodb-dev
LightInTheBox - Buy quality products at wholesale price