ZODB Tips and Tricks

  Carlos de la Guardia
Most important performance tip

   Find the best size for the ZODB object cache.
   How to calculate best size: take amount of
    available memory and divide by one ;)
   Corollary: Increase RAM as a first step when
    you want better performance.
Looking inside the ZODB

   collective.zodbbrowser is a package that has to
    be installed inside Zope and provides access to
    all objects and their attributes, including
    callables and their source code.
   Eye is an external tool that can be used to
    browse the ZODB without having to install all
    the products it uses.
   You can always use the low-tech approach and
    use the debug mode of an instance to look at
    the values directly using Python.
Oh My God, a POSKey error!

   I feel your pain.
   Unfortunately, getting into the details of how to
    fix this would take a full talk.
   All is not lost, but you'll need to fire up debug
    mode and poke into the internals of your ZODB.
   Before anything else: MAKE A BACKUP!
   Some detailed information here:
    http://plonechix.blogspot.com/2009/12/definitive
    -guide-to-poskeyerror.html
Getting rid of persistent utilities

   Older products that you uninstall sometimes
    can leave persistent utilities installed.
   This will crash your site, because Zope will try
    to import that code.
   There is a package that can help (but
    remember, backup first!):

    http://pypi.python.org/pypi/wildcard.fixpersistent
    utilities/
Recovering objects

   Brute force way: truncate the database
   The civiliced way: use zc.beforestorage
    %import zc.beforestorage
    <before>
      before 2008-12-08T10:29:03
      <filestorage>
         path /zope/var/filestortage/Data.fs
      </filestorage>
    </before>
Searching for transactions

from ZODB.TimeStamp import TimeStamp
from ZODB.FileStorage import FileStorage

storage = FileStorage('/path/to/data.fs', read_only=True)
it = storage.iterator()

earliest = TimeStamp(2010, 2, 26, 6, 0, 0)
# the above is in GMT

for txn in it:
   tid = TimeStamp(txn.tid)
   if tid > earliest:
       print txn.user, txn.description, tid.timeTime(), txn.tid.encode('base64')
       for rec in txn:
          print rec.pos
RelStorage
   A storage implementation for ZODB that stores pickles in a
    relational database.
   It is a drop-in replacement for FileStorage and ZEO.
   Designed for high volume sites: multiple ZODB instances
    can share the same database. This is similar to ZEO, but
    RelStorage does not require ZEO.
   According to some tests, RelStorage handles high
    concurrency better than the standard combination of ZEO
    and FileStorage.
   RelStorage starts quickly regardless of database size.
   Supports undo, packing, and filesystem-based ZODB
    blobs.
   Capable of failover to replicated SQL databases.
Interesting packages

   zodbshootout – benchmark ZEO vs RelStorage
    with different backends
   zodbupdate – update moved or renamed
    classes
   dm.historical – get history of objects in the
    ZODB
   dm.zodb.repair – restore lost objects from a
    backup to a target database
   zc.zodbactivitylog - provides an activity log that
    lets you track database activity
Beginner tips for ZODB development
    Do not use the root to store objects. It doesn't scale.
    Learn about BTrees.
    Avoid storing mutable objects, use persistent sub-
     objects.
    If your objects are bigger than 64k, you need to divide
     them or use blobs.
    Avoid conflicts, organize application threads and data
     structures so that objects are unlikely to be modified
     by multiple threads at the same time.
    Use data structures that support conflict resolution.
    To resolve conflicts, retry. The developer is in charge
     of managing concurrency, not the database.
Tips From the Experts




    I asked some of the old time Zope
developers for some simple tips for using
  the ZODB. Here are their responses.
David Glick

  ”If you want instances of a class to have a new
attribute, add it as a class attribute so that existing
         instances get a reasonable default”
Tips From the Experts




               Lennart Regebro

”Products.ZMIntrospection is quick way to look at
 all the fields of any ZODB object from the ZMI.”
Tips From the Experts


                  Alec Mitchell

”If you need to store arbitrary key/value pairs: use
 PersistentDict when the amount of data is "small"
  and/or you tend to require all the data in a given
transaction; use OOBTree (and friends) when you
   have a large number of keys and tend to only
   need a small subset of them in a transaction.”
Tips From the Experts



                  Alec Mitchell

 ”If you store data in one of the BTree structures
   and you need to count the number of entries,
don't use len(), ever. Use a Btrees.Length object
       to keep track of the count separately.”
Tips From the Experts




                 Alan Runyan

”use zc.zlibstorage for txt heavy databases it's a
    60-70% storage win for those records. ”
zc.zlibstorage

Standalone:                 With ZEO:
%import zc.zlibstorage    %import zc.zlibstorage
<zodb>                    <zeo>
 <zlibstorage>             address 8100
  <filestorage>           </zeo>
   path data.fs           <serverzlibstorage>
  </filestorage>           <filestorage>
 </zlibstorage>             path data.fs
</zodb>                    </filestorage>
                          </serverzlibstorage>
Tips From the Experts



                Alan Runyan

”Use zc.zodbgc, awesome library which provides
inverse graph of ZODB tree so you can see what
           leafs are referneced from”
zc.zodbdgc

   To use zc.zodbdgc just a part to the buildout
    that pulls the egg:

[zodbdgc]
recipe = zc.recipe.egg
eggs = ${instance:eggs}


You can the call the multi-zodb-gc and multi-zodb-
checkrefs.
Tips From the Experts


               Chris McDonough

 ”Use the "BTrees.Length" object to implement
counters in the ZODB. It has conflict resolution
  built in to it that has the potential to eliminate
conflict errors (as opposed to a normal integer
    counter attached to a persistent object).”
Tips From the Experts



                  Tres Seaver

   ”If you find yourself under intense fire, and
everything around you is crumbling, don't despair,
     just increase the ZEO client cache size”
Which cache is which

   Don't confuse the ZEO client cache with the ZODB
    object cache.
   The ZODB object cache stores objects in memory for
    faster responses. You set it with zodb-cache-size in a
    buildout.
   The ZEO client cache is used first when amn object is
    not in the ZODB object cache and avoids round trips
    to the ZEO server. You set it with zeo-client-cache-
    size in a buildout.
   You can enable cache tracing for analysis by setting
    the ZEO_CACHE_TRACE environment variable. More
    information at:
    http://wiki.zope.org/ZODB/trace.html
Tips From the Experts




             Jim Fulton

”Avoid non-persistent mutable objects”
Tips From the Experts




       Jim Fulton

     ”Don't be clever”
Thank You!

         Email: cguardia@yahoo.com

http://zodb.readthedocs.org/en/latest/index.html

ZODB Tips and Tricks

  • 1.
    ZODB Tips andTricks Carlos de la Guardia
  • 2.
    Most important performancetip  Find the best size for the ZODB object cache.  How to calculate best size: take amount of available memory and divide by one ;)  Corollary: Increase RAM as a first step when you want better performance.
  • 3.
    Looking inside theZODB  collective.zodbbrowser is a package that has to be installed inside Zope and provides access to all objects and their attributes, including callables and their source code.  Eye is an external tool that can be used to browse the ZODB without having to install all the products it uses.  You can always use the low-tech approach and use the debug mode of an instance to look at the values directly using Python.
  • 4.
    Oh My God,a POSKey error!  I feel your pain.  Unfortunately, getting into the details of how to fix this would take a full talk.  All is not lost, but you'll need to fire up debug mode and poke into the internals of your ZODB.  Before anything else: MAKE A BACKUP!  Some detailed information here: http://plonechix.blogspot.com/2009/12/definitive -guide-to-poskeyerror.html
  • 5.
    Getting rid ofpersistent utilities  Older products that you uninstall sometimes can leave persistent utilities installed.  This will crash your site, because Zope will try to import that code.  There is a package that can help (but remember, backup first!): http://pypi.python.org/pypi/wildcard.fixpersistent utilities/
  • 6.
    Recovering objects  Brute force way: truncate the database  The civiliced way: use zc.beforestorage %import zc.beforestorage <before> before 2008-12-08T10:29:03 <filestorage> path /zope/var/filestortage/Data.fs </filestorage> </before>
  • 7.
    Searching for transactions fromZODB.TimeStamp import TimeStamp from ZODB.FileStorage import FileStorage storage = FileStorage('/path/to/data.fs', read_only=True) it = storage.iterator() earliest = TimeStamp(2010, 2, 26, 6, 0, 0) # the above is in GMT for txn in it: tid = TimeStamp(txn.tid) if tid > earliest: print txn.user, txn.description, tid.timeTime(), txn.tid.encode('base64') for rec in txn: print rec.pos
  • 8.
    RelStorage  A storage implementation for ZODB that stores pickles in a relational database.  It is a drop-in replacement for FileStorage and ZEO.  Designed for high volume sites: multiple ZODB instances can share the same database. This is similar to ZEO, but RelStorage does not require ZEO.  According to some tests, RelStorage handles high concurrency better than the standard combination of ZEO and FileStorage.  RelStorage starts quickly regardless of database size.  Supports undo, packing, and filesystem-based ZODB blobs.  Capable of failover to replicated SQL databases.
  • 9.
    Interesting packages  zodbshootout – benchmark ZEO vs RelStorage with different backends  zodbupdate – update moved or renamed classes  dm.historical – get history of objects in the ZODB  dm.zodb.repair – restore lost objects from a backup to a target database  zc.zodbactivitylog - provides an activity log that lets you track database activity
  • 10.
    Beginner tips forZODB development  Do not use the root to store objects. It doesn't scale.  Learn about BTrees.  Avoid storing mutable objects, use persistent sub- objects.  If your objects are bigger than 64k, you need to divide them or use blobs.  Avoid conflicts, organize application threads and data structures so that objects are unlikely to be modified by multiple threads at the same time.  Use data structures that support conflict resolution.  To resolve conflicts, retry. The developer is in charge of managing concurrency, not the database.
  • 11.
    Tips From theExperts I asked some of the old time Zope developers for some simple tips for using the ZODB. Here are their responses.
  • 12.
    David Glick ”If you want instances of a class to have a new attribute, add it as a class attribute so that existing instances get a reasonable default”
  • 13.
    Tips From theExperts Lennart Regebro ”Products.ZMIntrospection is quick way to look at all the fields of any ZODB object from the ZMI.”
  • 14.
    Tips From theExperts Alec Mitchell ”If you need to store arbitrary key/value pairs: use PersistentDict when the amount of data is "small" and/or you tend to require all the data in a given transaction; use OOBTree (and friends) when you have a large number of keys and tend to only need a small subset of them in a transaction.”
  • 15.
    Tips From theExperts Alec Mitchell ”If you store data in one of the BTree structures and you need to count the number of entries, don't use len(), ever. Use a Btrees.Length object to keep track of the count separately.”
  • 16.
    Tips From theExperts Alan Runyan ”use zc.zlibstorage for txt heavy databases it's a 60-70% storage win for those records. ”
  • 17.
    zc.zlibstorage Standalone: With ZEO: %import zc.zlibstorage %import zc.zlibstorage <zodb> <zeo> <zlibstorage> address 8100 <filestorage> </zeo> path data.fs <serverzlibstorage> </filestorage> <filestorage> </zlibstorage> path data.fs </zodb> </filestorage> </serverzlibstorage>
  • 18.
    Tips From theExperts Alan Runyan ”Use zc.zodbgc, awesome library which provides inverse graph of ZODB tree so you can see what leafs are referneced from”
  • 19.
    zc.zodbdgc  To use zc.zodbdgc just a part to the buildout that pulls the egg: [zodbdgc] recipe = zc.recipe.egg eggs = ${instance:eggs} You can the call the multi-zodb-gc and multi-zodb- checkrefs.
  • 20.
    Tips From theExperts Chris McDonough ”Use the "BTrees.Length" object to implement counters in the ZODB. It has conflict resolution built in to it that has the potential to eliminate conflict errors (as opposed to a normal integer counter attached to a persistent object).”
  • 21.
    Tips From theExperts Tres Seaver ”If you find yourself under intense fire, and everything around you is crumbling, don't despair, just increase the ZEO client cache size”
  • 22.
    Which cache iswhich  Don't confuse the ZEO client cache with the ZODB object cache.  The ZODB object cache stores objects in memory for faster responses. You set it with zodb-cache-size in a buildout.  The ZEO client cache is used first when amn object is not in the ZODB object cache and avoids round trips to the ZEO server. You set it with zeo-client-cache- size in a buildout.  You can enable cache tracing for analysis by setting the ZEO_CACHE_TRACE environment variable. More information at: http://wiki.zope.org/ZODB/trace.html
  • 23.
    Tips From theExperts Jim Fulton ”Avoid non-persistent mutable objects”
  • 24.
    Tips From theExperts Jim Fulton ”Don't be clever”
  • 25.
    Thank You! Email: cguardia@yahoo.com http://zodb.readthedocs.org/en/latest/index.html