SlideShare a Scribd company logo
1 of 49
Download to read offline
Caching techinques in
                                 python
                                Michael Domanski
                                europython 2010


czwartek, 22 lipca 2010
who I am

                     • python developer, professionally for a few
                          years now
                     • experienced also in c and objective-c
                     • currently working for 10clouds.com


czwartek, 22 lipca 2010
Interesting intro

                     • a bit of theory
                     • common patterns
                     • common problems
                     • common solutions

czwartek, 22 lipca 2010
How I think about
                               cache

                     • imagine a giant dict storing all your data
                     • you have to manage all data manually
                     • or provide some automated behaviour


czwartek, 22 lipca 2010
similar to....

                     • manual memory managment in c
                     • cache is memory
                     • and you have to controll it manually


czwartek, 22 lipca 2010
profits


                     • improved performance
                     • ...?


czwartek, 22 lipca 2010
problems


                     • managing any type of memory is hard
                     • automation often have to be done custom
                          each time




czwartek, 22 lipca 2010
common patterns



czwartek, 22 lipca 2010
memoization



czwartek, 22 lipca 2010
• very old pattern (circa 1968)
                     • we own the name to Donald Mitchie



czwartek, 22 lipca 2010
how it works


                     • we assosciate input with output, and store
                          in somewhere
                     • based on the assumption that for a given
                          input, output is always the same




czwartek, 22 lipca 2010
code example
                 CACHE_DICT = {}

                 def cached(key):
                     def func_wrapper(func):
                         def arg_wrapper(*args, **kwargs):
                             if not key in CACHE_DICT:
                                  value = func(*args, **kwargs)
                                  CACHE_DICT[key] = value
                             return CACHE_DICT[key]
                         return arg_wrapper
                     return func_wrapper




czwartek, 22 lipca 2010
what if output can
                               change?

                     • our pattern is still usefull
                     • we simply need to add something


czwartek, 22 lipca 2010
cache invalidation



czwartek, 22 lipca 2010
There are only two hard problems in Computer
                           Science: cache invalidation and naming things
                                                                  Phil Karlton


czwartek, 22 lipca 2010
• basically, we update data in cache
                     • we need to know when and what to
                          change

                     • the more granular you want to be, the
                          harder it gets




czwartek, 22 lipca 2010
code example
                   def invalidate(key):
                     try:
                          del CACHE_DICT[key]
                     except KeyError:
                          print "someone tried to invalidate not present
                 key: %s" %key




czwartek, 22 lipca 2010
common problems



czwartek, 22 lipca 2010
invalidating too much/
                                not enough

                     • flushing all data any time something changes
                     • not flushing cache at all
                     • tragic effects


czwartek, 22 lipca 2010
@cached('key1')
                 def simple_function1():
                     return db_get(id=1)

                 @cached('key2')
                 def simple_function2():
                     return db_get(id=2)

                 # SUPPOSE THIS IS IN ANOTHER MODULE

                 @cached('big_key1')
                 def some_bigger_function():
                     """
                     this function depends on big_key1, key1 and key2
                     """
                     def inner_workings():
                         db_set(1, 'something totally new')
                     #######
                     ##   imagine 100 lines of code here :)
                     ######
                     inner_workings()

                          return [simple_function1(),simple_function2()]

                 if __name__ == '__main__':
                     simple_function1()
                     simple_function2()
                     a,b = some_bigger_function()
                     assert a == db_get(id=1), "this fails because we didn't invalidated cache properly"




czwartek, 22 lipca 2010
invalidating too soon/
                                  too late

                     • your cache have to be synchronised to you
                          db
                     • sometimes very hard to spot
                     • leads to tragic mistakes


czwartek, 22 lipca 2010
@cached('key1')
                 def simple_function1():
                     return db_get(id=1)

                 @cached('key2')
                 def simple_function2():
                     return db_get(id=2)

                 # SUPPOSE THIS IS IN ANOTHER MODULE

                 def some_bigger_function():
                     db_set(1, 'something')
                     value = simple_function1()
                     db_set(2, 'something else')
                     #### now we know we used 2 cached functions so....
                     invalidate('key1')
                     invalidate('key2')
                     #### now we know we are safe, but for a price
                     return simple_function2()

                 if __name__ == '__main__':
                     some_bigger_function()




czwartek, 22 lipca 2010
superposition of
                               dependancy
                     • somehow less obvious problem
                     • eventually you will start caching effects of
                          computation
                     • you have to know very preciselly of what
                          your data is dependant



czwartek, 22 lipca 2010
@cached('key1')
                 def simple_function1():
                     return db_get(id=1)

                 @cached('key2')
                 def simple_function2():
                     return db_get(id=2)

                 # SUPPOSE THIS IS IN ANOTHER MODULE

                 @cached('key')
                 def some_bigger_function():

                          return {
                              '1': simple_function1(),
                              '2': simple_function2(),
                              '3': db_get(id=3)
                          }

                 if __name__ == '__main__':
                     simple_function1()
                     # somewhere else
                     db_set(1, 'foobar')
                     # and again
                     db_set(3, 'bazbar')
                     invalidate('key')
                     # ooops, we forgot something
                     data = some_bigger_function()
                     assert data['1'] == db_get(id=1), "this fails because we didn't manage to invalidate all the
                 keys"




czwartek, 22 lipca 2010
summing up
                     • know your data....
                     • be aware what and when you cache
                     • take care when using cached data in
                          computation




czwartek, 22 lipca 2010
common solutions



czwartek, 22 lipca 2010
process level cache



czwartek, 22 lipca 2010
why?

                     • very fast access
                     • simple to implement
                     • very effective as long as you’re using single
                          process




czwartek, 22 lipca 2010
clever tricks with dicts



czwartek, 22 lipca 2010
code example
                 CACHE_DICT = {}

                 def cached(key):
                     def func_wrapper(func):
                         def arg_wrapper(*args, **kwargs):
                             if not key in CACHE_DICT:
                                  value = func(*args, **kwargs)
                                  CACHE_DICT[key] = value
                             return CACHE_DICT[key]
                         return arg_wrapper
                     return func_wrapper




czwartek, 22 lipca 2010
invalidation



czwartek, 22 lipca 2010
code example
                   def invalidate(key):
                     try:
                          del CACHE_DICT[key]
                     except KeyError:
                          print "someone tried to invalidate not present
                 key: %s" %key




czwartek, 22 lipca 2010
application level cache



czwartek, 22 lipca 2010
memcache



czwartek, 22 lipca 2010
• battle tested
                     • scales
                     • fast
                     • supports a few cool features
                     • behaves a lot like dict
                     • supports time-based expiration
czwartek, 22 lipca 2010
libraries?

                     • python-memcache
                     • python-libmemcache
                     • python-cmemcache
                     • pylibmc

czwartek, 22 lipca 2010
why no benchmarks

                     • not the point of this talk :)
                     • benchmarks are generic, caching is specific
                     • pick your flavour, think for yourself


czwartek, 22 lipca 2010
code example
                          cache = memcache.Client(['localhost:11211'])

                 def memcached(key):
                     def func_wrapper(func):
                         def arg_wrapper(*args, **kwargs):
                             value = cache.get(str(key))
                             if not value:
                                 value = func(*args, **kwargs)
                                 cache.set(str(key), value)
                             return value
                         return arg_wrapper
                     return func_wrapper




czwartek, 22 lipca 2010
invalidation



czwartek, 22 lipca 2010
code example
                          def mem_invalidate(key):
                            cache.set(str(key), None)




czwartek, 22 lipca 2010
batch key managment



czwartek, 22 lipca 2010
• what if I don’t want to expire each key
                          manually

                     • that’s a lot to remember
                     • and we have to be carefull :(


czwartek, 22 lipca 2010
groups?

                     • group keys into sets
                     • which are tied to one key per set
                     • expire one key, instead of twenty


czwartek, 22 lipca 2010
how to get there?

                     • store some extra data
                     • you can store dicts in cache
                     • and cache behaves like dict
                     • so it’s a case of comparing keys and values

czwartek, 22 lipca 2010
#we start with specified key and group
                 key='some_key'
                 group='some_group'

                 # now retrieve some data from memcached
                 data=memcached_client.get_multi(key, group)
                 # now data is a dict that should look like
                 #{'some_key' :{'group_key' : '1234',
                 #                  'value' : 'some_value' },
                 # 'some_group' : '1234'}
                 #
                 if data and (key in data) and (group in data):
                     if data[key]['group_key']==data[group]:
                         return data[key]['value']




czwartek, 22 lipca 2010
def cached(key, group_key='', exp_time=0 ):

          # we don't want to mix time based and event based expiration models
          if group_key : assert exp_time==0, "can't set expiration time for grouped keys"
          def f_wrapper(func):
              def arg_wrapper(*args, **kwargs):
                  value = None
                  if group_key:
                      data = cache.get_multi([tools.make_key(group_key)]+[tools.make_key(key)])
                      data_dict = data.get(tools.make_key(key))
                      if data_dict:
                           value = data_dict['value']
                           group_value = data_dict['group_value']
                           if group_value != data[tools.make_key(group_key)]:
                               value = None
                  else:
                      value = cache.get(key)
                  if not value:
                      value = func(*args, **kwargs)
                      if exp_time:
                           cache.set(tools.make_key(key), value, exp_time)
                      elif not group_key:
                           cache.set(tools.make_key(key), value)
                      else: # exp_time not set and we have group_keys
                           group_value = make_group_value(group_key)
                           data_dict = { 'value':value, 'group_value': group_value}
                           cache.set_multi({ tools.make_key(key):data_dict, tools.make_key(group_key):group_value })
                  return value
              arg_wrapper.__name__ = func.__name__
              return arg_wrapper
          return f_wrapper



czwartek, 22 lipca 2010
questions?



czwartek, 22 lipca 2010
code samples @
                       http://github.com/
                    mdomans/europython2010

czwartek, 22 lipca 2010
follow me

                 twitter: mdomans
                 blog:    blog.mdomans.com


czwartek, 22 lipca 2010

More Related Content

What's hot

Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008Raffael Marty
 
IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008Raffael Marty
 
VJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript togetherVJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript togetherJustin Early
 
Easy undo.key
Easy undo.keyEasy undo.key
Easy undo.keyzachwaugh
 
Database madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyDatabase madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyJaime Buelta
 
Rails' Next Top Model
Rails' Next Top ModelRails' Next Top Model
Rails' Next Top ModelAdam Keys
 
Drupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of UsageDrupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of UsageRonald Ashri
 
Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Ralph Schindler
 
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebBDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebChristian Baranowski
 
Core Data Performance Guide Line
Core Data Performance Guide LineCore Data Performance Guide Line
Core Data Performance Guide LineGagan Vishal Mishra
 
YUI3 Modules
YUI3 ModulesYUI3 Modules
YUI3 Modulesa_pipkin
 

What's hot (15)

Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008
 
hibernate
hibernatehibernate
hibernate
 
Build your own entity with Drupal
Build your own entity with DrupalBuild your own entity with Drupal
Build your own entity with Drupal
 
IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008
 
Django - sql alchemy - jquery
Django - sql alchemy - jqueryDjango - sql alchemy - jquery
Django - sql alchemy - jquery
 
VJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript togetherVJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript together
 
Easy undo.key
Easy undo.keyEasy undo.key
Easy undo.key
 
Spock and Geb in Action
Spock and Geb in ActionSpock and Geb in Action
Spock and Geb in Action
 
Database madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyDatabase madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemy
 
Rails' Next Top Model
Rails' Next Top ModelRails' Next Top Model
Rails' Next Top Model
 
Drupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of UsageDrupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of Usage
 
Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2
 
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebBDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
 
Core Data Performance Guide Line
Core Data Performance Guide LineCore Data Performance Guide Line
Core Data Performance Guide Line
 
YUI3 Modules
YUI3 ModulesYUI3 Modules
YUI3 Modules
 

Similar to Caching Techniques in Python: Memoization, Invalidation and Process Level Cache

Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)Eugene Lazutkin
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Chappell.Wat
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构taobao.com
 
Drupal 7: What's In It For You?
Drupal 7: What's In It For You?Drupal 7: What's In It For You?
Drupal 7: What's In It For You?karschsp
 
CapitalCamp Features
CapitalCamp FeaturesCapitalCamp Features
CapitalCamp FeaturesPhase2
 
TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...Igalia
 
So you want to liberate your data?
So you want to liberate your data?So you want to liberate your data?
So you want to liberate your data?Mogens Heller Grabe
 
Pitfalls of Continuous Deployment
Pitfalls of Continuous DeploymentPitfalls of Continuous Deployment
Pitfalls of Continuous Deploymentzeeg
 
Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018Luciano Mammino
 
D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015Brian Coffey
 
Objective-C: a gentle introduction
Objective-C: a gentle introductionObjective-C: a gentle introduction
Objective-C: a gentle introductionGabriele Petronella
 

Similar to Caching Techniques in Python: Memoization, Invalidation and Process Level Cache (12)

Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构
 
Drupal 7: What's In It For You?
Drupal 7: What's In It For You?Drupal 7: What's In It For You?
Drupal 7: What's In It For You?
 
CapitalCamp Features
CapitalCamp FeaturesCapitalCamp Features
CapitalCamp Features
 
TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...
 
Active domain
Active domainActive domain
Active domain
 
So you want to liberate your data?
So you want to liberate your data?So you want to liberate your data?
So you want to liberate your data?
 
Pitfalls of Continuous Deployment
Pitfalls of Continuous DeploymentPitfalls of Continuous Deployment
Pitfalls of Continuous Deployment
 
Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018
 
D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015
 
Objective-C: a gentle introduction
Objective-C: a gentle introductionObjective-C: a gentle introduction
Objective-C: a gentle introduction
 

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

Caching Techniques in Python: Memoization, Invalidation and Process Level Cache

  • 1. Caching techinques in python Michael Domanski europython 2010 czwartek, 22 lipca 2010
  • 2. who I am • python developer, professionally for a few years now • experienced also in c and objective-c • currently working for 10clouds.com czwartek, 22 lipca 2010
  • 3. Interesting intro • a bit of theory • common patterns • common problems • common solutions czwartek, 22 lipca 2010
  • 4. How I think about cache • imagine a giant dict storing all your data • you have to manage all data manually • or provide some automated behaviour czwartek, 22 lipca 2010
  • 5. similar to.... • manual memory managment in c • cache is memory • and you have to controll it manually czwartek, 22 lipca 2010
  • 6. profits • improved performance • ...? czwartek, 22 lipca 2010
  • 7. problems • managing any type of memory is hard • automation often have to be done custom each time czwartek, 22 lipca 2010
  • 10. • very old pattern (circa 1968) • we own the name to Donald Mitchie czwartek, 22 lipca 2010
  • 11. how it works • we assosciate input with output, and store in somewhere • based on the assumption that for a given input, output is always the same czwartek, 22 lipca 2010
  • 12. code example CACHE_DICT = {} def cached(key): def func_wrapper(func): def arg_wrapper(*args, **kwargs): if not key in CACHE_DICT: value = func(*args, **kwargs) CACHE_DICT[key] = value return CACHE_DICT[key] return arg_wrapper return func_wrapper czwartek, 22 lipca 2010
  • 13. what if output can change? • our pattern is still usefull • we simply need to add something czwartek, 22 lipca 2010
  • 15. There are only two hard problems in Computer Science: cache invalidation and naming things Phil Karlton czwartek, 22 lipca 2010
  • 16. • basically, we update data in cache • we need to know when and what to change • the more granular you want to be, the harder it gets czwartek, 22 lipca 2010
  • 17. code example def invalidate(key): try: del CACHE_DICT[key] except KeyError: print "someone tried to invalidate not present key: %s" %key czwartek, 22 lipca 2010
  • 19. invalidating too much/ not enough • flushing all data any time something changes • not flushing cache at all • tragic effects czwartek, 22 lipca 2010
  • 20. @cached('key1') def simple_function1(): return db_get(id=1) @cached('key2') def simple_function2(): return db_get(id=2) # SUPPOSE THIS IS IN ANOTHER MODULE @cached('big_key1') def some_bigger_function(): """ this function depends on big_key1, key1 and key2 """ def inner_workings(): db_set(1, 'something totally new') ####### ## imagine 100 lines of code here :) ###### inner_workings() return [simple_function1(),simple_function2()] if __name__ == '__main__': simple_function1() simple_function2() a,b = some_bigger_function() assert a == db_get(id=1), "this fails because we didn't invalidated cache properly" czwartek, 22 lipca 2010
  • 21. invalidating too soon/ too late • your cache have to be synchronised to you db • sometimes very hard to spot • leads to tragic mistakes czwartek, 22 lipca 2010
  • 22. @cached('key1') def simple_function1(): return db_get(id=1) @cached('key2') def simple_function2(): return db_get(id=2) # SUPPOSE THIS IS IN ANOTHER MODULE def some_bigger_function(): db_set(1, 'something') value = simple_function1() db_set(2, 'something else') #### now we know we used 2 cached functions so.... invalidate('key1') invalidate('key2') #### now we know we are safe, but for a price return simple_function2() if __name__ == '__main__': some_bigger_function() czwartek, 22 lipca 2010
  • 23. superposition of dependancy • somehow less obvious problem • eventually you will start caching effects of computation • you have to know very preciselly of what your data is dependant czwartek, 22 lipca 2010
  • 24. @cached('key1') def simple_function1(): return db_get(id=1) @cached('key2') def simple_function2(): return db_get(id=2) # SUPPOSE THIS IS IN ANOTHER MODULE @cached('key') def some_bigger_function(): return { '1': simple_function1(), '2': simple_function2(), '3': db_get(id=3) } if __name__ == '__main__': simple_function1() # somewhere else db_set(1, 'foobar') # and again db_set(3, 'bazbar') invalidate('key') # ooops, we forgot something data = some_bigger_function() assert data['1'] == db_get(id=1), "this fails because we didn't manage to invalidate all the keys" czwartek, 22 lipca 2010
  • 25. summing up • know your data.... • be aware what and when you cache • take care when using cached data in computation czwartek, 22 lipca 2010
  • 28. why? • very fast access • simple to implement • very effective as long as you’re using single process czwartek, 22 lipca 2010
  • 29. clever tricks with dicts czwartek, 22 lipca 2010
  • 30. code example CACHE_DICT = {} def cached(key): def func_wrapper(func): def arg_wrapper(*args, **kwargs): if not key in CACHE_DICT: value = func(*args, **kwargs) CACHE_DICT[key] = value return CACHE_DICT[key] return arg_wrapper return func_wrapper czwartek, 22 lipca 2010
  • 32. code example def invalidate(key): try: del CACHE_DICT[key] except KeyError: print "someone tried to invalidate not present key: %s" %key czwartek, 22 lipca 2010
  • 35. • battle tested • scales • fast • supports a few cool features • behaves a lot like dict • supports time-based expiration czwartek, 22 lipca 2010
  • 36. libraries? • python-memcache • python-libmemcache • python-cmemcache • pylibmc czwartek, 22 lipca 2010
  • 37. why no benchmarks • not the point of this talk :) • benchmarks are generic, caching is specific • pick your flavour, think for yourself czwartek, 22 lipca 2010
  • 38. code example cache = memcache.Client(['localhost:11211']) def memcached(key): def func_wrapper(func): def arg_wrapper(*args, **kwargs): value = cache.get(str(key)) if not value: value = func(*args, **kwargs) cache.set(str(key), value) return value return arg_wrapper return func_wrapper czwartek, 22 lipca 2010
  • 40. code example def mem_invalidate(key): cache.set(str(key), None) czwartek, 22 lipca 2010
  • 42. • what if I don’t want to expire each key manually • that’s a lot to remember • and we have to be carefull :( czwartek, 22 lipca 2010
  • 43. groups? • group keys into sets • which are tied to one key per set • expire one key, instead of twenty czwartek, 22 lipca 2010
  • 44. how to get there? • store some extra data • you can store dicts in cache • and cache behaves like dict • so it’s a case of comparing keys and values czwartek, 22 lipca 2010
  • 45. #we start with specified key and group key='some_key' group='some_group' # now retrieve some data from memcached data=memcached_client.get_multi(key, group) # now data is a dict that should look like #{'some_key' :{'group_key' : '1234', # 'value' : 'some_value' }, # 'some_group' : '1234'} # if data and (key in data) and (group in data): if data[key]['group_key']==data[group]: return data[key]['value'] czwartek, 22 lipca 2010
  • 46. def cached(key, group_key='', exp_time=0 ): # we don't want to mix time based and event based expiration models if group_key : assert exp_time==0, "can't set expiration time for grouped keys" def f_wrapper(func): def arg_wrapper(*args, **kwargs): value = None if group_key: data = cache.get_multi([tools.make_key(group_key)]+[tools.make_key(key)]) data_dict = data.get(tools.make_key(key)) if data_dict: value = data_dict['value'] group_value = data_dict['group_value'] if group_value != data[tools.make_key(group_key)]: value = None else: value = cache.get(key) if not value: value = func(*args, **kwargs) if exp_time: cache.set(tools.make_key(key), value, exp_time) elif not group_key: cache.set(tools.make_key(key), value) else: # exp_time not set and we have group_keys group_value = make_group_value(group_key) data_dict = { 'value':value, 'group_value': group_value} cache.set_multi({ tools.make_key(key):data_dict, tools.make_key(group_key):group_value }) return value arg_wrapper.__name__ = func.__name__ return arg_wrapper return f_wrapper czwartek, 22 lipca 2010
  • 48. code samples @ http://github.com/ mdomans/europython2010 czwartek, 22 lipca 2010
  • 49. follow me twitter: mdomans blog: blog.mdomans.com czwartek, 22 lipca 2010