advanced memcache implementation for gharkikhoj.com

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    advanced memcache implementation for gharkikhoj.com - Presentation Transcript

    1. Advanced Cache implementation for gharkikhoj.com using memcache
      • This presentation is for
        • A tour into our new memcache based caching system.
        • What we did and how we did.
        • Suggestions recommendations, and things we tried.
        • One of the presentation in series for out site.....
    2. Our Design goals.
      • Donot use dumb caching becuase it is easy.
      • We could through 100+gb and use the memcache default caching policy to discard old items but that was a bad idea as it is not efficient use of resources and little improvement can help memcache.
      • Intelligent caching is possible and therefore should be implemented.
      • Provide fast cache solution.
      • Automatically update items in cache if they get updated in database.
      • Remove stale items from cache if needed.
      • Maintain the cache.
      • Fault tolerance.
      • Spread same cache in cluster for fault tolerance.
      • Maintain cache coherency.
      • Provide notifications to listen for cache events.
      • Continous query.
    3. Over all design
      • Use memcache as in memory data store.
      • Use udp for fast messaging where it can be.
      • Use spread for mysql messaging along with udp.
      • amqp (qpid,rabbitmq, zeromq,openmqp) for clustering and data/metadata distribution.
      • Use custom event processing for metadata and notifications.
      • Mysql for data storage.
      • Linux as os
      • samba/glusterfs/afs for filesystem.
    4. good points about memcache
      • Good
        • Memcache is very fast cache solution.
        • Clients available for almost all platforms.
        • Simple api
        • Provides cache distribution via client hashing.
        • Barebone cache works very fast.
        • Binary and other protocol available
        • Can be spawned multiple times accross same machine or others(capacity only limited by the servers and instance, ram and physical limitations).
    5. Bad points about memcache
      • Bad ( not so bad since it is not the purpose of memcache).
        • No db integration
        • No cache coherency wrappers available
        • No inbuilt fault tolerance
        • No inbuild load distribution.
        • No events for cache updates or activities such as item removed, added etc.
        • No continous queries.
    6. Solution part 1
      • Use memcache.
      • Spawn multiple instances over a cluster of server (three instances = ~20 GB of ram)
      • Use a custom java and php memcache based client to maintain the cache.
      • Main controller is in java that is running in fault tolerent mode via redundent instances.
      • Most of the data items are pagination data consisting of 25+ records in a single item.
      • Some of the cache items are single record only.
      • Records are stored in mysql databases
      • Multiple sources can process the databases through cache or in backend with database directly.
    7. Solution part 2
      • No client goes to memcache directly but through a proxy client.
      • Proxy client uses udp and spread messaging to send cache updates to main controller.
      • Controller maintains cache items in its local instance as key value pair.
      • For paginated data controller maintains multivalue key pairs using two way hash maps.
      • Mysql messaging and some custom user defined functions are used for fast messaging.
      • If any record is updated in database the change is propogated via amqp .
      • Fanout is used to send message to a cluster.
      • Memcache items are logically grouped into regions called via a parent group key.
    8. Solution part 3
      • One memcache controller handles only one type of parent key ( currently we are using only one server handling all such keys).
      • Memcache controller maintains both
        • record of key and values being accessed and created in memcache cluster.
        • Record of database updates.
      • If a record in database is updated then corresponding key is updated in memcache or memcache key is removed ( depending upon configuration/item).
      • To maintain rate of operations, throttle control and other management functions we now moved to esper.
      • Java client uses macro framework to enable dynamic routing of messages and updating memcache controller without any downtime.
    9. Reliablity
      • Reliablity considertions
        • Record updates are always sent via reliable medium.
        • Keys and memcache items are handled in semi-reliable fashion via one global unique sequence (still not 100% reliable).
        • Hard failure will lead to failure of the semi reliable mechanism thus requiring cache flusing, bad but we donot expect it to happen often.
        • In case cache has to be flushed , we now know how to rebuilt the entire cache.
        • Incase of stale data, cache can be manually cleaned at individual item level.
        • Strict communication policy availble for 100% reliable cache coherency maintainence but not used unless required.
        • Memcache controllers run on java platform and are clustered accross machine to prevent failures (heart beat yet to be fully implemented).
    10. Performance
      • Remember cache controller is only in picture if db is updated in backend.
      • 10000 + cache/page hits per second can be handled by single controller.
      • Controller can handle over 100,000 db record updates per second, db+network is a bottleneck though which reduces overall throughput to 10000 updates.
      • Memcache custom client uses udp normally for faster communication, can be changed to reliable mode (still fast) but reliable communication.
      • Use amqp based group communciation or spread communication.
      • End to end latency is down under 350 ms for serving cache items to end client (including overhead of cache maintainence in memcache controller, communication, database notifications and processing of cache items).
      • This 350 ms has a lot of one time cost and on average will be much less as no of cache reads are more then record updates.
      • An extra throtling parameter is available to even improve this performance. Extra throttling parameter controls snap in cache coherency by throtlling db updates.
    11. Last slide
      • Once again
        • Memcache is cache item cluster
        • udp/spread/rabbitmq/qpid/zeromq/openmqp for communication.
        • Java based clustered process used as memcache controller.
        • Mysql + mysql messaging with udf and triggers used to maintain cache coherency.
        • Custom macro library used in java process for dynamic routing.
        • Memcache used in fault tolerant configuration along with clustering.
        • Memcache controller splitable via a parent group key that handles a type of keys.
    12. Extra items.
      • afs/gluster/samba filesystem combination is used for replicating proceses.
      • Ssh/scripting/wmi/staf is used for remote process control.
      • Hybrid combination of ubuntu/windows is used for servers.
      • Apache/jetty and custom http server used for application.
      • Cache client available in php and java combination.
      • Cluster is set of three servers running on linux/windows on and off.
      • At anytime two servers are always running.
      • Amqp servers can be changed easily due to binary nature of amqp specification.
      • We have tested with zeromq, rabbitmq, qpid works with only configuration changes.
    SlideShare Zeitgeist 2009

    + sumisingsumising Nominate

    custom

    212 views, 0 favs, 0 embeds more stats

    a brief presentation / insight about our memcache i more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 212
      • 212 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 10
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories