Your SlideShare is downloading. ×
memcached
because fetching data from disk is too expensive.
what is memcached?
• complex
  a high-performance, distributed memory object caching
    system, generic in nature, but in...
memcached clients
       C/C++
who uses memcached?
why is memcached fast?
• data is stored in least an order of magnitude faster
  memory is usually at
                     ...
what memcached isn’t
• accessed with sql key/value model.
   instead a dictionary-like
• persistent server goes down, its ...
memcached gotchas
• aalthough, you can cannot be larger than 1MB
     single entry
                     compress large ent...
memcached pools
• memcached is dumb are managed by clients.
  pools of memcached servers
   this design principle was stol...
memcached resistance?
• my database alreadya good thing; however,
  database optimization is
                           do...
autocomplete demo
because a distributed memory object caching system is
     expensive...wait, what? it’s completely free?
scenario
• problem an autocomplete search box to our
  we just added
   website homepage. everyone uses it, but the load i...
setup details
• amazon ec2 ubuntu 8.04 server installation. 7.5GB
   a pre-configured
    of memory, 4 ec2 compute units (o...
source + slides
http://github.com/hectcastro/memcached-talk/
Upcoming SlideShare
Loading in...5
×

memcached

2,768

Published on

Because fetching data from disk is too expensive.

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,768
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • thanks: becky, terry, anand, young, and dave.
  • thanks: becky, terry, anand, young, and dave.
  • platform independence: doesn’t matter what language you use. if the language has low-level network I/O capabilities, memcached is possible.
    struct: dictionary object in CF (thanks, dave).
  • platform independence: doesn’t matter what language you use. if the language has low-level network I/O capabilities, memcached is possible.
    struct: dictionary object in CF (thanks, dave).
  • platform independence: doesn’t matter what language you use. if the language has low-level network I/O capabilities, memcached is possible.
    struct: dictionary object in CF (thanks, dave).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • perl: top-left
    clients: although all we need to communicate with memcached is low-level network I/O, generous people have put together memcached clients for the following languages. these clients are equivalent to HTTP libraries found in most modern programming languages. because of HTTP libraries, we can utilize the HTTP protocol without implementing it at a low level (ex: GET / HTTP/1.1).
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • wikimedia: left of twitter.
    purpose: this slide is just to show that memcached isn’t mickey mouse.
  • order of magnitude: generally refers to a scale of 10, where 10 is the exponent being applied to an amount. imagine if someone multiplied your salary by 10 -- that’s an order of magnitude difference.
    poll vs epoll: poll() is O(n), while epoll() is O(1).
    myth: this may be a myth, but i’ve read articles that claim Google has memcached pools of up to 1,000 machines.
  • order of magnitude: generally refers to a scale of 10, where 10 is the exponent being applied to an amount. imagine if someone multiplied your salary by 10 -- that’s an order of magnitude difference.
    poll vs epoll: poll() is O(n), while epoll() is O(1).
    myth: this may be a myth, but i’ve read articles that claim Google has memcached pools of up to 1,000 machines.
  • order of magnitude: generally refers to a scale of 10, where 10 is the exponent being applied to an amount. imagine if someone multiplied your salary by 10 -- that’s an order of magnitude difference.
    poll vs epoll: poll() is O(n), while epoll() is O(1).
    myth: this may be a myth, but i’ve read articles that claim Google has memcached pools of up to 1,000 machines.
  • order of magnitude: generally refers to a scale of 10, where 10 is the exponent being applied to an amount. imagine if someone multiplied your salary by 10 -- that’s an order of magnitude difference.
    poll vs epoll: poll() is O(n), while epoll() is O(1).
    myth: this may be a myth, but i’ve read articles that claim Google has memcached pools of up to 1,000 machines.
  • order of magnitude: generally refers to a scale of 10, where 10 is the exponent being applied to an amount. imagine if someone multiplied your salary by 10 -- that’s an order of magnitude difference.
    poll vs epoll: poll() is O(n), while epoll() is O(1).
    myth: this may be a myth, but i’ve read articles that claim Google has memcached pools of up to 1,000 machines.
  • dumb: please take a second to digest the first bullet -- this is known as the end-to-end principle. ex: IP is dumb in that it simply moves datagrams across the network. TCP, on the other hand, is seen as smart because it provides error detection, retransmission, congestion, and flow control. core routers support IP, while endpoints support the heavier TCP.
  • dumb: please take a second to digest the first bullet -- this is known as the end-to-end principle. ex: IP is dumb in that it simply moves datagrams across the network. TCP, on the other hand, is seen as smart because it provides error detection, retransmission, congestion, and flow control. core routers support IP, while endpoints support the heavier TCP.
  • dumb: please take a second to digest the first bullet -- this is known as the end-to-end principle. ex: IP is dumb in that it simply moves datagrams across the network. TCP, on the other hand, is seen as smart because it provides error detection, retransmission, congestion, and flow control. core routers support IP, while endpoints support the heavier TCP.
  • dumb: please take a second to digest the first bullet -- this is known as the end-to-end principle. ex: IP is dumb in that it simply moves datagrams across the network. TCP, on the other hand, is seen as smart because it provides error detection, retransmission, congestion, and flow control. core routers support IP, while endpoints support the heavier TCP.
  • envelope: json > xml.
  • envelope: json > xml.
  • envelope: json > xml.
  • ec2: target’s website runs in the ec2 cloud. during the holiday season, while some competitor sites were unresponsive, target’s site, at worst, showed 50% slowdown in response time.
    sqlite: used internally by firefox for bookmarks + browsing history.
  • ec2: target’s website runs in the ec2 cloud. during the holiday season, while some competitor sites were unresponsive, target’s site, at worst, showed 50% slowdown in response time.
    sqlite: used internally by firefox for bookmarks + browsing history.
  • ec2: target’s website runs in the ec2 cloud. during the holiday season, while some competitor sites were unresponsive, target’s site, at worst, showed 50% slowdown in response time.
    sqlite: used internally by firefox for bookmarks + browsing history.
  • ec2: target’s website runs in the ec2 cloud. during the holiday season, while some competitor sites were unresponsive, target’s site, at worst, showed 50% slowdown in response time.
    sqlite: used internally by firefox for bookmarks + browsing history.
  • Transcript of "memcached"

    1. 1. memcached because fetching data from disk is too expensive.
    2. 2. what is memcached? • complex a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load. • simplebig, platform independent, dictionary-like object a really that lives in memory, and can be intelligently sliced across multiple computers.
    3. 3. memcached clients C/C++
    4. 4. who uses memcached?
    5. 5. why is memcached fast? • data is stored in least an order of magnitude faster memory is usually at memory, not disk than disk. • I/O is non-blocking (asynchronous)data permits other processing to continue before transmission has finished. • epoll() instead of poll() for network event loop as the number of file descriptors increases, so does the advantage epoll() provides over poll(). • it’s distributed well on one machine, or one memcached works hundred.
    6. 6. what memcached isn’t • accessed with sql key/value model. instead a dictionary-like • persistent server goes down, its contents are gone. if a memcached this is perfectly acceptable, because memcached is a cache. • replicated across N servers, but no key exists on data is spread more than one server. • secure no built-in security features. a lower level of there are network security is necessary (firewall).
    7. 7. memcached gotchas • aalthough, you can cannot be larger than 1MB single entry compress large entries to make them fit into 1MB. • keys functions work to 250a consistent way to create hash are limited well as characters keys. • cache algorithmmeans it evicts the least recently least recently used is lru used items first.
    8. 8. memcached pools • memcached is dumb are managed by clients. pools of memcached servers this design principle was stolen from the internet (keep the insides dumb + add intelligence to the perimeter). • clients provide failover start using another if a server in the pool dies, clients server. price of failure is the cost of a few extra database look-ups. • 32-bit? 32-bit processes can only address 4GB of virtual memory. if you have PAE (physical address extension enables systems to use 4GB-64GB RAM) enabled, just start multiple memcached processes.
    9. 9. memcached resistance? • my database alreadya good thing; however, database optimization is does caching memcached provides caching at a higher level. built html fragments can be cached, not just query results. • my favorite programming language already lets me cache at a higher level can it handle a ton of simultaneous cache requests well? does it utilize non-blocking I/O? can it be spread intelligently across N machines? • whoyou loveabout savingmilliseconds can add up -- i bet cares the slowskys. a few milliseconds don’t settle for average response time, especially when a drastic improvement is this easy to implement.
    10. 10. autocomplete demo because a distributed memory object caching system is expensive...wait, what? it’s completely free?
    11. 11. scenario • problem an autocomplete search box to our we just added website homepage. everyone uses it, but the load is punishing our servers and users are complaining that it isn’t as responsive anymore. • solution insert memcached between application and database. don’t just cache query results, cache results wrapped in a transport envelope (json). if the users enter something we didn’t already cache, cache that too.
    12. 12. setup details • amazon ec2 ubuntu 8.04 server installation. 7.5GB a pre-configured of memory, 4 ec2 compute units (one ec2 compute unit equals 1.0-1.2 GHz 2007 opteron or 2007 xeon processor), 850GB of instance storage, 64-bit platform. • apache + php web server on the internet (50+% the most popular according to netcraft for 01/09) + the free and open source php. • sqlite an acid-compliant rdbms contained in a relatively small C programming library. the sqlite engine is not a standalone process, and the entire database is stored as a single cross-platform file on a host machine.
    13. 13. source + slides http://github.com/hectcastro/memcached-talk/

    ×