SDEC2011 Going by TACC

  • 3,773 views
Uploaded on

Key-value stores are widely used in applications that only require primary key data access, which is common in many web applications. Because developing an industrial grade key value store is …

Key-value stores are widely used in applications that only require primary key data access, which is common in many web applications. Because developing an industrial grade key value store is expensive, the conventional solution is to use one of the existing key-value stores and layer application semantics on top of the primitives provided by the store. This approach leads to potential inefficiencies, because application specific semantics can often allow optimizations in the implementation of the store. We present an alternative approach, using the TACC platform to provide a key-value store implementation that is both performant and easily customizable. The TACC programming model separates state from logic: state is stored in a collection of distributed in-memory database instances, while logic is performed by distributed agents that react asynchronously to changes in objects stored in the database instances. Agents can selectively subscribe to updates using a fine-grain hierarchical directory system to mount objects into a local namespace. TACC provides performance comparable to hand-coded C while reducing the actual source code size to a fraction of that. We describe the implementation and performance of a scalable and fault tolerant key-value store using TACC, pointing out the benefits realized by using TACC's strong, user-defined types and triggering/notification.Key-value stores are widely used in applications that only require primary key data access, which is common in many web applications. Because developing an industrial grade key value store is expensive, the conventional solution is to use one of the existing key-value stores and layer application semantics on top of the primitives provided by the store. This approach leads to potential inefficiencies, because application specific semantics can often allow optimizations in the implementation of the store. We present an alternative approach, using the TACC platform to provide a key-value store implementation that is both performant and easily customizable. The TACC programming model separates state from logic: state is stored in a collection of distributed in-memory database instances, while logic is performed by distributed agents that react asynchronously to changes in objects stored in the database instances. Agents can selectively subscribe to updates using a fine-grain hierarchical directory system to mount objects into a local namespace. TACC provides performance comparable to hand-coded C while reducing the actual source code size to a fraction of that. We describe the implementation and performance of a scalable and fault tolerant key-value store using TACC, pointing out the benefits realized by using TACC's strong, user-defined types and triggering/notification.

http://sdec.kr/

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,773
On Slideshare
0
From Embeds
0
Number of Embeds
6

Actions

Shares
Downloads
195
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Going  by  TACC:  Beyond  Key-­‐Value  to  Fault-­‐Tolerant   Stores  with  Easily  Customizable   Semantics   Henk  Goosen,  CEO   goosen@optumsoft.com  
  • 2. Key-­‐value  stores  rule  the  Web    Many  applications  only  need  primary  key  data  access    Examples:  catalogs,  shopping  carts,  web  session  state    No  need  for  the  complexity,  performance  overhead,   and  lack  of  scalability  of  a  full  database    Hence:  Key-­‐value  stores  are  everywhere     Dynamo,  CouchDB,  Cassandra,  Project  Voldemort,  Riak,   Redis,  memcached,  MongoDB,  …   Improving key-value stores is important OptumSoft, Inc. Proprietary and 2   Confidential
  • 3. Key-­‐value  stores  in  practice    Developing  a  key-­‐value  store  from  scratch  using   conventional  languages  is  expensive:     scalability,  performance,  and  fault  tolerance    Conventional  solution:  use  existing  key-­‐value  store     Layer  on  get()  and  put()  semantics    Mismatches  between  application  requirements  and   library:  either  accept  or  extensively  modify  library  code     Applications are more complex, performance suffers OptumSoft, Inc. Proprietary and 3   Confidential
  • 4. TACC  provides  a  different  model     Use  a  very  high-­‐level  language  to  specify  the  key-­‐value   store     Then  customize  the  store,  applying  application-­‐specific   semantics     Benefits:     Simplifies  the  application  business  logic     Improves  the  performance  of  both  store  and  application   TACC model is better! OptumSoft, Inc. Proprietary and 4   Confidential
  • 5. TACC  is  an  object-­‐oriented,   strongly  typed  language    User-­‐defined  type:  a  list  of  attributes  (nouns)    Read  or  write  attributes  (there  are  no  methods/verbs)    Logic  primarily  implemented  via  constraints     imperative  code  is  also  supported    Compact  code     First  class  high  level  data  types  (eg,  queues,  hash  tables)     Several  design  patterns  directly  supported  in  language   (eg  observer  pattern)   Compact code  fewer bugs, quicker to market 5  
  • 6. TACC:  efficient  development  of   distributed  systems     Reduce  development  time  by  a  factor  of  2x  to  3x     Reduce  lines  of  code  by  10x  or  more     Eliminate  most  synchronization  and  concurrency  bugs     High,  predictable  performance  using  optimized  code   generation     Fault-­‐Tolerance  built  into  the  model,  and  easy  to   implement   TACC is a general purpose language, focused on distributed systems 6  
  • 7. Stateful  remote  proxy  objects   LR  1   LR  2   Agents 1     Proxy:  local  copy  of  data   1     Writes  are  asynchronously  object added copied  to  SysDB  to collection   SysDB  changes  are  copied   to  “interested”  agents     R/W  access  is  local,  fast   SysDB 1     No  remote  access   collection exceptions   Simple semantics, and fast OptumSoft, Inc. Proprietary and 7   Confidential
  • 8. SysDB:  a  hierarchical  in-­‐memory   object  database    Stores  state  (ideally  no  logic)     Minimizes  risk  of  program   logic  bugs,  hence  reliable   Agents  Concise  specification  of  user-­‐ defined  types    TACC  compiler  automatically   generates  all  required  code   for  remote  access   SysDB  Agents  receive  automatic   notification  when  values   change   OptumSoft, Inc. Proprietary and 8   Confidential
  • 9. Distributed,  hierarchical  name   space    SysDB  defines  and  exports  an  hierarchical  name  space   (similar  to  a  distributed  file  system)    Remote  agents  can  “mount”  remote  directories  into  a   local  namespace    Each  object  is  instantiated  into  a  directory,  state  is   made  available  remotely  via  proxy  objects    Updates    propagate  asynchronously,  notifications  are   delivered  on  changes   Simple, powerful, proven way to provide large, structured name space OptumSoft, Inc. Proprietary and 9   Confidential
  • 10. Fault-­‐tolerance  is  built  in    When  an  agent  restarts,  it   recovers  its  state  from   SysDB   A1   A2   A3   A4    Agents  implement   invariants,  therefore  can   be  restarted  at  any  time,   on  any  server    Any  number  of  backup   SP   SB   SysDBs  are  supported   Fast  recovery  for  high  availability   10  
  • 11. Example:  Location  Service  as   customized  key-­‐value  store    Application  needs  to  track  real-­‐time  location  of  user    User  allowed  in  only  one  location  at  a  time    Three  operations:     ENTER  <user  id>  <session  id>  <location  id>     LEAVE  <user  id>     QUERY  <user  id>    Throughput  >  10,000  requests/sec,  latency  <  1  ms   High throughput, low latency required OptumSoft, Inc. Proprietary and 11   Confidential
  • 12. Location  Service  Overview   Load   balancer   Get  GS   location   LR     HTTP  access  to  service  GS     Application  (GS)  contacts   Leave   any  LR  server  via  load   LR   balancer  GS     LR  servers  replicated  for  GS   LR   scalability  and  for  fault   Enter   tolerance  GS   LR   Challenge: ensure responses fromGS   Enter   multiple LR servers are handled correctly OptumSoft, Inc. Proprietary and 12   Confidential
  • 13. Key-­‐value  store  tracks  location   for  each  user   Load   Key-­‐value   balancer   store   GS   LR   Shard   GS   A-­‐J   LR   Smith,1   GS   Has  to  be     Enter   atomic   Shard   Smith,1   K-­‐R   GS   get(),   LR   put()   GS   Shard   Enter     LR   Smith,2   get(),   Smith   S-­‐Z   GS   Smith,2   put()   OptumSoft, Inc. Proprietary and 13   Confidential
  • 14. TACC  allows  easy  customization   of  key-­‐value  update  semantics     Each  partition  stores  a  unique  subset  of  the  user  state     We  directly  implement  ENTER,  LEAVE,  and  QUERY   semantics,  using  a  TACC  Constrainer     No  locking  or  inter-­‐agent  synchronization  required     Requests  and  responses  sent  asynchronously     High  performance:  there  is  no  waiting  or  blocking  Specializing the key-value store semanticssimplifies the application and improves performance OptumSoft, Inc. Proprietary and 14   Confidential
  • 15. Single-­‐writer  collections:  no  need   for  synchronization   R S   RLR   R S   S   R Shard   R Request  Collection   A-­‐J   S   Response  Collection   S   R R S   S  LR   R S   R S   R R Shard   S   S   K-­‐R  LR   R R S   S   OptumSoft, Inc. Proprietary and 15   Confidential
  • 16. The  Serializer  Constrainer   Logic   Notify   Update user Write result status Request  Collection   Response  Collection  A   Enter  U1,  R5   A   OK   Status  Collection  K   Enter  U1,  R5   K   NOT  ALLOWED   U1   R5  D   Enter  U8,  R9   D   OK   U8   R9   Really simple! OptumSoft, Inc. Proprietary and 16   Confidential
  • 17. Details  of  Constrainer   implementation    Code  for  the  Serializer  constrainer  defines  three   collections:       Input  collection:  requests     Output  collections:  responses  and  user  status    A  dependency  constraint  causes  imperative  code  to  be   executed  when  a  new  request  arrives  from  LR  server     The  imperative  code  in  the  constrainer  implements   the  application  specific  semantics  This code is a minor tweak on put() implementation OptumSoft, Inc. Proprietary and 17   Confidential
  • 18. Constraints,  strong  typing  improves  event  handling  code      Constraint  handling  code  automatically  inserted  by   compiler    No  need  to  manually  maintain  invariants  in  many  call  sites    User-­‐defined  types  organize  constraint  handling  code  and   protect  against  mistakes    TACC  coroutine  further  simplifies  event  handling   TACC changes event-handling spaghetti into well-structured, type-safe code OptumSoft, Inc. Proprietary and 18   Confidential
  • 19. Instrumentation  and   Measurements    Stress  Agent  and  SysDB  instrumented  to  collect   timestamps  (stored  in  memory,  I/O  after  test)    tcpdump  run  on  Stress  Agent  and  SysDB  servers    Correlate  timestamps  with  tcpdump   OptumSoft, Inc. Proprietary and 19   Confidential
  • 20. Low  latency  pitfalls  to  avoid    Network  and  TCP  behavior     Many  TCP  settings  have  a  dramatic  and  non-­‐linear   performance  impact    Memory  management     Memory  allocation/deallocation     Avoid  garbage  collection   “The devil is in the details” OptumSoft, Inc. Proprietary and 20   Confidential
  • 21. Zero-­‐load  Latency  (μs)  End-­‐to-­‐end   Time   Latency  Request   0  created  1  Request   48   48   SysDB   Time   Latency  packet  2   Receive  request  3   0.0  Response   248   200   Notification  4   42.3   42.3  packet  7   Response   75.1   32.8  Notification  8   288   40   enqueued  5   Response  packet  6   108.5   33.4   Latencies are low and predictable OptumSoft, Inc. Proprietary and 21   Confidential
  • 22. Latency,  throughput  vs  SysDBs  High scalability under Latency converges tostrict latency bound zero-load latency OptumSoft, Inc. Proprietary and 22   Confidential
  • 23. Summary    Tacc  enables  developers  to  efficiently  create   predictably  high  performance,  scalable,  fault-­‐tolerant   distributed  applications    Eliminates  synchronization  and  locking  bugs    Fewer  lines  of  code     Faster  to  develop,  shorter  time  to  market     Easier  to  maintain     Fewer  bugs   23
  • 24. Contact  me  for  more  information  about  TACC  and   OptumSoft!  goosen@optumsoft.com   OptumSoft, Inc. Proprietary and 24   Confidential