Your SlideShare is downloading. ×
0
Going	  by	  TACC:	  Beyond	  Key-­‐Value	  to	  Fault-­‐Tolerant	   Stores	  with	  Easily	  Customizable	               ...
Key-­‐value	  stores	  rule	  the	  Web	    Many	  applications	  only	  need	  primary	  key	  data	  access	    Examples...
Key-­‐value	  stores	  in	  practice	    Developing	  a	  key-­‐value	  store	  from	  scratch	  using	    conventional	  ...
TACC	  provides	  a	  different	  model	      Use	  a	  very	  high-­‐level	  language	  to	  specify	  the	  key-­‐value	 ...
TACC	  is	  an	  object-­‐oriented,	      strongly	  typed	  language	    User-­‐defined	  type:	  a	  list	  of	  attribut...
TACC:	  efficient	  development	  of	       distributed	  systems	     Reduce	  development	  time	  by	  a	  factor	  of	  ...
Stateful	  remote	  proxy	  objects	                    LR	  1                        	            LR	  2	   Agents       ...
SysDB:	  a	  hierarchical	  in-­‐memory	              object	  database	    Stores	  state	  (ideally	  no	  logic)	      ...
Distributed,	  hierarchical	  name	                space	    SysDB	  defines	  and	  exports	  an	  hierarchical	  name	  s...
Fault-­‐tolerance	  is	  built	  in	    When	  an	  agent	  restarts,	  it	    recovers	  its	  state	  from	    SysDB	   ...
Example:	  Location	  Service	  as	   customized	  key-­‐value	  store	    Application	  needs	  to	  track	  real-­‐time	...
Location	  Service	  Overview	            Load	           balancer	             Get	  GS	           location	     LR	     ...
Key-­‐value	  store	  tracks	  location	            for	  each	  user	                Load	                               ...
TACC	  allows	  easy	  customization	   of	  key-­‐value	  update	  semantics	     Each	  partition	  stores	  a	  unique	...
Single-­‐writer	  collections:	  no	  need	                for	  synchronization	           R         S	      RLR	        ...
The	  Serializer	  Constrainer	                                          Logic	              Notify	                      ...
Details	  of	  Constrainer	              implementation	    Code	  for	  the	  Serializer	  constrainer	  defines	  three	 ...
Constraints,	  strong	  typing	  improves	  event	  handling	  code	  	    Constraint	  handling	  code	  automatically	  ...
Instrumentation	  and	                Measurements	    Stress	  Agent	  and	  SysDB	  instrumented	  to	  collect	    time...
Low	  latency	  pitfalls	  to	  avoid	    Network	  and	  TCP	  behavior	       Many	  TCP	  settings	  have	  a	  dramati...
Zero-­‐load	  Latency	  (μs)	  End-­‐to-­‐end	       Time	     Latency	  Request	              0	  created	  1	  Request	 ...
Latency,	  throughput	  vs	  SysDBs	  High scalability under               Latency converges tostrict latency bound       ...
Summary	    Tacc	  enables	  developers	  to	  efficiently	  create	    predictably	  high	  performance,	  scalable,	  faul...
Contact	  me	  for	  more	  information	  about	  TACC	  and	                             OptumSoft!	  goosen@optumsoft.co...
Upcoming SlideShare
Loading in...5
×

SDEC2011 Going by TACC

5,705

Published on

Key-value stores are widely used in applications that only require primary key data access, which is common in many web applications. Because developing an industrial grade key value store is expensive, the conventional solution is to use one of the existing key-value stores and layer application semantics on top of the primitives provided by the store. This approach leads to potential inefficiencies, because application specific semantics can often allow optimizations in the implementation of the store. We present an alternative approach, using the TACC platform to provide a key-value store implementation that is both performant and easily customizable. The TACC programming model separates state from logic: state is stored in a collection of distributed in-memory database instances, while logic is performed by distributed agents that react asynchronously to changes in objects stored in the database instances. Agents can selectively subscribe to updates using a fine-grain hierarchical directory system to mount objects into a local namespace. TACC provides performance comparable to hand-coded C while reducing the actual source code size to a fraction of that. We describe the implementation and performance of a scalable and fault tolerant key-value store using TACC, pointing out the benefits realized by using TACC's strong, user-defined types and triggering/notification.Key-value stores are widely used in applications that only require primary key data access, which is common in many web applications. Because developing an industrial grade key value store is expensive, the conventional solution is to use one of the existing key-value stores and layer application semantics on top of the primitives provided by the store. This approach leads to potential inefficiencies, because application specific semantics can often allow optimizations in the implementation of the store. We present an alternative approach, using the TACC platform to provide a key-value store implementation that is both performant and easily customizable. The TACC programming model separates state from logic: state is stored in a collection of distributed in-memory database instances, while logic is performed by distributed agents that react asynchronously to changes in objects stored in the database instances. Agents can selectively subscribe to updates using a fine-grain hierarchical directory system to mount objects into a local namespace. TACC provides performance comparable to hand-coded C while reducing the actual source code size to a fraction of that. We describe the implementation and performance of a scalable and fault tolerant key-value store using TACC, pointing out the benefits realized by using TACC's strong, user-defined types and triggering/notification.

http://sdec.kr/

Published in: Technology

Transcript of "SDEC2011 Going by TACC"

  1. 1. Going  by  TACC:  Beyond  Key-­‐Value  to  Fault-­‐Tolerant   Stores  with  Easily  Customizable   Semantics   Henk  Goosen,  CEO   goosen@optumsoft.com  
  2. 2. Key-­‐value  stores  rule  the  Web    Many  applications  only  need  primary  key  data  access    Examples:  catalogs,  shopping  carts,  web  session  state    No  need  for  the  complexity,  performance  overhead,   and  lack  of  scalability  of  a  full  database    Hence:  Key-­‐value  stores  are  everywhere     Dynamo,  CouchDB,  Cassandra,  Project  Voldemort,  Riak,   Redis,  memcached,  MongoDB,  …   Improving key-value stores is important OptumSoft, Inc. Proprietary and 2   Confidential
  3. 3. Key-­‐value  stores  in  practice    Developing  a  key-­‐value  store  from  scratch  using   conventional  languages  is  expensive:     scalability,  performance,  and  fault  tolerance    Conventional  solution:  use  existing  key-­‐value  store     Layer  on  get()  and  put()  semantics    Mismatches  between  application  requirements  and   library:  either  accept  or  extensively  modify  library  code     Applications are more complex, performance suffers OptumSoft, Inc. Proprietary and 3   Confidential
  4. 4. TACC  provides  a  different  model     Use  a  very  high-­‐level  language  to  specify  the  key-­‐value   store     Then  customize  the  store,  applying  application-­‐specific   semantics     Benefits:     Simplifies  the  application  business  logic     Improves  the  performance  of  both  store  and  application   TACC model is better! OptumSoft, Inc. Proprietary and 4   Confidential
  5. 5. TACC  is  an  object-­‐oriented,   strongly  typed  language    User-­‐defined  type:  a  list  of  attributes  (nouns)    Read  or  write  attributes  (there  are  no  methods/verbs)    Logic  primarily  implemented  via  constraints     imperative  code  is  also  supported    Compact  code     First  class  high  level  data  types  (eg,  queues,  hash  tables)     Several  design  patterns  directly  supported  in  language   (eg  observer  pattern)   Compact code  fewer bugs, quicker to market 5  
  6. 6. TACC:  efficient  development  of   distributed  systems     Reduce  development  time  by  a  factor  of  2x  to  3x     Reduce  lines  of  code  by  10x  or  more     Eliminate  most  synchronization  and  concurrency  bugs     High,  predictable  performance  using  optimized  code   generation     Fault-­‐Tolerance  built  into  the  model,  and  easy  to   implement   TACC is a general purpose language, focused on distributed systems 6  
  7. 7. Stateful  remote  proxy  objects   LR  1   LR  2   Agents 1     Proxy:  local  copy  of  data   1     Writes  are  asynchronously  object added copied  to  SysDB  to collection   SysDB  changes  are  copied   to  “interested”  agents     R/W  access  is  local,  fast   SysDB 1     No  remote  access   collection exceptions   Simple semantics, and fast OptumSoft, Inc. Proprietary and 7   Confidential
  8. 8. SysDB:  a  hierarchical  in-­‐memory   object  database    Stores  state  (ideally  no  logic)     Minimizes  risk  of  program   logic  bugs,  hence  reliable   Agents  Concise  specification  of  user-­‐ defined  types    TACC  compiler  automatically   generates  all  required  code   for  remote  access   SysDB  Agents  receive  automatic   notification  when  values   change   OptumSoft, Inc. Proprietary and 8   Confidential
  9. 9. Distributed,  hierarchical  name   space    SysDB  defines  and  exports  an  hierarchical  name  space   (similar  to  a  distributed  file  system)    Remote  agents  can  “mount”  remote  directories  into  a   local  namespace    Each  object  is  instantiated  into  a  directory,  state  is   made  available  remotely  via  proxy  objects    Updates    propagate  asynchronously,  notifications  are   delivered  on  changes   Simple, powerful, proven way to provide large, structured name space OptumSoft, Inc. Proprietary and 9   Confidential
  10. 10. Fault-­‐tolerance  is  built  in    When  an  agent  restarts,  it   recovers  its  state  from   SysDB   A1   A2   A3   A4    Agents  implement   invariants,  therefore  can   be  restarted  at  any  time,   on  any  server    Any  number  of  backup   SP   SB   SysDBs  are  supported   Fast  recovery  for  high  availability   10  
  11. 11. Example:  Location  Service  as   customized  key-­‐value  store    Application  needs  to  track  real-­‐time  location  of  user    User  allowed  in  only  one  location  at  a  time    Three  operations:     ENTER  <user  id>  <session  id>  <location  id>     LEAVE  <user  id>     QUERY  <user  id>    Throughput  >  10,000  requests/sec,  latency  <  1  ms   High throughput, low latency required OptumSoft, Inc. Proprietary and 11   Confidential
  12. 12. Location  Service  Overview   Load   balancer   Get  GS   location   LR     HTTP  access  to  service  GS     Application  (GS)  contacts   Leave   any  LR  server  via  load   LR   balancer  GS     LR  servers  replicated  for  GS   LR   scalability  and  for  fault   Enter   tolerance  GS   LR   Challenge: ensure responses fromGS   Enter   multiple LR servers are handled correctly OptumSoft, Inc. Proprietary and 12   Confidential
  13. 13. Key-­‐value  store  tracks  location   for  each  user   Load   Key-­‐value   balancer   store   GS   LR   Shard   GS   A-­‐J   LR   Smith,1   GS   Has  to  be     Enter   atomic   Shard   Smith,1   K-­‐R   GS   get(),   LR   put()   GS   Shard   Enter     LR   Smith,2   get(),   Smith   S-­‐Z   GS   Smith,2   put()   OptumSoft, Inc. Proprietary and 13   Confidential
  14. 14. TACC  allows  easy  customization   of  key-­‐value  update  semantics     Each  partition  stores  a  unique  subset  of  the  user  state     We  directly  implement  ENTER,  LEAVE,  and  QUERY   semantics,  using  a  TACC  Constrainer     No  locking  or  inter-­‐agent  synchronization  required     Requests  and  responses  sent  asynchronously     High  performance:  there  is  no  waiting  or  blocking  Specializing the key-value store semanticssimplifies the application and improves performance OptumSoft, Inc. Proprietary and 14   Confidential
  15. 15. Single-­‐writer  collections:  no  need   for  synchronization   R S   RLR   R S   S   R Shard   R Request  Collection   A-­‐J   S   Response  Collection   S   R R S   S  LR   R S   R S   R R Shard   S   S   K-­‐R  LR   R R S   S   OptumSoft, Inc. Proprietary and 15   Confidential
  16. 16. The  Serializer  Constrainer   Logic   Notify   Update user Write result status Request  Collection   Response  Collection  A   Enter  U1,  R5   A   OK   Status  Collection  K   Enter  U1,  R5   K   NOT  ALLOWED   U1   R5  D   Enter  U8,  R9   D   OK   U8   R9   Really simple! OptumSoft, Inc. Proprietary and 16   Confidential
  17. 17. Details  of  Constrainer   implementation    Code  for  the  Serializer  constrainer  defines  three   collections:       Input  collection:  requests     Output  collections:  responses  and  user  status    A  dependency  constraint  causes  imperative  code  to  be   executed  when  a  new  request  arrives  from  LR  server     The  imperative  code  in  the  constrainer  implements   the  application  specific  semantics  This code is a minor tweak on put() implementation OptumSoft, Inc. Proprietary and 17   Confidential
  18. 18. Constraints,  strong  typing  improves  event  handling  code      Constraint  handling  code  automatically  inserted  by   compiler    No  need  to  manually  maintain  invariants  in  many  call  sites    User-­‐defined  types  organize  constraint  handling  code  and   protect  against  mistakes    TACC  coroutine  further  simplifies  event  handling   TACC changes event-handling spaghetti into well-structured, type-safe code OptumSoft, Inc. Proprietary and 18   Confidential
  19. 19. Instrumentation  and   Measurements    Stress  Agent  and  SysDB  instrumented  to  collect   timestamps  (stored  in  memory,  I/O  after  test)    tcpdump  run  on  Stress  Agent  and  SysDB  servers    Correlate  timestamps  with  tcpdump   OptumSoft, Inc. Proprietary and 19   Confidential
  20. 20. Low  latency  pitfalls  to  avoid    Network  and  TCP  behavior     Many  TCP  settings  have  a  dramatic  and  non-­‐linear   performance  impact    Memory  management     Memory  allocation/deallocation     Avoid  garbage  collection   “The devil is in the details” OptumSoft, Inc. Proprietary and 20   Confidential
  21. 21. Zero-­‐load  Latency  (μs)  End-­‐to-­‐end   Time   Latency  Request   0  created  1  Request   48   48   SysDB   Time   Latency  packet  2   Receive  request  3   0.0  Response   248   200   Notification  4   42.3   42.3  packet  7   Response   75.1   32.8  Notification  8   288   40   enqueued  5   Response  packet  6   108.5   33.4   Latencies are low and predictable OptumSoft, Inc. Proprietary and 21   Confidential
  22. 22. Latency,  throughput  vs  SysDBs  High scalability under Latency converges tostrict latency bound zero-load latency OptumSoft, Inc. Proprietary and 22   Confidential
  23. 23. Summary    Tacc  enables  developers  to  efficiently  create   predictably  high  performance,  scalable,  fault-­‐tolerant   distributed  applications    Eliminates  synchronization  and  locking  bugs    Fewer  lines  of  code     Faster  to  develop,  shorter  time  to  market     Easier  to  maintain     Fewer  bugs   23
  24. 24. Contact  me  for  more  information  about  TACC  and   OptumSoft!  goosen@optumsoft.com   OptumSoft, Inc. Proprietary and 24   Confidential
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×