Successfully reported this slideshow.

RESTLess Design with Apache Thrift: Experiences from Apache Airavata


Published on

Apache Airavata is software for providing services to manage scientific applications on a wide range of remote computing resources. Airavata can be used by both individual scientists to run scientific workflows as well as communities of scientists through Web browser interfaces. It is a challenge to bring all of Airavata’s capabilities together in the single API layer that is our prerequisite for a 1.0 release. To support our diverse use cases, we have developed a rich data model and messaging format that we need to expose to client developers using many programming languages. We do not believe this is a good match for REST style services. In this presentation, we present our use and evaluation of Apache Thrift as an interface and data model definition tool, its use internally in Airavata, and its use to deliver and distribute client development kits.

Published in: Software, Technology
  • Be the first to comment

RESTLess Design with Apache Thrift: Experiences from Apache Airavata

  1. 1. Chathuri Wimalasena, Suresh Marru - Apache Airavata (Slide Contributions) Randy Abernethy - Apache Thrift
  2. 2. 40%  discount  coupon  on  any  of  Manning  Publications   Coupon  code  -­‐‑  ********   (listen  through  the  end  for  decryption  key)   A  free  ebook:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning   Publications  Co. (listen  through  the  end  and  impress  us  to  win)     Red Alert: FREE Stuff ….FREE as in *  Image  Source:  hHp://
  3. 3. Outline —  Introduction —  Apache  Thrift   —  Apache  Airavata —  Motivation  for  Airavata  to  explore  Thrift —  Airavata  API —  Road  to  Airavata  1.0 —  Airavata  Thrift  based  architecture —  Experience  &  lessons  learned —  Discussion,  Q  &  A
  4. 4. —  Modern  distributed   applications  are  rarely   composed  of  modules  wriHen   in  a  single  language —  Weaving  together  innovations   made  in  a  range  of  languages   is  a  core  competency  of   successful  enterprises —  Cross  language   communications  are  a   necessity,  not  a  luxury Polyglotism **  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.
  5. 5. —  A  high  performance,  scalable  cross  language  serialization  and  RPC  framework —  What? —  Full  RPC  Implementation  -­‐‑  Apache  Thrift  supplies  a  complete  RPC  solution:     clients,  servers,  everything  but  your  business  logic —  Modularity  -­‐‑  Apache  Thrift  supports  plug-­‐‑in  serialization  protocols:     binary,  compact,  json,  or  build  your  own —  Multiple  End  Points  –  Plug-­‐‑in  transports  for  network,  disk  and  memory  end  points,   making  Thrift  easy  to  integrate  with  other  communications  and  storage  solutions  like   AMQP  messaging  and  HDFS —  Performance  -­‐‑  Apache  Thrift  is  fast  and  efficient,  solutions  for  minimal  parsing  overhead   and  minimal  size   —  Reach  -­‐‑  Apache  Thrift  supports  a  wide  range  of  languages  and  platforms:   Linux,  OSX,  Windows,  Embedded  Systems,  Mobile,  Browser,  C++,  Go,  PHP,  Erlang,   Haskell,  Ruby,  Node.js,  C#,  Java,  C,  OCaml,  ObjectiveC,  D,  Perl,  Python,  SmallTalk,  … —  Flexibility  -­‐‑  Apache  Thrift  supports  interface  evolution,  that  is  to  say,  CI  environments  can   roll  new  interface  features  incrementally  without  breaking  existing  infrastructure. Apache Thrift *  Slide  source:  Randy  Abernethy.
  6. 6. 1.  Define  the  service  interface  in  IDL 2.  Compile  the  IDL  to  generate  client/server  RPC  stubs  in  the  desired   languages 3.  Call  the  remote  functions  as  if  they  were  local  using  the  client  stubs 4.  Connect  the  server  stubs  to  the  desired  implementation 5.  Choose  a  prebuilt  Apache  Thrift  server  to  host  your  service thrift IDL
  7. 7. —  Streaming  –  Communications  characterized  by  an   ongoing  flow  of  bytes  from  a  server  to  one  or  more   clients. —  Example:  An  internet  radio  broadcast  where  the  client   receives  bytes  over  time  transmiHed  by  the  server  in  an   ongoing  sequence  of  small  packets.   —  Messaging  –  Message  passing  involves  one  way   asynchronous,  often  queued,  communications,  producing   loosely  coupled  systems. —  Example:  Sending  an  email  message  where  you  may  get  a   response  or  you  may  not,  and  if  you  do  get  a  response   you  don’t  know  exactly  when  you  will  get  it. —  RPC  –  Remote  Procedure  Call  systems  allow  function   calls  to  be  made  between  processes  on  different   computers. —  Example:  An  iPhone  app  calling  a  service  on  the  Internet   which  returns  the  weather  forecast. Comm schemes Apache  Thrift  is  an  efficient  cross  platform   serialization  solution  for  streaming  interfaces Apache  Thrift  provides  a  complete  RPC   framework **  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.
  8. 8. —  User  Code —  client  code  calls  RPC  methods  and/or   [de]serializes  objects —  service  handlers  implement  RPC  service   behavior —  Generated  Code —  RPC  stubs  supply  client  side  proxies  and   server  side  processors —  type  serialization  code  provides  serialization   for  IDL  defined  types —  Library  Code —  servers  host  user  defined  services,  managing   connections  and  concurrency —  protocols  perform  serialization —  transports  move  bytes  from  here  to  there Architecture **  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.
  9. 9. —  The  Thrift  framework  was  originally  developed  at  Facebook  and   released  as  open  source  in  2007.  The  project  became  an  Apache   Software  Foundation  incubator  project  in  2008,  after  which  four  early   versions  were  released. —  0.9.1 released  2013-­‐‑07-­‐‑16 —  0.9.2 Planned  2014-­‐‑06-­‐‑01 —  1.0.0 Planned  2015-­‐‑01-­‐‑01 Thrift Roadmap it  is  difficult  to  make   predictions,  particularly   about  the  future. -­‐‑-­‐‑  Mark  Twain Open  Source Community  Developed Apache  License  Version  2.0 **  Figure  source:  The  Programmer’s  Guide  to  Apache  Thrift,  Manning  Publications  Co. *  Slide  source:  Randy  Abernethy.
  10. 10. —  Web — — —  Mail —  Users:  user-­‐‑ —  Developers:  dev-­‐‑ —  Chat —  #thrift —  Book —  Abernethy  (2014),  The  Programmer’s  Guide  to  Apache  Thrift,  Manning   Publications  Co.      [hHp://] Thrift Resources Chapter   1  is  free *  Slide  source:  Randy  Abernethy.
  11. 11. Knowledge and Expertise Computational Resources Scientific Instruments Algorithms and Models Archived Data and Metadata Advanced Science Tools Enabling & Democratizing Scientific Research Science Gateways:
  12. 12. Few Noted Examples:
  13. 13. Airavata  API Security Launch  &  Manage    Jobs Save/Load   configurations  and   provenance  data Notify  progress  of  job   or  workflow  execution Real-­‐‑Time   Monitoring Execute  &  Manage   Computations Data  Ingest  &  Discovery Data  Movement Data  &   Provenance   Subsystem Job  &   Workflow   Subsystem Messaging  Subsystem i Information   Subsystem CIPRES Neuro   Science Ultrascan BioVLAB GAAMP Param Chem Academic  &   Commercial   Clouds Inter-­‐‑ national   Grids Apache Airavata
  14. 14. Apache Airavata
  15. 15. Translating Science Gateway Language...
  16. 16. Interactivity through layers
  17. 17. Mathema'cal* Domain* Resource* Model** Refinement* Uncertain)es+ Orchestra)on+level+Interac)ons+ Workflow** Steering* Parametric* Sweeps* Provenance* Job+Level+Interac)ons+ Job*launch,* gliding* Checkpoint/** Restart* Asynchronous+ +refinements+ Applica)on+ Steering+ Challenges the API
  18. 18. Airavata API in a nutshell —  Airavata  provides  a  API  to  science  gateway  developers   —  Science  gateways  =>  diverse  requirements —  Airavata  API  needs  to  be   —  Easy  to  understand  and  simple —  Generic  enough  to  support  different  gateways —  Existing  science  gateway  integration  should  be  smooth —  Flexible  for  changes  arising  with  different  requirements  of  science   gateways
  19. 19. Earlier Airavata Architecture (a simplified view - evolved over 10 years)
  20. 20. Road to Airavata 1.0 —  First  Approach  -­‐‑  SOAP  /  WS —  Pros   —  ability  to  separate  out  context  and  the  payload —  Already  proven  tools  and  techniques  (XML,  WSDL,  WS-­‐‑Security  etc) —  Clients  can  easily  generate  stubs  using  WSDLs —  Cons   —  Heavy  weight —  Data  schema  changes  cannot  be  handled  easily —  Could  not  support  broad  range  of  clients
  21. 21. Contd.. Road to Airavata 1.0 —  Second  Approach  -­‐‑  REST —  Pros   —  Flexible  for  data  representation  (JSON  or  XML) —  Light  weight —  BeHer  performance —  Cons   —  Multiple  object  layers       —  No  standard  way  to  describe  the  service  to  the  client
  22. 22. Key issues with earlier Airavata API —  Only  java  clients  were  developed —  users  had  to  write  their  own  client  for  other  languages —  API  design  was  a  hybrid  model   —  Makes  it  more  complicated  internally —  Complex  data  objects  are  overwhelming  to  work  in  REST —  Lot  of  marshalling  /  unmarshalling  of  data  objects —  Unnecessary  need  for  a  servlet  container —  Overwhelming  number  of  methods  with  lots  of  overloaded  methods
  23. 23. Contd..Road to Airavata 1.0 —  Third  Approach  -­‐‑  RPC  style  tools  (Apache  Thrift,  Protocol  Buffers,   Apache  Avro) —  Ability  to  communicate  easily  across  different  programming  languages —  Easy  way  to  generate  server  and  client  code
  24. 24. Why Apache Thrift —  Apache  project —  For  Airavata  users,  a  client  library  that  can  consume  the  API  is  more   important  than  an  open  API  like  REST —  Light  framework —  No  multiple  dependencies  and  servlet  containers. —  Easy  learning  curve  to  get  started —  If  carefully  crafted,  the  IDLs  and  framework  support  backward  and   forward  compatibility
  25. 25. Clean  way  to  define  IDLs  with  richer  data  structures
  26. 26. Apache Thrift Integration with Airavata ●  Airavata  API  server  and   Orchestrator  are  thrift   services  at  the  moment ●  Other  components  will  also   become  thrift  services  in  the   future
  27. 27. Apache Thrift Integration with Airavata
  28. 28. Experience: What we gain —  Able  to  support  clients  wriHen  in  different  languages —  Clean  design —  Having  different  versions  of  servers —  TSimpleServer  -­‐‑  Simple  single  threaded  server —  TThreadPoolServer  -­‐‑  Uses  Java'ʹs  built  in  ThreadPool  management —  TNonblockingServer  -­‐‑  non-­‐‑blocking  TServer  implementation —  THsHaServer  -­‐‑  extension  of  the  TNonblockingServer  to  a  Half-­‐‑Sync/ Half-­‐‑Async  server
  29. 29. Experience Contd.. What we gain —  No  need  of  marshalling  /  unmarshalling  -­‐‑  data  models  used  internally —  Server  is  very  robust  -­‐‑  able  to  handle  large  number  of  concurrent   requests.   —  Easy  to  do  modifications  to  the  models,  since  code  is  auto-­‐‑generated —  Convenient  way  to  achieve  backward  compatibility
  30. 30. Lessons Learned —  Modularize  the  data  models  in  order  to  ease  the  maintenance —  No  maven  plugin  at  the  moment,  hence  you  have  to  commit  auto-­‐‑ generated  code.
  31. 31. Lessons Learned Contd..
  32. 32. Lessons Learned Contd.. —  Thrift  has  limited  support  to  handle  null  values. —  Experiment  model  is  a  complex  model  with  lot  of  other  structs. —  Enums  cannot  be  passed  as  null  over  the  wire  even  they  are  specified  as   optional  in  the  struct —  Thrift  has  limited  documentation  and  samples —  airavata  can  be  used  as  a  reference   —  hHps://
  33. 33. Future Directions —  March  towards  Airavata  1.0  with  a  stable  API —  Exploring  next  generation  information  model  and  Messaging  Systems —  Ka{a,  flume,  fluend,  Scribe,  Hedwig…. —  Refactor  Registry  Implementation   —  Airavata  architecture  to  support  multi-­‐‑tenanted  PAAS —  Motivated  by  a  downstream  open  source  project  SciGaP  –  Science   Gateways  Platform  as  a  Service  
  34. 34. How you can participate/follow —  Join  the  party  -­‐‑  Architecture  mailing  list —  BYOX   —  X  =  opinions,  software,  ideas,  code,  yourself  ..
  35. 35. Take Home —  2  e  books  on  thrift —  awareness  of  thrift  and  airavata —  lessons  learned —  how  to  get  involved —  We  are  under  same  umbrella  for  a  reason,  lets  party  together —  Join  our  Architecture  List  and  help   —  Other  means??
  36. 36. Discussion, Q & A