Replacing	  Datacenter	  Oracle	  with	  Global	  Apache	  Cassandra	  on	  AWS	                   July	  11,	  2011	     ...
Ne8lix	  Inc.	       With	  more	  than	  23	  million	  subscribers	  in	  the	  United	       States	  and	  Canada,	  N...
Building	  a	  Global	  Ne8lix	  Service	                        Ne8lix	  Cloud	  MigraKon	                    Data	  Migr...
Why	  Use	  Public	  Cloud?	  
FricKonless	  Deployment	  	           (JFDI)	  
Things	  We	  Don’t	  Do	  
Be;er	  Business	  Agility	  
Data	  Center	                  Ne8lix	  could	  not	                                       build	  new	                  ...
23	  Million	  Customers	                         2011-­‐Q1	  year/year	  customers	  +69%	  	    25	    20	     15	     1...
Out-­‐Growing	  Data	  Center	               h;p://techblog.ne8lix.com/2011/02/redesigning-­‐ne8lix-­‐api.html   	        ...
Ne8lix.com	  is	  now	  ~100%	  Cloud	     Account	  sign-­‐up	  is	  currently	  being	  moved	  to	  cloud	          All...
Ne8lix	  Choice	  was	  AWS	  with	  our	     own	  pla8orm	  and	  tools	       Unique	  pla8orm	  requirements	  and	   ...
Leverage	  AWS	  Scale	     “the	  biggest	  public	  cloud”	   AWS	  investment	  in	  features	  and	  automaKon	  Use	 ...
We	  want	  to	  use	  clouds,	  we	  don’t	  have	  Kme	  to	  build	  them	                    Public	  cloud	  for	  ag...
Ne8lix	  Deployed	  on	  AWS	  Content	            Logs	             Play	          WWW	             API	      Video	     ...
Port	  to	  Cloud	  Architecture	  Short	  term	  investment,	  long	  term	  payback!	              Pay	  down	  technica...
TransiKon	  •  The	  Goals	         –  Faster,	  Scalable,	  Available	  and	  ProducKve	  •  AnK-­‐pa;erns	  and	  Cloud	...
Datacenter	  AnK-­‐Pa;erns	   What	  do	  we	  currently	  do	  in	  the	  datacenter	  that	  prevents	  us	  from	      ...
Old	  Datacenter	  vs.	  New	  Cloud	  Arch	      Central	  SQL	  Database	          Distributed	  Key/Value	  NoSQL	   SK...
The	  Central	  SQL	  Database	  •  Datacenter	  has	  central	  Oracle	  databases	     –  Everything	  in	  one	  place	...
The	  Distributed	  Key-­‐Value	  Store	  •  Cloud	  has	  many	  key-­‐value	  data	  stores	      –  More	  complex	  to...
Data	  MigraKon	  to	  Cassandra	  
TransiKonal	  Steps	  •  BidirecKonal	  ReplicaKon	     –  Oracle	  to	  SimpleDB	     –  Queued	  reverse	  path	  using	...
API	  AWS	  EC2	                                              Front	  End	  Load	  Balancer	               Discovery	     ...
Cuvng	  the	  Umbilical	  •  TransiKon	  Oracle	  Data	  Sources	  to	  Cassandra	      –  Offload	  Datacenter	  Oracle	  h...
API	  AWS	  EC2	                                     Front	  End	  Load	  Balancer	              Discovery	               ...
High	  Availability	  •  Cassandra	  stores	  3	  local	  copies,	  1	  per	  zone	         –  Synchronous	  access,	  dur...
Remote	  Copies	  •  Cassandra	  duplicates	  across	  AWS	  regions	      –  Asynchronous	  write,	  replicates	  at	  de...
Cassandra	  Backup	  •  Full	  Backup	                                                                             Cassand...
Cassandra	  Restore	  •  Full	  Restore	                                                               Cassandra	         ...
Cassandra	  Data	  ExtracKon	  •  Business	  Intelligence	                                   Brisk	                       ...
Cassandra	  Online	  BI	  •  Intra-­‐Day	  ExtracKon	                                                           Cassandra	...
Cassandra	  Archive	                  Appropriate	  level	  of	  paranoia	  needed…       	  •  Archive	  could	  be	  un-...
Tools	  and	  AutomaKon	  •  Developer	  and	  Build	  Tools	        –  Jira,	  Perforce,	  Eclipse,	  Jenkins,	  Ivy,	  A...
Developer	  MigraKon	  •  Detailed	  SQL	  to	  NoSQL	  TransiKon	  Advice	     –  Sid	  Anand	  	  -­‐	  QConSF	  Nov	  5...
Cloud	  OperaKons	       Cassandra	  Use	  Cases	    Model	  Driven	  Architecture	  Capacity	  Planning	  &	  Monitoring	...
Cassandra	  Use	  Cases	  •  Key	  by	  Customer	      –  Several	  separate	  Cassandra	  rings,	  read-­‐intensive	     ...
Model	  Driven	  Architecture	  •  Datacenter	  PracKces	     –  Lots	  of	  unique	  hand-­‐tweaked	  systems	     –  Har...
Ne8lix	  Pla8orm	  Cassandra	  AMI	  •  Tomcat	  server	     –  Always	  running,	  registers	  with	  pla8orm	     –  Man...
Ne8lix	  App	  Console	  
Auto	  Scale	  Group	  ConfiguraKon	  
Chaos	  Monkey	  •  Make	  sure	  systems	  are	  resilient	      –  Allow	  any	  instance	  to	  fail	  without	  custom...
Capacity	  Planning	  &	  Monitoring	  
Capacity	  Planning	  in	  Clouds	                       (a	  few	  things	  have	  changed…)	  •    Capacity	  is	  expen...
Data	  Sources	                                        • External	  URL	  availability	  and	  latency	  alerts	  and	  re...
AppDynamics	          How	  to	  look	  deep	  inside	  your	  cloud	  applicaKons	  •  AutomaKc	  Monitoring	     –  Base...
AppDynamics	  Monitoring	  of	  Cassandra	  –	  AutomaKc	  Discovery	  
DataStax	  OpsCenter	  
Ne8lix	  ContribuKons	  to	  Cassandra	  •  Cassandra	  as	  a	  mutable	  toolkit	      –  Cassandra	  is	  in	  Java,	  ...
Ne8lix	  “NoOps”	  OrganizaKon	  MarkeKng	  &	  AdverKsing	  Site	                      Member	  Site	  PersonalizaKon	   ...
Takeaway	                                	  Ne9lix	  is	  using	  Cassandra	  on	  AWS	  as	  a	  key	  	    infrastructur...
Amazon Cloud Terminology Reference     See http://aws.amazon.com/ This is not a full list of Amazon Web Service features• ...
Upcoming SlideShare
Loading in...5
×

Migrating Netflix from Datacenter Oracle to Global Cassandra

78,167

Published on

Slide deck from July 11, 2011 Cassandra Summit talk.
Updated 7/18 to fix a spelling mistake.

Published in: Technology, News & Politics
26 Comments
147 Likes
Statistics
Notes
No Downloads
Views
Total Views
78,167
On Slideshare
0
From Embeds
0
Number of Embeds
15
Actions
Shares
0
Downloads
76
Comments
26
Likes
147
Embeds 0
No embeds

No notes for slide

Migrating Netflix from Datacenter Oracle to Global Cassandra

  1. 1. Replacing  Datacenter  Oracle  with  Global  Apache  Cassandra  on  AWS   July  11,  2011   Adrian  Cockcro4   @adrianco  #ne8lixcloud   h;p://www.linkedin.com/in/adriancockcro4  
  2. 2. Ne8lix  Inc.   With  more  than  23  million  subscribers  in  the  United   States  and  Canada,  Ne9lix,  Inc.  is  the  world’s  leading   Internet  subscripAon  service  for  enjoying  movies  and   TV  shows.     InternaAonal  Expansion   We  plan  to  expand  into  an  addiAonal  market  in  the   second  half  of  2011…  If  the  second  market  meets  our   expectaAons…  we  will  conAnue  to  invest  and  expand   aggressively  in  2012.  Source:  h;p://ir.ne8lix.com  
  3. 3. Building  a  Global  Ne8lix  Service   Ne8lix  Cloud  MigraKon   Data  MigraKon  to  Cassandra   Highly  Available  and  Globally  Distributed  Data   Backups  and  Archives  in  the  Cloud   Monitoring  Cassandra   ContribuKons  and  OrganizaKon  
  4. 4. Why  Use  Public  Cloud?  
  5. 5. FricKonless  Deployment     (JFDI)  
  6. 6. Things  We  Don’t  Do  
  7. 7. Be;er  Business  Agility  
  8. 8. Data  Center   Ne8lix  could  not   build  new   datacenters  fast   enough   Capacity  growth  is  acceleraKng,  unpredictable   Product  launch  spikes  -­‐  iPhone,  Wii,  PS3,  XBox  
  9. 9. 23  Million  Customers   2011-­‐Q1  year/year  customers  +69%     25   20   15   10   5   0   2011Q1   2010Q1   2010Q2   2010Q3   2010Q4   2009Q2   2009Q3   2009Q4  Source:  h;p://ir.ne8lix.com  
  10. 10. Out-­‐Growing  Data  Center   h;p://techblog.ne8lix.com/2011/02/redesigning-­‐ne8lix-­‐api.html   37x  Growth  Jan   2010-­‐Jan  2011  Datacenter  Capacity  
  11. 11. Ne8lix.com  is  now  ~100%  Cloud   Account  sign-­‐up  is  currently  being  moved  to  cloud   All  internaKonal  product  is  cloud  based   USA  specific  logisKcs  remains  in  the  Datacenter    
  12. 12. Ne8lix  Choice  was  AWS  with  our   own  pla8orm  and  tools   Unique  pla8orm  requirements  and   extreme  agility  and  flexibility  
  13. 13. Leverage  AWS  Scale   “the  biggest  public  cloud”   AWS  investment  in  features  and  automaKon  Use  AWS  zones  and  regions  for  high  availability,   scalability  and  global  deployment  
  14. 14. We  want  to  use  clouds,  we  don’t  have  Kme  to  build  them   Public  cloud  for  agility  and  scale   AWS  because  they  are  big  enough  to  allocate  thousands   of  instances  per  hour  when  we  need  to  
  15. 15. Ne8lix  Deployed  on  AWS  Content   Logs   Play   WWW   API   Video   S3   DRM   Sign-­‐Up   Metadata   Masters   EMR   CDN   Device   EC2   Search   Hadoop   rouKng   Config   Movie   TV  Movie   S3   Hive   Bookmarks   Choosing   Choosing   Business   Mobile   CDN   Logging   RaKngs   Intelligence   iPhone  
  16. 16. Port  to  Cloud  Architecture  Short  term  investment,  long  term  payback!   Pay  down  technical  debt   Robust  pa;erns  
  17. 17. TransiKon  •  The  Goals   –  Faster,  Scalable,  Available  and  ProducKve  •  AnK-­‐pa;erns  and  Cloud  Architecture   –  The  things  we  wanted  to  change  and  why  •  Data  MigraKon   –  Minimizing  datacenter  dependencies    
  18. 18. Datacenter  AnK-­‐Pa;erns   What  do  we  currently  do  in  the  datacenter  that  prevents  us  from   meeKng  our  goals?    
  19. 19. Old  Datacenter  vs.  New  Cloud  Arch   Central  SQL  Database   Distributed  Key/Value  NoSQL   SKcky  In-­‐Memory  Session   Shared  Memcached  Session   Cha;y  Protocols   Latency  Tolerant  Protocols   Tangled  Service  Interfaces   Layered  Service  Interfaces   Instrumented  Code   Instrumented  Service  Pa;erns   Fat  Complex  Objects   Lightweight  Serializable  Objects   Components  as  Jar  Files   Components  as  Services  
  20. 20. The  Central  SQL  Database  •  Datacenter  has  central  Oracle  databases   –  Everything  in  one  place  is  convenient  unKl  it  fails   –  Customers,  movies,  history,  configuraKon  •  Schema  changes  require  downKme     AnA-­‐paOern  impacts  scalability,  availability  
  21. 21. The  Distributed  Key-­‐Value  Store  •  Cloud  has  many  key-­‐value  data  stores   –  More  complex  to  keep  track  of,  do  backups  etc.   –  Each  store  is  much  simpler  to  administer   –  Joins  take  place  in  java  code   DBA  •  No  schema  to  change,  no  scheduled  downKme  •  Latency  for  typical  queries   –  Memcached  is  dominated  by  network  latency  <1ms   –  Cassandra  replicaKon  takes  a  few  milliseconds   –  Oracle  for  simple  queries  is  a  few  milliseconds   –  SimpleDB  replicaKon  and  REST  auth  overheads  >10ms  
  22. 22. Data  MigraKon  to  Cassandra  
  23. 23. TransiKonal  Steps  •  BidirecKonal  ReplicaKon   –  Oracle  to  SimpleDB   –  Queued  reverse  path  using  SQS   –  Backups  remain  in  Datacenter  via  Oracle  •  New  Cloud-­‐Only  Data  Sources   –  Cassandra  based   –  No  replicaKon  to  Datacenter   –  Backups  performed  in  the  cloud  
  24. 24. API  AWS  EC2   Front  End  Load  Balancer   Discovery   Service   API  Proxy   API  etc.   Load  Balancer   Component   API   SQS   Services   Oracl e   Oracle   Oracle  Cassandra   memcached   ReplicaKon   memcached   EC2   Internal   Disks   Ne=lix   S3   Data  Center   SimpleDB  
  25. 25. Cuvng  the  Umbilical  •  TransiKon  Oracle  Data  Sources  to  Cassandra   –  Offload  Datacenter  Oracle  hardware   –  Free  up  capacity  for  growth  of  remaining  services  •  TransiKon  SimpleDB+Memcached  to  Cassandra   –  Primary  data  sources  that  need  backup   –  Keep  simple  use  cases  like  configuraKon  service  •  New  challenges   –  Backup,  restore,  archive,  business  conKnuity   –  Business  Intelligence  integraKon  
  26. 26. API  AWS  EC2   Front  End  Load  Balancer   Discovery   Service   API  Proxy   Load  Balancer   Component   API   Services   memcached   Cassandra   EC2   Internal   Disks   Backup   S3   SimpleDB  
  27. 27. High  Availability  •  Cassandra  stores  3  local  copies,  1  per  zone   –  Synchronous  access,  durable,  highly  available   –  Read/Write  One  fastest,  least  consistent  -­‐  ~1ms   –  Read/Write  Quorum  2  of  3,  consistent  -­‐  ~3ms  •  AWS  Availability  Zones   –  Separate  buildings   –  Separate  power  etc.   –  Close  together    
  28. 28. Remote  Copies  •  Cassandra  duplicates  across  AWS  regions   –  Asynchronous  write,  replicates  at  desKnaKon   –  Doesn’t  directly  affect  local  read/write  latency  •  Global  Coverage   –  Business  agility   –  Follow  AWS…  •  Local  Access   3 3 –  Be;er  latency   3 3 –  Fault  IsolaKon    
  29. 29. Cassandra  Backup  •  Full  Backup   Cassandra   –  Cron  on  each  node   Cassandra   Cassandra   –  Snapshot  -­‐>  tar.gz  -­‐>  S3   Cassandra   Cassandra  •  Incremental   S3   –  SSTable  write  triggers   Cassandra   Backup   Cassandra   copy  to  S3  •  ConKnuous   Cassandra   Cassandra   –  Scrape  commit  log   Cassandra   Cassandra   –  Write  to  EBS  every  30s  
  30. 30. Cassandra  Restore  •  Full  Restore   Cassandra   Cassandra   Cassandra   –  Replace  previous  data  •  New  Ring  from  Backup   Cassandra   Cassandra   –  New  name  old  data   S3   Backup   Cassandra   Cassandra   –  One  line  command!   Cassandra   Cassandra   Cassandra   Cassandra  
  31. 31. Cassandra  Data  ExtracKon  •  Business  Intelligence   Brisk   Brisk   Brisk   –  Re-­‐normalize  data   using  Hadoop  job   Brisk   Brisk  •  Daily  ExtracKon   S3   –  Create  Brisk  ring   Brisk   Backup   Brisk   –  Extract  backup   –  Run  Hadoop  job   Brisk   Brisk   –  Remove  Brisk  ring   Brisk   Brisk   –  Under  1hr…  
  32. 32. Cassandra  Online  BI  •  Intra-­‐Day  ExtracKon   Cassandra   Brisk   Cassandra   –  Use  split  Brisk  ring   –  Size  each  separately   Brisk   Cassandra   –  Hourly  Hadoop  job   S3   Backup   Cassandra   Cassandra   Cassandra   Cassandra   Cassandra   Cassandra  
  33. 33. Cassandra  Archive   Appropriate  level  of  paranoia  needed…  •  Archive  could  be  un-­‐readable   –  Base  on  restored  S3  backup  and  BI  extracted  data  •  Archive  could  be  stolen   –  Encrypt  archive  •  AWS  East  Region  could  have  a  problem   –  Copy  data  to  AWS  West  •  ProducKon  AWS  Account  could  have  an  issue   –  Separate  Archive  account  with  no-­‐delete  S3  ACL  •  AWS  S3  could  have  a  global  problem   –  Create  an  extra  copy  on  a  different  cloud  vendor  
  34. 34. Tools  and  AutomaKon  •  Developer  and  Build  Tools   –  Jira,  Perforce,  Eclipse,  Jenkins,  Ivy,  ArKfactory   –  Builds,  creates  .war  file,  .rpm,  bakes  AMI  and  launches  •  Custom  Ne8lix  ApplicaKon  Console   –  AWS  Features  at  Enterprise  Scale  (hide  the  AWS  security  keys!)   –  Auto  Scaler  Group  is  unit  of  deployment  to  producKon  •  Open  Source  +  Support   –  Apache,  Tomcat,  Cassandra,  Hadoop,  OpenJDK,  CentOS   –  Datastax  support  for  Cassandra,  AWS  support  for  Hadoop  via  EMR  •  Monitoring  Tools   –  Datastax  Opscenter  for  monitoring  Cassandra   –  AppDynamics  –  Developer  focus  for  cloud  h;p://appdynamics.com  
  35. 35. Developer  MigraKon  •  Detailed  SQL  to  NoSQL  TransiKon  Advice   –  Sid  Anand    -­‐  QConSF  Nov  5th  –  Ne8lix’  TransiKon   to  High  Availability  Storage  Systems   –  Blog  -­‐  h;p://pracKcalcloudcompuKng.com/   –  Download  Paper  PDF  -­‐  h;p://bit.ly/bhOTLu  •  Mark  Atwood,  "Guide  to  NoSQL,  redux”   –  YouTube  h;p://youtu.be/zAbFRiyT3LU  
  36. 36. Cloud  OperaKons   Cassandra  Use  Cases   Model  Driven  Architecture  Capacity  Planning  &  Monitoring   Chaos  Monkey  
  37. 37. Cassandra  Use  Cases  •  Key  by  Customer   –  Several  separate  Cassandra  rings,  read-­‐intensive   –  Sized  to  fit  in  memory  using  m2.4xl  Instances  •  Key  by  Customer:Movie  –  e.g.  Viewing  History   –  Growing  fast,  write  intensive  –  m1.xl  instances   –  Sized  to  hold  hot  data  in  memory  only  •  Large  scale  data  logging  –  lots  of  writes   –  Column  data  expires  a4er  Kme  period   –  Working  on  using  distributed  counters…  
  38. 38. Model  Driven  Architecture  •  Datacenter  PracKces   –  Lots  of  unique  hand-­‐tweaked  systems   –  Hard  to  enforce  pa;erns  •  Model  Driven  Cloud  Architecture   –  Perforce/Ivy/Jenkins  based  builds  for  everything   –  Every  producKon  instance  is  a  pre-­‐baked  AMI   –  Every  applicaKon  is  managed  by  an  Autoscaler   Every  change  is  a  new  AMI  
  39. 39. Ne8lix  Pla8orm  Cassandra  AMI  •  Tomcat  server   –  Always  running,  registers  with  pla8orm   –  Manages  Cassandra  state,  tokens,  backups  •  SimpleDB  configuraKon   –  Stores  token  slots  and  opKons   –  Avoids  circular  “bootstrap  problems”  •  Removed  Root  Disk  Dependency  on  EBS   –  Use  S3  backed  AMI  for  stateful  services   –  Normally  use  EBS  backed  AMI  for  fast  provisioning  
  40. 40. Ne8lix  App  Console  
  41. 41. Auto  Scale  Group  ConfiguraKon  
  42. 42. Chaos  Monkey  •  Make  sure  systems  are  resilient   –  Allow  any  instance  to  fail  without  customer  impact  •  Chaos  Monkey  hours   –  Monday-­‐Thursday  9am-­‐3pm  random  instance  kill  •  ApplicaKon  configuraKon  opKon   –  Apps  now  have  to  opt-­‐out  from  Chaos  Monkey  •  Computers  (Datacenter  or  AWS)  randomly  die   –  Fact  of  life,  but  too  infrequent  to  test  resiliency  
  43. 43. Capacity  Planning  &  Monitoring  
  44. 44. Capacity  Planning  in  Clouds   (a  few  things  have  changed…)  •  Capacity  is  expensive  •  Capacity  takes  Kme  to  buy  and  provision  •  Capacity  only  increases,  can’t  be  shrunk  easily  •  Capacity  comes  in  big  chunks,  paid  up  front  •  Planning  errors  can  cause  big  problems  •  Systems  are  clearly  defined  assets  •  Systems  can  be  instrumented  in  detail  •  Depreciate  assets  over  3  years  (reservaKons!)  
  45. 45. Data  Sources   • External  URL  availability  and  latency  alerts  and  reports  –  Keynote   External  TesKng   • Stress  tesKng  -­‐  SOASTA   • Ne8lix  REST  calls  –  Chukwa  to  DataOven  with  GUID  transacKon  idenKfier   Request  Trace  Logging   • Generic  HTTP  –  AppDynamics  service  Ker  aggregaKon,  end  to  end  tracking   • Tracers  and  counters  –  log4j,  tracer  central,  Chukwa  to  DataOven   ApplicaKon  logging   • Trackid  and  Audit/Debug  logging  –  DataOven,  Appdynamics    GUID  cross  reference   • ApplicaKon  specific  real  Kme  –  Datastax  Opscenter,  Appdynamics   JMX    Metrics   • Service  and  SLA  percenKles  –  Appdynamics,  Epic  logged  to  DataOven   • Stdout  logs  –  S3  –  DataOven  Tomcat  and  Apache  logs   • Standard  format  Access  and  Error  logs  –  S3  –  DataOven   • Garbage  CollecKon  –  Appdynamics   JVM   • Memory  usage,  call  stacks,  resource/call  -­‐  AppDynamics   • system  CPU/Net/RAM/Disk  metrics  –  AppDynamics   Linux   • SNMP  metrics  –  Epic,  Network  flows  –  boundary.com   • Load  balancer  traffic  –  Amazon  Cloudwatch,  SimpleDB  usage  stats   AWS   • System  configuraKon    -­‐  CPU  count/speed  and  RAM  size,  overall  usage  -­‐  AWS  
  46. 46. AppDynamics   How  to  look  deep  inside  your  cloud  applicaKons  •  AutomaKc  Monitoring   –  Base  AMI  bakes  in  all  monitoring  tools   –  Outbound  calls  only  –  no  discovery/polling  issues   –  InacKve  instances  removed  a4er  a  few  days    •  Incident  Alarms  (deviaKon  from  baseline)   –  Business  TransacKon  latency  and  error  rate   –  Alarm  thresholds  discover  their  own  baseline   –  Email  contains  URL  to  Incident  Workbench  UI  
  47. 47. AppDynamics  Monitoring  of  Cassandra  –  AutomaKc  Discovery  
  48. 48. DataStax  OpsCenter  
  49. 49. Ne8lix  ContribuKons  to  Cassandra  •  Cassandra  as  a  mutable  toolkit   –  Cassandra  is  in  Java,  pluggable,  well  structured   –  Ne8lix  has  a  building  full  of  Java  engineers….  •  Actual  ContribuKons  delivered  in  0.8   –  First  prototype  of  off-­‐heap  row  cache  (Vijay)   –  Incremental  backup  SSTable  write  callback  •  Work  In  Progress   –  AWS  integraKon  and  backup  using  Tomcat  helper   –  Total  re-­‐write  of  Hector  Java  client  library  (Eran)  
  50. 50. Ne8lix  “NoOps”  OrganizaKon  MarkeKng  &  AdverKsing  Site   Member  Site  PersonalizaKon   for  Customer  AcquisiKon   for  Customer  RetenKon   Cloud  Ops   Build  Tools   Database   Pla8orm   Cloud   Cloud   Reliability   and   Engineering   Development   Performance   SoluKons  Engineering   AutomaKon   Perforce   Cassandra   Cassandra   Cassandra   Cassandra   Cassandra   Jenkins   AWS   AWS   AWS   AWS   AWS   AWS  
  51. 51. Takeaway    Ne9lix  is  using  Cassandra  on  AWS  as  a  key     infrastructure  component  of  its  globally   distributed  streaming  product.     h;p://www.linkedin.com/in/adriancockcro4   @adrianco  #ne8lixcloud  
  52. 52. Amazon Cloud Terminology Reference See http://aws.amazon.com/ This is not a full list of Amazon Web Service features•  AWS  –  Amazon  Web  Services  (common  name  for  Amazon  cloud)  •  AMI  –  Amazon  Machine  Image  (archived  boot  disk,  Linux,  Windows  etc.  plus  applicaKon  code)  •  EC2  –  ElasKc  Compute  Cloud   –  Range  of  virtual  machine  types  m1,  m2,  c1,  cc,  cg.  Varying  memory,  CPU  and  disk  configuraKons.   –  Instance  –  a  running  computer  system.  Ephemeral,  when  it  is  de-­‐allocated  nothing  is  kept.   –  Reserved  Instances  –  pre-­‐paid  to  reduce  cost  for  long  term  usage   –  Availability  Zone  –  datacenter  with  own  power  and  cooling  hosKng  cloud  instances   –  Region  –  group  of  Availability  Zones  –  US-­‐East,  US-­‐West,  EU-­‐Eire,  Asia-­‐Singapore,  Asia-­‐Japan  •  ASG  –  Auto  Scaling  Group  (instances  booKng  from  the  same  AMI)  •  S3  –  Simple  Storage  Service  (h;p  access)  •  EBS  –  ElasKc  Block  Storage  (network  disk  filesystem  can  be  mounted  on  an  instance)  •  RDS  –  RelaKonal  Database  Service  (managed  MySQL  master  and  slaves)  •  SDB  –  Simple  Data  Base  (hosted  h;p  based  NoSQL  data  store)  •  SQS  –  Simple  Queue  Service  (h;p  based  message  queue)  •  SNS  –  Simple  NoKficaKon  Service  (h;p  and  email  based  topics  and  messages)  •  EMR  –  ElasKc  Map  Reduce  (automaKcally  managed  Hadoop  cluster)  •  ELB  –  ElasKc  Load  Balancer  •  EIP  –  ElasKc  IP  (stable  IP  address  mapping  assigned  to  instance  or  ELB)  •  VPC  –  Virtual  Private  Cloud  (extension  of  enterprise  datacenter  network  into  cloud)  •  IAM  –  IdenKty  and  Access  Management  (fine  grain  role  based  security  keys)  

×