Ne#lix	
  in	
  the	
  Cloud	
  

          Oct	
  14,	
  2010	
  
         Adrian	
  Cockcro:	
  
        @adrianco	
  #ne#lixcloud	
  
h=p://www.linkedin.com/in/adriancockcro:	
  
Beta	
  Slides	
  
"I	
  didn't	
  have	
  -me	
  to	
  write	
  a	
  short	
  le4er,	
  so	
  I	
  wrote	
  a	
  long	
  one	
  
                                 instead.”	
  Mark	
  Twain	
  

        Teaser	
  intro	
  slides	
  now	
  at	
  slideshare.net/adrianco	
  
    Oct	
  14th	
  Beta	
  #devops	
  subset	
  -­‐	
  Cloud	
  CompuKng	
  Meetup	
  
     Nov	
  3rd	
  GA	
  –	
  QConSF	
  and	
  full	
  slides	
  on	
  slideshare.net	
  	
  
With	
  more	
  than	
  15	
  million	
  subscribers	
  in	
  the	
  
 United	
  States	
  and	
  Canada,	
  Ne#lix,	
  Inc.	
  is	
  the	
  
 world’s	
  leading	
  Internet	
  subscripKon	
  service	
  
        for	
  enjoying	
  movies	
  and	
  TV	
  shows.	
  .	
  
Synopsis	
  
•  Why	
  Give	
  This	
  Talk?	
  Why	
  Use	
  AWS?	
  
•  The	
  Goals	
  
     –  Faster,	
  Scalable,	
  Available	
  and	
  ProducKve	
  
•  Datacenter	
  AnK-­‐pa=erns	
  
     –  the	
  things	
  we	
  wanted	
  to	
  change	
  and	
  why	
  
•    Cloud	
  Architecture	
  Features	
  
•    Cloud	
  Bring-­‐up	
  Strategy	
  
•    Developer	
  TransiKons	
  and	
  Tools	
  
•    Roadmap	
  and	
  Next	
  Steps	
  
Why	
  Give	
  This	
  Talk?	
  
•  Ne#lix	
  is	
  Pathfinding	
  
    –  Cloud	
  ecosystem	
  is	
  evolving	
  very	
  fast	
  
    –  Share	
  with	
  and	
  learn	
  from	
  the	
  cloud	
  community	
  
•  We	
  want	
  to	
  use	
  clouds,	
  not	
  build	
  them	
  
    –  Cloud	
  technology	
  should	
  be	
  a	
  commodity	
  
    –  Public	
  cloud	
  and	
  open	
  source	
  for	
  agility	
  and	
  scale	
  
•  We	
  are	
  looking	
  for	
  talent	
  to	
  help…	
  
    –  How	
  do	
  we	
  connect	
  with	
  the	
  very	
  best	
  engineers?	
  
    –  h=p://www.quora.com/10X-­‐Engineers	
  
Why	
  Use	
  AWS?	
  
•  We	
  stopped	
  building	
  our	
  own	
  datacenters	
  
    –  Capacity	
  growth	
  rate	
  is	
  acceleraKng,	
  unpredictable	
  
    –  Product	
  launch	
  spikes	
  -­‐	
  iPhone,	
  Wii,	
  PS3,	
  XBox	
  
    –  Datacenter	
  is	
  large	
  inflexible	
  capital	
  commitment	
  
•  Leverage	
  AWS	
  Scale	
  –	
  the	
  biggest	
  public	
  cloud	
  
    –  AWS	
  investment	
  in	
  tooling	
  and	
  automaKon	
  
    –  AWS	
  zones	
  for	
  high	
  availability,	
  scalability	
  
    –  AWS	
  skills	
  are	
  common	
  on	
  resumes…	
  
•  Leverage	
  AWS	
  Feature	
  Set	
  –	
  most	
  advanced	
  
    –  EC2,	
  S3,	
  SDB,	
  SQS,	
  EBS,	
  EMR,	
  ELB,	
  ASG,	
  RDB..	
  
“The	
  cloud	
  lets	
  its	
  users	
  focus	
  
  on	
  delivering	
  differen-a-ng	
  
  business	
  value	
  instead	
  of	
  
  was-ng	
  valuable	
  resources	
  
  on	
  the	
  undifferen)ated	
  
  heavy	
  li0ing	
  that	
  makes	
  up	
  
  most	
  of	
  IT	
  infrastructure.”	
  

  	
  -­‐	
  Werner	
  Vogels	
  
  	
  	
  	
  August	
  25,	
  2009,	
  ‘All	
  Things	
  Digital’	
  
Ne#lix	
  Deployed	
  on	
  AWS	
  
•  “batch”	
  Movie	
  Encoding	
  farm	
  (2009)	
  
     –  Thousands	
  of	
  EC2	
  instances,	
  Petabytes	
  of	
  S3	
  
     –  Movie	
  files	
  staged	
  out	
  to	
  CDNs	
  for	
  delivery	
  
•  Hadoop	
  -­‐	
  ElasKc	
  Map-­‐Reduce	
  (2009)	
  
     –  Large	
  scale	
  log	
  processing	
  and	
  analyKcs	
  
•  Streaming	
  Service	
  Back-­‐end	
  (early	
  2010)	
  
     –  Highly	
  available	
  and	
  scalable	
  “play	
  bu=on”	
  
•  Web	
  site,	
  a	
  page	
  at	
  a	
  Kme	
  (through	
  2010)	
  
     –  Ramp	
  from	
  25%	
  of	
  views	
  to	
  80%	
  in	
  the	
  coming	
  weeks	
  
•  API	
  for	
  TV	
  devices	
  and	
  iPhone	
  etc.	
  (2010)	
  
     –  Personalized	
  movie	
  choosing	
  algorithm	
  back-­‐end	
  
Learnings…	
  
•  Datacenter	
  oriented	
  tools	
  don’t	
  work	
  
      –  Ephemeral	
  instances,	
  high	
  rate	
  of	
  change	
  
•  Cloud	
  Tools	
  Don’t	
  Scale	
  for	
  Enterprise	
  
      –  Built	
  our	
  own	
  tools,	
  drove	
  vendors	
  hard	
  
•  “fork-­‐li:ed”	
  apps	
  don’t	
  work	
  well	
  
      –  Fragile,	
  too	
  many	
  datacenter	
  oriented	
  assumpKons	
  baked	
  in	
  
•  It’s	
  faster	
  in	
  the	
  end	
  to	
  re-­‐code	
  than	
  Knker	
  
      –    Re-­‐architected	
  and	
  re-­‐wrote	
  much	
  of	
  our	
  code	
  base	
  
      –    Fine	
  grain	
  web	
  services	
  
      –    Leveraging	
  open	
  source	
  in	
  Java	
  
      –    SystemaKcally	
  instrumented	
  
      –    “NoSQL”	
  SimpleDB	
  backend.	
  

In	
  the	
  datacenter,	
  robust	
  code	
  is	
  best	
  prac-ce.	
  In	
  the	
  cloud,	
  it’s	
  essen-al.	
  
Next	
  Few	
  Years…	
  
•  “System	
  of	
  Record”	
  moves	
  to	
  Cloud	
  
     –  Cut	
  the	
  datacenter	
  to	
  cloud	
  replicaKon	
  link	
  
•  InternaKonal	
  Expansion	
  –	
  Global	
  Clouds	
  
     –  Rapid	
  deployments	
  to	
  new	
  markets	
  
•  GPU	
  Clouds	
  opKmized	
  for	
  video	
  encoding	
  
•  Cloud	
  StandardizaKon	
  
     –    Cloud	
  features	
  and	
  APIs	
  should	
  be	
  a	
  commodity	
  not	
  a	
  differenKator	
  
     –    DifferenKate	
  on	
  scale	
  and	
  quality	
  of	
  service	
  
     –    CompeKKon	
  drives	
  cost	
  down	
  
     –    Higher	
  resilience	
  
     –    Higher	
  scalability	
  


     We	
  would	
  prefer	
  to	
  be	
  an	
  insignificant	
  customer	
  in	
  a	
  giant	
  cloud	
  
Takeaway	
  

NeIlix	
  is	
  path-­‐finding	
  the	
  use	
  of	
  public	
  AWS	
  
 cloud	
  to	
  replace	
  in-­‐house	
  IT	
  for	
  non-­‐trivial	
  
applica-ons	
  with	
  hundreds	
  of	
  developers	
  and	
  
                  thousands	
  of	
  systems.	
  

       h=p://www.linkedin.com/in/adriancockcro:	
  
               @adrianco	
  #ne#lixcloud	
  

Netflix in the Cloud

  • 1.
    Ne#lix  in  the  Cloud   Oct  14,  2010   Adrian  Cockcro:   @adrianco  #ne#lixcloud   h=p://www.linkedin.com/in/adriancockcro:  
  • 2.
    Beta  Slides   "I  didn't  have  -me  to  write  a  short  le4er,  so  I  wrote  a  long  one   instead.”  Mark  Twain   Teaser  intro  slides  now  at  slideshare.net/adrianco   Oct  14th  Beta  #devops  subset  -­‐  Cloud  CompuKng  Meetup   Nov  3rd  GA  –  QConSF  and  full  slides  on  slideshare.net    
  • 3.
    With  more  than  15  million  subscribers  in  the   United  States  and  Canada,  Ne#lix,  Inc.  is  the   world’s  leading  Internet  subscripKon  service   for  enjoying  movies  and  TV  shows.  .  
  • 4.
    Synopsis   •  Why  Give  This  Talk?  Why  Use  AWS?   •  The  Goals   –  Faster,  Scalable,  Available  and  ProducKve   •  Datacenter  AnK-­‐pa=erns   –  the  things  we  wanted  to  change  and  why   •  Cloud  Architecture  Features   •  Cloud  Bring-­‐up  Strategy   •  Developer  TransiKons  and  Tools   •  Roadmap  and  Next  Steps  
  • 5.
    Why  Give  This  Talk?   •  Ne#lix  is  Pathfinding   –  Cloud  ecosystem  is  evolving  very  fast   –  Share  with  and  learn  from  the  cloud  community   •  We  want  to  use  clouds,  not  build  them   –  Cloud  technology  should  be  a  commodity   –  Public  cloud  and  open  source  for  agility  and  scale   •  We  are  looking  for  talent  to  help…   –  How  do  we  connect  with  the  very  best  engineers?   –  h=p://www.quora.com/10X-­‐Engineers  
  • 6.
    Why  Use  AWS?   •  We  stopped  building  our  own  datacenters   –  Capacity  growth  rate  is  acceleraKng,  unpredictable   –  Product  launch  spikes  -­‐  iPhone,  Wii,  PS3,  XBox   –  Datacenter  is  large  inflexible  capital  commitment   •  Leverage  AWS  Scale  –  the  biggest  public  cloud   –  AWS  investment  in  tooling  and  automaKon   –  AWS  zones  for  high  availability,  scalability   –  AWS  skills  are  common  on  resumes…   •  Leverage  AWS  Feature  Set  –  most  advanced   –  EC2,  S3,  SDB,  SQS,  EBS,  EMR,  ELB,  ASG,  RDB..  
  • 7.
    “The  cloud  lets  its  users  focus   on  delivering  differen-a-ng   business  value  instead  of   was-ng  valuable  resources   on  the  undifferen)ated   heavy  li0ing  that  makes  up   most  of  IT  infrastructure.”    -­‐  Werner  Vogels        August  25,  2009,  ‘All  Things  Digital’  
  • 8.
    Ne#lix  Deployed  on  AWS   •  “batch”  Movie  Encoding  farm  (2009)   –  Thousands  of  EC2  instances,  Petabytes  of  S3   –  Movie  files  staged  out  to  CDNs  for  delivery   •  Hadoop  -­‐  ElasKc  Map-­‐Reduce  (2009)   –  Large  scale  log  processing  and  analyKcs   •  Streaming  Service  Back-­‐end  (early  2010)   –  Highly  available  and  scalable  “play  bu=on”   •  Web  site,  a  page  at  a  Kme  (through  2010)   –  Ramp  from  25%  of  views  to  80%  in  the  coming  weeks   •  API  for  TV  devices  and  iPhone  etc.  (2010)   –  Personalized  movie  choosing  algorithm  back-­‐end  
  • 9.
    Learnings…   •  Datacenter  oriented  tools  don’t  work   –  Ephemeral  instances,  high  rate  of  change   •  Cloud  Tools  Don’t  Scale  for  Enterprise   –  Built  our  own  tools,  drove  vendors  hard   •  “fork-­‐li:ed”  apps  don’t  work  well   –  Fragile,  too  many  datacenter  oriented  assumpKons  baked  in   •  It’s  faster  in  the  end  to  re-­‐code  than  Knker   –  Re-­‐architected  and  re-­‐wrote  much  of  our  code  base   –  Fine  grain  web  services   –  Leveraging  open  source  in  Java   –  SystemaKcally  instrumented   –  “NoSQL”  SimpleDB  backend.   In  the  datacenter,  robust  code  is  best  prac-ce.  In  the  cloud,  it’s  essen-al.  
  • 10.
    Next  Few  Years…   •  “System  of  Record”  moves  to  Cloud   –  Cut  the  datacenter  to  cloud  replicaKon  link   •  InternaKonal  Expansion  –  Global  Clouds   –  Rapid  deployments  to  new  markets   •  GPU  Clouds  opKmized  for  video  encoding   •  Cloud  StandardizaKon   –  Cloud  features  and  APIs  should  be  a  commodity  not  a  differenKator   –  DifferenKate  on  scale  and  quality  of  service   –  CompeKKon  drives  cost  down   –  Higher  resilience   –  Higher  scalability   We  would  prefer  to  be  an  insignificant  customer  in  a  giant  cloud  
  • 11.
    Takeaway   NeIlix  is  path-­‐finding  the  use  of  public  AWS   cloud  to  replace  in-­‐house  IT  for  non-­‐trivial   applica-ons  with  hundreds  of  developers  and   thousands  of  systems.   h=p://www.linkedin.com/in/adriancockcro:   @adrianco  #ne#lixcloud