0
Season 2 Episode 1
March 12, 2014
Evening Outline
Lightning Talks:
- S3mper
- PigPen
- STAASH
- Dynomite
- Aegisthus
- Suro
- Zeno
- Lipstick on GCE
- AnsWe...
41 projects… Now what?
● Cohesive platform
● Workshops / Training / Documentation
● Participate and contribute :
netflixos...
Lightning talks
Lipstick, Hadoop, and
Big Data on the Google Cloud
Matt Bookman
Solutions Architect
Google Confidential and Proprietary
Google Compute Engine - VMs in Google Datacenters
● Public Preview - May 2013
● Genera...
Google Confidential and Proprietary
Demo (Summer 2013): Pig on Compute Engine
Sweet demo!
Google Confidential and Proprietary
Netflix OSS Meetup - July 17, 2013
Google Confidential and Proprietary
Lipstick - Providing insights
Google Confidential and Proprietary
Google Confidential and Proprietary
Hadoop on GCE + Cloud Storage (GCS) Connector
Accenture:
Cloud vs. Bare-Metal
● Cloud-...
Google Confidential and Proprietary
Data in GCS, Lipstick DB in Cloud SQL
Google Cloud Platform
Output Data
Lipstick
Datab...
Google Confidential and Proprietary
● Netflix Lipstick on Google Compute Engine
https://cloud.google.com/developers/articl...
Google Confidential and Proprietary
Thank you
@Answers4AWS
Cloud Prize and Beyond
Peter Sankauskas
@pas256
@Answers4AWS
March 2013
@Answers4AWS
First idea
• AsgardFormation
• CloudFormation for Asgard
@Answers4AWS
@Answers4AWS
@Answers4AWS
Requirements
• AsgardFormation
• Asgard running
• AWS Credentials
• IAM user
• Policy
• Security Group
• EC2 ...
@Answers4AWS
@Answers4AWS
Asgard playbook
• Base
• Install usual Linux packages
• Basic system hardening and security
packages
• Oracle...
@Answers4AWS
Other playbooks
• Eureka
• Edda
• Simian Army
• Ice
• Aminator
• Genie
@Answers4AWS
AMIs
• Initially built using my own scripts based
on Eric Hammond’s (@esh) work
• Then using Aminator
• Creat...
@Answers4AWS
CloudFormation
• One-click deploy
• Well, about 10 going through the AWS
Web Console wizard
• Designed to get...
@Answers4AWS
@Answers4AWS
What’s next?
@Answers4AWS
Do you do this? (this is not my slide)
@Answers4AWS
@Answers4AWS
Beta users
• From a successful CI build
• To a Fully Baked AMI
• Use in Testing and Production
• Without you ...
@Answers4AWS
Thank you
http://bakery.answersforaws.com/
bakery@answersforaws.com
See me at the demo station
Peter Sankausk...
IBM Scalable Services Fabric
for Netflix S2E1 Meetup
Andrew Spyker
@aspyker
History and Future
2012
SPECjEnterprise
2013
AcmeAir Run
On IBM Cloud at
“Web Scale”
2014
Scalable Services
Fabric interna...
Scalable Service Fabric Work
Netflix OSS IBM port/enablement
Netflix “Zen” of Cloud • Worked with initial services to enab...
Come meet the team!
Looks like … Tweets from … Talks about …
Adolfo @adolforod
API Management and Cloud Integration, user ...
Priam +
Aegisthus
@Coursera
NetflixOSS Meetup
Introduction
@DanielChiaJH
Software Engineer, Infrastructure Team
Coursera
Overview
• Philosophy
• Priam
• Aegisthus
• Conclusion
Philosophy
• Architecture Patterns
• Use what we can
• Incorporate the spirit of others
Priam – Wins
• Token Management
• S3 Backup + Restore
• Config
Priam – Next Steps?
• SimpleDB -> DynamoDB
• Backups blow out OS disk buffer cache
• Compatibility with newer C* versions
Aegisthus - Wins
• Novel workflow
• Data reduced to one authoritative copy
• Possibility for incremental jobs
Aegisthus – What Next?
• C* 1.2 / 2.0
• CQL3
• Priam <–> Aegisthus
• Better compressed SSTable support
Conclusion
• Come chat with me!
• Especially if you have similar goals to me
Zeno
● In-memory data distribution platform
● Contains tools for:
○ data quality management
○ data serialization
● We use ...
Zeno
Why in-memory data?
- Netflix serves billions of requests
per day
- Each request requires metadata
about many movies ...
Zeno
Netflix Use Case:
● Gigabytes of in-memory data
● Hundreds of thousands of in-memory cache
requests per second, per a...
Distribution
FastBlob:
Binary serialization of a complete
state of data, and/or the changes
in data over time.
Serializati...
Data Quality
Diff Reports - inspect data changes between releases
Data Quality
Diff History - inspect changes in data over time
Zeno Framework
Data Schema (Serializers)
Operation (SerializationFramework)
Input Data (POJOs)
Output
Zeno Framework
Data Schema (Serializers)
Operation (SerializationFramework)
Input Data (POJOs)
Output
JsonSerializationFra...
Zeno Benefits
Development Agility:
● Easy to evolve data model, no need to change serialization formats or
operation logic...
Suro
To Be Processed in Different Ways
A Simple Solution That Supports All These
STAASH
STorage As A Service over Http
STAASH
STAASH
● Storage-Agnostic
● Language-Agnostic
● REST Interface to data
● Pattern Automation / Aware End Points
● Wrapper A...
Dynomite!!
Dynomite
● Cross AZ & Region replication to existing Key Value
stores
○ memcached
○ Reddis
● Thin Dynamo implementation pr...
Dynomite
Dynomite
memcached
Dynomite
memcached
App 1
AZ 1 AZ 2
What do all those events mean?
{“deviceid”: 12345, “action”: “played”, “titleid”: 99999}
{“deviceid”: 12345, “action”: “played”, “titleid”: 99999}
Device C*
12345: “PS3”
{“deviceid”: 12345, “action”: “played”, “titleid”: 99999}
Device C*
12345: “PS3”
Content C*
99999: “HOC”
Don’t hurt production/our customers
Device/Content C*
“My Devices”: {“PS3:HOC”:”12345:99999”}
?!?!?
Sometimes you just want all the data
C*
Priam S3
SSTables
S3
SSTables
Move to HDFS*
Convert to JSON
Compact Rows
S3
JSON
● A splittable input format for SSTables
○ Need less files from the cluster.
○ Faster - just deserializing/serializing the...
Big Data Platform
Eventual Consistency
Focus on Performance
● Get your job running faster
● Understand why it was slow
● Transition to Hadoop 2
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Netflix oss   season 2 episode 1 - meetup Lightning talks
Upcoming SlideShare
Loading in...5
×

Netflix oss season 2 episode 1 - meetup Lightning talks

78,741

Published on

Published in: Technology

Transcript of "Netflix oss season 2 episode 1 - meetup Lightning talks"

  1. 1. Season 2 Episode 1 March 12, 2014
  2. 2. Evening Outline Lightning Talks: - S3mper - PigPen - STAASH - Dynomite - Aegisthus - Suro - Zeno - Lipstick on GCE - AnsWerS - IBM - Coursera
  3. 3. 41 projects… Now what? ● Cohesive platform ● Workshops / Training / Documentation ● Participate and contribute : netflixoss@netflix.com
  4. 4. Lightning talks
  5. 5. Lipstick, Hadoop, and Big Data on the Google Cloud Matt Bookman Solutions Architect
  6. 6. Google Confidential and Proprietary Google Compute Engine - VMs in Google Datacenters ● Public Preview - May 2013 ● General Availability - December 2013
  7. 7. Google Confidential and Proprietary Demo (Summer 2013): Pig on Compute Engine Sweet demo!
  8. 8. Google Confidential and Proprietary Netflix OSS Meetup - July 17, 2013
  9. 9. Google Confidential and Proprietary Lipstick - Providing insights
  10. 10. Google Confidential and Proprietary
  11. 11. Google Confidential and Proprietary Hadoop on GCE + Cloud Storage (GCS) Connector Accenture: Cloud vs. Bare-Metal ● Cloud-based Hadoop deployments offer better price- performance ratios than bare-metal ● Cloud’s virtualization expands performance- tuning opportunities ● Using remote storage outperforms local disk HDFS
  12. 12. Google Confidential and Proprietary Data in GCS, Lipstick DB in Cloud SQL Google Cloud Platform Output Data Lipstick Database Hadoop Master MapReduce JobTracker Hadoop Worker MapReduce TaskTrackerHadoop Worker MapReduce TaskTracker Hadoop Worker MapReduce TaskTrackerLipstick Server Input Data
  13. 13. Google Confidential and Proprietary ● Netflix Lipstick on Google Compute Engine https://cloud.google.com/developers/articles/netflix-lipstick-on-google-compute-engine ● GCS Connector for Hadoop https://developers.google.com/hadoop/google-cloud-storage-connector ● Cloud-based Hadoop Deployments: Benefits and Considerations http://www.accenture.com/SiteCollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop-Deployments-Benefits-and-Considerations. pdf ● Apache Hadoop, Hive, and Pig on Google Compute Engine https://cloud.google.com/developers/articles/apache-hadoop-hive-and-pig-on-google-compute-engine Resources
  14. 14. Google Confidential and Proprietary Thank you
  15. 15. @Answers4AWS Cloud Prize and Beyond Peter Sankauskas @pas256
  16. 16. @Answers4AWS March 2013
  17. 17. @Answers4AWS First idea • AsgardFormation • CloudFormation for Asgard
  18. 18. @Answers4AWS
  19. 19. @Answers4AWS
  20. 20. @Answers4AWS Requirements • AsgardFormation • Asgard running • AWS Credentials • IAM user • Policy • Security Group • EC2 instance • Asgard downloaded and configured • Tomcat downloaded and configured • Java downloaded and installed • Linux configured
  21. 21. @Answers4AWS
  22. 22. @Answers4AWS Asgard playbook • Base • Install usual Linux packages • Basic system hardening and security packages • Oracle Java 7 • Tomcat 7 • Asgard • Latest release from GitHub
  23. 23. @Answers4AWS Other playbooks • Eureka • Edda • Simian Army • Ice • Aminator • Genie
  24. 24. @Answers4AWS AMIs • Initially built using my own scripts based on Eric Hammond’s (@esh) work • Then using Aminator • Created Ubuntu Foundation AMIs • Added the Ansible Provisioner for Aminator • Put a couple of them on the AWS Marketplace for free
  25. 25. @Answers4AWS CloudFormation • One-click deploy • Well, about 10 going through the AWS Web Console wizard • Designed to get you up and running quickly • Test it out, see if you like it • NOT production quality • No real security • No HA
  26. 26. @Answers4AWS
  27. 27. @Answers4AWS What’s next?
  28. 28. @Answers4AWS Do you do this? (this is not my slide)
  29. 29. @Answers4AWS
  30. 30. @Answers4AWS Beta users • From a successful CI build • To a Fully Baked AMI • Use in Testing and Production • Without you doing anything • ZERO clicks • Signups are open
  31. 31. @Answers4AWS Thank you http://bakery.answersforaws.com/ bakery@answersforaws.com See me at the demo station Peter Sankauskas @pas256
  32. 32. IBM Scalable Services Fabric for Netflix S2E1 Meetup Andrew Spyker @aspyker
  33. 33. History and Future 2012 SPECjEnterprise 2013 AcmeAir Run On IBM Cloud at “Web Scale” 2014 Scalable Services Fabric internally for IBM Services Scalable Services Fabric SaaS and On-Prem? Sample application cloud prize work AcmeAir Cloud/Mobile Sample/Benchmark born Codename: BlueMix Portability cloud prize work
  34. 34. Scalable Service Fabric Work Netflix OSS IBM port/enablement Netflix “Zen” of Cloud • Worked with initial services to enable cloud native arch • Worked with initial services to enable NetflixOSS usages • Created scorecard and tests for “cloud native” readiness Highly Available IaaS and Cloud Services • Deployment across multiple IBM SoftLayer IaaS datacenters and global and local load balancers • Complete automation via IBM SoftLayer IaaS API’s • Ensured facilities for automatic failure recovery Micro-service Runtimes (Karyon, Eureka Client, Ribbon, Hystrix, Archaius) • Ported to work with IBM SoftLayer IaaS and on the WebSphere Liberty Profile application server • Created “eureka-sidecar” for non-Java runtimes and ElasticSearch discovery Netflix OSS Servers (Asgard, Eureka Server, Turbine) • Ported to work with IBM SoftLayer IaaS + RightScale • Operationalized HA and secure deployments for multiple service tenants Adopted Chaos Testing • Ported Chaos Monkey to IBM SoftLayer IaaS • Performed manual Chaos Gorilla validation on services Worked through devops tool chain • Worked with initial services to enable continuous delivery with devops (and imagine baking via Animator like tool)
  35. 35. Come meet the team! Looks like … Tweets from … Talks about … Adolfo @adolforod API Management and Cloud Integration, user of NetflixOSS platform. Appliances in the cloud. Brian @bkmartin IBM BlueMix (PaaS), enabling composable apps in PaaS Darrell IBM Research focusing on NetflixOSS devops and on- premise deployments David @dcurrie WebSphere Liberty Profile application server NetflixOSS development and PaaS integration Jonathan @ma4jpb NetflixOSS portability across many aspects Cloud messaging (in relation to Suro) Matt @matrober API Management, user of NetflixOSS platform Converted service to be cloud native Rachel @rreinitz IBM Services, interested in helping you get to this cloud native in SaaS and on-premise Ricky @rickymoorhouse API Management, user of NetflixOSS platform Creator of Imaginator Will @auwilli98 API Management operations, user of NetflixOSS platform
  36. 36. Priam + Aegisthus @Coursera NetflixOSS Meetup
  37. 37. Introduction @DanielChiaJH Software Engineer, Infrastructure Team Coursera
  38. 38. Overview • Philosophy • Priam • Aegisthus • Conclusion
  39. 39. Philosophy • Architecture Patterns • Use what we can • Incorporate the spirit of others
  40. 40. Priam – Wins • Token Management • S3 Backup + Restore • Config
  41. 41. Priam – Next Steps? • SimpleDB -> DynamoDB • Backups blow out OS disk buffer cache • Compatibility with newer C* versions
  42. 42. Aegisthus - Wins • Novel workflow • Data reduced to one authoritative copy • Possibility for incremental jobs
  43. 43. Aegisthus – What Next? • C* 1.2 / 2.0 • CQL3 • Priam <–> Aegisthus • Better compressed SSTable support
  44. 44. Conclusion • Come chat with me! • Especially if you have similar goals to me
  45. 45. Zeno ● In-memory data distribution platform ● Contains tools for: ○ data quality management ○ data serialization ● We use it to distribute and keep up to date gigabytes of video metadata on tens of thousands of servers across the globe
  46. 46. Zeno Why in-memory data? - Netflix serves billions of requests per day - Each request requires metadata about many movies to answer
  47. 47. Zeno Netflix Use Case: ● Gigabytes of in-memory data ● Hundreds of thousands of in-memory cache requests per second, per application instance ● Tens of thousands of application instances
  48. 48. Distribution FastBlob: Binary serialization of a complete state of data, and/or the changes in data over time. Serialization format designed to propagate, and keep up to date, a large amount of in-memory data across many servers. Optimized for: memory GC effects, memory footprint, data transfer size, deserialization CPU usage
  49. 49. Data Quality Diff Reports - inspect data changes between releases
  50. 50. Data Quality Diff History - inspect changes in data over time
  51. 51. Zeno Framework Data Schema (Serializers) Operation (SerializationFramework) Input Data (POJOs) Output
  52. 52. Zeno Framework Data Schema (Serializers) Operation (SerializationFramework) Input Data (POJOs) Output JsonSerializationFramework HashSerializationFramework DiffSerializationFramework FastBlobStateEngine
  53. 53. Zeno Benefits Development Agility: ● Easy to evolve data model, no need to change serialization formats or operation logic ● Easy to create new functionality, no need to think about data model structure or semantics ● Included “Diff” tools support high data quality across releases without too much effort Resource Efficiency: ● Included “FastBlob” optimized for Netflix scale ● Ask about in-development functionality!
  54. 54. Suro
  55. 55. To Be Processed in Different Ways
  56. 56. A Simple Solution That Supports All These
  57. 57. STAASH STorage As A Service over Http
  58. 58. STAASH
  59. 59. STAASH ● Storage-Agnostic ● Language-Agnostic ● REST Interface to data ● Pattern Automation / Aware End Points ● Wrapper Around Astyanax Recipes ● Possibilities: Auditing, Cascading CL, Replication across multiple storages, MapReduce …...many more..
  60. 60. Dynomite!!
  61. 61. Dynomite ● Cross AZ & Region replication to existing Key Value stores ○ memcached ○ Reddis ● Thin Dynamo implementation provides the replication ● Keep existing native KV protocol ○ No code refactoring
  62. 62. Dynomite Dynomite memcached Dynomite memcached App 1 AZ 1 AZ 2
  63. 63. What do all those events mean?
  64. 64. {“deviceid”: 12345, “action”: “played”, “titleid”: 99999}
  65. 65. {“deviceid”: 12345, “action”: “played”, “titleid”: 99999} Device C* 12345: “PS3”
  66. 66. {“deviceid”: 12345, “action”: “played”, “titleid”: 99999} Device C* 12345: “PS3” Content C* 99999: “HOC”
  67. 67. Don’t hurt production/our customers
  68. 68. Device/Content C* “My Devices”: {“PS3:HOC”:”12345:99999”} ?!?!?
  69. 69. Sometimes you just want all the data
  70. 70. C* Priam S3 SSTables
  71. 71. S3 SSTables Move to HDFS* Convert to JSON Compact Rows S3 JSON
  72. 72. ● A splittable input format for SSTables ○ Need less files from the cluster. ○ Faster - just deserializing/serializing the files. ● An input format for the JSON ○ Allow incremental processing of backups ● A reducer that can compact SSTables.
  73. 73. Big Data Platform
  74. 74. Eventual Consistency
  75. 75. Focus on Performance ● Get your job running faster ● Understand why it was slow ● Transition to Hadoop 2
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×