Netflix oss season 2 episode 1 - meetup Lightning talks

102,301 views

Published on

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
102,301
On SlideShare
0
From Embeds
0
Number of Embeds
94,947
Actions
Shares
0
Downloads
77
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Netflix oss season 2 episode 1 - meetup Lightning talks

  1. 1. Season 2 Episode 1 March 12, 2014
  2. 2. Evening Outline Lightning Talks: - S3mper - PigPen - STAASH - Dynomite - Aegisthus - Suro - Zeno - Lipstick on GCE - AnsWerS - IBM - Coursera
  3. 3. 41 projects… Now what? ● Cohesive platform ● Workshops / Training / Documentation ● Participate and contribute : netflixoss@netflix.com
  4. 4. Lightning talks
  5. 5. Lipstick, Hadoop, and Big Data on the Google Cloud Matt Bookman Solutions Architect
  6. 6. Google Confidential and Proprietary Google Compute Engine - VMs in Google Datacenters ● Public Preview - May 2013 ● General Availability - December 2013
  7. 7. Google Confidential and Proprietary Demo (Summer 2013): Pig on Compute Engine Sweet demo!
  8. 8. Google Confidential and Proprietary Netflix OSS Meetup - July 17, 2013
  9. 9. Google Confidential and Proprietary Lipstick - Providing insights
  10. 10. Google Confidential and Proprietary
  11. 11. Google Confidential and Proprietary Hadoop on GCE + Cloud Storage (GCS) Connector Accenture: Cloud vs. Bare-Metal ● Cloud-based Hadoop deployments offer better price- performance ratios than bare-metal ● Cloud’s virtualization expands performance- tuning opportunities ● Using remote storage outperforms local disk HDFS
  12. 12. Google Confidential and Proprietary Data in GCS, Lipstick DB in Cloud SQL Google Cloud Platform Output Data Lipstick Database Hadoop Master MapReduce JobTracker Hadoop Worker MapReduce TaskTrackerHadoop Worker MapReduce TaskTracker Hadoop Worker MapReduce TaskTrackerLipstick Server Input Data
  13. 13. Google Confidential and Proprietary ● Netflix Lipstick on Google Compute Engine https://cloud.google.com/developers/articles/netflix-lipstick-on-google-compute-engine ● GCS Connector for Hadoop https://developers.google.com/hadoop/google-cloud-storage-connector ● Cloud-based Hadoop Deployments: Benefits and Considerations http://www.accenture.com/SiteCollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop-Deployments-Benefits-and-Considerations. pdf ● Apache Hadoop, Hive, and Pig on Google Compute Engine https://cloud.google.com/developers/articles/apache-hadoop-hive-and-pig-on-google-compute-engine Resources
  14. 14. Google Confidential and Proprietary Thank you
  15. 15. @Answers4AWS Cloud Prize and Beyond Peter Sankauskas @pas256
  16. 16. @Answers4AWS March 2013
  17. 17. @Answers4AWS First idea • AsgardFormation • CloudFormation for Asgard
  18. 18. @Answers4AWS
  19. 19. @Answers4AWS
  20. 20. @Answers4AWS Requirements • AsgardFormation • Asgard running • AWS Credentials • IAM user • Policy • Security Group • EC2 instance • Asgard downloaded and configured • Tomcat downloaded and configured • Java downloaded and installed • Linux configured
  21. 21. @Answers4AWS
  22. 22. @Answers4AWS Asgard playbook • Base • Install usual Linux packages • Basic system hardening and security packages • Oracle Java 7 • Tomcat 7 • Asgard • Latest release from GitHub
  23. 23. @Answers4AWS Other playbooks • Eureka • Edda • Simian Army • Ice • Aminator • Genie
  24. 24. @Answers4AWS AMIs • Initially built using my own scripts based on Eric Hammond’s (@esh) work • Then using Aminator • Created Ubuntu Foundation AMIs • Added the Ansible Provisioner for Aminator • Put a couple of them on the AWS Marketplace for free
  25. 25. @Answers4AWS CloudFormation • One-click deploy • Well, about 10 going through the AWS Web Console wizard • Designed to get you up and running quickly • Test it out, see if you like it • NOT production quality • No real security • No HA
  26. 26. @Answers4AWS
  27. 27. @Answers4AWS What’s next?
  28. 28. @Answers4AWS Do you do this? (this is not my slide)
  29. 29. @Answers4AWS
  30. 30. @Answers4AWS Beta users • From a successful CI build • To a Fully Baked AMI • Use in Testing and Production • Without you doing anything • ZERO clicks • Signups are open
  31. 31. @Answers4AWS Thank you http://bakery.answersforaws.com/ bakery@answersforaws.com See me at the demo station Peter Sankauskas @pas256
  32. 32. IBM Scalable Services Fabric for Netflix S2E1 Meetup Andrew Spyker @aspyker
  33. 33. History and Future 2012 SPECjEnterprise 2013 AcmeAir Run On IBM Cloud at “Web Scale” 2014 Scalable Services Fabric internally for IBM Services Scalable Services Fabric SaaS and On-Prem? Sample application cloud prize work AcmeAir Cloud/Mobile Sample/Benchmark born Codename: BlueMix Portability cloud prize work
  34. 34. Scalable Service Fabric Work Netflix OSS IBM port/enablement Netflix “Zen” of Cloud • Worked with initial services to enable cloud native arch • Worked with initial services to enable NetflixOSS usages • Created scorecard and tests for “cloud native” readiness Highly Available IaaS and Cloud Services • Deployment across multiple IBM SoftLayer IaaS datacenters and global and local load balancers • Complete automation via IBM SoftLayer IaaS API’s • Ensured facilities for automatic failure recovery Micro-service Runtimes (Karyon, Eureka Client, Ribbon, Hystrix, Archaius) • Ported to work with IBM SoftLayer IaaS and on the WebSphere Liberty Profile application server • Created “eureka-sidecar” for non-Java runtimes and ElasticSearch discovery Netflix OSS Servers (Asgard, Eureka Server, Turbine) • Ported to work with IBM SoftLayer IaaS + RightScale • Operationalized HA and secure deployments for multiple service tenants Adopted Chaos Testing • Ported Chaos Monkey to IBM SoftLayer IaaS • Performed manual Chaos Gorilla validation on services Worked through devops tool chain • Worked with initial services to enable continuous delivery with devops (and imagine baking via Animator like tool)
  35. 35. Come meet the team! Looks like … Tweets from … Talks about … Adolfo @adolforod API Management and Cloud Integration, user of NetflixOSS platform. Appliances in the cloud. Brian @bkmartin IBM BlueMix (PaaS), enabling composable apps in PaaS Darrell IBM Research focusing on NetflixOSS devops and on- premise deployments David @dcurrie WebSphere Liberty Profile application server NetflixOSS development and PaaS integration Jonathan @ma4jpb NetflixOSS portability across many aspects Cloud messaging (in relation to Suro) Matt @matrober API Management, user of NetflixOSS platform Converted service to be cloud native Rachel @rreinitz IBM Services, interested in helping you get to this cloud native in SaaS and on-premise Ricky @rickymoorhouse API Management, user of NetflixOSS platform Creator of Imaginator Will @auwilli98 API Management operations, user of NetflixOSS platform
  36. 36. Priam + Aegisthus @Coursera NetflixOSS Meetup
  37. 37. Introduction @DanielChiaJH Software Engineer, Infrastructure Team Coursera
  38. 38. Overview • Philosophy • Priam • Aegisthus • Conclusion
  39. 39. Philosophy • Architecture Patterns • Use what we can • Incorporate the spirit of others
  40. 40. Priam – Wins • Token Management • S3 Backup + Restore • Config
  41. 41. Priam – Next Steps? • SimpleDB -> DynamoDB • Backups blow out OS disk buffer cache • Compatibility with newer C* versions
  42. 42. Aegisthus - Wins • Novel workflow • Data reduced to one authoritative copy • Possibility for incremental jobs
  43. 43. Aegisthus – What Next? • C* 1.2 / 2.0 • CQL3 • Priam <–> Aegisthus • Better compressed SSTable support
  44. 44. Conclusion • Come chat with me! • Especially if you have similar goals to me
  45. 45. Zeno ● In-memory data distribution platform ● Contains tools for: ○ data quality management ○ data serialization ● We use it to distribute and keep up to date gigabytes of video metadata on tens of thousands of servers across the globe
  46. 46. Zeno Why in-memory data? - Netflix serves billions of requests per day - Each request requires metadata about many movies to answer
  47. 47. Zeno Netflix Use Case: ● Gigabytes of in-memory data ● Hundreds of thousands of in-memory cache requests per second, per application instance ● Tens of thousands of application instances
  48. 48. Distribution FastBlob: Binary serialization of a complete state of data, and/or the changes in data over time. Serialization format designed to propagate, and keep up to date, a large amount of in-memory data across many servers. Optimized for: memory GC effects, memory footprint, data transfer size, deserialization CPU usage
  49. 49. Data Quality Diff Reports - inspect data changes between releases
  50. 50. Data Quality Diff History - inspect changes in data over time
  51. 51. Zeno Framework Data Schema (Serializers) Operation (SerializationFramework) Input Data (POJOs) Output
  52. 52. Zeno Framework Data Schema (Serializers) Operation (SerializationFramework) Input Data (POJOs) Output JsonSerializationFramework HashSerializationFramework DiffSerializationFramework FastBlobStateEngine
  53. 53. Zeno Benefits Development Agility: ● Easy to evolve data model, no need to change serialization formats or operation logic ● Easy to create new functionality, no need to think about data model structure or semantics ● Included “Diff” tools support high data quality across releases without too much effort Resource Efficiency: ● Included “FastBlob” optimized for Netflix scale ● Ask about in-development functionality!
  54. 54. Suro
  55. 55. To Be Processed in Different Ways
  56. 56. A Simple Solution That Supports All These
  57. 57. STAASH STorage As A Service over Http
  58. 58. STAASH
  59. 59. STAASH ● Storage-Agnostic ● Language-Agnostic ● REST Interface to data ● Pattern Automation / Aware End Points ● Wrapper Around Astyanax Recipes ● Possibilities: Auditing, Cascading CL, Replication across multiple storages, MapReduce …...many more..
  60. 60. Dynomite!!
  61. 61. Dynomite ● Cross AZ & Region replication to existing Key Value stores ○ memcached ○ Reddis ● Thin Dynamo implementation provides the replication ● Keep existing native KV protocol ○ No code refactoring
  62. 62. Dynomite Dynomite memcached Dynomite memcached App 1 AZ 1 AZ 2
  63. 63. What do all those events mean?
  64. 64. {“deviceid”: 12345, “action”: “played”, “titleid”: 99999}
  65. 65. {“deviceid”: 12345, “action”: “played”, “titleid”: 99999} Device C* 12345: “PS3”
  66. 66. {“deviceid”: 12345, “action”: “played”, “titleid”: 99999} Device C* 12345: “PS3” Content C* 99999: “HOC”
  67. 67. Don’t hurt production/our customers
  68. 68. Device/Content C* “My Devices”: {“PS3:HOC”:”12345:99999”} ?!?!?
  69. 69. Sometimes you just want all the data
  70. 70. C* Priam S3 SSTables
  71. 71. S3 SSTables Move to HDFS* Convert to JSON Compact Rows S3 JSON
  72. 72. ● A splittable input format for SSTables ○ Need less files from the cluster. ○ Faster - just deserializing/serializing the files. ● An input format for the JSON ○ Allow incremental processing of backups ● A reducer that can compact SSTables.
  73. 73. Big Data Platform
  74. 74. Eventual Consistency
  75. 75. Focus on Performance ● Get your job running faster ● Understand why it was slow ● Transition to Hadoop 2

×