Apache Mesos at Twitter (Texas LinuxFest 2014)

12,364 views

Published on

http://2014.texaslinuxfest.org/content/apache-mesos-twitter

Published in: Software, Technology, Education
0 Comments
60 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
12,364
On SlideShare
0
From Embeds
0
Number of Embeds
3,432
Actions
Shares
0
Downloads
373
Comments
0
Likes
60
Embeds 0
No embeds

No notes for slide

Apache Mesos at Twitter (Texas LinuxFest 2014)

  1. 1. @andypiper Chris Aniszczyk Head of Open Source @cra Apache Mesos at Twitter #TXLF 2014
  2. 2. Hi, I’m @cra & run the @TwitterOSS office! 2
  3. 3. Twitter is Built on Open Source… 3
  4. 4. Agenda ! • Introduction • How does Mesos work? • Mesos Ecosystem • Conclusion • Q&A
  5. 5. Twitter Scale… 5 255M+ 500M+ 77% Active users Tweets per day of users are outside the US 2006 2014 100TB+ compressed data per day
  6. 6. 6 Growth challenges… sad times… remember the fail whale?
  7. 7. 7 Ups and Downs… remember World Cup 2010? http://gigaom.com/2010/06/11/is-the-world-cup-bringing-down-twitter/
  8. 8. Easy solution!? Lets add machines… but… ! • Can get expensive… even with commodity hardware… • Hard to fully utilize machines (e.g., 72 GB RAM and 24 CPUs) • Hard to deal with failures… • What else could we do…?
  9. 9. Evaluate industry… ! • Google was ahead of the game of managing warehouse scale computing: http:// research.google.com/pubs/pub35290.html ! • Google hit a lot of these problems before many other companies and came up with interesting solutions: http://youtube.com/watch?v=0ZFMlO98Jkc
  10. 10. Evaluate research at universities… ! • Universities (wooooo PhDs) were doing research in this area, we decided to partner and hire researchers: https://amplab.cs.berkeley.edu/tag/mesos/ ! • “Return of the Borg: How Twitter Rebuilt Google’s Secret Weapon: http://www.wired.com/2013/03/ google-borg-twitter-mesos
  11. 11. Enter Apache Mesos ! • We took university research and spun into an open source project at the Apache Foundation: https:// blog.twitter.com/2012/incubating-apache-mesos • https://twitter.com/ApacheMesos/statuses/ 360039441500340224
  12. 12. What is exactly is Mesos? • Mesos is an open source project with a healthy independent community: http://mesos.apache.org • Mesos is a distributed system to build and run distributed systems • Mesos provides fine-grained resource sharing and isolation • Mesos enables high-availability and fault-tolerance for your cluster
  13. 13. This is your typical data center 1 2 3 4 5 6 7 8 9
  14. 14. This is your typical data center with static partitioned apps 1 2 3 4 5 6 7 8 9
  15. 15. Not sharing wastes resources 0% 11% 22% 33% 0% 11% 22% 33% 0% 11% 22% 33%
  16. 16. Resource sharing increases throughput and utilization 0% 11% 22% 33% 0% 11% 22% 33% 0% 11% 22% 33% 0% 33.333% 66.667% 100%
  17. 17. Running at the container level improves performance… Timetoprovision(seconds) 1 100 10000 Bare metal VM Container Inspired by Tomas Barton’s Mesos talk at InstallFest in Prague
  18. 18. Agenda ! • Introduction • How does Mesos work? • Mesos Ecosystem • Conclusion • Q&A
  19. 19. Mesos Slave Hadoop task-tracker Mesos Executor Task #1 Task #2 ./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum *Thank you to Niklas Nielsen and Adam Borlen for the following diagrams explaining Mesos https://www.youtube.com/watch?v=EI0ROkf0vks Mesos consists of master/slave nodes
  20. 20. Mesos Slave Hadoop task-tracker Mesos Executor Task #1 Task #2 ./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum applications are known as frameworks in Mesos, they interact with master
  21. 21. Mesos Slave Hadoop task-tracker Mesos Executor Task #1 Task #2 ./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum Multiple masters can be in place for HA; coordinate leader election with ZK
  22. 22. Mesos Slave Hadoop task-tracker Mesos Executor Task #1 Task #2 ./ruby XYZ Mesos Slave Docker Executor Docker Executor java -jar XYZ.jar ./xyz Mesos Master Mesos Master Mesos Master Hadoop scheduler Marathon scheduler Zookeeper quorum Master schedules tasks to run on slaves’ available resources; slaves use executors to coordinate execution of tasks Tasks are the unit of execution
  23. 23. Mesos provides fine-grained resource isolation (via cgroups) Compute Node Mesos Slave Process Hadoop task-tracker Mesos Executor Task #1 Task #2 ruby XYZ Container (Cgroups) Executor Slaves isolate executors and tasks via containers (dotted line)
  24. 24. Compute Node Mesos Slave Process Hadoop task-tracker Task #1 Task #2 Container (Cgroups) Task #3 Mesos provides fine-grained resource isolation (via cgroups) Containers can GROW AND SRHINK as tasks run and complete
  25. 25. Mesos provides componentized resource isolation Mesos Slave Process Mesos Containerizer CGroups CPU isolator CGroups Memory isolator Launcher Container foo Task baz Containerizer API Executor bar When a slave starts, you can specify a “containerizer” to launch the container and set of isolators to enforce resource constraints (CPU/memory) Mesos can track and allocate more resource types, allowing you to manage resources like ip-addresses, ports, disk space and even GPUs!
  26. 26. Mesos provides pluggable resource isolation (e.g., Docker) External Containerizer External Containerizer API Mesos Slave Process External Containerizer Program Container foo MySQL Containerizer API Ubuntu 13.10 Container bar Ruby Centos 6.4 github.com/mesosphere/deimos
  27. 27. Everything fails all the time Werner Vogels (Amazon CTO)
  28. 28. Mesos has no single point of failure (master keeps monitoring tasks and waits for a node to reconnect, master will update the framework with any tasks that were completed while it was gone) Tasks keep running! Framework Masters
  29. 29. Master node can fail-over (ZK quorum will elect a new leader) Tasks keep running! Framework Masters
  30. 30. Slave processes can fail over (loads check pointed state to learn what pods to reconnect for reach task and re-registeres with the master) Tasks keep running! Compute Node Mesos Slave Process Mesos Executor Mesos Executor
  31. 31. The Mesos ecosystem is growing, frameworks everywhere) http://mesos.apache.org/documentation/latest/mesos-frameworks/
  32. 32. Chronos: Distributed cron with dependencies https://github.com/airbnb/chronos
  33. 33. Marathon: init.d for your data center https://github.com/mesosphere/marathon
  34. 34. Aurora: Advanced scheduler used by Twitter in production http://aurora.incubator.apache.org
  35. 35. You can also build your own framework…
  36. 36. Agenda ! • Introduction • How does Mesos work? • Mesos Ecosystem • Conclusion • Q&A
  37. 37. #PoweredByMesos (public) http://mesos.apache.org/documentation/latest/powered-by-mesos/
  38. 38. Mesos allow services to scale Engineers think about resources, not machines
  39. 39. Storage MySQL Tweet store Flock User Store Cache Memcached Redis Logic Tweet Service User Service Timeline Service SocialGraph Service DM Service Presentation API Web Search Feature X Feature Y Presentation TFE (netty) Reverse Proxy HTTP Thrift Thrift Aurora Mesos Monorail
  40. 40. Mesos enables multi-tenant clusters Small teams can move fast AWS-based infrastructure beyond just Hadoop
  41. 41. Marathon Mesos Chronos Batch/Streaming Hadoop Spark Kafka Query/Analysis Cascading Presto Hive Shark Pig Services Rails Redis Cassandra KairosDB RDS Hadoop A Hadoop B
  42. 42. Agenda ! • Introduction • How does Mesos work? • Mesos Ecosystem • Conclusion • Q&A
  43. 43. Conclusion • Mesos is a distributed system to build and run distributed systems (think datacenter OS) • Mesos enables resource sharing, high-availability and fault-tolerance for your data centers • Mesos is an open source project with a healthy independent community: http://mesos.apache.org • So please check it out, use it or contribute back if you can to make it better!
  44. 44. https://elastic.mesosphere.io
  45. 45. http://mesos.apache.org Open Source Support from the Mesos Community
  46. 46. http://mesos.apache.org/community/user-groups/ Learn more via Mesos User Groups
  47. 47. http://mesosphere.io/learn Commercial Support from Mesosphere
  48. 48. http://mesoscon.org First #MesosCon to coincide with LinuxCon 2014!
  49. 49. Thank you for listening! Chris Aniszczyk (@cra) zx@twitter.com http://opensource.twitter.com ! http://mesos.apache.org email: {user,dev}@mesos.apache.org 51Also thanks to Niklas Nielsen and Adam Borlen for their slides explaining Mesos from ApacheCon 2014 https://www.youtube.com/watch?v=EI0ROkf0vks
  50. 50. Resources ! http://mesos.apache.org http://mesosphere.io/learn/ http://wired.com/wiredenterprise/2013/03/google-borg-twitter-mesos http://mesosphere.io/2013/09/26/docker-on-mesos/ http://typesafe.com/blog/play-framework-grid-deployment-with-mesos http://research.google.com/pubs/pub35290.html http://nerds.airbnb.com/hadoop-on-mesos/ https://blog.twitter.com/2013/mesos-graduates-from-apache-incubation http://www.ebaytechblog.com/2014/04/04/delivering-ebays-ci-solution-with-apache-mesos-part-i/ https://www.youtube.com/watch?v=EI0ROkf0vks !

×