Your SlideShare is downloading. ×
0
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos

5,214

Published on

Published in: Technology
0 Comments
16 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
5,214
On Slideshare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
55
Comments
0
Likes
16
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Augury and Omens Aside, Part 1:
 The Business Case for Apache Mesos 
 Apache Mesos NYC Meetup @Shutterstock
 2014-06-11 
 Paco Nathan 
 http://liber118.com/pxn/
 @pacoid meetup.com/Apache-Mesos-NYC-Meetup/events/187583352/
  • 2. Disclaimer The following content results from research, use case analysis, industry observations, plus personal perspectives and opinions – presented by a speaker who is an independent author/consultant. The following content does not in any way represent the opinions or official messaging for any clients of Liber 118,Apache Foundation, United Nations,Area 51, S.P.E.C.T.R.E., etc. Except, perhaps, for the smarter ones who nurture an ample sense 
 of humor, which unfortunately may disqualify much of SiliconValley…
  • 3. Recent News Apache releases Mesos 0.19
 mesos.apache.org/blog/mesos-0-19-0-released/ Program announced for inaugural #MesosCon
 events.linuxfoundation.org/events/mesoscon Mesosphere takes $10.5M in funding
 techcrunch.com/2014/06/09/mesosphere-grabs-10m- in-series-a-funding-to-transform-datacenters/ Google releases part of Borg/Omega as OSS
 wired.com/2014/06/google-kubernetes/
  • 4. Recent News Apache releases Mesos 0.19 mesos.apache.org/blog/mesos-0-19-0-released/ Program announced for inaugural #MesosCon events.linuxfoundation.org/events/mesoscon Mesosphere takes $10.5M in funding techcrunch.com/2014/06/09/mesosphere-grabs-10m- in-series-a-funding-to-transform-datacenters/ Google releases part of Borg/Omega as OSS wired.com/2014/06/google-kubernetes/ seriously, can’t top that
  • 5. A Big Idea
  • 6. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming Paradigm shifts can be observed at three levels of the tech stack for cluster computing. Each implies orders of magnitude in cost savings over prior best results, based on substantive changes in software engineering practices…
  • 7. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming In other words, now that we have Mesos, Docker, and Spark, 
 why do we need Hadoop legacy software?
  • 8. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming hard problems? • latency • aggregation • parallelism • data rates Countdown: Augury and Omens Aside, Part 3…
  • 9. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming hard problems => solutions • applicative systems • leveraging semigroup structure • lazy evaluation aka combinator graph reduction • probabilistic data structures Countdown: Augury and Omens Aside, Part 3…
  • 10. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming hard problems? • process, data, and metadata in silos • BI + data modeling legacy culture • CAP theorem vs.ACID • accidental complexity • propagating schema and lineage • learning curve inertia • managing risk vs. innovation Countdown: Augury and Omens Aside, Part 2…
  • 11. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming hard problems => solutions • interdisciplinary teams • generalize across batch + real-time + etc. • separation of concerns • pattern language • compiler => query planner Countdown: Augury and Omens Aside, Part 2…
  • 12. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming hard problems? • commodity hardware failure rates • sched. batch is simple; sched. services is expensive • no getting around it: building a distrib system • static partitioning => cost of cluster computing • monolithic controllers vs. shared state • low util rates => upsidedown in power availability Countdown: Augury and Omens Aside, Part 1…
  • 13. From Business Use Cases To Bare Metal Datacenter Computing DataWorkflow Abstractions Functional Programming hard problems => solutions • isolation • containerization • mixed workloads • data locality • service+framework architecture • predictive scheduling Countdown: Augury and Omens Aside, Part 1…
  • 14. Why Does
 This Matter?
  • 15. IoT Data Rates: technologyreview.com/...
  • 16. IoT Data Rates: technologyreview.com/... Tools and techniques that served well for ad-tech will not necessarily apply for “Industrial Internet” data rates … we must retool; power requirements alone would boil the oceans
  • 17. Some History,
 Part 3
  • 18. Theory, Eight Decades Ago: Haskell Curry, known for seminal work on combinatory logic (1927) Alonzo Church, known for lambda calculus (1936) and much more! ! Both sought formal answers to the question, “What can be computed?” Narrative Arc: Lambda Somethingorother Haskell Curry
 haskell.org Alonso Church
 wikipedia.org
  • 19. Praxis, Four Decades Ago: Leveraging lambda calculus, combinators, etc., to increase parallelism of apps as applicative systems John Backus
 acm.org Narrative Arc: Lambda Somethingorother David Tuner
 wikipedia.org “Can Programming Be Liberated from the von Neumann
 Style? A Functional Style and Its Algebra of Programs”
 ACMTuring Award (1977)
 stanford.edu/class/cs242/readings/backus.pdf “A new implementation technique for applicative languages”
 Turner, D.A. (1979)
 Softw: Pract. Exper., 9: 31–49. doi: 10.1002/spe.4380090105
  • 20. Today: Add ALL theThings:
 Abstract Algebra Meets Analytics
 infoq.com/presentations/abstract- algebra-analytics
 Avi Bryant, Strange Loop (2013) • grouping doesn’t matter (associativity) • ordering doesn’t matter (commutativity) • zeros get ignored In other words, while partitioning data at scale is quite difficult, you can let the math allow your code to be flexible at scale Avi Bryant
 @avibryant Narrative Arc: Lambda Somethingorother
  • 21. Algebra for Analytics
 speakerdeck.com/johnynek/ algebra-for-analytics
 Oscar Boykin, Strata SC (2014) Oscar Boykin
 @posco A + B + C + D + E + F + G + H + I + J + K + L + M + N + O + P + + + + + + + (A + B) (C + D) (E + F) (G + H) (I + J) (K + L) (M + N) (O + P) (A + B) + C + D + E + F + G + H + I + J + K + L + M + N + O + P • “Associativity allows parallelism in reducing” 
 by letting you put the () where you want • “Lack of associativity increases latency exponentially” Narrative Arc: Lambda Somethingorother ???
  • 22. That, plus oh so much more math fun in store! Narrative Arc: Lambda Somethingorother The Prior (past decisions) The Evidence (the data) The Posterior (current decision) v u w x M U Σ VH n r nr = r m A z - cT x'0 x = b 0 I input hidden output
  • 23. Some History,
 Part 2
  • 24. wikipedia.org/wiki/Firefly
 businessweek.com/1996/41/b349690.htm
 pubs.media.mit.edu/pubs/papers/32paper.ps • Firefly, an early commercial recommender system • intent: the volume of data about things is more than any person can digest • leveraged similarity within a network • an evolution of intelligent agents into web apps • collect machine data about consumer interests • people communicating with each other and 
 with machines Narrative Arc: Data Workflow Abstractions Pattie Maes
 MIT Media Lab machine data about cognitive social systems
  • 25. Q3 1997 inflection point: four independent teams working toward horizontal scale-out of workflows based on commodity hardware This effort prepared the way for huge Internet successes during
 the 1997 holiday season… AMZN, EBAY, Inktomi (YHOO Search), then GOOG MapReduce on clusters of commodity hardware and the 
 Apache Hadoop open source stack emerged from this context Narrative Arc: Data Workflow Abstractions
  • 26. Amazon “Early Amazon: Splitting the website” – Greg Linden glinden.blogspot.com/2006/02/early-amazon-splitting- website.html ! eBay “The eBay Architecture” – Randy Shoup, Dan Pritchett addsimplicity.com/adding_simplicity_an_engi/2006/11/ you_scaled_your.html addsimplicity.com.nyud.net:8080/downloads/ eBaySDForum2006-11-29.pdf ! Inktomi (YHOO Search) “Inktomi’s Wild Ride” – Erik Brewer (0:05:31 ff) youtu.be/E91oEn1bnXM ! Google “Underneath the Covers at Google” – Jeff Dean (0:06:54 ff) youtu.be/qsan-GQaeyk perspectives.mvdirona.com/2008/06/11/ JeffDeanOnGoogleInfrastructure.aspx Narrative Arc: Data Workflow Abstractions
  • 27. RDBMS SQL Query result sets recommenders + classifiers Web Apps customer transactions Algorithmic Modeling Logs event history aggregation dashboards Product Engineering UX Stakeholder Customers DW ETL Middleware servletsmodels Narrative Arc: Data Workflow Abstractions
  • 28. RDBMS SQL Query result sets recommenders + classifiers Web Apps customer transactions Algorithmic Modeling Logs event history aggregation dashboards Product Engineering UX Stakeholder Customers DW ETL Middleware servletsmodels “data products” Narrative Arc: Data Workflow Abstractions
  • 29. See extended discussion + scorecard:
 www.slideshare.net/pacoid/data-workflows- for-machine-learning-33341183
  • 30. MapReduce General Batch Processing Pregel Giraph Dremel Drill Tez Impala GraphLab Storm S4 Specialized Systems: iterative, interactive, streaming, graph, etc. Narrative Arc: Data Workflow Abstractions
  • 31. 2002 2002 MapReduce @ Google 2004 MapReduce paper 2006 Hadoop @Yahoo! 2004 2006 2008 2010 2012 2014 2014 Apache Spark top-level 2010 Spark paper 2008 Hadoop Summit The State of Spark, and WhereWe're Going Next Matei Zaharia Spark Summit (2013) youtu.be/nU6vO2EJAb4 action value RDD RDD RDD transformations RDD How about a generalized engine for distributed, applicative systems – apps sharing code across multiple use cases: batch, iterative, streaming, etc. Narrative Arc: Data Workflow Abstractions
  • 32. Some History,
 Part 1
  • 33. Lessons
 from Google
  • 34. Datacenter Computing Google has been doing datacenter computing for years, 
 to address the complexities of large-scale data workflows: • leveraging the modern kernel: isolation in lieu of VMs • “most (>80%) jobs are batch jobs, but the majority 
 of resources (55–80%) are allocated to service jobs” • mixed workloads, multi-tenancy • relatively high utilization rates • JVM FTW? not so much… • reality: scheduling batch is simple; 
 scheduling services is hard/expensive
  • 35. The Modern Kernel: Top Linux Contributors… arstechnica.com/information-technology/2013/09/...
  • 36. “Return of the Borg” Return of the Borg: HowTwitter Rebuilt Google’s SecretWeapon
 Cade Metz
 wired.com/wiredenterprise/2013/03/google- borg-twitter-mesos ! The Datacenter as a Computer: An Introduction 
 to the Design ofWarehouse-Scale Machines Luiz André Barroso, Urs Hölzle research.google.com/pubs/pub35290.html ! ! 2011 GAFS Omega
 John Wilkes, et al.
 youtu.be/0ZFMlO98Jkc
  • 37. Google describes the technology… Omega: flexible, scalable schedulers for large compute clusters Malte Schwarzkopf,Andy Konwinski, Michael Abd-El-Malek, John Wilkes eurosys2013.tudos.org/wp-content/uploads/2013/paper/ Schwarzkopf.pdf
  • 38. Google describes the business case… Taming LatencyVariability
 Jeff Dean
 plus.google.com/u/0/+ResearchatGoogle/posts/C1dPhQhcDRv
  • 39. Commercial OS Cluster Schedulers ! • IBM Platform Symphony
 • Microsoft Autopilot ! 
 Arguably, some grid controllers 
 are quite notable in-category: • Univa Grid Engine (formerly SGE)
 • Condor • etc.
  • 40. Emerging
 at Berkeley
  • 41. Beyond Hadoop Hadoop – an open source solution for fault-tolerant parallel processing of batch jobs at scale, based on commodity hardware… however, other priorities have emerged for the analytics lifecycle: • apps require integration beyond Hadoop • multiple topologies, mixed workloads, multi-tenancy • significant disruptions in h/w cost/performance curves • higher utilization • lower latency • highly-available, long running services • more than “Just JVM” – e.g., Py adoption, etc.
  • 42. Just No Getting Around It “There's Just No Getting Around It:You're Building a Distributed System”
 Mark Cavage
 ACM Queue (2013-05-03)
 queue.acm.org/detail.cfm?id=2482856 key takeaways on architecture: • decompose the business application into discrete services on the boundaries of fault domains, scaling, and data workload • make as many things as possible stateless • when dealing with state, deeply understand CAP, latency, throughput, and durability requirements “Without practical experience working on successful—and failed—systems, most engineers take a "hopefully it works" approach and attempt to string together off-the-shelf software, whether open source or commercial, and often are unsuccessful at building a resilient, performant system. In reality, building a distributed system requires a methodical approach to requirements along the boundaries of failure domains, latency, throughput, durability, consistency, and desired SLAs for the business application at all aspects of the application.”
  • 43. Mesos – open source datacenter computing a common substrate for cluster computing mesos.apache.org heterogenous assets in your datacenter or cloud 
 made available as a homogenous set of resources • top-level Apache project • scalability to 10,000s of nodes • obviates the need for virtual machines • isolation (pluggable) for CPU, RAM, I/O, FS, etc. • fault-tolerant leader election based on Zookeeper • APIs in C++, Java/Scala, Python, Go, Erlang, Haskell • web UI for inspecting cluster state • available for Linux, OpenSolaris, Mac OSX
  • 44. What are the costs of Virtualization? benchmark type OpenVZ improvement mixed workloads 210%-300% LAMP (related) 38%-200% I/O throughput 200%-500% response time order magnitude more pronounced 
 at higher loads
  • 45. What are the costs of Single Tenancy? 0% 25% 50% 75% 100% RAILS CPU LOAD MEMCACHED CPU LOAD 0% 25% 50% 75% 100% HADOOP CPU LOAD 0% 25% 50% 75% 100% t t 0% 25% 50% 75% 100% Rails Memcached Hadoop COMBINED CPU LOAD (RAILS, MEMCACHED, HADOOP)
  • 46. Arguments for Datacenter Computing rather than running several specialized clusters, each 
 at relatively low utilization rates, instead run many 
 mixed workloads obvious benefits are realized in terms of: • scalability, elasticity, fault tolerance, performance, utilization • reduced equipment capex, Ops overhead, etc. • reduced licensing, eliminating need forVMs or potential 
 vendor lock-in subtle benefits – arguably, more important for Enterprise IT: • reduced time for engineers to ramp up new services at scale • reduced latency between batch and services, enabling new 
 high ROI use cases • enables Dev/Test apps to run safely on a Production cluster
  • 47. Analogies and Architecture
  • 48. Prior Practice: Dedicated Servers • low utilization rates • longer time to ramp up new services DATACENTER
  • 49. Prior Practice: Virtualization DATACENTER PROVISIONED VMS • even more machines to manage • substantial performance decrease 
 due to virtualization • VM licensing costs
  • 50. Prior Practice: Static Partitioning STATIC PARTITIONING • even more machines to manage • substantial performance decrease 
 due to virtualization • VM licensing costs • failures make static partitioning 
 more complex to manage DATACENTER
  • 51. MESOS Mesos: One Large Pool of Resources “We wanted people to be able to program 
 for the datacenter just like they program 
 for their laptop." ! Ben Hindman DATACENTER
  • 52. ! Fault-tolerant distributed systems… …written in 100-300 lines of 
 C++, Java/Scala, Python, Go, etc. …building blocks, if you will ! Q: required lines of network code? A: probably none
  • 53. Mesos – architecture HDFS, distrib file system Mesos, distrib kernel meta-frameworks: Aurora, Marathon frameworks: Spark, Storm, MPI, Jenkins, etc. task schedulers: Chronos, etc. APIs: C++, JVM, Py, Go apps: HA services, web apps, batch jobs, scripts, etc. Linux: libcgroup, libprocess, libev, etc.
  • 54. Mesos – dynamics Mesos distrib kernel Marathon distrib init.d Chronos distrib cron distrib frameworks HA services scheduled apps
  • 55. Mesos – dynamics resource offers distributed framework Scheduler Executor Executor Executor Mesos slave Mesos slave Mesos slave distributed kernel available resources Mesos slave Mesos slave Mesos slave Mesos masterMesos master
  • 56. Example: Resource Offer in a Two-Level Scheduler mesos.apache.org/documentation/latest/mesos-architecture/
  • 57. Frameworks Integrated with Mesos Continuous Integration:
 Jenkins, GitLab Big Data:
 Hadoop, Spark, Storm, Kafka, Hama Python workloads:
 DPark, Exelixi Meta-Frameworks / HA Services:
 Aurora, Marathon Orchestration:
 Singularity
 Distributed Cron:
 Chronos, JobServer Data Storage:
 ElasticSearch, Cassandra,
 Hypertable Containers:
 Docker, Deimos, GearD Parallel Processing:
 Chapel, MPI, Torque
  • 58. Looking Ahead…
  • 59. Quasar+Mesos @ Stanford, Twitter, etc.… Quasar: Resource-Efficient and QoS-Aware Cluster Management
 Christina Delimitrou, Christos Kozyrakis
 stanford.edu/~cdel/2014.asplos.quasar.pdf
  • 60. Quasar+Mesos @ Stanford, Twitter, etc.… Improving Resource Efficiency with Apache Mesos
 Christina Delimitrou
 youtu.be/YpmElyi94AA
  • 61. Quasar+Mesos @ Stanford, Twitter, etc.… Consider that for datacenter computing at scale, a surge in 
 workloads implies: • large cap-ex investment, long lead-time to build • utilities cannot supply the power requirements Even for large players that achieve 2x beyond typical industry DC util rates, those factors become show-stoppers. Even so, high rates of over-provisioning are typical, so there’s much room to improve. Experiences with Quasar+Mesos showed: • 88% apps get >95% performance • ~10% overprovisioning instead of 500% • up to 70% cluster util at steady state • 23% shorter scenario completion
  • 62. Because…
 Use Cases
  • 63. Production Deployments (public)
  • 64. Built-in /
 bare metal Hypervisors Solaris Zones Linux CGroups Opposite Ends of the Spectrum, One Common Substrate
  • 65. Opposite Ends of the Spectrum, One Common Substrate Request /
 Response Batch
  • 66. Case Study: Twitter (bare metal / on premise) “Mesos is the cornerstone of our elastic compute infrastructure – 
 it’s how we build all our new services and is critical forTwitter’s
 continued success at scale. It's one of the primary keys to our
 data center efficiency." Chris Fry, SVP Engineering blog.twitter.com/2013/mesos-graduates-from-apache-incubation wired.com/gadgetlab/2013/11/qa-with-chris-fry/ ! • key services run in production: analytics, typeahead, ads • Twitter engineers rely on Mesos to build all new services • instead of thinking about static machines, engineers think 
 about resources like CPU, memory and disk • allows services to scale and leverage a shared pool of 
 servers across datacenters efficiently • reduces the time between prototyping and launching
  • 67. Case Study: Airbnb (fungible cloud infrastructure) “We think we might be pushing data science in the field of travel 
 more so than anyone has ever done before… a smaller number 
 of engineers can have higher impact through automation on 
 Mesos." Mike Curtis,VP Engineering
 gigaom.com/2013/07/29/airbnb-is-engineering-itself-into-a-data... • improves resource management and efficiency • helps advance engineering strategy of building small teams 
 that can move fast • key to letting engineers make the most of AWS-based 
 infrastructure beyond just Hadoop • allowed company to migrate off Elastic MapReduce • enables use of Hadoop along with Chronos, Spark, Storm, etc.
  • 68. Case Study: eBay (continuous integration) eBay PaaS Team
 ebaytechblog.com/2014/04/04/delivering-ebays-ci- solution-with-apache-mesos-part-i/ • cluster management (PaaS core framework services) for CI • integration of: OpenStack, Jenkins, Zookeeper, Mesos, Marathon,Ansible In eBay’s existing CI model, each developer gets a personal CI/Jenkins Master instance.This Jenkins instance runs within a dedicatedVM, and over time the result has beenVM sprawl and poor resource utilization.We started looking at solutions to maximize our resource utilization and reduce theVM footprint while still preserving the individual CI instance model.After much deliberation, we chose Apache Mesos for a POC.This post shares the journey of how we approached this challenge and accomplished our goal.
  • 69. Case Study: HubSpot (cluster management) Tom Petr
 youtu.be/ROn14csiikw mesosphere.io/resources/mesos-case-study-hubspot/ • 500 deployable objects; 100 deploys/day to production; 90 engineers; 3 devops on Mesos cluster • “Our QA cluster is now a fixed $10K/month — that used to fluctuate”
  • 70. DIY
  • 71. ! ! http://elastic.mesosphere.io ! http://mesosphere.io/learn !
  • 72. Summary
 Question
  • 73. Given the points about Part 3, Part 2, Part 1… Given the history from Church and Curry 
 to BDAS and Twitter OSS… Given the needs, e.g., IoT preferably not boiling the oceans… Why do we still see proto-legacy systems like Tez? Or, for that matter, why do we find notable experts stating that “Hadoop is an OS” ? It’s time to set the legacy of YHOO circa 2009 
 aside, to step up to contemporary challenges with better understanding of the underlying math and 
 CS theory => solving business use cases at scale To paraphrase authorWilliam Gibson, the future is already here – it’s just not very evenly distributed, 
 nor is it google-able Summary Question:
  • 74. IoT Data Rates: ???
  • 75. ありがとう
 ございました
  • 76. monthly newsletter for updates, 
 events, conf summaries, etc.: liber118.com/pxn/ Enterprise Data Workflows with Cascading O’Reilly, 2013 shop.oreilly.com/product/0636920028536.do Just Enough Math O’Reilly, 2014 oreilly.com/go/enough_math/
 preview: youtu.be/TQ58cWgdCpA
  • 77. Spark Summit
 SF, Jun 30 15% code: Paco2014
 spark-summit.org/2014 OSCON 2014
 PDX, Jul 20 20% code: PACOID
 oscon.com/oscon2014/ #MesosCon
 Chicago, Aug 21
 events.linuxfoundation.org/events/mesoscon Strata NYC + Hadoop World
 NYC, Oct 15
 strataconf.com/stratany2014 Data Day Texas
 Austin, Jan 10
 datadaytexas.com calendar:

×