SlideShare a Scribd company logo
1 of 29
Download to read offline
© 2016 Mesosphere, Inc. All Rights Reserved.
From SMACK to
SMAACK
Alluxio meets DC/OS
Jörg Schad, Mesosphere
Adit Madan, Alluxio
#smack @Alluxio @dcos @joerg_schad @madanadit
© 2017 Mesosphere, Inc. All Rights Reserved.
20% OFF
MCDCOS20
September 13th - 15th
● Dedicated Tracks
● MesosCon University
● Town Halls
● Hackathon
Accelerating Spark workloads in a Mesos
environment with Alluxio, 09/15, 11AM
© 2017 Mesosphere, Inc. All Rights Reserved. 3
Fast Data
Batch Event ProcessingMicro-Batch
Days Hours Minutes Seconds Microseconds
Solves problems using predictive and prescriptive analyticsReports what has happened using descriptive analytics
Predictive User InterfaceReal-time Pricing and Routing Real-time AdvertisingBilling, Chargeback Product recommendations
© 2017 Mesosphere, Inc. All Rights Reserved. 4
The SMACK Stack
EVENTS
Ubiquitous data streams
from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events
per second
Distributed & highly
scalable database
Real-time and batch
process data
Visualize data and build
data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2017 Mesosphere, Inc. All Rights Reserved. 5
Datacenter
© 2017 Mesosphere, Inc. All Rights Reserved. 6
NAIVE APPROACH
Typical Datacenter
siloed, over-provisioned servers,
low utilization
Industry Average
12-15% utilization
mySQL
microservice
Cassandra
Spark/Hadoop
Kafka
© 2017 Mesosphere, Inc. All Rights Reserved. 7
© 2017 Mesosphere, Inc. All Rights Reserved. 8
MULTIPLEXING OF DATA, SERVICES, USERS, ENVIRONMENTS
Typical Datacenter
siloed, over-provisioned servers,
low utilization
Mesos/ DC/OS
automated schedulers, workload multiplexing onto the
same machines
mySQL
microservice
Cassandra
Spark/Hadoop
Kafka
Datacenter Operating System (DC/OS)
Distributed Systems Kernel (Mesos)
DC/OS ENABLES MODERN DISTRIBUTED APPS
Big Data + Analytics EnginesMicroservices (in containers)
Streaming
Batch
Machine Learning
Analytics
Functions &
Logic
Search
Time Series
SQL / NoSQL
Databases
Modern App Components
Any Infrastructure (Physical, Virtual, Cloud)
9
© 2017 Mesosphere, Inc. All Rights Reserved. 10
The SMACK Stack
EVENTS
Ubiquitous data streams
from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events
per second
Distributed & highly
scalable database
Real-time and batch
process data
Visualize data and build
data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2017 Mesosphere, Inc. All Rights Reserved. 11
The SMACK Stack
EVENTS
Ubiquitous data streams
from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events
per second
Distributed & highly
scalable database
Real-time and batch
process data
Visualize data and build
data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM YESTERDAY
© 2017 Alluxio 12
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM TODAY
© 2017 Alluxio
…
…
13
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM ISSUES
© 2017 Alluxio
…
…
14
© 2017 Mesosphere, Inc. All Rights Reserved. 15
The SMAACK Stack
EVENTS
Ubiquitous data streams
from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events
per second
Distributed & highly
scalable database
Real-time and batch
process data
Visualize data and build
data driven applications
Mesos/ DC/OS
Sensors
Devices
Clients
Alluxio
© 2017 Mesosphere, Inc. All Rights Reserved. 16© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM WITH ALLUXIO
…
…
FUSE Compatible File
System Interface
Hadoop Compatible File
System Interface
Native Key-Value
Interface
Native File System
Interface
HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface
© 2017 Alluxio 17
© 2016 Mesosphere, Inc. All Rights Reserved.
BIG DATA ECOSYSTEM WITH ALLUXIO
…
…
FUSE Compatible File
System Interface
Hadoop Compatible File
System Interface
Native Key-Value
Interface
Native File System
Interface
HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface
Enabling Application to Access Data from any
Storage System at Memory-speed
© 2017 Alluxio 18
© 2016 Mesosphere, Inc. All Rights Reserved.
WHY ALLUXIO
© 2017 Alluxio
Co-located compute and data with memory-speed access to data
Virtualized across different storage systems under a unified namespace
Scale-out architecture
File system API, software only
19
© 2016 Mesosphere, Inc. All Rights Reserved.
ALLUXIO BENEFITS
© 2017 Alluxio
Unification
New workflows across
any data in any storage
system
Orders of magnitude
improvement in run
time
Choice in compute and
storage – grow each
independently, buy
only what is needed
Performance Flexibility
20
© 2017 Mesosphere, Inc. All Rights Reserved. 21© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved. 22
WHY DATA SERVICES ON DC/OS?
On-demand provisioning1
2
3
Simplified operations
Elastic data infrastructure
● Single command install of services
● Runtime software upgrade
● Runtime application settings update
● Monitoring & metrics
● Managed persistent storage volumes
● Data services and containerized apps share resources
● Deploy instances with different versions on the same
infrastructure
● Resize instances
● Add more instances
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved. 23
ALLUXIO ON MESOSPHERE DC/OS
Fast, On-demand Unified Data at Memory Speed for Analytics
Alluxio
Mesosphere DC/OS
Any Infrastructure
Build apps once in DC/OS, and
run anywhere
Runs distributed apps anywhere as
simply as running apps on your laptop
Unify Data at Memory Speed Unify Data at Memory Speed
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved. 24
ALLUXIO ON MESOSPHERE DC/OS
Fast, On-demand Unified Data at Memory Speed for Analytics
© 2017 Alluxio
© 2016 Mesosphere, Inc. All Rights Reserved.
WHY ALLUXIO ON MESOSPHERE DC/OS?
● Without Mesosphere DC/OS, provisioning of infrastructure is tedious
○ Mesosphere DC/OS automates app & cluster provisioning, management & elastic scaling
● Alluxio brings
○ A unified view of data across disparate storage systems
○ High performance & predictable SLA for analytics workloads
● Benefits include:
○ Process data in your existing cluster faster with Spark and other analytics frameworks
○ Process data from hybrid cloud storage systems (HDFS, S3, On-prem Object Stores etc)
© 2017 Alluxio 25
© 2016 Mesosphere, Inc. All Rights Reserved. 26
BIG DATA STACK WITH ALLUXIO ON MESOSPHERE DC/OS
Fast, On-demand Unified Data at Memory Speed for Analytics
Mesos
Container Orchestration Management & Monitoring Tools Apps Universe
Security Advanced Operations Multitenancy Adv. Network & Storage
Unifying Data at Memory Speed
© 2017 Alluxio
© 2017 Mesosphere, Inc. All Rights Reserved. 27© 2017 Alluxio
DEMO
© 2016 Mesosphere, Inc. All Rights Reserved.
WHAT HAPPENED?
● Alluxio scheduler (developed using the DC/OS SDK) launched as a Marathon application
○ Marathon manages and restarts the scheduler in case of failures
○ Scheduler consists of YAML + scripting
● Alluxio scheduler launched master and worker processes
○ Scheduler manages the configured number of instances even w/ failures
● Configuration changes take effect on the fly
○ Scaled up the worker instances
© 2017 Alluxio 28
© 2016 Mesosphere, Inc. All Rights Reserved.
GET STARTED TODAY
Read:
● Mesosphere Blog: http://ow.ly/ou0530ax9aM
● Alluxio Blog: http://ow.ly/ILOZ30ax8YE
Try it out:
● Install Alluxio from DC/OS Universe
Questions?
© 2017 Alluxio 29

More Related Content

What's hot

What's hot (20)

Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with SparkBest Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
 
Spark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with AlluxioSpark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with Alluxio
 
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
 
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
 
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
 
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
 
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with AlluxioSecurely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
 
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri SimsaSpark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
 
The Missing Piece of On-Demand Clusters
The Missing Piece of On-Demand ClustersThe Missing Piece of On-Demand Clusters
The Missing Piece of On-Demand Clusters
 
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016
 
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
Rise of Intermediate APIs - Beam and Alluxio at Alluxio Meetup 2016
 
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in KubernetesDeep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
Deep Learning and Gene Computing Acceleration with Alluxio in Kubernetes
 
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
Alluxio: The missing piece of on-demand clusters at Alluxio Meetup 2016
 
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18
 
Building Fast SQL Analytics on Anything with Presto, Alluxio
Building Fast SQL Analytics on Anything with Presto, AlluxioBuilding Fast SQL Analytics on Anything with Presto, Alluxio
Building Fast SQL Analytics on Anything with Presto, Alluxio
 
Alluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for DaskAlluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for Dask
 
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
 
Introducing the Hub for Data Orchestration
Introducing the Hub for Data OrchestrationIntroducing the Hub for Data Orchestration
Introducing the Hub for Data Orchestration
 
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed StorageOpen Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed Storage
 
Running Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with AlluxioRunning Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with Alluxio
 

Similar to Alluxio Mesos Meetup - SMACK to SMAACK

[DO16] Mesosphere : Microservices meet Fast Data on Azure
[DO16] Mesosphere : Microservices meet Fast Data on Azure [DO16] Mesosphere : Microservices meet Fast Data on Azure
[DO16] Mesosphere : Microservices meet Fast Data on Azure
de:code 2017
 

Similar to Alluxio Mesos Meetup - SMACK to SMAACK (20)

Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
Powering Predictive Mapping at Scale with Spark, Kafka, and Elastic Search: S...
 
Downtime is not an option - day 2 operations - Jörg Schad
Downtime is not an option - day 2 operations -  Jörg SchadDowntime is not an option - day 2 operations -  Jörg Schad
Downtime is not an option - day 2 operations - Jörg Schad
 
Webinar - Big Data: Let's SMACK - Jorg Schad
Webinar - Big Data: Let's SMACK - Jorg SchadWebinar - Big Data: Let's SMACK - Jorg Schad
Webinar - Big Data: Let's SMACK - Jorg Schad
 
SMACK stack and beyond
SMACK stack and beyondSMACK stack and beyond
SMACK stack and beyond
 
[DO16] Mesosphere : Microservices meet Fast Data on Azure
[DO16] Mesosphere : Microservices meet Fast Data on Azure [DO16] Mesosphere : Microservices meet Fast Data on Azure
[DO16] Mesosphere : Microservices meet Fast Data on Azure
 
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg SchadSmack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
Smack Stack and Beyond—Building Fast Data Pipelines with Jorg Schad
 
Kubernetes on DC/OS
Kubernetes on DC/OSKubernetes on DC/OS
Kubernetes on DC/OS
 
Elastic data services on Apache Mesos via Mesosphere’s DCOS
Elastic data services on Apache Mesos via Mesosphere’s DCOSElastic data services on Apache Mesos via Mesosphere’s DCOS
Elastic data services on Apache Mesos via Mesosphere’s DCOS
 
A Journey to Modern Apps with Containers, Microservices and Big Data
A Journey to Modern Apps with Containers, Microservices and Big DataA Journey to Modern Apps with Containers, Microservices and Big Data
A Journey to Modern Apps with Containers, Microservices and Big Data
 
DOD 2016 - Jörg Schad - How Fast Data and Microservices Change the Datacenter.
DOD 2016 - Jörg Schad - How Fast Data and Microservices Change the Datacenter.DOD 2016 - Jörg Schad - How Fast Data and Microservices Change the Datacenter.
DOD 2016 - Jörg Schad - How Fast Data and Microservices Change the Datacenter.
 
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
OSDC 2018 | From batch to pipelines – why Apache Mesos and DC/OS are a soluti...
 
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of KubernetesDevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
DevOps vs. Site Reliability Engineering (SRE) in Age of Kubernetes
 
Episode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OSEpisode 4: Operating Kubernetes at Scale with DC/OS
Episode 4: Operating Kubernetes at Scale with DC/OS
 
Partner Webinar: Mesosphere and DSE: Production-Proven Infrastructure for Fas...
Partner Webinar: Mesosphere and DSE: Production-Proven Infrastructure for Fas...Partner Webinar: Mesosphere and DSE: Production-Proven Infrastructure for Fas...
Partner Webinar: Mesosphere and DSE: Production-Proven Infrastructure for Fas...
 
DevOps in Age of Kubernetes
DevOps in Age of KubernetesDevOps in Age of Kubernetes
DevOps in Age of Kubernetes
 
Hyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with MesosphereHyperscale Computing, Enterprise Agility with Mesosphere
Hyperscale Computing, Enterprise Agility with Mesosphere
 
Doing Dropbox the Native Cloud Native Way
Doing Dropbox the Native Cloud Native WayDoing Dropbox the Native Cloud Native Way
Doing Dropbox the Native Cloud Native Way
 
Flink Forward San Francisco 2017 - Flink meet DC/OS
Flink Forward San Francisco 2017 - Flink meet DC/OSFlink Forward San Francisco 2017 - Flink meet DC/OS
Flink Forward San Francisco 2017 - Flink meet DC/OS
 
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...
 
Flink forward sf 17
Flink forward sf 17Flink forward sf 17
Flink forward sf 17
 

More from Alluxio, Inc.

More from Alluxio, Inc. (20)

Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-CloudAlluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
Alluxio Monthly Webinar | Simplify Data Access for AI in Multi-Cloud
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Optimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with AlluxioOptimizing Data Access for Analytics And AI with Alluxio
Optimizing Data Access for Analytics And AI with Alluxio
 
Speed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio CachingSpeed Up Presto at Uber with Alluxio Caching
Speed Up Presto at Uber with Alluxio Caching
 
Correctly Loading Incremental Data at Scale
Correctly Loading Incremental Data at ScaleCorrectly Loading Incremental Data at Scale
Correctly Loading Incremental Data at Scale
 
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/MLBig Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
 
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
 
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...Alluxio Monthly Webinar | Five Disruptive Trends that Every  Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
 
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache EvictionData Infra Meetup | FIFO Queues are All You Need for Cache Eviction
Data Infra Meetup | FIFO Queues are All You Need for Cache Eviction
 
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio EdgeData Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
 
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the CloudData Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
 
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet ReaderData Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
 
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage EvolutionData Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
 
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
 
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
 
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI EraAI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
 
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
 
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...AI Infra Day | The Generative AI Market  And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
 
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ MetaAI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
 
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber ScaleAI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
 

Recently uploaded

Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdfWhere to Learn More About FDO _ Richard at FIDO Alliance.pdf
Where to Learn More About FDO _ Richard at FIDO Alliance.pdf
 
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The InsideCollecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
Collecting & Temporal Analysis of Behavioral Web Data - Tales From The Inside
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
Secure Zero Touch enabled Edge compute with Dell NativeEdge via FDO _ Brad at...
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 

Alluxio Mesos Meetup - SMACK to SMAACK

  • 1. © 2016 Mesosphere, Inc. All Rights Reserved. From SMACK to SMAACK Alluxio meets DC/OS Jörg Schad, Mesosphere Adit Madan, Alluxio #smack @Alluxio @dcos @joerg_schad @madanadit
  • 2. © 2017 Mesosphere, Inc. All Rights Reserved. 20% OFF MCDCOS20 September 13th - 15th ● Dedicated Tracks ● MesosCon University ● Town Halls ● Hackathon Accelerating Spark workloads in a Mesos environment with Alluxio, 09/15, 11AM
  • 3. © 2017 Mesosphere, Inc. All Rights Reserved. 3 Fast Data Batch Event ProcessingMicro-Batch Days Hours Minutes Seconds Microseconds Solves problems using predictive and prescriptive analyticsReports what has happened using descriptive analytics Predictive User InterfaceReal-time Pricing and Routing Real-time AdvertisingBilling, Chargeback Product recommendations
  • 4. © 2017 Mesosphere, Inc. All Rights Reserved. 4 The SMACK Stack EVENTS Ubiquitous data streams from connected devices INGEST Apache Kafka STORE Apache Spark ANALYZE Apache Cassandra ACT Akka Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Mesos/ DC/OS Sensors Devices Clients
  • 5. © 2017 Mesosphere, Inc. All Rights Reserved. 5 Datacenter
  • 6. © 2017 Mesosphere, Inc. All Rights Reserved. 6 NAIVE APPROACH Typical Datacenter siloed, over-provisioned servers, low utilization Industry Average 12-15% utilization mySQL microservice Cassandra Spark/Hadoop Kafka
  • 7. © 2017 Mesosphere, Inc. All Rights Reserved. 7
  • 8. © 2017 Mesosphere, Inc. All Rights Reserved. 8 MULTIPLEXING OF DATA, SERVICES, USERS, ENVIRONMENTS Typical Datacenter siloed, over-provisioned servers, low utilization Mesos/ DC/OS automated schedulers, workload multiplexing onto the same machines mySQL microservice Cassandra Spark/Hadoop Kafka
  • 9. Datacenter Operating System (DC/OS) Distributed Systems Kernel (Mesos) DC/OS ENABLES MODERN DISTRIBUTED APPS Big Data + Analytics EnginesMicroservices (in containers) Streaming Batch Machine Learning Analytics Functions & Logic Search Time Series SQL / NoSQL Databases Modern App Components Any Infrastructure (Physical, Virtual, Cloud) 9
  • 10. © 2017 Mesosphere, Inc. All Rights Reserved. 10 The SMACK Stack EVENTS Ubiquitous data streams from connected devices INGEST Apache Kafka STORE Apache Spark ANALYZE Apache Cassandra ACT Akka Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Mesos/ DC/OS Sensors Devices Clients
  • 11. © 2017 Mesosphere, Inc. All Rights Reserved. 11 The SMACK Stack EVENTS Ubiquitous data streams from connected devices INGEST Apache Kafka STORE Apache Spark ANALYZE Apache Cassandra ACT Akka Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Mesos/ DC/OS Sensors Devices Clients
  • 12. © 2016 Mesosphere, Inc. All Rights Reserved. BIG DATA ECOSYSTEM YESTERDAY © 2017 Alluxio 12
  • 13. © 2016 Mesosphere, Inc. All Rights Reserved. BIG DATA ECOSYSTEM TODAY © 2017 Alluxio … … 13
  • 14. © 2016 Mesosphere, Inc. All Rights Reserved. BIG DATA ECOSYSTEM ISSUES © 2017 Alluxio … … 14
  • 15. © 2017 Mesosphere, Inc. All Rights Reserved. 15 The SMAACK Stack EVENTS Ubiquitous data streams from connected devices INGEST Apache Kafka STORE Apache Spark ANALYZE Apache Cassandra ACT Akka Ingest millions of events per second Distributed & highly scalable database Real-time and batch process data Visualize data and build data driven applications Mesos/ DC/OS Sensors Devices Clients Alluxio
  • 16. © 2017 Mesosphere, Inc. All Rights Reserved. 16© 2017 Alluxio
  • 17. © 2016 Mesosphere, Inc. All Rights Reserved. BIG DATA ECOSYSTEM WITH ALLUXIO … … FUSE Compatible File System Interface Hadoop Compatible File System Interface Native Key-Value Interface Native File System Interface HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface © 2017 Alluxio 17
  • 18. © 2016 Mesosphere, Inc. All Rights Reserved. BIG DATA ECOSYSTEM WITH ALLUXIO … … FUSE Compatible File System Interface Hadoop Compatible File System Interface Native Key-Value Interface Native File System Interface HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface Enabling Application to Access Data from any Storage System at Memory-speed © 2017 Alluxio 18
  • 19. © 2016 Mesosphere, Inc. All Rights Reserved. WHY ALLUXIO © 2017 Alluxio Co-located compute and data with memory-speed access to data Virtualized across different storage systems under a unified namespace Scale-out architecture File system API, software only 19
  • 20. © 2016 Mesosphere, Inc. All Rights Reserved. ALLUXIO BENEFITS © 2017 Alluxio Unification New workflows across any data in any storage system Orders of magnitude improvement in run time Choice in compute and storage – grow each independently, buy only what is needed Performance Flexibility 20
  • 21. © 2017 Mesosphere, Inc. All Rights Reserved. 21© 2017 Alluxio
  • 22. © 2016 Mesosphere, Inc. All Rights Reserved. 22 WHY DATA SERVICES ON DC/OS? On-demand provisioning1 2 3 Simplified operations Elastic data infrastructure ● Single command install of services ● Runtime software upgrade ● Runtime application settings update ● Monitoring & metrics ● Managed persistent storage volumes ● Data services and containerized apps share resources ● Deploy instances with different versions on the same infrastructure ● Resize instances ● Add more instances © 2017 Alluxio
  • 23. © 2016 Mesosphere, Inc. All Rights Reserved. 23 ALLUXIO ON MESOSPHERE DC/OS Fast, On-demand Unified Data at Memory Speed for Analytics Alluxio Mesosphere DC/OS Any Infrastructure Build apps once in DC/OS, and run anywhere Runs distributed apps anywhere as simply as running apps on your laptop Unify Data at Memory Speed Unify Data at Memory Speed © 2017 Alluxio
  • 24. © 2016 Mesosphere, Inc. All Rights Reserved. 24 ALLUXIO ON MESOSPHERE DC/OS Fast, On-demand Unified Data at Memory Speed for Analytics © 2017 Alluxio
  • 25. © 2016 Mesosphere, Inc. All Rights Reserved. WHY ALLUXIO ON MESOSPHERE DC/OS? ● Without Mesosphere DC/OS, provisioning of infrastructure is tedious ○ Mesosphere DC/OS automates app & cluster provisioning, management & elastic scaling ● Alluxio brings ○ A unified view of data across disparate storage systems ○ High performance & predictable SLA for analytics workloads ● Benefits include: ○ Process data in your existing cluster faster with Spark and other analytics frameworks ○ Process data from hybrid cloud storage systems (HDFS, S3, On-prem Object Stores etc) © 2017 Alluxio 25
  • 26. © 2016 Mesosphere, Inc. All Rights Reserved. 26 BIG DATA STACK WITH ALLUXIO ON MESOSPHERE DC/OS Fast, On-demand Unified Data at Memory Speed for Analytics Mesos Container Orchestration Management & Monitoring Tools Apps Universe Security Advanced Operations Multitenancy Adv. Network & Storage Unifying Data at Memory Speed © 2017 Alluxio
  • 27. © 2017 Mesosphere, Inc. All Rights Reserved. 27© 2017 Alluxio DEMO
  • 28. © 2016 Mesosphere, Inc. All Rights Reserved. WHAT HAPPENED? ● Alluxio scheduler (developed using the DC/OS SDK) launched as a Marathon application ○ Marathon manages and restarts the scheduler in case of failures ○ Scheduler consists of YAML + scripting ● Alluxio scheduler launched master and worker processes ○ Scheduler manages the configured number of instances even w/ failures ● Configuration changes take effect on the fly ○ Scaled up the worker instances © 2017 Alluxio 28
  • 29. © 2016 Mesosphere, Inc. All Rights Reserved. GET STARTED TODAY Read: ● Mesosphere Blog: http://ow.ly/ou0530ax9aM ● Alluxio Blog: http://ow.ly/ILOZ30ax8YE Try it out: ● Install Alluxio from DC/OS Universe Questions? © 2017 Alluxio 29