1
Accelerating Spark Workloads in a Mesos
Environment with Alluxio
Gene Pang, Software Engineer, Alluxio, Inc.
* ©2017 Alluxio, Inc. All Rights Reserved
About Me
Gene Pang
Software Engineer @ Alluxio, Inc.
Alluxio Open Source PMC Member
Ph.D. from AMPLab @ UC Berkeley
Worked at Google before UC Berkeley
Twitter: @unityxx
Github: @gpang
©2017 Alluxio, Inc. All Rights Reserved 2
Outline
Alluxio Overview
Alluxio + Spark + Mesos Use Cases
Using Spark with Alluxio on Mesos
Deployment with Mesos
Demo
1
2
3
4
5
©2017 Alluxio, Inc. All Rights Reserved 3
Data Ecosystem Yesterday
4* ©2017 Alluxio, Inc. All Rights Reserved
• One Compute
Framework
• Single Storage System
• Co-located
Data Ecosystem Today
5* ©2017 Alluxio, Inc. All Rights Reserved
…
• Many Compute
Frameworks
• Multiple Storage Systems
• Most not co-located
…
Data Ecosystem Issues
6* ©2017 Alluxio, Inc. All Rights Reserved
• Each application manage
multiple data sources
• Add/Removing data
sources require
application changes
• Storage optimizations
requires application
change
• Lower performance due
to lack of locality
…
…
Data Ecosystem with Alluxio
7* ©2017 Alluxio, Inc. All Rights Reserved
• Apps only talk to
Alluxio
• Simple Add/Remove
• No App Changes
• Memory Performance
…
…
Next Gen Analytics with Alluxio
8* ©2017 Alluxio, Inc. All Rights Reserved
✓ Big Data/IoT
✓ AI/ML
✓ Deep Learning
✓ Cloud Migration
✓ Multi Platform
✓ Autonomous
…
…
Native File System
Hadoop Compatible
File System
Native Key-Value
Interface
Fuse Compatible File
System
HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface
Apps, Data & Storage
at Memory Speed
Enabling Next Gen Analytics
Unify your Data
9
1
Architecture Flexibility2
Improved I/O Performance3
* ©2017 Alluxio, Inc. All Rights Reserved
Fastest Growing Big Data
Open Source Project
10* ©2017 Alluxio, Inc. All Rights Reserved
• Fastest Growing open-
source project in the big
data ecosystem
• Running world’s largest
production clusters
• 600+ Contributors from
100+ organizations
Outline
Alluxio Overview
Alluxio + Spark + Mesos Use Cases
Using Spark with Alluxio on Mesos
Deployment with Mesos
Demo
1
2
3
4
5
©2017 Alluxio, Inc. All Rights Reserved 11
Big Data Case Study –
Challenge –
Gain end to end view of
business with large volume of
data for $5B Travel Site
Queries were slow / not
interactive, resulting in
operational inefficiency
SPARK
HDFS
Solution –
With Alluxio, 300x improvement
in performance
Impact –
Increased revenue from immediate
response to user behavior
Use case: http://bit.ly/2pDJdrq
CEPH
HDFS CEPH
FLINK SPARK FLINK
©2017 Alluxio, Inc. All Rights Reserved 12
MESOS
Machine Learning Case Study –
136/12/17 ©2017 Alluxio, Inc. All Rights Reserved
Challenge –
Disparate Data both on-prem
and Cloud. Heterogeneous
types of data.
Scaling of Exabyte size data.
Slow due to disk based
approach.
SPARK
HDFS
SPARK
MINIO
Solution –
Using Alluxio to prevent I/O
bottlenecks
Impact –
Orders of magnitude higher
performance than before.
http://bit.ly/2p18ds3
MESOS
Outline
Alluxio Overview
Alluxio + Spark + Mesos Use Cases
Using Spark with Alluxio on Mesos
Deployment with Mesos
Demo
1
2
3
4
5
©2017 Alluxio, Inc. All Rights Reserved 14
Sharing Data via Memory
Storage Engine &
Execution Engine
Same Process
• Two copies of data in memory – double the memory used
• Inter-process Sharing Slowed Down by Network / Disk I/O
©2017 Alluxio, Inc. All Rights Reserved 15
Mesos
Spark Compute
Spark
Storage
block 1
block 3
HDFS / Amazon S3
block 1
block 3
block 2
block 4
Spark Compute
Spark
Storage
block 1
block 3
Sharing Data via Memory
Storage Engine &
Execution Engine
Different process
• Half the memory used
• Inter-process Sharing Happens at Memory Speed
Spark Compute
Spark Storage
HDFS / Amazon S3
block 1
block 3
block 2
block 4
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
block 1
block 3 block 4
Spark Compute
Spark Storage
©2017 Alluxio, Inc. All Rights Reserved 16
Mesos
Data Resilience During Crash
Spark Compute
Spark Storage
block 1
block 3
HDFS / Amazon S3
block 1
block 3
block 2
block 4
Storage Engine &
Execution Engine
Same Process
©2017 Alluxio, Inc. All Rights Reserved 17
Mesos
Data Resilience During Crash
CRASH
Spark Storage
block 1
block 3
HDFS / Amazon S3
block 1
block 3 block 4
block 2
• Process Crash Requires Network and/or Disk I/O to Re-read Data
Storage Engine &
Execution Engine
Same Process
©2017 Alluxio, Inc. All Rights Reserved 18
Mesos
Data Resilience During Crash
CRASH
HDFS / Amazon S3
block 1
block 3
block 2
block 4
Storage Engine &
Execution Engine
Same Process
• Process Crash Requires Network and/or Disk I/O to Re-read Data
©2017 Alluxio, Inc. All Rights Reserved 19
Mesos
Data Resilience During Crash
Spark Compute
Spark Storage
HDFS / Amazon S3
block 1
block 3
block 2
block 4
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
block 1
block 3 block 4
Storage Engine &
Execution Engine
Different process
©2017 Alluxio, Inc. All Rights Reserved 20
Mesos
Data Resilience During Crash
Process Crash -
Data is Re-read at
Memory SpeedHDFS / Amazon S3
block 1
block 3
block 2
block 4
HDFS
disk
block 1
block 3
block 2
block 4
Alluxio
block 1
block 3 block 4
CRASH Storage Engine &
Execution Engine
Different process
©2017 Alluxio, Inc. All Rights Reserved 21
Mesos
Alluxio Architecture
©2017 Alluxio, Inc. All Rights Reserved 22
Application
AlluxioClient
Alluxio
Master
Alluxio
Worker
Alluxio
Worker
…
Storage
Storage
…
Alluxio Client
©2017 Alluxio, Inc. All Rights Reserved 23
Applications interact with Alluxio via the Alluxio client
● Native Alluxio Filesystem Client
• Alluxio specific operations like [un]pin, [un]mount, [un]set TTL
● HDFS-Compatible Filesystem Client
• No code change necessary
● S3 API
Alluxio Master
©2017 Alluxio, Inc. All Rights Reserved 24
Master is responsible for managing metadata
● Filesystem namespace metadata
● Blocks / workers metadata
Primary master writes journal for durable operations
● Secondary masters replay journal entries
Alluxio Worker
©2017 Alluxio, Inc. All Rights Reserved 25
Worker is responsible for managing block data
Worker stores block data on various storage media
● HDD, SSD, Memory
Reads and writes data to underlying storage systems
Outline
Alluxio Overview
Alluxio + Spark + Mesos Use Cases
Using Spark with Alluxio on Mesos
Deployment with Mesos
Demo
1
2
3
4
5
©2017 Alluxio, Inc. All Rights Reserved 26
Alluxio on DC/OS
©2017 Alluxio, Inc. All Rights Reserved 27
Alluxio on DC/OS
©2017 Alluxio, Inc. All Rights Reserved 28
Alluxio brings
A unified view of data across disparate storage systems
High performance & predictable SLA for analytics workloads
DC/OS makes provisioning infrastructure easy
Automates provisioning, management & elastic scaling
Benefits include:
Faster analytics with Spark and other frameworks
Process data from hybrid cloud storage systems (HDFS, S3, etc)
Outline
Alluxio Overview
Alluxio + Spark + Mesos Use Cases
Using Spark with Alluxio on Mesos
Deployment with Mesos
Demo
1
2
3
4
5
©2017 Alluxio, Inc. All Rights Reserved 29
Demo Environment
Spark
Alluxio
©2017 Alluxio, Inc. All Rights Reserved 30
SPARK
MESOS
Demo Setup
Alluxio 1.5.0
DC/OS 1.9.4
Spark 2.0.2
Amazon EC2 (m3.xlarge)
©2017 Alluxio, Inc. All Rights Reserved 31
Results
©2017 Alluxio, Inc. All Rights Reserved 32
8x  improvement  
Conclusion
Easy to use Alluxio with Spark in a Mesos environment
Predictable and improved performance
Easily connect to various storage systems
©2017 Alluxio, Inc. All Rights Reserved 33
Thank you!
Gene Pang
Software Engineer
gene@alluxio.com
34
Twitter.com/alluxio
Linkedin.com/alluxio
Website
www.alluxio.com
E-mail
info@alluxio.com
@
Social Media
* ©2017 Alluxio, Inc. All Rights Reserved

Accelerating Spark Workloads in a Mesos Environment with Alluxio

  • 1.
    1 Accelerating Spark Workloadsin a Mesos Environment with Alluxio Gene Pang, Software Engineer, Alluxio, Inc. * ©2017 Alluxio, Inc. All Rights Reserved
  • 2.
    About Me Gene Pang SoftwareEngineer @ Alluxio, Inc. Alluxio Open Source PMC Member Ph.D. from AMPLab @ UC Berkeley Worked at Google before UC Berkeley Twitter: @unityxx Github: @gpang ©2017 Alluxio, Inc. All Rights Reserved 2
  • 3.
    Outline Alluxio Overview Alluxio +Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 3
  • 4.
    Data Ecosystem Yesterday 4*©2017 Alluxio, Inc. All Rights Reserved • One Compute Framework • Single Storage System • Co-located
  • 5.
    Data Ecosystem Today 5*©2017 Alluxio, Inc. All Rights Reserved … • Many Compute Frameworks • Multiple Storage Systems • Most not co-located …
  • 6.
    Data Ecosystem Issues 6*©2017 Alluxio, Inc. All Rights Reserved • Each application manage multiple data sources • Add/Removing data sources require application changes • Storage optimizations requires application change • Lower performance due to lack of locality … …
  • 7.
    Data Ecosystem withAlluxio 7* ©2017 Alluxio, Inc. All Rights Reserved • Apps only talk to Alluxio • Simple Add/Remove • No App Changes • Memory Performance … …
  • 8.
    Next Gen Analyticswith Alluxio 8* ©2017 Alluxio, Inc. All Rights Reserved ✓ Big Data/IoT ✓ AI/ML ✓ Deep Learning ✓ Cloud Migration ✓ Multi Platform ✓ Autonomous … … Native File System Hadoop Compatible File System Native Key-Value Interface Fuse Compatible File System HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface Apps, Data & Storage at Memory Speed
  • 9.
    Enabling Next GenAnalytics Unify your Data 9 1 Architecture Flexibility2 Improved I/O Performance3 * ©2017 Alluxio, Inc. All Rights Reserved
  • 10.
    Fastest Growing BigData Open Source Project 10* ©2017 Alluxio, Inc. All Rights Reserved • Fastest Growing open- source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from 100+ organizations
  • 11.
    Outline Alluxio Overview Alluxio +Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 11
  • 12.
    Big Data CaseStudy – Challenge – Gain end to end view of business with large volume of data for $5B Travel Site Queries were slow / not interactive, resulting in operational inefficiency SPARK HDFS Solution – With Alluxio, 300x improvement in performance Impact – Increased revenue from immediate response to user behavior Use case: http://bit.ly/2pDJdrq CEPH HDFS CEPH FLINK SPARK FLINK ©2017 Alluxio, Inc. All Rights Reserved 12 MESOS
  • 13.
    Machine Learning CaseStudy – 136/12/17 ©2017 Alluxio, Inc. All Rights Reserved Challenge – Disparate Data both on-prem and Cloud. Heterogeneous types of data. Scaling of Exabyte size data. Slow due to disk based approach. SPARK HDFS SPARK MINIO Solution – Using Alluxio to prevent I/O bottlenecks Impact – Orders of magnitude higher performance than before. http://bit.ly/2p18ds3 MESOS
  • 14.
    Outline Alluxio Overview Alluxio +Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 14
  • 15.
    Sharing Data viaMemory Storage Engine & Execution Engine Same Process • Two copies of data in memory – double the memory used • Inter-process Sharing Slowed Down by Network / Disk I/O ©2017 Alluxio, Inc. All Rights Reserved 15 Mesos Spark Compute Spark Storage block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Spark Compute Spark Storage block 1 block 3
  • 16.
    Sharing Data viaMemory Storage Engine & Execution Engine Different process • Half the memory used • Inter-process Sharing Happens at Memory Speed Spark Compute Spark Storage HDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio block 1 block 3 block 4 Spark Compute Spark Storage ©2017 Alluxio, Inc. All Rights Reserved 16 Mesos
  • 17.
    Data Resilience DuringCrash Spark Compute Spark Storage block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Storage Engine & Execution Engine Same Process ©2017 Alluxio, Inc. All Rights Reserved 17 Mesos
  • 18.
    Data Resilience DuringCrash CRASH Spark Storage block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 4 block 2 • Process Crash Requires Network and/or Disk I/O to Re-read Data Storage Engine & Execution Engine Same Process ©2017 Alluxio, Inc. All Rights Reserved 18 Mesos
  • 19.
    Data Resilience DuringCrash CRASH HDFS / Amazon S3 block 1 block 3 block 2 block 4 Storage Engine & Execution Engine Same Process • Process Crash Requires Network and/or Disk I/O to Re-read Data ©2017 Alluxio, Inc. All Rights Reserved 19 Mesos
  • 20.
    Data Resilience DuringCrash Spark Compute Spark Storage HDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio block 1 block 3 block 4 Storage Engine & Execution Engine Different process ©2017 Alluxio, Inc. All Rights Reserved 20 Mesos
  • 21.
    Data Resilience DuringCrash Process Crash - Data is Re-read at Memory SpeedHDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio block 1 block 3 block 4 CRASH Storage Engine & Execution Engine Different process ©2017 Alluxio, Inc. All Rights Reserved 21 Mesos
  • 22.
    Alluxio Architecture ©2017 Alluxio,Inc. All Rights Reserved 22 Application AlluxioClient Alluxio Master Alluxio Worker Alluxio Worker … Storage Storage …
  • 23.
    Alluxio Client ©2017 Alluxio,Inc. All Rights Reserved 23 Applications interact with Alluxio via the Alluxio client ● Native Alluxio Filesystem Client • Alluxio specific operations like [un]pin, [un]mount, [un]set TTL ● HDFS-Compatible Filesystem Client • No code change necessary ● S3 API
  • 24.
    Alluxio Master ©2017 Alluxio,Inc. All Rights Reserved 24 Master is responsible for managing metadata ● Filesystem namespace metadata ● Blocks / workers metadata Primary master writes journal for durable operations ● Secondary masters replay journal entries
  • 25.
    Alluxio Worker ©2017 Alluxio,Inc. All Rights Reserved 25 Worker is responsible for managing block data Worker stores block data on various storage media ● HDD, SSD, Memory Reads and writes data to underlying storage systems
  • 26.
    Outline Alluxio Overview Alluxio +Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 26
  • 27.
    Alluxio on DC/OS ©2017Alluxio, Inc. All Rights Reserved 27
  • 28.
    Alluxio on DC/OS ©2017Alluxio, Inc. All Rights Reserved 28 Alluxio brings A unified view of data across disparate storage systems High performance & predictable SLA for analytics workloads DC/OS makes provisioning infrastructure easy Automates provisioning, management & elastic scaling Benefits include: Faster analytics with Spark and other frameworks Process data from hybrid cloud storage systems (HDFS, S3, etc)
  • 29.
    Outline Alluxio Overview Alluxio +Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 29
  • 30.
    Demo Environment Spark Alluxio ©2017 Alluxio,Inc. All Rights Reserved 30 SPARK MESOS
  • 31.
    Demo Setup Alluxio 1.5.0 DC/OS1.9.4 Spark 2.0.2 Amazon EC2 (m3.xlarge) ©2017 Alluxio, Inc. All Rights Reserved 31
  • 32.
    Results ©2017 Alluxio, Inc.All Rights Reserved 32 8x  improvement  
  • 33.
    Conclusion Easy to useAlluxio with Spark in a Mesos environment Predictable and improved performance Easily connect to various storage systems ©2017 Alluxio, Inc. All Rights Reserved 33
  • 34.
    Thank you! Gene Pang SoftwareEngineer gene@alluxio.com 34 Twitter.com/alluxio Linkedin.com/alluxio Website www.alluxio.com E-mail info@alluxio.com @ Social Media * ©2017 Alluxio, Inc. All Rights Reserved