Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Apache Spark on Kubernetes
Haridas N
Agenda
● What’s Kubernetes
● Why we need to run park on Kubernetes
● How comparable kubernetes with other cluster managers...
Kubernetes
● Container orchestrator
● Provision containers on multiple nodes and abstract the networks over multiple node....
Spark on
Kubernetes
● Support different deployment
modes
Resource
provisioning
Why spark on Kubernetes
● Kubernetes is now widely used container management for SOA and other application
environment.
● ...
Demo / Workshop
RECAP
Hadoop, HDFS and Yarn
● HDFS for the storage layer, namenode and datanode services take care the data storage part.
● Yarn...
Apache Spark on Yarn
● Using Yarn we deployed spark driver and slaves into hadoop cluster.
● Yarn provides more flexible r...
Apache Drill
● SQL interface for bigdata, with spark like architecture.
● Interface with HDFS, NoSQL, Hive, Kafka etc. and...
Apache Spark on Kubernetes
● Kubernetes is a widely used container orchestrator
● Major deployments outside big-data domai...
QA
Thank you
You’ve finished this document.
Download and read it offline.
Upcoming SlideShare
What to Upload to SlideShare
Next
Upcoming SlideShare
What to Upload to SlideShare
Next
Download to read offline and view in fullscreen.

Share

Apache Spark on Kubernetes

Download to read offline

How we can make use of Kubernetes as Resource Manager for Spark. What are the Pros and Cons of Spark Resource manager are discussed on this slides and the associated tutorial.

Refer this github project for more details and code samples : https://github.com/haridas/hadoop-env

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

Apache Spark on Kubernetes

  1. 1. Apache Spark on Kubernetes Haridas N
  2. 2. Agenda ● What’s Kubernetes ● Why we need to run park on Kubernetes ● How comparable kubernetes with other cluster managers ● Hands on with few spark jobs.
  3. 3. Kubernetes ● Container orchestrator ● Provision containers on multiple nodes and abstract the networks over multiple node. ● Supports multiple namespaces for better project isolation. ● User role and privilege management.
  4. 4. Spark on Kubernetes ● Support different deployment modes
  5. 5. Resource provisioning
  6. 6. Why spark on Kubernetes ● Kubernetes is now widely used container management for SOA and other application environment. ● Better isolation on different deployments.
  7. 7. Demo / Workshop
  8. 8. RECAP
  9. 9. Hadoop, HDFS and Yarn ● HDFS for the storage layer, namenode and datanode services take care the data storage part. ● Yarn resource negotiator provide a compute framework over HDFS nodes. ● Map-reduce jobs are written on yarn framework. ● Best fit for batch-processing, big-data storage
  10. 10. Apache Spark on Yarn ● Using Yarn we deployed spark driver and slaves into hadoop cluster. ● Yarn provides more flexible resource management. ● Dynamic worker allocation or on demand allocation. ● Best fit, if you already have a hadoop cluster and want to run spark jobs on it.
  11. 11. Apache Drill ● SQL interface for bigdata, with spark like architecture. ● Interface with HDFS, NoSQL, Hive, Kafka etc. and provide unified standard SQL interface ● Exposes APIs JDBC, HTTP. ● Best fit for quick data analysis using SQL commands.
  12. 12. Apache Spark on Kubernetes ● Kubernetes is a widely used container orchestrator ● Major deployments outside big-data domain for different needs. ● Project supports big-data tools like spark and hadoop on top of it. ● Run spark job on existing kubernetes cluster. ● Got better feature set with resourcemanagement compared to all other cluster managers. ● Best fit if you already have kubernetes cluster in your environment.
  13. 13. QA
  14. 14. Thank you
  • WellingtonAlvesDasNeves

    Nov. 27, 2019

How we can make use of Kubernetes as Resource Manager for Spark. What are the Pros and Cons of Spark Resource manager are discussed on this slides and the associated tutorial. Refer this github project for more details and code samples : https://github.com/haridas/hadoop-env

Views

Total views

323

On Slideshare

0

From embeds

0

Number of embeds

217

Actions

Downloads

5

Shares

0

Comments

0

Likes

1

×