SlideShare a Scribd company logo
Version 1.0
Apache Livy
A REST API for interacting with Spark Clusters, submitting jobs
and managing Spark Contexts and cached data.
Obioma Anomnachi
Engineer @ Anant
Apache Livy
● Apache Livy is a tool that enables users to interact with a Spark cluster via a REST interface
○ It allows users to submit jobs as pre-compiler jars, or snippets of code via REST
○ It also comes with a Java/Scala client API
○ It manages Spark Contexts, including allowing them to be used for multiple jobs or by multiple clients
○ Allows the sharing of caches RDDs, Dataframes, Datasets across multiple jobs
● It allows the submission of jobs without a Spark Client
● Existing Spark code needs no changes to be compatible with Livy, and existing cluster need little
modification
Spark Job Submission Options
● Normally, Spark interactions take place through the Spark shell (Python or Scala or R), or through
submission of batch jobs with spark-submit
○ These aren't actually very different, the spark-submit deploy mode option can allow submitted jobs to run in
client mode, which can also give access to a REPL shell
○ Either of the above types rely on access to the cluster from a limited number of sources
■ A node permanently part of the cluster
■ Temporarily adding a node to the cluster to use spark-submit
■ Using a cluster gateway
○ All three methods essentially require access to a machine that is part of the cluster, allowing users to access
implementation and configuration details
○ Difficult to deny or allow certain types of access to users in different security groups with a limited number
nodes / gateways to access
■ Makes it hard to integrate with existing security / access control solutions
Demo
● https://github.com/anomnaco/LivyDemo
Resources
● https://livy.apache.org/
● https://github.com/apache/incubator-livy
Extra Discussion
● Livy’s website says “Apache Livy also simplifies the interaction between Spark and application
servers, thus enabling the use of Spark for interactive web/mobile applications.”
● As a thought experiment: What use case would you use this functionality for? Where is it better to
use Livy, so that you can use Spark directly for furnishing your applications rather than loading the
results of Spark’s work into a solution more focused on being application friendly?
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

What's hot

Cassandra
Cassandra Cassandra
Cassandra
Pooja GV
 
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Anant Corporation
 
DevOps and OpenStack December 2012
DevOps and OpenStack December 2012DevOps and OpenStack December 2012
DevOps and OpenStack December 2012
Matt Ray
 
Introducing Kubernetes
Introducing Kubernetes Introducing Kubernetes
Introducing Kubernetes
VikRam S
 
Backup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipesBackup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipes
Leandro Totino Pereira
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0
ScyllaDB
 
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
ScyllaDB
 
Solving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache CassandraSolving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache Cassandra
Aaron Ploetz
 
Free & Open DynamoDB API for Everyone
Free & Open DynamoDB API for EveryoneFree & Open DynamoDB API for Everyone
Free & Open DynamoDB API for Everyone
ScyllaDB
 
Kafka website activity architecture
Kafka website activity architectureKafka website activity architecture
Kafka website activity architecture
Omid Vahdaty
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
Long Nguyen
 
Scylla Summit 2022: Stream Processing with ScyllaDB
Scylla Summit 2022: Stream Processing with ScyllaDBScylla Summit 2022: Stream Processing with ScyllaDB
Scylla Summit 2022: Stream Processing with ScyllaDB
ScyllaDB
 
Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)
Deepak Mane
 
How LogDNA Scaled Elasticsearch on Kubernetes
How LogDNA Scaled Elasticsearch on KubernetesHow LogDNA Scaled Elasticsearch on Kubernetes
How LogDNA Scaled Elasticsearch on Kubernetes
LogDNA
 
Changing the world with ZeroVM and Swift
Changing the world with ZeroVM and SwiftChanging the world with ZeroVM and Swift
Changing the world with ZeroVM and Swift
Jakub Krajcovic
 
Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020
ScyllaDB
 
Augury: Real-Time Insights for the Industrial IoT
Augury: Real-Time Insights for the Industrial IoTAugury: Real-Time Insights for the Industrial IoT
Augury: Real-Time Insights for the Industrial IoT
ScyllaDB
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
Li Gao
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes Services
ScyllaDB
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
ScyllaDB
 

What's hot (20)

Cassandra
Cassandra Cassandra
Cassandra
 
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
Apache Cassandra Lunch #71: Creating a User Profile Using DataStax Astra and ...
 
DevOps and OpenStack December 2012
DevOps and OpenStack December 2012DevOps and OpenStack December 2012
DevOps and OpenStack December 2012
 
Introducing Kubernetes
Introducing Kubernetes Introducing Kubernetes
Introducing Kubernetes
 
Backup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipesBackup multi-cloud solution based on named pipes
Backup multi-cloud solution based on named pipes
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0
 
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
How to Secure Your Scylla Deployment: Authorization, Encryption, LDAP Authent...
 
Solving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache CassandraSolving Hybrid Cloud Data Replication with Apache Cassandra
Solving Hybrid Cloud Data Replication with Apache Cassandra
 
Free & Open DynamoDB API for Everyone
Free & Open DynamoDB API for EveryoneFree & Open DynamoDB API for Everyone
Free & Open DynamoDB API for Everyone
 
Kafka website activity architecture
Kafka website activity architectureKafka website activity architecture
Kafka website activity architecture
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Scylla Summit 2022: Stream Processing with ScyllaDB
Scylla Summit 2022: Stream Processing with ScyllaDBScylla Summit 2022: Stream Processing with ScyllaDB
Scylla Summit 2022: Stream Processing with ScyllaDB
 
Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)
 
How LogDNA Scaled Elasticsearch on Kubernetes
How LogDNA Scaled Elasticsearch on KubernetesHow LogDNA Scaled Elasticsearch on Kubernetes
How LogDNA Scaled Elasticsearch on Kubernetes
 
Changing the world with ZeroVM and Swift
Changing the world with ZeroVM and SwiftChanging the world with ZeroVM and Swift
Changing the world with ZeroVM and Swift
 
Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020Scylla Virtual Workshop 2020
Scylla Virtual Workshop 2020
 
Augury: Real-Time Insights for the Industrial IoT
Augury: Real-Time Insights for the Industrial IoTAugury: Real-Time Insights for the Industrial IoT
Augury: Real-Time Insights for the Industrial IoT
 
Scaling spark on kubernetes at Lyft
Scaling spark on kubernetes at LyftScaling spark on kubernetes at Lyft
Scaling spark on kubernetes at Lyft
 
Running a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes ServicesRunning a DynamoDB-compatible Database on Managed Kubernetes Services
Running a DynamoDB-compatible Database on Managed Kubernetes Services
 
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2Scylla Summit 2022: Scylla 5.0 New Features, Part 2
Scylla Summit 2022: Scylla 5.0 New Features, Part 2
 

Similar to Data Engineer’s Lunch #45: Apache Livy

Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019
Gerard Klijs
 
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark example
ShidrokhGoudarzi1
 
Livy: A REST Web Service for Spark
Livy: A REST Web Service for SparkLivy: A REST Web Service for Spark
Livy: A REST Web Service for Spark
Ashish kumar
 
Migrating to spark 2.0
Migrating to spark 2.0Migrating to spark 2.0
Migrating to spark 2.0
datamantra
 
E2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/LivyE2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/Livy
Rikin Tanna
 
Function as a Service
Function as a ServiceFunction as a Service
Function as a Service
rich fernandez
 
A Short Presentation on Kafka
A Short Presentation on KafkaA Short Presentation on Kafka
A Short Presentation on Kafka
Mostafa Jubayer Khan
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Timothy Spann
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
Slim Baltagi
 
Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021
Lalit Panwar
 
Using OSGi for script deployment in Apache Sling
Using OSGi for script deployment in Apache SlingUsing OSGi for script deployment in Apache Sling
Using OSGi for script deployment in Apache Sling
Radu Cotescu
 
Apache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataApache frameworks for Big and Fast Data
Apache frameworks for Big and Fast Data
Naveen Korakoppa
 
Building Cloud-agnostic Serverless APIs
Building Cloud-agnostic Serverless APIsBuilding Cloud-agnostic Serverless APIs
Building Cloud-agnostic Serverless APIs
Postman
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
smarru
 
What's New in AWS Serverless and Containers
What's New in AWS Serverless and ContainersWhat's New in AWS Serverless and Containers
What's New in AWS Serverless and Containers
Amazon Web Services
 
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
My own sweet home!
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Timothy Spann
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
Timothy Spann
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
Max Alexejev
 
Java Library for High Speed Streaming Data
Java Library for High Speed Streaming Data Java Library for High Speed Streaming Data
Java Library for High Speed Streaming Data
Oracle Developers
 

Similar to Data Engineer’s Lunch #45: Apache Livy (20)

Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019Rust with-kafka-07-02-2019
Rust with-kafka-07-02-2019
 
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark example
 
Livy: A REST Web Service for Spark
Livy: A REST Web Service for SparkLivy: A REST Web Service for Spark
Livy: A REST Web Service for Spark
 
Migrating to spark 2.0
Migrating to spark 2.0Migrating to spark 2.0
Migrating to spark 2.0
 
E2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/LivyE2E Data Pipeline - Apache Spark/Airflow/Livy
E2E Data Pipeline - Apache Spark/Airflow/Livy
 
Function as a Service
Function as a ServiceFunction as a Service
Function as a Service
 
A Short Presentation on Kafka
A Short Presentation on KafkaA Short Presentation on Kafka
A Short Presentation on Kafka
 
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...Scenic City Summit (2021):  Real-Time Streaming in any and all clouds, hybrid...
Scenic City Summit (2021): Real-Time Streaming in any and all clouds, hybrid...
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 
Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021Mule soft meetup_chandigarh_#7_25_sept_2021
Mule soft meetup_chandigarh_#7_25_sept_2021
 
Using OSGi for script deployment in Apache Sling
Using OSGi for script deployment in Apache SlingUsing OSGi for script deployment in Apache Sling
Using OSGi for script deployment in Apache Sling
 
Apache frameworks for Big and Fast Data
Apache frameworks for Big and Fast DataApache frameworks for Big and Fast Data
Apache frameworks for Big and Fast Data
 
Building Cloud-agnostic Serverless APIs
Building Cloud-agnostic Serverless APIsBuilding Cloud-agnostic Serverless APIs
Building Cloud-agnostic Serverless APIs
 
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache AiravataRESTLess Design with Apache Thrift: Experiences from Apache Airavata
RESTLess Design with Apache Thrift: Experiences from Apache Airavata
 
What's New in AWS Serverless and Containers
What's New in AWS Serverless and ContainersWhat's New in AWS Serverless and Containers
What's New in AWS Serverless and Containers
 
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101AIDevWorldApacheNiFi101
AIDevWorldApacheNiFi101
 
Distributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and ScalaDistributed & Highly Available server applications in Java and Scala
Distributed & Highly Available server applications in Java and Scala
 
Java Library for High Speed Streaming Data
Java Library for High Speed Streaming Data Java Library for High Speed Streaming Data
Java Library for High Speed Streaming Data
 

More from Anant Corporation

LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
Anant Corporation
 
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
Anant Corporation
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Anant Corporation
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Anant Corporation
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
Anant Corporation
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Anant Corporation
 
YugabyteDB Developer Tools
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer Tools
Anant Corporation
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Anant Corporation
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
Anant Corporation
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Anant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Anant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Anant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
Anant Corporation
 
CL 121
CL 121CL 121
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Anant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Anant Corporation
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Anant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Anant Corporation
 

More from Anant Corporation (20)

LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by AnantLLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
LLM Fine Tuning with QLoRA Cassandra Lunch 4, presented by Anant
 
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
QLoRA Fine-Tuning on Cassandra Link Data Set (1/2) Cassandra Lunch 137
 
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdfKono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
Kono.IntelCraft.Weekly.AI.LLM.Landscape.2024.02.28.pdf
 
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache PinotData Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
Data Engineer's Lunch 96: Intro to Real Time Analytics Using Apache Pinot
 
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
NoCode, Data & AI LLM Inside Bootcamp: Episode 6 - Design Patterns: Retrieval...
 
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPTAutomate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
Automate your Job and Business with ChatGPT #3 - Fundamentals of LLM/GPT
 
YugabyteDB Developer Tools
YugabyteDB Developer ToolsYugabyteDB Developer Tools
YugabyteDB Developer Tools
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
Machine Learning Orchestration with Airflow
Machine Learning Orchestration with AirflowMachine Learning Orchestration with Airflow
Machine Learning Orchestration with Airflow
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
CL 121
CL 121CL 121
CL 121
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 

Recently uploaded

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 

Recently uploaded (20)

4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 

Data Engineer’s Lunch #45: Apache Livy

  • 1. Version 1.0 Apache Livy A REST API for interacting with Spark Clusters, submitting jobs and managing Spark Contexts and cached data. Obioma Anomnachi Engineer @ Anant
  • 2. Apache Livy ● Apache Livy is a tool that enables users to interact with a Spark cluster via a REST interface ○ It allows users to submit jobs as pre-compiler jars, or snippets of code via REST ○ It also comes with a Java/Scala client API ○ It manages Spark Contexts, including allowing them to be used for multiple jobs or by multiple clients ○ Allows the sharing of caches RDDs, Dataframes, Datasets across multiple jobs ● It allows the submission of jobs without a Spark Client ● Existing Spark code needs no changes to be compatible with Livy, and existing cluster need little modification
  • 3. Spark Job Submission Options ● Normally, Spark interactions take place through the Spark shell (Python or Scala or R), or through submission of batch jobs with spark-submit ○ These aren't actually very different, the spark-submit deploy mode option can allow submitted jobs to run in client mode, which can also give access to a REPL shell ○ Either of the above types rely on access to the cluster from a limited number of sources ■ A node permanently part of the cluster ■ Temporarily adding a node to the cluster to use spark-submit ■ Using a cluster gateway ○ All three methods essentially require access to a machine that is part of the cluster, allowing users to access implementation and configuration details ○ Difficult to deny or allow certain types of access to users in different security groups with a limited number nodes / gateways to access ■ Makes it hard to integrate with existing security / access control solutions
  • 6. Extra Discussion ● Livy’s website says “Apache Livy also simplifies the interaction between Spark and application servers, thus enabling the use of Spark for interactive web/mobile applications.” ● As a thought experiment: What use case would you use this functionality for? Where is it better to use Livy, so that you can use Spark directly for furnishing your applications rather than loading the results of Spark’s work into a solution more focused on being application friendly?
  • 7. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help. www.anant.us | solutions@anant.us | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037