SlideShare a Scribd company logo

Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark

F
felixcss

Apache: Big Data North America 2016 session How do you find the needle in the haystack? With Big Data, finding insight is a big problem. Visualization and exploratory analysis help convert on insights and Apache Zeppelin (incubating) is an essential tool for that. In this tutorial, Felix Cheung will introduce you to Apache Zeppelin, and provide step-by-step guides to get you up-and-running with Apache Zeppelin to run Big Data analysis with Apache Spark. This is going to be a heavily hands-on session, no previous experience with Zeppelin, Data Science, or Statistics necessary.

1 of 42
Download to read offline
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark
•
•
•
•
•
•
• ANY OTHER
PROVIDER PROVISIONING TOOLS
• HTTPS://WWW.VAGRANTUP.COM/DOWNLOADS.HTML
• VIRTUALIZATION
•
•
GUEST OPERATING SYSTEMS
• HTTPS://WWW.VIRTUALBOX.ORG/WIKI/DOWNLOADS
•
• HTTPS://GITHUB.COM/FELIXCHEUNG/VAGRANT-PROJECTS
• SPARK-CASSANDRA-ZEPPELIN
•
•
Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark

Recommended

Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeData Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo LeeSpark Summit
 
Intro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache ZeppelinIntro to Big Data Analytics using Apache Spark and Apache Zeppelin
Intro to Big Data Analytics using Apache Spark and Apache ZeppelinAlex Zeltov
 
Big Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and ZeppelinBig Data visualization with Apache Spark and Zeppelin
Big Data visualization with Apache Spark and Zeppelinprajods
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & ZeppelinVinay Shukla
 
Sparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with SparkSparkly Notebook: Interactive Analysis and Visualization with Spark
Sparkly Notebook: Interactive Analysis and Visualization with Sparkfelixcss
 
Exploratory data analysis using apache lens and apache zeppelin
Exploratory data analysis using apache lens and apache zeppelinExploratory data analysis using apache lens and apache zeppelin
Exploratory data analysis using apache lens and apache zeppelinpraagarw
 

More Related Content

Viewers also liked

Apache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source IntegrationApache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source Integrationprajods
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processingprajods
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelinfelixcss
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on HadoopVivian S. Zhang
 
Installing Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchInstalling Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchAndrey Vykhodtsev
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
Event Driven Architecture with Apache Camel
Event Driven Architecture with Apache CamelEvent Driven Architecture with Apache Camel
Event Driven Architecture with Apache Camelprajods
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsSrinath Perera
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboardrobkitchin
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launchrobkitchin
 
The ethics of urban big data and smart cities
The ethics of urban big data and smart citiesThe ethics of urban big data and smart cities
The ethics of urban big data and smart citiesrobkitchin
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
 
Ethics and Politics of Big Data
Ethics and Politics of Big DataEthics and Politics of Big Data
Ethics and Politics of Big Datarobkitchin
 
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin DataWorks Summit/Hadoop Summit
 
Spark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science LondonSpark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science LondonDatabricks
 
Intro to Spark with Zeppelin
Intro to Spark with ZeppelinIntro to Spark with Zeppelin
Intro to Spark with ZeppelinHortonworks
 

Viewers also liked (20)

Apache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source IntegrationApache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source Integration
 
Cloudera Impala
Cloudera ImpalaCloudera Impala
Cloudera Impala
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
 
Installing Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchInstalling Hadoop / Spark from scratch
Installing Hadoop / Spark from scratch
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
PyData Ljubljana meetup #1
PyData Ljubljana meetup #1PyData Ljubljana meetup #1
PyData Ljubljana meetup #1
 
Apache Zeppelin, Helium and Beyond
Apache Zeppelin, Helium and BeyondApache Zeppelin, Helium and Beyond
Apache Zeppelin, Helium and Beyond
 
Event Driven Architecture with Apache Camel
Event Driven Architecture with Apache CamelEvent Driven Architecture with Apache Camel
Event Driven Architecture with Apache Camel
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboard
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launch
 
The ethics of urban big data and smart cities
The ethics of urban big data and smart citiesThe ethics of urban big data and smart cities
The ethics of urban big data and smart cities
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash Course
 
Ethics and Politics of Big Data
Ethics and Politics of Big DataEthics and Politics of Big Data
Ethics and Politics of Big Data
 
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
 
Spark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science LondonSpark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science London
 
Intro to Spark with Zeppelin
Intro to Spark with ZeppelinIntro to Spark with Zeppelin
Intro to Spark with Zeppelin
 

Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark

Editor's Notes

  1. https://docs.mesosphere.com/1-7/usage/services/zeppelin/