SlideShare a Scribd company logo
1 of 42
•
•
•
•
•
•
• ANY OTHER
PROVIDER PROVISIONING TOOLS
• HTTPS://WWW.VAGRANTUP.COM/DOWNLOADS.HTML
• VIRTUALIZATION
•
•
GUEST OPERATING SYSTEMS
• HTTPS://WWW.VIRTUALBOX.ORG/WIKI/DOWNLOADS
•
• HTTPS://GITHUB.COM/FELIXCHEUNG/VAGRANT-PROJECTS
• SPARK-CASSANDRA-ZEPPELIN
•
•
•
•
•
•
•
•
•
HTTPS://ZEPPELIN.INCUBATOR.APACHE.ORG/
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• HTTPS://GITHUB.COM/FELIXCHEUNG/SPARK-NOTEBOOK-
EXAMPLES/TREE/MASTER/ZEPPELIN_NOTEBOOK/APACHECON2016
•
•
• HTTP://SPARK.APACHE.ORG/DOCS/LATE
ST/CONFIGURATION.HTML
•
•
• HTTPS://GITHUB.COM/FELIXCHEUNG/SPARK-NOTEBOOK-
EXAMPLES/TREE/MASTER/ZEPPELIN_NOTEBOOK/APACHECON2016
• PARTITION
CLUSTER MEAN PROTOTYPE
• HTTP://THEORY.STANFORD.EDU/~SERGEI/PAPERS/VLDB12-KMPAR.PDF
K-MEANS++
• STREAMING K-MEANS
•
GRAPHFRAMES
•
•
•
•
•
•
•
•
• BIGTABLE: A DISTRIBUTED STORAGE SYSTEM FOR STRUCTURED DATA
•
•
•
•
•
• HTTPS://HBASE.APACHE.ORG/BOOK.HTML#QUICKSTART
•
•
•
•
•
•
•
•
• BORN AT FACEBOOK AMAZON’S DYNAMO AND GOOGLE’S BIGTABLE
•
•
•
•
•
• HTTP://WIKI.APACHE.ORG/CASSANDRA/GETTINGSTARTED
•
•
•
•
•
•
•
•
• HERE
•
•
•
• HTTPS://WWW.DIGITALOCEAN.COM/COMMUNITY/TUTORIALS/HOW-TO-INSTALL-CASSANDRA-AND-RUN-
A-SINGLE-NODE-CLUSTER-ON-A-UBUNTU-VPS
•
•
•
•
•
•
•
•
•
•
•
•
•
•
• M3.XLARGE
•
•
•
•
•
•
•
•
•
http://www.natalinobusa.com/2015/11/why-is-smack-stack-all-rage-lately.html
• HTTPS://DOCS.MESOSPHERE.COM/ADMINISTRATION/INSTALLING/CLOUD/AWS/
• HTTPS://DCOS.IO/DOCS/1.7/ADMINISTRATION/INSTALLING/CLOUD/AWS/
•
•
•
•
•
• https://dcos.io/docs/1.7/usage/tutorials/spark/
• HTTPS://GITHUB.COM/FELIXCHEUNG

More Related Content

Viewers also liked

Apache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source IntegrationApache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source Integrationprajods
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processingprajods
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelinfelixcss
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on HadoopVivian S. Zhang
 
Installing Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchInstalling Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchAndrey Vykhodtsev
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLDavid Gleich
 
Event Driven Architecture with Apache Camel
Event Driven Architecture with Apache CamelEvent Driven Architecture with Apache Camel
Event Driven Architecture with Apache Camelprajods
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsSrinath Perera
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboardrobkitchin
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launchrobkitchin
 
The ethics of urban big data and smart cities
The ethics of urban big data and smart citiesThe ethics of urban big data and smart cities
The ethics of urban big data and smart citiesrobkitchin
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
 
Ethics and Politics of Big Data
Ethics and Politics of Big DataEthics and Politics of Big Data
Ethics and Politics of Big Datarobkitchin
 
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin DataWorks Summit/Hadoop Summit
 
Spark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science LondonSpark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science LondonDatabricks
 
Intro to Spark with Zeppelin
Intro to Spark with ZeppelinIntro to Spark with Zeppelin
Intro to Spark with ZeppelinHortonworks
 

Viewers also liked (20)

Apache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source IntegrationApache Camel: The Swiss Army Knife of Open Source Integration
Apache Camel: The Swiss Army Knife of Open Source Integration
 
Cloudera Impala
Cloudera ImpalaCloudera Impala
Cloudera Impala
 
Apache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data ProcessingApache Spark: The Next Gen toolset for Big Data Processing
Apache Spark: The Next Gen toolset for Big Data Processing
 
SparkR + Zeppelin
SparkR + ZeppelinSparkR + Zeppelin
SparkR + Zeppelin
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
 
Installing Hadoop / Spark from scratch
Installing Hadoop / Spark from scratchInstalling Hadoop / Spark from scratch
Installing Hadoop / Spark from scratch
 
Recommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQLRecommendation and graph algorithms in Hadoop and SQL
Recommendation and graph algorithms in Hadoop and SQL
 
PyData Ljubljana meetup #1
PyData Ljubljana meetup #1PyData Ljubljana meetup #1
PyData Ljubljana meetup #1
 
Apache Zeppelin, Helium and Beyond
Apache Zeppelin, Helium and BeyondApache Zeppelin, Helium and Beyond
Apache Zeppelin, Helium and Beyond
 
Event Driven Architecture with Apache Camel
Event Driven Architecture with Apache CamelEvent Driven Architecture with Apache Camel
Event Driven Architecture with Apache Camel
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
Praxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin DashboardPraxis and politics of urban data: Building the Dublin Dashboard
Praxis and politics of urban data: Building the Dublin Dashboard
 
Dublin dashboard launch
Dublin dashboard launchDublin dashboard launch
Dublin dashboard launch
 
The ethics of urban big data and smart cities
The ethics of urban big data and smart citiesThe ethics of urban big data and smart cities
The ethics of urban big data and smart cities
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
 
Apache Hadoop Crash Course
Apache Hadoop Crash CourseApache Hadoop Crash Course
Apache Hadoop Crash Course
 
Ethics and Politics of Big Data
Ethics and Politics of Big DataEthics and Politics of Big Data
Ethics and Politics of Big Data
 
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
 
Spark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science LondonSpark Under the Hood - Meetup @ Data Science London
Spark Under the Hood - Meetup @ Data Science London
 
Intro to Spark with Zeppelin
Intro to Spark with ZeppelinIntro to Spark with Zeppelin
Intro to Spark with Zeppelin
 

Interactive Data Science From Scratch with Apache Zeppelin and Apache Spark

Editor's Notes

  1. https://docs.mesosphere.com/1-7/usage/services/zeppelin/