SlideShare a Scribd company logo
hadoopsphere
Future of DataVisualization
HadoopSphereVirtual Conclave
August 2015
2
Commonly understood components of data visualization
• Graphs, maps, tables, shapes
• WYSIWYG editors
• Dashboards
• HTML5 views
• Infographics
3
Defining data visualization
• Data visualization is the presentation of data in a pictorial or
graphical format. -Wikipedia
• Data visualization is a visual representation of the insights
gained from your analysis. - Datameer
4
EmergingTrends
• New Channels
– Mobile,VR devices
• More interactive charts
– Redraw, filter, annotations
• Multidimensional visual
– VR, GL
• Network visualization
– Social, Linkages
• Collaborations
– Share, Review,Workflow
• And we may have ‘audiolizations’ as well
– Audio narrations
5
Process of data visualization
Prepare
Explore
Design
Deliver
6
Challenges
Access to data
Parse data
Central data access
Fast queries
Complex visual types
Linked Views
Data mining
Collaboration
Workflow
7
Introducing Apache Zeppelin
HDFS/ Data Store
Operations
Governance/Security
YARN
Spark / Flink /Tajo …
• Apache Zeppelin is a web-based multi-purpose notebook for interactive data
analysis.
• It is a 100% open source incubator project of Apache Software Foundations.
• As per HadoopSphere,Apache Zeppelin is going to influence big data visualization
tools for next 2 years or more.
8
Zeppelin Notebook
• A web-based notebook that
enables interactive data
analytics.
• You can type in code in SQL,
Scala and more in the
notebook.
• Run the commands directly
from the notebook.
Source for this slide and subsequent slides:
(1) http://zeppelin.apache.org
(2) Lee Moon Soo, Introduction to Zeppelin, ApacheCon 2015
9
Zeppelin user interface
10
Behind the scenes
• Java based backend
• Active development community
- Built-in Apache Spark integration
- Uses Angular JS, D3.js
- Tested on Mac OSx, Ubuntu 14.x, CentOS 6.x
11
Zeppelin features -Visualization
• Some basic charts are
currently included in Zeppelin
and more will be added in
future.
• Visualizations are not limited
to Spark SQL's query -
relational output from many
other language backends can
be recognized and visualized.
12
Zeppelin features - Pivots
• With simple drag and drop
Zeppelin aggregates the
values and display them in
pivot chart.
• You can easily create chart
with multiple aggregated
values including sum, count,
average, min, max.
13
Zeppelin features – Dynamic forms
• Zeppelin can dynamically take
inputs in forms as part of the
notebook.
• These dynamic forms can be
used to see input based results
or render charts.
14
Zeppelin features – Collaboration and publishing
• Notebook URL can be shared
among collaborators. Zeppelin
can then broadcast any changes
in real time, just like the
collaboration in Google docs.
• Zeppelin provides a URL to
display the results only that can
easily be embedded as an
iframe inside a web page.
15
Zeppelin interpreter architecture
• Zeppelin Interpreter is a connector between Zeppelin and backend data processing
system. For example to use scala code in Zeppelin, you need scala interpreter.
• Every Interpreter belongs to an InterpreterGroup which is a unit of start/stop
interpreter. Interpreters in the same InterpreterGroup can reference each other.
For example, SparkSqlInterpreter can reference SparkInterpreter to get
SparkContext from it while they're in the same group.
ZeppelinServer
InterpreterGroup
Separate JVM process
Interpreter Interpreter Interpreter
Spark
Spark PySpark SparkSQL Dep
Load
libraries
Maven repositorySpark cluster
Share single SparkDriver
Thrift
16
Zeppelin interaction ecosystem
* includes future roadmap components
17
Getting involved with Zeppelin
• http://zeppelin.apache.org/
• http://github.com/apache/incubator-zeppelin
Installation reference:
• http://hortonworks.com/blog/introduction-to-data-science-
with-apache-spark/
• http://nflabs.github.io/z-manager/
Mailing List
• users@zeppelin.incubator.apache.org
18
Other Notebook options
• iPython Notebook
• Beaker
• Spark-Notebook
• Databricks Cloud Notebook
19
Thank you
scale@hadoopsphere.com
Twitter: @hadoopsphere

More Related Content

What's hot

Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...
Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...
Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...
TUMRA | Big Data Science - Gain a competitive advantage through Big Data & Data Science
 
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and JujuMining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
seoul_engineer
 
Infrastructure for Deep Learning in Apache Spark
Infrastructure for Deep Learning in Apache SparkInfrastructure for Deep Learning in Apache Spark
Infrastructure for Deep Learning in Apache Spark
Databricks
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
Simon Ambridge
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Spark Summit
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Alex Zeltov
 
Real time ETL processing using Spark streaming
Real time ETL processing using Spark streamingReal time ETL processing using Spark streaming
Real time ETL processing using Spark streaming
datamantra
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark Summit
 
Big Telco - Yousun Jeong
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun Jeong
Spark Summit
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & Zeppelin
Vinay Shukla
 
Continuous Processing in Structured Streaming with Jose Torres
 Continuous Processing in Structured Streaming with Jose Torres Continuous Processing in Structured Streaming with Jose Torres
Continuous Processing in Structured Streaming with Jose Torres
Databricks
 
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDesigning the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Databricks
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
Databricks
 
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton UniversitySpark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
Alex Zeltov
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
tsliwowicz
 
Family data sheet HP Virtual Connect(May 2013)
Family data sheet HP Virtual Connect(May 2013)Family data sheet HP Virtual Connect(May 2013)
Family data sheet HP Virtual Connect(May 2013)
E. Balauca
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Databricks
 
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraStream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Databricks
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookTangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Databricks
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
Juantomás García Molina
 

What's hot (20)

Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...
Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...
Continuous Analytics & Optimisation using Apache Spark (Big Data Analytics, L...
 
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and JujuMining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
 
Infrastructure for Deep Learning in Apache Spark
Infrastructure for Deep Learning in Apache SparkInfrastructure for Deep Learning in Apache Spark
Infrastructure for Deep Learning in Apache Spark
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
Introduction to Big Data Analytics using Apache Spark and Zeppelin on HDInsig...
 
Real time ETL processing using Spark streaming
Real time ETL processing using Spark streamingReal time ETL processing using Spark streaming
Real time ETL processing using Spark streaming
 
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
 
Big Telco - Yousun Jeong
Big Telco - Yousun JeongBig Telco - Yousun Jeong
Big Telco - Yousun Jeong
 
Data Science with Spark & Zeppelin
Data Science with Spark & ZeppelinData Science with Spark & Zeppelin
Data Science with Spark & Zeppelin
 
Continuous Processing in Structured Streaming with Jose Torres
 Continuous Processing in Structured Streaming with Jose Torres Continuous Processing in Structured Streaming with Jose Torres
Continuous Processing in Structured Streaming with Jose Torres
 
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDesigning the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton UniversitySpark Advanced Analytics NJ Data Science Meetup - Princeton University
Spark Advanced Analytics NJ Data Science Meetup - Princeton University
 
Taboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache SparkTaboola Road To Scale With Apache Spark
Taboola Road To Scale With Apache Spark
 
Family data sheet HP Virtual Connect(May 2013)
Family data sheet HP Virtual Connect(May 2013)Family data sheet HP Virtual Connect(May 2013)
Family data sheet HP Virtual Connect(May 2013)
 
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
Dynamic DDL: Adding Structure to Streaming Data on the Fly with David Winters...
 
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen ShapiraStream All Things—Patterns of Modern Data Integration with Gwen Shapira
Stream All Things—Patterns of Modern Data Integration with Gwen Shapira
 
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at FacebookTangram: Distributed Scheduling Framework for Apache Spark at Facebook
Tangram: Distributed Scheduling Framework for Apache Spark at Facebook
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 

Similar to Future of data visualization

Tableau
TableauTableau
Spring Integration Splunk
Spring Integration SplunkSpring Integration Splunk
Spring Integration Splunk
Damien Dallimore
 
Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2
Imviplav
 
IRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New ContentIRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New Content
Martin Sykora
 
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
MLconf
 
Atlanta MLConf
Atlanta MLConfAtlanta MLConf
Atlanta MLConf
Qubole
 
SplunkLive! Amsterdam 2015 - Web Framework & 3rd Party Visualization
SplunkLive! Amsterdam 2015 - Web Framework & 3rd Party VisualizationSplunkLive! Amsterdam 2015 - Web Framework & 3rd Party Visualization
SplunkLive! Amsterdam 2015 - Web Framework & 3rd Party Visualization
Splunk
 
SEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
SemLib Project
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
Slim Baltagi
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
ssuserd3a367
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
indhuparvathy
 
Advanced Visualization of Spark jobs
Advanced Visualization of Spark jobsAdvanced Visualization of Spark jobs
Advanced Visualization of Spark jobs
DataWorks Summit/Hadoop Summit
 
Sparkflows.io
Sparkflows.ioSparkflows.io
Sparkflows.io
sparkflows
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
Damien Dallimore
 
Pallavi_Resume
Pallavi_ResumePallavi_Resume
Pallavi_Resume
pallavi Mahajan
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
Jeffrey T. Pollock
 
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
semanticsconference
 
DeepeshRehi
DeepeshRehiDeepeshRehi
DeepeshRehi
deepesh rehi
 
Best-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdfBest-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdf
ssuserf8f9b2
 
Apache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster ComputingApache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster Computing
All Things Open
 

Similar to Future of data visualization (20)

Tableau
TableauTableau
Tableau
 
Spring Integration Splunk
Spring Integration SplunkSpring Integration Splunk
Spring Integration Splunk
 
Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2Big data analytics with hadoop volume 2
Big data analytics with hadoop volume 2
 
IRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New ContentIRMAC April 2015 - DMBOK2 DWBI New Content
IRMAC April 2015 - DMBOK2 DWBI New Content
 
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
Jason Huang, Solutions Engineer, Qubole at MLconf ATL - 9/18/15
 
Atlanta MLConf
Atlanta MLConfAtlanta MLConf
Atlanta MLConf
 
SplunkLive! Amsterdam 2015 - Web Framework & 3rd Party Visualization
SplunkLive! Amsterdam 2015 - Web Framework & 3rd Party VisualizationSplunkLive! Amsterdam 2015 - Web Framework & 3rd Party Visualization
SplunkLive! Amsterdam 2015 - Web Framework & 3rd Party Visualization
 
SEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 
Advanced Visualization of Spark jobs
Advanced Visualization of Spark jobsAdvanced Visualization of Spark jobs
Advanced Visualization of Spark jobs
 
Sparkflows.io
Sparkflows.ioSparkflows.io
Sparkflows.io
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
 
Pallavi_Resume
Pallavi_ResumePallavi_Resume
Pallavi_Resume
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
Ben Gardner | Delivering a Linked Data warehouse and integrating across the w...
 
DeepeshRehi
DeepeshRehiDeepeshRehi
DeepeshRehi
 
Best-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdfBest-Practices-for-Using-Tableau-With-Snowflake.pdf
Best-Practices-for-Using-Tableau-With-Snowflake.pdf
 
Apache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster ComputingApache Spark: Lightning Fast Cluster Computing
Apache Spark: Lightning Fast Cluster Computing
 

Recently uploaded

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

Future of data visualization

  • 2. 2 Commonly understood components of data visualization • Graphs, maps, tables, shapes • WYSIWYG editors • Dashboards • HTML5 views • Infographics
  • 3. 3 Defining data visualization • Data visualization is the presentation of data in a pictorial or graphical format. -Wikipedia • Data visualization is a visual representation of the insights gained from your analysis. - Datameer
  • 4. 4 EmergingTrends • New Channels – Mobile,VR devices • More interactive charts – Redraw, filter, annotations • Multidimensional visual – VR, GL • Network visualization – Social, Linkages • Collaborations – Share, Review,Workflow • And we may have ‘audiolizations’ as well – Audio narrations
  • 5. 5 Process of data visualization Prepare Explore Design Deliver
  • 6. 6 Challenges Access to data Parse data Central data access Fast queries Complex visual types Linked Views Data mining Collaboration Workflow
  • 7. 7 Introducing Apache Zeppelin HDFS/ Data Store Operations Governance/Security YARN Spark / Flink /Tajo … • Apache Zeppelin is a web-based multi-purpose notebook for interactive data analysis. • It is a 100% open source incubator project of Apache Software Foundations. • As per HadoopSphere,Apache Zeppelin is going to influence big data visualization tools for next 2 years or more.
  • 8. 8 Zeppelin Notebook • A web-based notebook that enables interactive data analytics. • You can type in code in SQL, Scala and more in the notebook. • Run the commands directly from the notebook. Source for this slide and subsequent slides: (1) http://zeppelin.apache.org (2) Lee Moon Soo, Introduction to Zeppelin, ApacheCon 2015
  • 10. 10 Behind the scenes • Java based backend • Active development community - Built-in Apache Spark integration - Uses Angular JS, D3.js - Tested on Mac OSx, Ubuntu 14.x, CentOS 6.x
  • 11. 11 Zeppelin features -Visualization • Some basic charts are currently included in Zeppelin and more will be added in future. • Visualizations are not limited to Spark SQL's query - relational output from many other language backends can be recognized and visualized.
  • 12. 12 Zeppelin features - Pivots • With simple drag and drop Zeppelin aggregates the values and display them in pivot chart. • You can easily create chart with multiple aggregated values including sum, count, average, min, max.
  • 13. 13 Zeppelin features – Dynamic forms • Zeppelin can dynamically take inputs in forms as part of the notebook. • These dynamic forms can be used to see input based results or render charts.
  • 14. 14 Zeppelin features – Collaboration and publishing • Notebook URL can be shared among collaborators. Zeppelin can then broadcast any changes in real time, just like the collaboration in Google docs. • Zeppelin provides a URL to display the results only that can easily be embedded as an iframe inside a web page.
  • 15. 15 Zeppelin interpreter architecture • Zeppelin Interpreter is a connector between Zeppelin and backend data processing system. For example to use scala code in Zeppelin, you need scala interpreter. • Every Interpreter belongs to an InterpreterGroup which is a unit of start/stop interpreter. Interpreters in the same InterpreterGroup can reference each other. For example, SparkSqlInterpreter can reference SparkInterpreter to get SparkContext from it while they're in the same group. ZeppelinServer InterpreterGroup Separate JVM process Interpreter Interpreter Interpreter Spark Spark PySpark SparkSQL Dep Load libraries Maven repositorySpark cluster Share single SparkDriver Thrift
  • 16. 16 Zeppelin interaction ecosystem * includes future roadmap components
  • 17. 17 Getting involved with Zeppelin • http://zeppelin.apache.org/ • http://github.com/apache/incubator-zeppelin Installation reference: • http://hortonworks.com/blog/introduction-to-data-science- with-apache-spark/ • http://nflabs.github.io/z-manager/ Mailing List • users@zeppelin.incubator.apache.org
  • 18. 18 Other Notebook options • iPython Notebook • Beaker • Spark-Notebook • Databricks Cloud Notebook