SlideShare a Scribd company logo
1 of 96
TABLE DES MATIÈRES
Big Data & Data visualization:
From the Lake to Your Screen
An afterwork by @OCTOSuisse
Geneva, May 9th, 2017
Joseph Glorieux
Alexandre Masselot
TABLE DES MATIÈRES
Big Data & Data visualization:
From the Lake to Your Screen
An afterwork by @OCTOSuisse
Geneva, May 9th, 2017
Joseph Glorieux
Alexandre Masselot
4
OCTO, DIGITAL TRANSFORMATION ACCELERATOR
DIGITAL
TRANSFORMATION
Facilitate and
Accelerate the adoption
of Digital Culture
-
Business, IT,
People
Consulting
& Delivery
OCTO TECHNOLOGY > THERE IS A BETTER WAY
BIG DATA @ OCTO : THE NUMBERS
TB, the biggest
volume of
distributed storage
on a single project
250
TB, the biggest
volume of data
analyzed by
OCTO’s data
scientists
>20
Is the number of
Big Data projects at
OCTO in the past
12 months
The number of
OCTO certified on
the Hadoop
platform
40
850
800
cores, the biggest
Hadoop cluster
built by OCTO
16 The number of active partnerships with
major Big Data actors
5OCTO TECHNOLOGY > THERE IS A BETTER WAY
BIG DATA @ OCTO: PUBLICATIONS
OCTO TECHNOLOGY > THERE IS A BETTER WAY 6
BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN
OCTO TECHNOLOGY > THERE IS A BETTER WAY 7
About Data Visualization
1
2
3
4
From Data lake to your Mac
Explore, Understand, Communicate
Back to the Lake
BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN
OCTO TECHNOLOGY > THERE IS A BETTER WAY 8
About Data Visualization
1
2
3
4
From Data lake to your Mac
Explore, Understand, Communicate
Back to the Lake
LIMITATIONS OF TRADITIONAL ARCHITECTURES
OCTO TECHNOLOGY > THERE IS A BETTER WAY 9
Over 10 Tb, « classical »
architectures requires huge
software and hardware
adaptations.
Over 1 000 transactions /
second, « classical »
architectures requires huge
software and hardware
adaptations.
Over 10 threads/Core CPU,
sequential programming reach
its limits (IO).
Over 1 000 events / second,
« classical » architectures
requires huge software and
hardware adaptations.
Distributed
storage
Share
nothing
XTP
Parallel
processing
Event Stream
Processing
« Traditional /
Standard »
architectures
RDBMS,
Application server,
ETL, ESB
Event flow oriented
application
Message Bound
(streaming)
Transaction oriented
applications
Transaction Bound
(TPS)
Storage oriented
applications
(IO bound)
Computation
oriented applications
CPU bound
(Stream Grid)
(Calculation
Grid)
(Transaction Grid)
(Storage
Grid)
BIG DATA - EMERGING FAMILIES
OCTO TECHNOLOGY > THERE IS A BETTER WAY 10
Event flow oriented
application
Message Bound
(streaming)
Transaction oriented
applications
Transaction Bound
(TPS)
Storage oriented
applications
IO bound
Computation
oriented applications
CPU bound
NoSQL
NewSQL
NoSQL : ditributed non-
relational stores,
NewSQL : SQL compliant
distributed stores
CEP - Complex Event Processing,
ESP - Event Stream Processing
Grid -
GPU
Grid computing on
CPU, or on GPU
In-memory analytics solutions
distribute the data in the
memory of several nodes to
obtain a low processing time.
In-memory
analytics
Hadoop
The Hadoop ecosystem offers
a distributed storage, but also
distributed computing using
MapReduce.
Streaming
In-memory
analytics
NoSQL
NewSQLStreaming
Hadoop
MODELS & DATA
Traditional models Advanced models Advanced models
with more data
Advanced models
with more data
and more features
Precision
Precision score for the TOP 20%
OCTO TECHNOLOGY > THERE IS A BETTER WAY 11
NEW ARCHITECTURE PATTERN: THE DATALAKE
Non-structured storage
Semi-structured
storage (NoSQL)
structured storage (ex.
relational)
Interactive
requests
Analytical
processing
Flow
management
Machine
Learning
Database Raw files Logs External data,
OpenAPI
Messages &
Events
Enterprise
DWH
Operational
system
Reporting,
request
External data,
OpenAPI
Messages &
Events
DATALAKE
INTEGRATION
PUBLICATION
OCTO TECHNOLOGY > THERE IS A BETTER WAY 12
PEAK OF INFLATED EXPECTATIONS?
From Datalake… … to Dataswamp
You do not
need to
store/compute
petabyte of
data…
OCTO TECHNOLOGY > THERE IS A BETTER WAY 13
Big Data?
I should
buy a
Hadoop
Cluster
NEW ARCHITECTURE PATTERN: THE DATALAKE
Non-structured storage
Semi-structured
storage (NoSQL)
structured storage (ex.
relational)
Interactive
requests
Analytical
processing
Flow
management
Machine
Learning
Database Raw files Logs External data,
OpenAPI
Messages &
Events
Enterprise
DWH
Operational
system
Reporting,
request
External data,
OpenAPI
Messages &
Events
DATALAKE
INTEGRATION
PUBLICATION
OCTO TECHNOLOGY > THERE IS A BETTER WAY 14
THE 5-LEGGED SHEEP
OCTO TECHNOLOGY > THERE IS A BETTER WAY 15
Source : www.marketingdistillery.com
16
THE DATALAB
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Why a DataLab?
 Limitation of distributed environment for experimentation:
> less algorithms available,
> longer round trip implies slower experimentation,
> other programming paradigms
 No necessary to have all data for experimenting, statistically
relevant samples are sufficient
Description
 The DataLab is a “sandbox” area where analysts should have
great freedom with tools and data usage. It contains a work
storage area allowing to "play" with the data
 It lives outside of the Datalake to ensure and facilitate its
exploitation
 Machine with lots of RAM and CPU to enable in memory
processing, mono-machine – vertical scalability, multi-user
DataLab
Analytics
Machine Learning
Tools
Storage
DataViz Tools
Work Storage Area
17
DATA SCIENCE LIFE CYCLE
OCTO TECHNOLOGY > THERE IS A BETTER WAY
DATALAB
ITERATIVE EXPERIMENTS
Data scientists
Activities:
 Data exploration environment
 Machine learning applied to key
business question
 POC
 Preliminary models
 Demos for communication
HADOOP CLUSTER
DEVELOPMENT
Developers
Activities
 Developers implements
selected models from the
DataLab to run in a distributed
environment
 Industrialize external/internal
data flows
 Model industrialized
 Applications to access results
 Data ingestion programs
HADOOP
PRODUCTION
Business
Activities
 Interacts with the applications
accessing the Data Lake and
exposing results from models
Scheduled activities on cluster
 Ingestion of historical data
 Compute associated with all
deployed applications
 Populated Data Lake
 Models on distributed data
 Applications for end-users
ARCHITECTURE
18
proxy proxy proxy
Legend
GET blacklist
POST/PUT whitelist (>30K)proxy
DMZ/Cloud
Reverse proxy
https
data
data,programs
R,python
dataReverse proxy
On demand manual transfer
Security check
ETL, https,
ssh
Wifi
mobile
data flow
Explore existing Data Lake
masks / anonymisation
data
copy
https
ssh
Laptop PAM
PROD - System transactional
PAM laptop
Personal computer
DataLab
AD ?
HDP PROD - Bare metal
Edge
Name
Node
Name
Node
Data
Node
Data
Node
Date
Node
DEV - Toolsdata
GET whitelist
POST/PUT whitelistproxy
HDP DEV - VM
Edge
Name
Node
Data
Node
Data
Node
PAM
HDP SANDBOX -
VM
Edge
Name
Node
Data
Node
Data
Node
Bare metal
Virtual machine
THE NEW DATASCIENCE PLATFORM
OCTO TECHNOLOGY > THERE IS A BETTER WAY 19
BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN
OCTO TECHNOLOGY > THERE IS A BETTER WAY 20
About Data Visualization
1
2
3
4
From Data lake to your Mac
Explore, Understand, Communicate
Back to the Lake
21
A PERFECT USE CASE: SWISS PUBLIC TRANSPORT
Data types
 Schedules
 Event storage
 Real time streaming
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Usages
 Data analysis
 Prediction
 End user application
Sources
 opentransportdata.swiss
 transport.opendata.ch
 gtfs.geops.ch
22OCTO TECHNOLOGY > THERE IS A BETTER WAY
23
QUESTION OF THE DAY: “IS MY TRAIN RUNNING LATE?”
OCTO TECHNOLOGY > THERE IS A BETTER WAY
“WILL MY TRAIN BE RUNNING LATE?”
24
“WILL MY TRAIN BE RUNNING LATE?”
OCTO TECHNOLOGY > THERE IS A BETTER WAY
25OCTO TECHNOLOGY > THERE IS A BETTER WAY
26
WHAT IS INSIDE THE AVAILABLE DATA?
Multiple challenges
 Different acquisition modes
 Different ids for the same entity
 Different languages (e.g. german)
 Different quality (e.g. missing data)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Multiple sources
 opentransportdata.swiss
 transport.opendata.ch
 gtfs.geops.ch
27OCTO TECHNOLOGY > THERE IS A BETTER WAY
28
EXPLORING THE DATA
Getting acquainted with the data:
 download a bearable sample (40 millions lines)
 Repeat
1. build the lightest import process
2. observe
3. go to business expert to get insights
OCTO TECHNOLOGY > THERE IS A BETTER WAY
4. observe
29
EXPLORING THE DATA
OCTO TECHNOLOGY > THERE IS A BETTER WAY
30
EXPLORE MY DATA?
OCTO TECHNOLOGY > THERE IS A BETTER WAY
31
EXPLORING THE DATA
The need for a visualization tool:
 interactive
 versatile
 handling large amount of data (samples)
 loading data for various sources
 adding computed values
OCTO TECHNOLOGY > THERE IS A BETTER WAY
32
EXPLORING THE DATA
 Connect to data
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
33
EXPLORING THE DATA
 Create computed columns
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
34
EXPLORING THE DATA
 Join multiple tables
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
35
EXPLORING THE DATA
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
36
EXPLORING THE DATA
 Exploring and filtering out data
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
37
EXPLORING THE DATA
 Killing preconceived ideas: “InterCity trains are less frequently late”
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
38
EXPLORING THE DATA
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
39
EXPLORING THE DATA
An Introduction to Tableau
OCTO TECHNOLOGY > THERE IS A BETTER WAY
40OCTO TECHNOLOGY > THERE IS A BETTER WAY
41
ANALYZING DATA WITH NOTEBOOKS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
 A notebook allows to write text and live code in order to wrap together code,
output and documentation
 The full power of programming, interactivity, results and documentation. All in
the same place.
Language
of choice
Interactive
widgets
Share
notebooks
Big data
Integration
42
ANALYZING DATA WITH NOTEBOOKS
Loading and munging data
OCTO TECHNOLOGY > THERE IS A BETTER WAY
43
ANALYZING DATA WITH NOTEBOOKS
Figures and code
OCTO TECHNOLOGY > THERE IS A BETTER WAY
44
ANALYZING DATA
Building a model
OCTO TECHNOLOGY > THERE IS A BETTER WAY
1’ 3’ 5’ 6’ 11’ 12’ 14’ 15’ 20’ 22’ (departure shift)
45
ANALYZING DATA
Building a model
OCTO TECHNOLOGY > THERE IS A BETTER WAY
1’ 3’ 5’ 6’ 11’ 12’ 14’ 15’ 20’ 22’ (departure shift)
When will my train
leave Les Tuileries?
46
ANALYZING DATA WITH NOTEBOOKS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
My train has 90% chance of leaving Les Tuileries
between 45s and 3’40s seconds late
Between 45s
and 3’40s!
47
ANALYZING DATA WITH NOTEBOOKS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
At 7 AM, my train has 90% chance of leaving Les Tuileries
between 1’10s and 3’30s late
At 7AM,
between 1’15s
and 3’30s!
49
ANALYZING DATA WITH NOTEBOOKS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
~ consistent delays
50
ANALYZING DATA WITH NOTEBOOKS
OCTO TECHNOLOGY > THERE IS A BETTER WAY
If I know How late my train runs in Versoix,
I can predict rather precisely how late it will be in Les Tuileries
If it’s 3’ late in
Versoix,
between 50s
and 1’20s!
51
ANALYZING DATA
OCTO TECHNOLOGY > THERE IS A BETTER WAY
52OCTO TECHNOLOGY > THERE IS A BETTER WAY
53
COMMUNICATION = INFORMATION WITH A MEANING
OCTO TECHNOLOGY > THERE IS A BETTER WAY
54
COMMUNICATING
OCTO TECHNOLOGY > THERE IS A BETTER WAY
55
COMMUNICATING
OCTO TECHNOLOGY > THERE IS A BETTER WAY
56
COMMUNICATING
 Sharing notebooks + data online
 Assemble and broadcast dashboards
 Design and share stories
Tableau can be turned into a communication tool
OCTO TECHNOLOGY > THERE IS A BETTER WAY
57
COMMUNICATING
 Sharing data generated document with values, figures…
 Publishing on URL
Notebook can be used for communication
OCTO TECHNOLOGY > THERE IS A BETTER WAY
58
COMMUNICATING
Browser power
OCTO TECHNOLOGY > THERE IS A BETTER WAY
59
COMMUNICATING
D3.js: mapping data to browser DOM
Browser power
OCTO TECHNOLOGY > THERE IS A BETTER WAY
time
station board
trains
60
COMMUNICATING
OCTO TECHNOLOGY > THERE IS A BETTER WAY
61
COMMUNICATING
Browser power with d3.js
OCTO TECHNOLOGY > THERE IS A BETTER WAY
62
COMMUNICATING
Browser power inheriting from d3.js
OCTO TECHNOLOGY > THERE IS A BETTER WAY
63
COMMUNICATING
Browser power
OCTO TECHNOLOGY > THERE IS A BETTER WAY
sigma.js cytoscape.js
64
COMMUNICATING
Browser power with high throughput data
OCTO TECHNOLOGY > THERE IS A BETTER WAY
65
THE VISUALIZATION IS THE OXYGEN OF THE DATA SCIENCE
OCTO TECHNOLOGY > THERE IS A BETTER WAY
BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN
OCTO TECHNOLOGY > THERE IS A BETTER WAY 66
About Data Visualization
1
2
3
4
From Data lake to your Mac
Explore, Understand, Communicate
Conclusion
67
ANOTHER PERSPECTIVE ON VISUALIZATION
Who said that? When?
OCTO TECHNOLOGY > THERE IS A BETTER WAY
“There is danger in giving too much
information to executives of small
brain capacity.”
“As a cathedral is to its foundations,
so is an effective presentation of the
fact to the data.”
“The answer is that the executive of the
future will be forced on the analysis of facts
which have been collected and arranged for
his instantaneous and continuous use.”
68
ANOTHER PERSPECTIVE ON VISUALIZATION
OCTO TECHNOLOGY > THERE IS A BETTER WAY
1914Willard C. Brinton
100yrsofbrinton.tumblr.com
69
ANOTHER PERSPECTIVE ON VISUALIZATION
OCTO TECHNOLOGY > THERE IS A BETTER WAY
100yrsofbrinton.tumblr.com
70
1880: TEXTILE PRODUCTION IN ENGLAND (OTTO NEURATH, ~1920)
Changing the world by educating people about the world around them
OCTO TECHNOLOGY > THERE IS A BETTER WAY
71OCTO TECHNOLOGY > THERE IS A BETTER WAY
72OCTO TECHNOLOGY > THERE IS A BETTER WAY
73
PARIS-LYON TRAIN SCHEDULE (1880S)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
74OCTO TECHNOLOGY > THERE IS A BETTER WAY
Early maps Measurement
& Theory
New graphic
forms
Golden
age
Begin
modern
period
Modern
dark
ages
High-D
Vis
Density
Year
The distribution of milestone items over time, shown by a rug plot and density estimate.
Michael Friendly et Daniel J. Denis. https://www.researchgate.net/publication/221649568
Graphics Milestones: Time course of developments
75
THE PREVIOUS BIG DATA REVOLUTION (END 1800s)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
76
VISUALIZATION THEORY & PRACTICE, BY EDWARD R. TUFTE
The most complete suite of classic books
OCTO TECHNOLOGY > THERE IS A BETTER WAY
77
VISUALIZATION THEORY & PRACTICE
“Pie charts are bad and that the only thing
worse than one pie chart is lots of them.”
E. Tufte
OCTO TECHNOLOGY > THERE IS A BETTER WAY
W. Brinton W. Playfair (1801)
78
VISUALIZATION THEORY & PRACTICE
OCTO TECHNOLOGY > THERE IS A BETTER WAY
79
VISUALIZATION THEORY & PRACTICE
 Among several authors, Mackinley (1986) stated an “expressiveness rule”
for graphical display.
 The most important information shall use the following attributes
(in priority order):
1. position;
2. size;
3. orientation;
4. shape;
5. color.
 And the most important dimensions to communicate shall therefore use the
first attributes.
Mackinley (1986) “Expressiveness Rule”
OCTO TECHNOLOGY > THERE IS A BETTER WAY
80
VISUALIZATION THEORY & PRACTICE
Data/Ink Ratio
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Data-ink ratio = =
1- proportion of a graphic
that can be erasedTotal ink used to
print the graphs
Data-ink
“Above all else show the data”
E. Tufte, 1983
81OCTO TECHNOLOGY > THERE IS A BETTER WAY
82
VISUALIZATION THEORY & PRACTICE
Data/Ink Ratio
OCTO TECHNOLOGY > THERE IS A BETTER WAY
"Perfection is achieved not when there is nothing more to add,
but when there is nothing left to take away”
Antoine de St Exupéry
Terre des Hommes, 1939
83
VISUALIZATION THEORY & PRACTICE
8% males are color blind (0.6% females)
OCTO TECHNOLOGY > THERE IS A BETTER WAY
84
VISUALIZATION THEORY & PRACTICE
Violating all principles
OCTO TECHNOLOGY > THERE IS A BETTER WAY
85
VISUALIZATION: NEW METHODS
Three.js
OCTO TECHNOLOGY > THERE IS A BETTER WAY
86
VISUALIZATION: NEW METHODS
Interactive display wall
OCTO TECHNOLOGY > THERE IS A BETTER WAY
http://earlymodernconversions.com/activity/history-visualization-lab/
87
VISUALIZATION: NEW METHODS
Virtual reality
OCTO TECHNOLOGY > THERE IS A BETTER WAY
88
VISUALIZATION: NEW METHODS
Data sonification
OCTO TECHNOLOGY > THERE IS A BETTER WAY
http://earlymodernconversions.com/activity/history-visualization-lab/
89
VISUALIZATION: NEW METHODS
Animation
OCTO TECHNOLOGY > THERE IS A BETTER WAY
A Day in the Life of Americans – Nathan Yau
90
VISUALIZATION: NEW METHODS
Animation
OCTO TECHNOLOGY > THERE IS A BETTER WAY
Hans Rosling… The Revolutionary
BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN
OCTO TECHNOLOGY > THERE IS A BETTER WAY 91
About Data Visualization
1
2
3
4
From Datalake to your mac
Explore, understand, communicate
Back to The Lake
ARCHITECTURE
92
proxy proxy proxy
Legend
GET blacklist
POST/PUT whitelist (>30K)proxy
DMZ/Cloud
Reverse proxy
https
data
data,programs
R,python
dataReverse proxy
On demand manual transfer
Security check
ETL, https,
ssh
Wifi
mobile
data flow
Explore existing Data Lake
masks / anonymisation
data
copy
https
ssh
Laptop PAM
PROD - System transactional
PAM laptop
Personal computer
DataLab
AD ?
HDP PROD - Bare metal
Edge
Name
Node
Name
Node
Data
Node
Data
Node
Date
Node
DEV - Toolsdata
GET whitelist
POST/PUT whitelistproxy
HDP DEV - VM
Edge
Name
Node
Data
Node
Data
Node
PAM
HDP SANDBOX -
VM
Edge
Name
Node
Data
Node
Data
Node
Bare metal
Virtual machine
93
DATA SCIENCE LIFE CYCLE
OCTO TECHNOLOGY > THERE IS A BETTER WAY
DATALAB
ITERATIVE EXPERIMENTS
HADOOP CLUSTER
DEVELOPMENT
HADOOP
PRODUCTION
94
WHAT IS DATA DRIVER?
 Data Driver is a platform for data science exploration/production
 Data Driver integrates all the OCTO know-how acquired for 5 years
 Data Driver accelerates the development of your data science applications
to production
By
A COMPANY: INFINITE OPPORTUNITIES FOR DATA SCIENCE
SUPPORT
Information SystemHR
Strategy
Produc-
tion
Compliance, risk
management
Finance
CORE BUSINESS
R&D Sales,
distributi
on
ENTERPRISE MANAGEMENT
…
Administration …
Supply
chain
Planification
Marke-
ting
After
sales
…
Procure-
ment
96
- Du 29 au 30 Mai 2017 à Genève
Nouvelles Architectures des Systèmes
d’Information
academy.octo.c
- Du 8 au 9 Juillet 2017 à Genève
Découvrir les démarches et la culture agile
- Du 26 au 27 Juillet 2017 à Genève
Les géants du web : culture - pratiques -
architecture
- Du 15 au 17 Mai 2017 à Genève
Analyse de données pour Hadoop 2.x
Hortonworks
AVENUE DU THÉÂTRE, 7 – 1005 LAUSANNE > SUISSE > WWW.OCTO.CH
OCTO Suisse RECRUTE
5 consultants en 2017
rejoins.octo.com
Architecte
Software
Craftsman DataGeek
Coach
Méthodo
Expert
DevOps
Consultant
en Stratégie
Questions ?

More Related Content

Similar to big data et data viz - du lac à votre écran - afterwork

Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsDataconomy Media
 
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...Spark Summit
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...Dataconomy Media
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioAlluxio, Inc.
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Maya Lumbroso
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Dataconomy Media
 
Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Guido Schmutz
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsClusterpoint
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemDan Eaton
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreHPCC Systems
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?Robert Grossman
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceeRic Choo
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...confluent
 
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Café da manhã - São Paulo - Use-cases and opportunities in BigData with HadoopCafé da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Café da manhã - São Paulo - Use-cases and opportunities in BigData with HadoopOCTO Technology
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioCHAKER ALLAOUI
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes John Archer
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 

Similar to big data et data viz - du lac à votre écran - afterwork (20)

Louise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx SystemsLouise McCluskey, Kx Engineer at Kx Systems
Louise McCluskey, Kx Engineer at Kx Systems
 
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
Digitalising the Core – How Analytics is Shaping the Energy Industry Daniel J...
 
Practical advice to build a data driven company
Practical advice to build a data driven companyPractical advice to build a data driven company
Practical advice to build a data driven company
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
 
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud using Intel Analytics Zoo & Alluxio
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
Ronan Corkery, kdb+ developer at Kx Systems: “Kdb+: How Wall Street Tech can ...
 
Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?Internet of Things (IoT) - in the cloud or rather on-premises?
Internet of Things (IoT) - in the cloud or rather on-premises?
 
High-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutionsHigh-performance database technology for rock-solid IoT solutions
High-performance database technology for rock-solid IoT solutions
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics EcosystemXDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
XDF 2019 Xilinx Accelerated Database and Data Analytics Ecosystem
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio VillanustreBig Data Processing Beyond MapReduce by Dr. Flavio Villanustre
Big Data Processing Beyond MapReduce by Dr. Flavio Villanustre
 
What Are Science Clouds?
What Are Science Clouds?What Are Science Clouds?
What Are Science Clouds?
 
Scaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data ScienceScaling up with Cisco Big Data: Data + Science = Data Science
Scaling up with Cisco Big Data: Data + Science = Data Science
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Café da manhã - São Paulo - Use-cases and opportunities in BigData with HadoopCafé da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
Café da manhã - São Paulo - Use-cases and opportunities in BigData with Hadoop
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes Democratizing Data Science on Kubernetes
Democratizing Data Science on Kubernetes
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 

More from OCTO Technology Suisse

An afterwork on Microservices by @OCTO Technology Switzerland
An afterwork on Microservices  by @OCTO Technology SwitzerlandAn afterwork on Microservices  by @OCTO Technology Switzerland
An afterwork on Microservices by @OCTO Technology SwitzerlandOCTO Technology Suisse
 
Afterwork Devops : vision et pratiques
Afterwork Devops : vision et pratiquesAfterwork Devops : vision et pratiques
Afterwork Devops : vision et pratiquesOCTO Technology Suisse
 
Êtes-vous API dans votre organisation ?
Êtes-vous API dans votre organisation ?Êtes-vous API dans votre organisation ?
Êtes-vous API dans votre organisation ?OCTO Technology Suisse
 
Dev wednesday-swiss-transport-realtime
Dev wednesday-swiss-transport-realtimeDev wednesday-swiss-transport-realtime
Dev wednesday-swiss-transport-realtimeOCTO Technology Suisse
 
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern TalesPolar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern TalesOCTO Technology Suisse
 
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...OCTO Technology Suisse
 
Afterwork Blockchain : la prochaine technologie disruptive ?
Afterwork Blockchain : la prochaine technologie disruptive ?Afterwork Blockchain : la prochaine technologie disruptive ?
Afterwork Blockchain : la prochaine technologie disruptive ?OCTO Technology Suisse
 
Réussissez le développement de votre prochaine application web ou mobile
Réussissez le développement de votre prochaine application web ou mobileRéussissez le développement de votre prochaine application web ou mobile
Réussissez le développement de votre prochaine application web ou mobileOCTO Technology Suisse
 
L'ADN d'un développement produit réussi
L'ADN d'un développement produit réussiL'ADN d'un développement produit réussi
L'ADN d'un développement produit réussiOCTO Technology Suisse
 
Fintech : concurrents ou partenaires ?
Fintech : concurrents ou partenaires ?Fintech : concurrents ou partenaires ?
Fintech : concurrents ou partenaires ?OCTO Technology Suisse
 
Fintech demain comment travailler ensemble
Fintech   demain comment travailler ensembleFintech   demain comment travailler ensemble
Fintech demain comment travailler ensembleOCTO Technology Suisse
 
Softshake 2015 - Des small data aux big data - Méthodes et Technologies
Softshake 2015 - Des small data aux big data - Méthodes et TechnologiesSoftshake 2015 - Des small data aux big data - Méthodes et Technologies
Softshake 2015 - Des small data aux big data - Méthodes et TechnologiesOCTO Technology Suisse
 
Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?
Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?
Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?OCTO Technology Suisse
 
OCTO Technology - Data Driven Company - SITB15
OCTO Technology - Data Driven Company - SITB15OCTO Technology - Data Driven Company - SITB15
OCTO Technology - Data Driven Company - SITB15OCTO Technology Suisse
 

More from OCTO Technology Suisse (20)

An afterwork on Microservices by @OCTO Technology Switzerland
An afterwork on Microservices  by @OCTO Technology SwitzerlandAn afterwork on Microservices  by @OCTO Technology Switzerland
An afterwork on Microservices by @OCTO Technology Switzerland
 
Afterwork Devops : vision et pratiques
Afterwork Devops : vision et pratiquesAfterwork Devops : vision et pratiques
Afterwork Devops : vision et pratiques
 
Êtes-vous API dans votre organisation ?
Êtes-vous API dans votre organisation ?Êtes-vous API dans votre organisation ?
Êtes-vous API dans votre organisation ?
 
Afterwork "Décollez vers le Cloud"
Afterwork "Décollez vers le Cloud"Afterwork "Décollez vers le Cloud"
Afterwork "Décollez vers le Cloud"
 
Dev wednesday-swiss-transport-realtime
Dev wednesday-swiss-transport-realtimeDev wednesday-swiss-transport-realtime
Dev wednesday-swiss-transport-realtime
 
Cloud : en 2017, sortez du stratus !
Cloud : en 2017, sortez du stratus !Cloud : en 2017, sortez du stratus !
Cloud : en 2017, sortez du stratus !
 
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern TalesPolar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
Polar Expeditions and Agility: the 1910 Race to the South Pole and Modern Tales
 
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
Afterwork Big Data - Data Science & Machine Learning : explorer, comprendre e...
 
Afterwork Blockchain : la prochaine technologie disruptive ?
Afterwork Blockchain : la prochaine technologie disruptive ?Afterwork Blockchain : la prochaine technologie disruptive ?
Afterwork Blockchain : la prochaine technologie disruptive ?
 
Afterwork hadoop
Afterwork hadoopAfterwork hadoop
Afterwork hadoop
 
Réussissez le développement de votre prochaine application web ou mobile
Réussissez le développement de votre prochaine application web ou mobileRéussissez le développement de votre prochaine application web ou mobile
Réussissez le développement de votre prochaine application web ou mobile
 
L'ADN d'un développement produit réussi
L'ADN d'un développement produit réussiL'ADN d'un développement produit réussi
L'ADN d'un développement produit réussi
 
Fintech : concurrents ou partenaires ?
Fintech : concurrents ou partenaires ?Fintech : concurrents ou partenaires ?
Fintech : concurrents ou partenaires ?
 
Fintech demain comment travailler ensemble
Fintech   demain comment travailler ensembleFintech   demain comment travailler ensemble
Fintech demain comment travailler ensemble
 
Softshake 2015 - Des small data aux big data - Méthodes et Technologies
Softshake 2015 - Des small data aux big data - Méthodes et TechnologiesSoftshake 2015 - Des small data aux big data - Méthodes et Technologies
Softshake 2015 - Des small data aux big data - Méthodes et Technologies
 
Démystifions l'API-culture!
Démystifions l'API-culture!Démystifions l'API-culture!
Démystifions l'API-culture!
 
Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?
Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?
Qu'est qu'une Data Driven Company à l'heure de la digitalisation ?
 
OCTO Technology - Data Driven Company - SITB15
OCTO Technology - Data Driven Company - SITB15OCTO Technology - Data Driven Company - SITB15
OCTO Technology - Data Driven Company - SITB15
 
Afterwork - La Révolution Digitale
Afterwork - La Révolution DigitaleAfterwork - La Révolution Digitale
Afterwork - La Révolution Digitale
 
Brochure Vers l'entreprise Agile
Brochure Vers l'entreprise AgileBrochure Vers l'entreprise Agile
Brochure Vers l'entreprise Agile
 

Recently uploaded

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 

Recently uploaded (20)

5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 

big data et data viz - du lac à votre écran - afterwork

  • 1. TABLE DES MATIÈRES Big Data & Data visualization: From the Lake to Your Screen An afterwork by @OCTOSuisse Geneva, May 9th, 2017 Joseph Glorieux Alexandre Masselot
  • 2. TABLE DES MATIÈRES Big Data & Data visualization: From the Lake to Your Screen An afterwork by @OCTOSuisse Geneva, May 9th, 2017 Joseph Glorieux Alexandre Masselot
  • 3.
  • 4. 4 OCTO, DIGITAL TRANSFORMATION ACCELERATOR DIGITAL TRANSFORMATION Facilitate and Accelerate the adoption of Digital Culture - Business, IT, People Consulting & Delivery OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 5. BIG DATA @ OCTO : THE NUMBERS TB, the biggest volume of distributed storage on a single project 250 TB, the biggest volume of data analyzed by OCTO’s data scientists >20 Is the number of Big Data projects at OCTO in the past 12 months The number of OCTO certified on the Hadoop platform 40 850 800 cores, the biggest Hadoop cluster built by OCTO 16 The number of active partnerships with major Big Data actors 5OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 6. BIG DATA @ OCTO: PUBLICATIONS OCTO TECHNOLOGY > THERE IS A BETTER WAY 6
  • 7. BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN OCTO TECHNOLOGY > THERE IS A BETTER WAY 7 About Data Visualization 1 2 3 4 From Data lake to your Mac Explore, Understand, Communicate Back to the Lake
  • 8. BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN OCTO TECHNOLOGY > THERE IS A BETTER WAY 8 About Data Visualization 1 2 3 4 From Data lake to your Mac Explore, Understand, Communicate Back to the Lake
  • 9. LIMITATIONS OF TRADITIONAL ARCHITECTURES OCTO TECHNOLOGY > THERE IS A BETTER WAY 9 Over 10 Tb, « classical » architectures requires huge software and hardware adaptations. Over 1 000 transactions / second, « classical » architectures requires huge software and hardware adaptations. Over 10 threads/Core CPU, sequential programming reach its limits (IO). Over 1 000 events / second, « classical » architectures requires huge software and hardware adaptations. Distributed storage Share nothing XTP Parallel processing Event Stream Processing « Traditional / Standard » architectures RDBMS, Application server, ETL, ESB Event flow oriented application Message Bound (streaming) Transaction oriented applications Transaction Bound (TPS) Storage oriented applications (IO bound) Computation oriented applications CPU bound (Stream Grid) (Calculation Grid) (Transaction Grid) (Storage Grid)
  • 10. BIG DATA - EMERGING FAMILIES OCTO TECHNOLOGY > THERE IS A BETTER WAY 10 Event flow oriented application Message Bound (streaming) Transaction oriented applications Transaction Bound (TPS) Storage oriented applications IO bound Computation oriented applications CPU bound NoSQL NewSQL NoSQL : ditributed non- relational stores, NewSQL : SQL compliant distributed stores CEP - Complex Event Processing, ESP - Event Stream Processing Grid - GPU Grid computing on CPU, or on GPU In-memory analytics solutions distribute the data in the memory of several nodes to obtain a low processing time. In-memory analytics Hadoop The Hadoop ecosystem offers a distributed storage, but also distributed computing using MapReduce. Streaming In-memory analytics NoSQL NewSQLStreaming Hadoop
  • 11. MODELS & DATA Traditional models Advanced models Advanced models with more data Advanced models with more data and more features Precision Precision score for the TOP 20% OCTO TECHNOLOGY > THERE IS A BETTER WAY 11
  • 12. NEW ARCHITECTURE PATTERN: THE DATALAKE Non-structured storage Semi-structured storage (NoSQL) structured storage (ex. relational) Interactive requests Analytical processing Flow management Machine Learning Database Raw files Logs External data, OpenAPI Messages & Events Enterprise DWH Operational system Reporting, request External data, OpenAPI Messages & Events DATALAKE INTEGRATION PUBLICATION OCTO TECHNOLOGY > THERE IS A BETTER WAY 12
  • 13. PEAK OF INFLATED EXPECTATIONS? From Datalake… … to Dataswamp You do not need to store/compute petabyte of data… OCTO TECHNOLOGY > THERE IS A BETTER WAY 13 Big Data? I should buy a Hadoop Cluster
  • 14. NEW ARCHITECTURE PATTERN: THE DATALAKE Non-structured storage Semi-structured storage (NoSQL) structured storage (ex. relational) Interactive requests Analytical processing Flow management Machine Learning Database Raw files Logs External data, OpenAPI Messages & Events Enterprise DWH Operational system Reporting, request External data, OpenAPI Messages & Events DATALAKE INTEGRATION PUBLICATION OCTO TECHNOLOGY > THERE IS A BETTER WAY 14
  • 15. THE 5-LEGGED SHEEP OCTO TECHNOLOGY > THERE IS A BETTER WAY 15 Source : www.marketingdistillery.com
  • 16. 16 THE DATALAB OCTO TECHNOLOGY > THERE IS A BETTER WAY Why a DataLab?  Limitation of distributed environment for experimentation: > less algorithms available, > longer round trip implies slower experimentation, > other programming paradigms  No necessary to have all data for experimenting, statistically relevant samples are sufficient Description  The DataLab is a “sandbox” area where analysts should have great freedom with tools and data usage. It contains a work storage area allowing to "play" with the data  It lives outside of the Datalake to ensure and facilitate its exploitation  Machine with lots of RAM and CPU to enable in memory processing, mono-machine – vertical scalability, multi-user DataLab Analytics Machine Learning Tools Storage DataViz Tools Work Storage Area
  • 17. 17 DATA SCIENCE LIFE CYCLE OCTO TECHNOLOGY > THERE IS A BETTER WAY DATALAB ITERATIVE EXPERIMENTS Data scientists Activities:  Data exploration environment  Machine learning applied to key business question  POC  Preliminary models  Demos for communication HADOOP CLUSTER DEVELOPMENT Developers Activities  Developers implements selected models from the DataLab to run in a distributed environment  Industrialize external/internal data flows  Model industrialized  Applications to access results  Data ingestion programs HADOOP PRODUCTION Business Activities  Interacts with the applications accessing the Data Lake and exposing results from models Scheduled activities on cluster  Ingestion of historical data  Compute associated with all deployed applications  Populated Data Lake  Models on distributed data  Applications for end-users
  • 18. ARCHITECTURE 18 proxy proxy proxy Legend GET blacklist POST/PUT whitelist (>30K)proxy DMZ/Cloud Reverse proxy https data data,programs R,python dataReverse proxy On demand manual transfer Security check ETL, https, ssh Wifi mobile data flow Explore existing Data Lake masks / anonymisation data copy https ssh Laptop PAM PROD - System transactional PAM laptop Personal computer DataLab AD ? HDP PROD - Bare metal Edge Name Node Name Node Data Node Data Node Date Node DEV - Toolsdata GET whitelist POST/PUT whitelistproxy HDP DEV - VM Edge Name Node Data Node Data Node PAM HDP SANDBOX - VM Edge Name Node Data Node Data Node Bare metal Virtual machine
  • 19. THE NEW DATASCIENCE PLATFORM OCTO TECHNOLOGY > THERE IS A BETTER WAY 19
  • 20. BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN OCTO TECHNOLOGY > THERE IS A BETTER WAY 20 About Data Visualization 1 2 3 4 From Data lake to your Mac Explore, Understand, Communicate Back to the Lake
  • 21. 21 A PERFECT USE CASE: SWISS PUBLIC TRANSPORT Data types  Schedules  Event storage  Real time streaming OCTO TECHNOLOGY > THERE IS A BETTER WAY Usages  Data analysis  Prediction  End user application Sources  opentransportdata.swiss  transport.opendata.ch  gtfs.geops.ch
  • 22. 22OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 23. 23 QUESTION OF THE DAY: “IS MY TRAIN RUNNING LATE?” OCTO TECHNOLOGY > THERE IS A BETTER WAY “WILL MY TRAIN BE RUNNING LATE?”
  • 24. 24 “WILL MY TRAIN BE RUNNING LATE?” OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 25. 25OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 26. 26 WHAT IS INSIDE THE AVAILABLE DATA? Multiple challenges  Different acquisition modes  Different ids for the same entity  Different languages (e.g. german)  Different quality (e.g. missing data) OCTO TECHNOLOGY > THERE IS A BETTER WAY Multiple sources  opentransportdata.swiss  transport.opendata.ch  gtfs.geops.ch
  • 27. 27OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 28. 28 EXPLORING THE DATA Getting acquainted with the data:  download a bearable sample (40 millions lines)  Repeat 1. build the lightest import process 2. observe 3. go to business expert to get insights OCTO TECHNOLOGY > THERE IS A BETTER WAY 4. observe
  • 29. 29 EXPLORING THE DATA OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 30. 30 EXPLORE MY DATA? OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 31. 31 EXPLORING THE DATA The need for a visualization tool:  interactive  versatile  handling large amount of data (samples)  loading data for various sources  adding computed values OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 32. 32 EXPLORING THE DATA  Connect to data An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 33. 33 EXPLORING THE DATA  Create computed columns An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 34. 34 EXPLORING THE DATA  Join multiple tables An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 35. 35 EXPLORING THE DATA An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 36. 36 EXPLORING THE DATA  Exploring and filtering out data An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 37. 37 EXPLORING THE DATA  Killing preconceived ideas: “InterCity trains are less frequently late” An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 38. 38 EXPLORING THE DATA An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 39. 39 EXPLORING THE DATA An Introduction to Tableau OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 40. 40OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 41. 41 ANALYZING DATA WITH NOTEBOOKS OCTO TECHNOLOGY > THERE IS A BETTER WAY  A notebook allows to write text and live code in order to wrap together code, output and documentation  The full power of programming, interactivity, results and documentation. All in the same place. Language of choice Interactive widgets Share notebooks Big data Integration
  • 42. 42 ANALYZING DATA WITH NOTEBOOKS Loading and munging data OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 43. 43 ANALYZING DATA WITH NOTEBOOKS Figures and code OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 44. 44 ANALYZING DATA Building a model OCTO TECHNOLOGY > THERE IS A BETTER WAY 1’ 3’ 5’ 6’ 11’ 12’ 14’ 15’ 20’ 22’ (departure shift)
  • 45. 45 ANALYZING DATA Building a model OCTO TECHNOLOGY > THERE IS A BETTER WAY 1’ 3’ 5’ 6’ 11’ 12’ 14’ 15’ 20’ 22’ (departure shift) When will my train leave Les Tuileries?
  • 46. 46 ANALYZING DATA WITH NOTEBOOKS OCTO TECHNOLOGY > THERE IS A BETTER WAY My train has 90% chance of leaving Les Tuileries between 45s and 3’40s seconds late Between 45s and 3’40s!
  • 47. 47 ANALYZING DATA WITH NOTEBOOKS OCTO TECHNOLOGY > THERE IS A BETTER WAY At 7 AM, my train has 90% chance of leaving Les Tuileries between 1’10s and 3’30s late At 7AM, between 1’15s and 3’30s!
  • 48. 49 ANALYZING DATA WITH NOTEBOOKS OCTO TECHNOLOGY > THERE IS A BETTER WAY ~ consistent delays
  • 49. 50 ANALYZING DATA WITH NOTEBOOKS OCTO TECHNOLOGY > THERE IS A BETTER WAY If I know How late my train runs in Versoix, I can predict rather precisely how late it will be in Les Tuileries If it’s 3’ late in Versoix, between 50s and 1’20s!
  • 50. 51 ANALYZING DATA OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 51. 52OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 52. 53 COMMUNICATION = INFORMATION WITH A MEANING OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 53. 54 COMMUNICATING OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 54. 55 COMMUNICATING OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 55. 56 COMMUNICATING  Sharing notebooks + data online  Assemble and broadcast dashboards  Design and share stories Tableau can be turned into a communication tool OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 56. 57 COMMUNICATING  Sharing data generated document with values, figures…  Publishing on URL Notebook can be used for communication OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 58. 59 COMMUNICATING D3.js: mapping data to browser DOM Browser power OCTO TECHNOLOGY > THERE IS A BETTER WAY time station board trains
  • 59. 60 COMMUNICATING OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 60. 61 COMMUNICATING Browser power with d3.js OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 61. 62 COMMUNICATING Browser power inheriting from d3.js OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 62. 63 COMMUNICATING Browser power OCTO TECHNOLOGY > THERE IS A BETTER WAY sigma.js cytoscape.js
  • 63. 64 COMMUNICATING Browser power with high throughput data OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 64. 65 THE VISUALIZATION IS THE OXYGEN OF THE DATA SCIENCE OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 65. BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN OCTO TECHNOLOGY > THERE IS A BETTER WAY 66 About Data Visualization 1 2 3 4 From Data lake to your Mac Explore, Understand, Communicate Conclusion
  • 66. 67 ANOTHER PERSPECTIVE ON VISUALIZATION Who said that? When? OCTO TECHNOLOGY > THERE IS A BETTER WAY “There is danger in giving too much information to executives of small brain capacity.” “As a cathedral is to its foundations, so is an effective presentation of the fact to the data.” “The answer is that the executive of the future will be forced on the analysis of facts which have been collected and arranged for his instantaneous and continuous use.”
  • 67. 68 ANOTHER PERSPECTIVE ON VISUALIZATION OCTO TECHNOLOGY > THERE IS A BETTER WAY 1914Willard C. Brinton 100yrsofbrinton.tumblr.com
  • 68. 69 ANOTHER PERSPECTIVE ON VISUALIZATION OCTO TECHNOLOGY > THERE IS A BETTER WAY 100yrsofbrinton.tumblr.com
  • 69. 70 1880: TEXTILE PRODUCTION IN ENGLAND (OTTO NEURATH, ~1920) Changing the world by educating people about the world around them OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 70. 71OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 71. 72OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 72. 73 PARIS-LYON TRAIN SCHEDULE (1880S) OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 73. 74OCTO TECHNOLOGY > THERE IS A BETTER WAY Early maps Measurement & Theory New graphic forms Golden age Begin modern period Modern dark ages High-D Vis Density Year The distribution of milestone items over time, shown by a rug plot and density estimate. Michael Friendly et Daniel J. Denis. https://www.researchgate.net/publication/221649568 Graphics Milestones: Time course of developments
  • 74. 75 THE PREVIOUS BIG DATA REVOLUTION (END 1800s) OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 75. 76 VISUALIZATION THEORY & PRACTICE, BY EDWARD R. TUFTE The most complete suite of classic books OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 76. 77 VISUALIZATION THEORY & PRACTICE “Pie charts are bad and that the only thing worse than one pie chart is lots of them.” E. Tufte OCTO TECHNOLOGY > THERE IS A BETTER WAY W. Brinton W. Playfair (1801)
  • 77. 78 VISUALIZATION THEORY & PRACTICE OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 78. 79 VISUALIZATION THEORY & PRACTICE  Among several authors, Mackinley (1986) stated an “expressiveness rule” for graphical display.  The most important information shall use the following attributes (in priority order): 1. position; 2. size; 3. orientation; 4. shape; 5. color.  And the most important dimensions to communicate shall therefore use the first attributes. Mackinley (1986) “Expressiveness Rule” OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 79. 80 VISUALIZATION THEORY & PRACTICE Data/Ink Ratio OCTO TECHNOLOGY > THERE IS A BETTER WAY Data-ink ratio = = 1- proportion of a graphic that can be erasedTotal ink used to print the graphs Data-ink “Above all else show the data” E. Tufte, 1983
  • 80. 81OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 81. 82 VISUALIZATION THEORY & PRACTICE Data/Ink Ratio OCTO TECHNOLOGY > THERE IS A BETTER WAY "Perfection is achieved not when there is nothing more to add, but when there is nothing left to take away” Antoine de St Exupéry Terre des Hommes, 1939
  • 82. 83 VISUALIZATION THEORY & PRACTICE 8% males are color blind (0.6% females) OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 83. 84 VISUALIZATION THEORY & PRACTICE Violating all principles OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 84. 85 VISUALIZATION: NEW METHODS Three.js OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 85. 86 VISUALIZATION: NEW METHODS Interactive display wall OCTO TECHNOLOGY > THERE IS A BETTER WAY http://earlymodernconversions.com/activity/history-visualization-lab/
  • 86. 87 VISUALIZATION: NEW METHODS Virtual reality OCTO TECHNOLOGY > THERE IS A BETTER WAY
  • 87. 88 VISUALIZATION: NEW METHODS Data sonification OCTO TECHNOLOGY > THERE IS A BETTER WAY http://earlymodernconversions.com/activity/history-visualization-lab/
  • 88. 89 VISUALIZATION: NEW METHODS Animation OCTO TECHNOLOGY > THERE IS A BETTER WAY A Day in the Life of Americans – Nathan Yau
  • 89. 90 VISUALIZATION: NEW METHODS Animation OCTO TECHNOLOGY > THERE IS A BETTER WAY Hans Rosling… The Revolutionary
  • 90. BIG DATA & DATAVIZ : FROM THE LAKE TO YOUR SCREEN OCTO TECHNOLOGY > THERE IS A BETTER WAY 91 About Data Visualization 1 2 3 4 From Datalake to your mac Explore, understand, communicate Back to The Lake
  • 91. ARCHITECTURE 92 proxy proxy proxy Legend GET blacklist POST/PUT whitelist (>30K)proxy DMZ/Cloud Reverse proxy https data data,programs R,python dataReverse proxy On demand manual transfer Security check ETL, https, ssh Wifi mobile data flow Explore existing Data Lake masks / anonymisation data copy https ssh Laptop PAM PROD - System transactional PAM laptop Personal computer DataLab AD ? HDP PROD - Bare metal Edge Name Node Name Node Data Node Data Node Date Node DEV - Toolsdata GET whitelist POST/PUT whitelistproxy HDP DEV - VM Edge Name Node Data Node Data Node PAM HDP SANDBOX - VM Edge Name Node Data Node Data Node Bare metal Virtual machine
  • 92. 93 DATA SCIENCE LIFE CYCLE OCTO TECHNOLOGY > THERE IS A BETTER WAY DATALAB ITERATIVE EXPERIMENTS HADOOP CLUSTER DEVELOPMENT HADOOP PRODUCTION
  • 93. 94 WHAT IS DATA DRIVER?  Data Driver is a platform for data science exploration/production  Data Driver integrates all the OCTO know-how acquired for 5 years  Data Driver accelerates the development of your data science applications to production By
  • 94. A COMPANY: INFINITE OPPORTUNITIES FOR DATA SCIENCE SUPPORT Information SystemHR Strategy Produc- tion Compliance, risk management Finance CORE BUSINESS R&D Sales, distributi on ENTERPRISE MANAGEMENT … Administration … Supply chain Planification Marke- ting After sales … Procure- ment
  • 95. 96 - Du 29 au 30 Mai 2017 à Genève Nouvelles Architectures des Systèmes d’Information academy.octo.c - Du 8 au 9 Juillet 2017 à Genève Découvrir les démarches et la culture agile - Du 26 au 27 Juillet 2017 à Genève Les géants du web : culture - pratiques - architecture - Du 15 au 17 Mai 2017 à Genève Analyse de données pour Hadoop 2.x Hortonworks
  • 96. AVENUE DU THÉÂTRE, 7 – 1005 LAUSANNE > SUISSE > WWW.OCTO.CH OCTO Suisse RECRUTE 5 consultants en 2017 rejoins.octo.com Architecte Software Craftsman DataGeek Coach Méthodo Expert DevOps Consultant en Stratégie Questions ?