SlideShare a Scribd company logo
1 of 29
Download to read offline
1
OVH Analytics Data Compute
April 2019
Apache Spark Cluster as a Service
Mojtaba Imani
DevOps Cloud and Bigdata @OVH
2
Deep Learning
IoT Platform
Artificial Intelligence
Smart cities
Edge
Computing
Autonomous Vehicles
Blockchains
Biochips
Smart robots & drones
BY 2025, data will be x10 the total datasphere produced till 2016
- IDC Survey (2017) « Data Age 2025 »
The Cloud Transformation: A data odyssey…
“Exponential growth of data, its costs & governance:
IS YOUR BUSINESS READY FOR IT?”
3
90 % of the whole data in the world has been created in last 2 years.
Autonomous cars generate 20TB of data per hour
4
Computing Cluster
https://wiki.rc.hms.harvard.edu/display/O2/O2+HPC+Cluster+and+Computing+Nodes+Hardware
5
Apache Spark
• An open-source distributed general-purpose cluster-computing
framework.
• The largest open source community in big data
• Up to 100 times faster than hadoop mapReduce (in-memory
processing and lazy evaluation)
• The leading platform for large-scale SQL, batch processing, stream
processing, and machine learning
• Easy to use API and coding with Java, Scala, Python, R, and SQL
6
Apache Spark Cluster
source: http://www.buyukveri.co/en/apache-spark-architecture/
7
Computing Cluster
https://wiki.rc.hms.harvard.edu/display/O2/O2+HPC+Cluster+and+Computing+Nodes+Hardware
8
Problem?
• Time and money
• Computer and Network skills and knowledge
• Maintenance
• Spark version
• Scale up
• Idle times
• Cloud Data Connection
9
10
Computing Cluster
11
Cloud Virtual Computing Cluster
12
Problem?
• Time and money (installation and configuration)
• Computer and Network skills and knowledge
• Maintenance
• Spark version
• Scale up
• Idle times
• Cloud Data Connection
13
Problem?
• Time and money (installation and configuration)
• Computer and Network skills and knowledge
• Maintenance
• Spark version
• Scale up
• Idle times
• Cloud Data Connection
ovh-spark-submit
14
OVH Analytics Data Compute
15
OVH Analytics Data Compute
16
What is difference with original Spark-submit command?
17
OVH Analytics Data Compute
• Same command line and options as original Spark command line
• Select Spark version ( --version 2.4.0)
• Cluster is only accessible through HTTPS.
• A new and dedicated cluster will be created for each request and it will be
deleted after finishing the job.
• You have the option to keep your cluster after finishing the job. ( --keep-infra)
• Your cluster is isolated from internet.
• Your cluster computers are created in your own Openstack project.
• Results and output logs will be saved in swift of your Openstack project.
• Input and output of data can be any source or format of data
18
How to use? www.ovh.com - Espace Client
19
Download and use command line: ovh-spark-submit
Full manual: https://docs.ovh.com/gb/en/analytics-data-compute/labs/data-compute/getting-started-with-analytics-data-compute/
20
Customer Feedbacks
A French TV Channels provider:
Analytics Data Compute helps us on defining new jobs without having
impact on our production pipelines.
A French Car Manufacturer:
With such a huge amount of data we have, we needed a solution that
could scale easily. Moreover, we did not needed a full hadoop stack. So,
Analytics Data Compute was the perfect candidate.
A French Bank :
We had trouble with spark versions and spark cluster management. With
Analytics Data Compute, we no longer have to bother with that kind of
problems.
21
Global cloud hyperscalers concentration
cloud providers HQs dictate your data sovereignty options
Freedom Act
Cloud Act
GDPR
22
OVH, A GLOBAL HYPER-SCALE CLOUD PROVIDER
KEY FACTS & FIGURES
1,500,000+
CUSTOMERS*
2200+
EMPLOYEES WORLDWIDE
1,5 BILLION €
OF INVESTMENT OVER 5 YEARS
98%
OF HOSTING ROOMS FREE FROM AIR
CONDITIONING
28 DATACENTERS
350,000 SERVERS
5000+
Partners and communities
*May 2018
23
28 Datacenters
Owned, operated and racked by OVH
manufacturing units.
From sheet metal to automated
operations.
Hillsboro
r x1
Vint Hill
r x1
Beauharnois
r x7
Singapore
r x1
Sydney
r x1
Warsaw
r x1
London
r x1
Roubaix rx7
Gravelines rx2
Strasbourg rx4
Paris rx1
Frankfurt
r x1
24
We operate our own global network and server manufacturing
Keeping control on security, scalability & Quality of Service
P.U.E
1.09
1MServers
produced
(oct.2018)
33xPoints
of Presence
260K
Public cloud
instances
16
Tbps.total network
capacity
350K
Physical servers
Patented Watercooling and manufacturing Owned and operated Data centers Proprietary Network & Anti DDoS security
« No ingress/egress charges on your network usage. »
25
OVH Data Convergence Team
• Analytics Data Platform: A one-click pre-configured Hadoop stack designed to store and
process high volumes of data across OVH Public Cloud infrastructure.
• Analytics Data Compute: A one-time Apache Spark Cluster on the fly
• Analytics Data Collector: A cloud hosted agent to replicate, query and transport data
26
Demo:
Word Count of a big text file
27
Java + Swift Word Count Sample
28
Java + Amazon S3 Word Count Sample
29
Analytics Data Compute :
labs.ovh.com
docs.ovh.com

More Related Content

What's hot

Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016LibreCon
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Bostonkbajda
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache icebergAlluxio, Inc.
 
Transforming Your Business with Fast Data – Five Use Case Examples
Transforming Your Business with Fast Data – Five Use Case ExamplesTransforming Your Business with Fast Data – Five Use Case Examples
Transforming Your Business with Fast Data – Five Use Case ExamplesVoltDB
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Icebergkbajda
 
Turnkey Riak KV Cluster
Turnkey Riak KV ClusterTurnkey Riak KV Cluster
Turnkey Riak KV ClusterJoe Olson
 
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB  | InfluxDays...Sam Dillard [InfluxData] | Performance Optimization in InfluxDB  | InfluxDays...
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...InfluxData
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleItai Yaffe
 
Artik cloud deview 2016
Artik cloud   deview 2016Artik cloud   deview 2016
Artik cloud deview 2016NAVER D2
 
288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...
288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...
288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...Toradex
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alina Vilk
 
SC1 Workshop 2 Technical overview
SC1 Workshop 2 Technical overviewSC1 Workshop 2 Technical overview
SC1 Workshop 2 Technical overviewBigData_Europe
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...Databricks
 
Open source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsOpen source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsSoftwareMill
 
Stratio Sparta 2.0
Stratio Sparta 2.0Stratio Sparta 2.0
Stratio Sparta 2.0Stratio
 
How to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and ApplicationsHow to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and ApplicationsAlluxio, Inc.
 
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...OpenNebula Project
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleItai Yaffe
 

What's hot (20)

Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016Kappa Architecture, IoT of the cars - LibreCon 2016
Kappa Architecture, IoT of the cars - LibreCon 2016
 
Presto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 BostonPresto talk @ Global AI conference 2018 Boston
Presto talk @ Global AI conference 2018 Boston
 
Building an open data platform with apache iceberg
Building an open data platform with apache icebergBuilding an open data platform with apache iceberg
Building an open data platform with apache iceberg
 
Transforming Your Business with Fast Data – Five Use Case Examples
Transforming Your Business with Fast Data – Five Use Case ExamplesTransforming Your Business with Fast Data – Five Use Case Examples
Transforming Your Business with Fast Data – Five Use Case Examples
 
Presto Summit 2018 - 09 - Netflix Iceberg
Presto Summit 2018  - 09 - Netflix IcebergPresto Summit 2018  - 09 - Netflix Iceberg
Presto Summit 2018 - 09 - Netflix Iceberg
 
Hybrid Cloud 2014
Hybrid Cloud 2014Hybrid Cloud 2014
Hybrid Cloud 2014
 
Turnkey Riak KV Cluster
Turnkey Riak KV ClusterTurnkey Riak KV Cluster
Turnkey Riak KV Cluster
 
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB  | InfluxDays...Sam Dillard [InfluxData] | Performance Optimization in InfluxDB  | InfluxDays...
Sam Dillard [InfluxData] | Performance Optimization in InfluxDB | InfluxDays...
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scale
 
Artik cloud deview 2016
Artik cloud   deview 2016Artik cloud   deview 2016
Artik cloud deview 2016
 
288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...
288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...
288 Core ARM® and 13’824 CUDA Core Microserver Cluster with Toradex Apalis Sy...
 
Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.Alexander Pavlenko, Java Software Engineer, DataArt.
Alexander Pavlenko, Java Software Engineer, DataArt.
 
SC1 Workshop 2 Technical overview
SC1 Workshop 2 Technical overviewSC1 Workshop 2 Technical overview
SC1 Workshop 2 Technical overview
 
Certificates
CertificatesCertificates
Certificates
 
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
An End-to-End Spark-Based Machine Learning Stack in the Hybrid Cloud with Far...
 
Open source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applicationsOpen source big data landscape and possible ITS applications
Open source big data landscape and possible ITS applications
 
Stratio Sparta 2.0
Stratio Sparta 2.0Stratio Sparta 2.0
Stratio Sparta 2.0
 
How to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and ApplicationsHow to Develop and Operate Cloud Native Data Platforms and Applications
How to Develop and Operate Cloud Native Data Platforms and Applications
 
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
OpenNebulaConf2017EU: Elastic Clusters for Data Analysis by Carlos de Alfonso...
 
Using druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scaleUsing druid for interactive count distinct queries at scale
Using druid for interactive count distinct queries at scale
 

Similar to OVH Spark Cluster as a Service Overview

OVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a ServiceOVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a ServiceMojtaba Imani
 
Red hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategyRed hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategyOrgad Kimchi
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?OVHcloud
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewKrishna-Kumar
 
All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight Mark Hinkle
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLSingleStore
 
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X Kai Wähner
 
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTFlexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTconfluent
 
Cloud Platform for IoT
Cloud Platform for IoTCloud Platform for IoT
Cloud Platform for IoTNaoto Umemori
 
Presentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePresentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePRAGMA PROGETTI
 
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, Smile
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, SmileOCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, Smile
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, SmileOCCIware
 
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017Marc Dutoo
 
HIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlHIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlSascha Oehl
 
Streaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_VirenderStreaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_Virendervithakur
 
Introduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data PlatformIntroduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data PlatformOVHcloud
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux EcosystemKangaroot
 
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...Kai Wähner
 
Excellent slides on the new z13s announced on 16th Feb 2016
Excellent slides on the new z13s announced on 16th Feb 2016Excellent slides on the new z13s announced on 16th Feb 2016
Excellent slides on the new z13s announced on 16th Feb 2016Luigi Tommaseo
 

Similar to OVH Spark Cluster as a Service Overview (20)

OVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a ServiceOVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a Service
 
Red hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategyRed hat's updates on the cloud & infrastructure strategy
Red hat's updates on the cloud & infrastructure strategy
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?
 
Open Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - OverviewOpen Source Edge Computing Platforms - Overview
Open Source Edge Computing Platforms - Overview
 
All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight All Things Open SDN, NFV and Open Daylight
All Things Open SDN, NFV and Open Daylight
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 
OpenPOWER Update
OpenPOWER UpdateOpenPOWER Update
OpenPOWER Update
 
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
 
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTFlexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
 
Cloud Platform for IoT
Cloud Platform for IoTCloud Platform for IoT
Cloud Platform for IoT
 
Presentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobrePresentazione IBM Power System Evento Venaria 14 ottobre
Presentazione IBM Power System Evento Venaria 14 ottobre
 
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, Smile
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, SmileOCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, Smile
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, Smile
 
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017
 
HIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha OehlHIPAS UCP HSP Openstack Sascha Oehl
HIPAS UCP HSP Openstack Sascha Oehl
 
Streaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_VirenderStreaming Sensor Data Slides_Virender
Streaming Sensor Data Slides_Virender
 
Introduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data PlatformIntroduction to OVH Analytics Data Platform
Introduction to OVH Analytics Data Platform
 
IBM: The Linux Ecosystem
IBM: The Linux EcosystemIBM: The Linux Ecosystem
IBM: The Linux Ecosystem
 
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
IoT Open Source Integration Comparison (Kura, Node-RED, Flogo, Apache Nifi, S...
 
Excellent slides on the new z13s announced on 16th Feb 2016
Excellent slides on the new z13s announced on 16th Feb 2016Excellent slides on the new z13s announced on 16th Feb 2016
Excellent slides on the new z13s announced on 16th Feb 2016
 

Recently uploaded

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 

Recently uploaded (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 

OVH Spark Cluster as a Service Overview

  • 1. 1 OVH Analytics Data Compute April 2019 Apache Spark Cluster as a Service Mojtaba Imani DevOps Cloud and Bigdata @OVH
  • 2. 2 Deep Learning IoT Platform Artificial Intelligence Smart cities Edge Computing Autonomous Vehicles Blockchains Biochips Smart robots & drones BY 2025, data will be x10 the total datasphere produced till 2016 - IDC Survey (2017) « Data Age 2025 » The Cloud Transformation: A data odyssey… “Exponential growth of data, its costs & governance: IS YOUR BUSINESS READY FOR IT?”
  • 3. 3 90 % of the whole data in the world has been created in last 2 years. Autonomous cars generate 20TB of data per hour
  • 5. 5 Apache Spark • An open-source distributed general-purpose cluster-computing framework. • The largest open source community in big data • Up to 100 times faster than hadoop mapReduce (in-memory processing and lazy evaluation) • The leading platform for large-scale SQL, batch processing, stream processing, and machine learning • Easy to use API and coding with Java, Scala, Python, R, and SQL
  • 6. 6 Apache Spark Cluster source: http://www.buyukveri.co/en/apache-spark-architecture/
  • 8. 8 Problem? • Time and money • Computer and Network skills and knowledge • Maintenance • Spark version • Scale up • Idle times • Cloud Data Connection
  • 9. 9
  • 12. 12 Problem? • Time and money (installation and configuration) • Computer and Network skills and knowledge • Maintenance • Spark version • Scale up • Idle times • Cloud Data Connection
  • 13. 13 Problem? • Time and money (installation and configuration) • Computer and Network skills and knowledge • Maintenance • Spark version • Scale up • Idle times • Cloud Data Connection ovh-spark-submit
  • 16. 16 What is difference with original Spark-submit command?
  • 17. 17 OVH Analytics Data Compute • Same command line and options as original Spark command line • Select Spark version ( --version 2.4.0) • Cluster is only accessible through HTTPS. • A new and dedicated cluster will be created for each request and it will be deleted after finishing the job. • You have the option to keep your cluster after finishing the job. ( --keep-infra) • Your cluster is isolated from internet. • Your cluster computers are created in your own Openstack project. • Results and output logs will be saved in swift of your Openstack project. • Input and output of data can be any source or format of data
  • 18. 18 How to use? www.ovh.com - Espace Client
  • 19. 19 Download and use command line: ovh-spark-submit Full manual: https://docs.ovh.com/gb/en/analytics-data-compute/labs/data-compute/getting-started-with-analytics-data-compute/
  • 20. 20 Customer Feedbacks A French TV Channels provider: Analytics Data Compute helps us on defining new jobs without having impact on our production pipelines. A French Car Manufacturer: With such a huge amount of data we have, we needed a solution that could scale easily. Moreover, we did not needed a full hadoop stack. So, Analytics Data Compute was the perfect candidate. A French Bank : We had trouble with spark versions and spark cluster management. With Analytics Data Compute, we no longer have to bother with that kind of problems.
  • 21. 21 Global cloud hyperscalers concentration cloud providers HQs dictate your data sovereignty options Freedom Act Cloud Act GDPR
  • 22. 22 OVH, A GLOBAL HYPER-SCALE CLOUD PROVIDER KEY FACTS & FIGURES 1,500,000+ CUSTOMERS* 2200+ EMPLOYEES WORLDWIDE 1,5 BILLION € OF INVESTMENT OVER 5 YEARS 98% OF HOSTING ROOMS FREE FROM AIR CONDITIONING 28 DATACENTERS 350,000 SERVERS 5000+ Partners and communities *May 2018
  • 23. 23 28 Datacenters Owned, operated and racked by OVH manufacturing units. From sheet metal to automated operations. Hillsboro r x1 Vint Hill r x1 Beauharnois r x7 Singapore r x1 Sydney r x1 Warsaw r x1 London r x1 Roubaix rx7 Gravelines rx2 Strasbourg rx4 Paris rx1 Frankfurt r x1
  • 24. 24 We operate our own global network and server manufacturing Keeping control on security, scalability & Quality of Service P.U.E 1.09 1MServers produced (oct.2018) 33xPoints of Presence 260K Public cloud instances 16 Tbps.total network capacity 350K Physical servers Patented Watercooling and manufacturing Owned and operated Data centers Proprietary Network & Anti DDoS security « No ingress/egress charges on your network usage. »
  • 25. 25 OVH Data Convergence Team • Analytics Data Platform: A one-click pre-configured Hadoop stack designed to store and process high volumes of data across OVH Public Cloud infrastructure. • Analytics Data Compute: A one-time Apache Spark Cluster on the fly • Analytics Data Collector: A cloud hosted agent to replicate, query and transport data
  • 26. 26 Demo: Word Count of a big text file
  • 27. 27 Java + Swift Word Count Sample
  • 28. 28 Java + Amazon S3 Word Count Sample
  • 29. 29 Analytics Data Compute : labs.ovh.com docs.ovh.com