11
2
Hubert Stefani
Chief Innovation
Officer
Bastien Verdebout
Product Manager data
OVHcloud
Who are we?
Erika Gelinard
Product Marketing
OVHcloud
3
4
5
6
7
Storage : File, Block, Object, databases, metrics, logs, … + Private Network
Dedicated Servers
Servers
Hosted Private Cloud
VMs
Dedicated
hardware
✓ Private by design (hw+network)
✓ Certification (healthcare/banking)
✓ VMware - based
Public Cloud
Instances
Shared
hardware
✓ Scalability
✓ Automation
✓ Ecosystem
Ecosystem (K8s, big data, AI, …)
Like…
8
AI / ML
✓ Strong CPU
✓ Up to 1.5TB RAM / 112TB Storage SSD
✓ GPU options (P100/V100)
Start at
507€ HT/m
In-memory database
✓ From 96GB to 1.5TB RAM
✓ HDD disks
✓ GPU options (P100/V100)
Start at
519€ HT/m
IOPS intensive
✓ High perf storage SSD NVMe
✓ Up to 384GB RAM
✓ Intel Optane technology
Start at
585€ HT/m
Big data analytics
✓ Strong CPU + lot of storage (30TB SSD)
✓ Up to 1.5TB RAM / 38TB storage SSD
✓ GPU options (P100/V100)
Start at
597€ HT/m
9
IOPS intensive
✓ High perf storage SSD NVMe
✓ Unbeatable price/perf ratio
Amount of IOPS / $1
Full report : https://dochub.cloudspectator.com/ovh-iops-series/
GPU
✓ Powered by Nvidia tesla
✓ Up to 180GB RAM
✓ Local SSD storage
Start at
1199€ HT/m
Start at
199€ HT/m
As always, simple and predictive pricing :
- Ingress/egress network traffic included
- Reserved resources (CPU/RAM)
- IPv4/v6 included
- …
11
12
Process
Analyze
Big data cluster (Hadoop)
Managed Cloudera (Hadoop)
Data Processing (Spark)new
Benefit from a SMART cloud all the way
Learn
Consume
NVIDIA NGC (GPU-accelerated apps)
Machine Learning Serving
AI API Marketplace
AI Training (GPU aaS)
Notebooks Jupyter
soon
Ingest
No products but
compliance :
✓ debezium
✓ Dremio
✓ striim
✓ Stichdata
✓ …
Store
Object Storage
Block Storage
Cold Storage
Managed databases
Timeseries databases
Logs databases
NAS-HA, Cloud Disk array, ..
new
soon
13
Public Cloud Private Cloud
Agnostic
Object Storage
✓ Powered by Openstack Swift
✓ Compliant with
✓ Unlimited growth
✓ Pay as you Go
Block Storage
✓ Powered by Openstack Ceph
✓ Volumes from 10GB to 4TB
✓ Pay as you Go
Datastores
✓ SSD powered
✓ From 2TB to 36TB
Private
network
NAS-HA
✓ Managed NAS (NFS/CIFS protocols)
✓ From 2TB to 36TB
Cloud Disk Array
✓ Powered by CEPH (Rados Block Storage)
✓ From 2TB to Petabytes
14
Relational databases Logs & Metrics
Cold storage
Cloud Databases
✓ Best for sandbox/dev
✓ Multi-tenant
✓ From 500Mb to 2GB RAM
✓ MySQL, PostgreSQL, MariaDB
Enterprise Cloud Databases
✓ HA clustered infra (min. 3 nodes)
✓ Dedicated hardware (single tenant)
✓ From 16GB to 256GB RAM per
node
✓ PostgreSQL (MySQL soon)
Logs Data Platform
✓ Index and Analyze
✓ Can ingest thousand lines per sec.
✓ Compliant with Graylog, Kibana, Grafana,…
Cloud Archive
✓ Powered by Openstack Swift
✓ Perfect for legal/backups use cases
Metrics Data Platform
✓ Collect/Store/Analyze
✓ Multi-protocol (InfluxDB, Warp10, OpenTSDB, …)
✓ Can store millions of series
✓ From 1 to 10 years retention
15
Cloudera, managed by
✓Fully managed big data Hadoop Cluster
✓24/7 unique entry point (Claranet)
✓Data experts sizing & support
✓On top of dedicated servers or Private cloud
Data processing : Submit jobs in 2 clicks / API / CLI
✓Submit your Apache Spark jobs (Java/Python), we deploy and run them for you
✓Start in seconds, not minutes
✓Integrated in Public Cloud
✓Pays as you go
Analytics Data Platform
✓Preinstalled big data Hadoop cluster
✓Integrated in Public Cloud
✓Up in 1 hour, fully secured
✓Pay as you go
new
16
Develop or find a
ML model
Training
Test / Analyze
results
Use it !
(serving)
GPU offers
✓ Baremetal
✓ Public Cloud
✓ Powered by
ML Serving : deploy models in 2 clicks
✓ We provide API endpoints for your models
✓ Auto-scaling, Versioning and monitoring
✓ Compliant with ONNX, Tensorflow, PMML
Data
+
Idea
new
NVIDIA NGC
✓ GPU accelerated apps
✓ Maintained by
AI Training
✓ GPU as a Service
✓ 1 line of code to train
✓ Pay as you go
soon
AI Marketplace
✓ Catalog of models
✓ Ready out of the box
soon
18
19
Market
place
Open
Trusted
Cloud
https://opentrustedcloud.ovhcloud.com/fr/
https://marketplace.ovhcloud.com
20
21
Credits:
• Domo ©
• Statista©
22
INPUT data
Processing
(Arranging, sorting, combining,
mathematical operations, machine learning, …)
OUTPUT
data
LOT OF COMPUTE
RAM + CPU/GPU
STORAGE STORAGE
✓ Example 1 : Each hour, we collect the millions of tweets then want to count the top 10 trends (top 10
#hashtags)
✓ Example 2 : A billing team receive the monthly usage of millions of products. The charges are summarized
per customer and we then the send global bill to customer (and we check fraud too…)
✓ Example 3 : A website accepts various video uploads and convert it to standard formats
23
But now imagine 2 millions lines…
Or images/videos
Or machines learning needs
…
➔ We need something better !
26
Process you data easily and with performance
!
Powered by most used analytics engine
✓ Submit your Apache Spark jobs (Java/Python), we process it
✓ Start in seconds, not minutes
✓ Integrated in Public Cloud
✓ Pays as you go
Concept
AWS EMR / Google Dataproc / Azure Databricks
Why with OVHcloud ?
✓ Easy to use, easy to scale
✓ Simple pricing, no hidden costs
✓ Data privacy
✓ Resources per job, not a big cluster to split : cost-effective !
Challengers
27
Example :
5 nodes, 300GB RAM
1 Deploy a cluster (fixed sizing) 1 Submit jobs with UI/API/CLI
2 We deploy in seconds in our infra
Code : mycode.jar or mycode.py
Files : s3://… or swift://
Sizing : ex: 45GB RAM / 8 vCores
>> submit
Job 1
1 x spark
resources
Job 2
1 x spark resources
Job n
2 Submit jobs on this cluster
Job 1
Job 2
28
1 Via Control Panel 2 Via API OVHcloud 3 Via CLI (Spark-
Submit)
$ ./ovh-spark-submit
--project-id yourProjectId
--upload ./spark-examples.jar
--class org.apache.spark.examples.SparkPi
--driver-cores 1
--driver-memory 4G
--executor-cores 1
--executor-memory 4G
--num-executors 1
swift://odp/spark-examples.jar 1000
29
(go in presentation mode in powerpoint)
30
https://docs.ovh.com/gb/en/data-processing/
31
Development Lab GA Soon
ETA = July 2020Launched (April)
Free ! Full working:
• Product itself
• Control panel/API/CLI
• Documentation
Adding:
• Pricing
• Contract
Backlog:
• More Datacenters
• More flavors / better prices
• (TBD) Spark 3.x
S2 2020
32
33

OVHcloud Partner Webinar - Data Processing

  • 1.
  • 2.
    2 Hubert Stefani Chief Innovation Officer BastienVerdebout Product Manager data OVHcloud Who are we? Erika Gelinard Product Marketing OVHcloud
  • 3.
  • 4.
  • 5.
  • 6.
  • 7.
    7 Storage : File,Block, Object, databases, metrics, logs, … + Private Network Dedicated Servers Servers Hosted Private Cloud VMs Dedicated hardware ✓ Private by design (hw+network) ✓ Certification (healthcare/banking) ✓ VMware - based Public Cloud Instances Shared hardware ✓ Scalability ✓ Automation ✓ Ecosystem Ecosystem (K8s, big data, AI, …) Like…
  • 8.
    8 AI / ML ✓Strong CPU ✓ Up to 1.5TB RAM / 112TB Storage SSD ✓ GPU options (P100/V100) Start at 507€ HT/m In-memory database ✓ From 96GB to 1.5TB RAM ✓ HDD disks ✓ GPU options (P100/V100) Start at 519€ HT/m IOPS intensive ✓ High perf storage SSD NVMe ✓ Up to 384GB RAM ✓ Intel Optane technology Start at 585€ HT/m Big data analytics ✓ Strong CPU + lot of storage (30TB SSD) ✓ Up to 1.5TB RAM / 38TB storage SSD ✓ GPU options (P100/V100) Start at 597€ HT/m
  • 9.
    9 IOPS intensive ✓ Highperf storage SSD NVMe ✓ Unbeatable price/perf ratio Amount of IOPS / $1 Full report : https://dochub.cloudspectator.com/ovh-iops-series/ GPU ✓ Powered by Nvidia tesla ✓ Up to 180GB RAM ✓ Local SSD storage Start at 1199€ HT/m Start at 199€ HT/m As always, simple and predictive pricing : - Ingress/egress network traffic included - Reserved resources (CPU/RAM) - IPv4/v6 included - …
  • 10.
  • 11.
    12 Process Analyze Big data cluster(Hadoop) Managed Cloudera (Hadoop) Data Processing (Spark)new Benefit from a SMART cloud all the way Learn Consume NVIDIA NGC (GPU-accelerated apps) Machine Learning Serving AI API Marketplace AI Training (GPU aaS) Notebooks Jupyter soon Ingest No products but compliance : ✓ debezium ✓ Dremio ✓ striim ✓ Stichdata ✓ … Store Object Storage Block Storage Cold Storage Managed databases Timeseries databases Logs databases NAS-HA, Cloud Disk array, .. new soon
  • 12.
    13 Public Cloud PrivateCloud Agnostic Object Storage ✓ Powered by Openstack Swift ✓ Compliant with ✓ Unlimited growth ✓ Pay as you Go Block Storage ✓ Powered by Openstack Ceph ✓ Volumes from 10GB to 4TB ✓ Pay as you Go Datastores ✓ SSD powered ✓ From 2TB to 36TB Private network NAS-HA ✓ Managed NAS (NFS/CIFS protocols) ✓ From 2TB to 36TB Cloud Disk Array ✓ Powered by CEPH (Rados Block Storage) ✓ From 2TB to Petabytes
  • 13.
    14 Relational databases Logs& Metrics Cold storage Cloud Databases ✓ Best for sandbox/dev ✓ Multi-tenant ✓ From 500Mb to 2GB RAM ✓ MySQL, PostgreSQL, MariaDB Enterprise Cloud Databases ✓ HA clustered infra (min. 3 nodes) ✓ Dedicated hardware (single tenant) ✓ From 16GB to 256GB RAM per node ✓ PostgreSQL (MySQL soon) Logs Data Platform ✓ Index and Analyze ✓ Can ingest thousand lines per sec. ✓ Compliant with Graylog, Kibana, Grafana,… Cloud Archive ✓ Powered by Openstack Swift ✓ Perfect for legal/backups use cases Metrics Data Platform ✓ Collect/Store/Analyze ✓ Multi-protocol (InfluxDB, Warp10, OpenTSDB, …) ✓ Can store millions of series ✓ From 1 to 10 years retention
  • 14.
    15 Cloudera, managed by ✓Fullymanaged big data Hadoop Cluster ✓24/7 unique entry point (Claranet) ✓Data experts sizing & support ✓On top of dedicated servers or Private cloud Data processing : Submit jobs in 2 clicks / API / CLI ✓Submit your Apache Spark jobs (Java/Python), we deploy and run them for you ✓Start in seconds, not minutes ✓Integrated in Public Cloud ✓Pays as you go Analytics Data Platform ✓Preinstalled big data Hadoop cluster ✓Integrated in Public Cloud ✓Up in 1 hour, fully secured ✓Pay as you go new
  • 15.
    16 Develop or finda ML model Training Test / Analyze results Use it ! (serving) GPU offers ✓ Baremetal ✓ Public Cloud ✓ Powered by ML Serving : deploy models in 2 clicks ✓ We provide API endpoints for your models ✓ Auto-scaling, Versioning and monitoring ✓ Compliant with ONNX, Tensorflow, PMML Data + Idea new NVIDIA NGC ✓ GPU accelerated apps ✓ Maintained by AI Training ✓ GPU as a Service ✓ 1 line of code to train ✓ Pay as you go soon AI Marketplace ✓ Catalog of models ✓ Ready out of the box soon
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
    22 INPUT data Processing (Arranging, sorting,combining, mathematical operations, machine learning, …) OUTPUT data LOT OF COMPUTE RAM + CPU/GPU STORAGE STORAGE ✓ Example 1 : Each hour, we collect the millions of tweets then want to count the top 10 trends (top 10 #hashtags) ✓ Example 2 : A billing team receive the monthly usage of millions of products. The charges are summarized per customer and we then the send global bill to customer (and we check fraud too…) ✓ Example 3 : A website accepts various video uploads and convert it to standard formats
  • 21.
    23 But now imagine2 millions lines… Or images/videos Or machines learning needs … ➔ We need something better !
  • 22.
    26 Process you dataeasily and with performance ! Powered by most used analytics engine ✓ Submit your Apache Spark jobs (Java/Python), we process it ✓ Start in seconds, not minutes ✓ Integrated in Public Cloud ✓ Pays as you go Concept AWS EMR / Google Dataproc / Azure Databricks Why with OVHcloud ? ✓ Easy to use, easy to scale ✓ Simple pricing, no hidden costs ✓ Data privacy ✓ Resources per job, not a big cluster to split : cost-effective ! Challengers
  • 23.
    27 Example : 5 nodes,300GB RAM 1 Deploy a cluster (fixed sizing) 1 Submit jobs with UI/API/CLI 2 We deploy in seconds in our infra Code : mycode.jar or mycode.py Files : s3://… or swift:// Sizing : ex: 45GB RAM / 8 vCores >> submit Job 1 1 x spark resources Job 2 1 x spark resources Job n 2 Submit jobs on this cluster Job 1 Job 2
  • 24.
    28 1 Via ControlPanel 2 Via API OVHcloud 3 Via CLI (Spark- Submit) $ ./ovh-spark-submit --project-id yourProjectId --upload ./spark-examples.jar --class org.apache.spark.examples.SparkPi --driver-cores 1 --driver-memory 4G --executor-cores 1 --executor-memory 4G --num-executors 1 swift://odp/spark-examples.jar 1000
  • 25.
    29 (go in presentationmode in powerpoint)
  • 26.
  • 27.
    31 Development Lab GASoon ETA = July 2020Launched (April) Free ! Full working: • Product itself • Control panel/API/CLI • Documentation Adding: • Pricing • Contract Backlog: • More Datacenters • More flavors / better prices • (TBD) Spark 3.x S2 2020
  • 28.
  • 29.