SlideShare a Scribd company logo
Scaling out Driverless AI
in Enterprise Data
Centers with IBM
Spectrum Conductor
Kevin Doyle
Lead Architect IBM Spectrum Conductor
IBM
LinkedIn: https://www.linkedin.com/in/kevin-doyle-675a4031/
Benefits of managing H2O with IBM Spectrum
Conductor
• H2O Driverless AI can scale across compute nodes for multiple instances, with each instance
allocated to one host
• In a future IBM Spectrum Conductor release, integration improves at the GPU level: You will be
able to run multiple Driverless AI instances on the same host, where each instance is allocated to
an assigned GPU
• Shared file system for Data and logs
• Failover to another host if Driverless AI goes down: IBM Spectrum Conductor starts it up on
another host (if resources available)
• Easily start and stop H2O Driverless AI and maintain instances for each user or groups of users
through role-based access control (RBAC) and consumer association, along with all other
workloads in one shared compute cluster
• H2O Driverless AI and IBM POWER9 GPU Systems are bringing together the best of breed AI
innovation. To handle the increasingly complex workloads of AI you need an integrated system of
software and hardware:
• IBM supports nearly 2.6x mPOWER9ore RAM, 9.5x more I/O bandwidth than comparable systems
• Nearly 2X the data ingest speed and over 50% faster feature engineering
• With GPU accelerated machine learning delivering nearly 30X speedup on model building
• Support for up to 6 V100 GPUs on a single system
What is IBM® Spectrum Conductor?
• IBM Spectrum Conductor confidently deploys modern computing frameworks and
services for a multitenant enterprise environment, both on-premises and in the cloud
• Provides multitenancy through application instances and Spark instance groups. You can
deploy modern computing frameworks and services, such as Spark, Anaconda, Driverless
AI, and H2O Sparkling Water efficiently and effectively, supporting multiple versions and
instances of each framework and service
• Increases performance and scale through granular and dynamic resource allocation for
application instances and Spark instance groups that share a resource pool
• Maximizes usage of resources and eliminates silos of resources that would otherwise
each be tied to separate application implementations
• Provides flexible and efficient data management for shared storage and high availability
by connecting to existing storage infrastructure, such as NFS mounts to a file system or
IBM Spectrum Scale™
VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES
Application
Application
examples
• Simulation
• Analysis
• Design
• Big data
IT constrained
• Long wait times
• Low utilization
• Data access
bottlenecks
• IT Sprawl
IBM Software Defined Infrastructure
Big data
Simulation and
modeling
Analytics
Traditional IBM Spectrum Conductor
Make multiple computers look
like one
Prioritized matching of supply
with demand
Benefits
• High utilization
• Throughput
• Performance
• Prioritization
• Reduced cost
Repeated for many
apps and groups
Converged
compute
and
storage
VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES
Faster results Fewer resources
Long running services
Distinct resources for
compute and storage
Traditional vs Conductor Management
IBM Systems
Shared Services Model for Spark, Machine Learning, and Deep Learning
• Physical view: IBM Spectrum Conductor installed on each Linux server
• Logical view: Users (groups) have their own Spark cluster (optional) that is isolated, protected, and
secured by Spark instance groups or application instances – Managed by SLA
| 5
Administrator
Compute Nodes
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Linux
Instance #1
LOB
Marketing…
Fraud Detection…
Data scientist
Instance #2
Data scientist
Driverless AI
Instance #3
Researcher
Instance #4
x86 Systems
Cloud Object Storage (COS)Spectrum Scale
Spectrum Conductor
Data Connectors
IBM Systems
IBM Spectrum Conductor
The most complete enterprise-grade solution for Data Science
• Anaconda Distributions
The solution supports multiple distributions of Anaconda running concurrently.
Users can add/remove Conda packages.
• Notebooks Integration
Out-of-the-box notebooks available: Jupyter, Zeppelin, RStudio, H2O
Sparkling Water. Other notebooks and distributed frameworks can be quickly
integrated.
• Spark Distributions
The solution supports multiple versions of Spark running concurrently.
• Workload Management / Scheduling
A proven workload scheduling engine that enhances the Spark master
scheduling logic to enable multi-tenancy.
• Services Management
Management of other long running application services on the same grid.
Spark applications commonly have dependencies on other services that can
now be managed as a single solution.
• Resource Management & Orchestration
Proven architecture at scale. Resources are dynamically allocated to Spark
workload with fine grain sharing across applications.
• IBM services and support
A single point of contact for your services and support needs.
| 6
Monitoring&Reporting
Workload Management / Scheduling
Resource Management &
Orchestration
Services Management
Services and Support
Red Hat Linux
x86…
Notebooks
IBM Systems 7© 2016 IBM Corporation
Competitive advantage through faster, more predictable analytics
Throughput: 41% greater than Spark with YARN; 57% greater than Spark with Mesos
Spectrum Conductor
with Spark
Spark / YARN Spark / Mesos
When minutes count 10 minutes 14.1 minutes 15.7 minutes
At quarter-end 80 hours 112.8 hours 125.6 hours
Product development 26 weeks 36.7 weeks 40.8 weeks
Source: STAC Report: Spark Resource Managers, Phase 1 (March 28, 2016)
Note: IBM is an active contributor in the Mesos community, helping to advance its capabilities and integration with IBM solutions
Predictability: longest job duration compared with median (lower is better)
Spectrum Conductor
with Spark
Spark / YARN Spark / Mesos
1.51X 1.62X 66.32X
IBM Systems 8© 2016 IBM Corporation
STAC reported significant advantages, up to 2.2x, for IBM Spectrum Conductor with Spark
over YARN and Mesos.
PowerAI Enterprise ML/DL - Data Science Stack
Open Source Frameworks Distribution
Data Layer
Runtimes,
Resource &
WL Managers
DL Frameworks
ML Libraries
ML/DL
UI and Flow
Data Science
Apps
Value-add Tools
IBM Spectrum Conductor
Tensor
Flow
Caffe PyTorch Chainer MLLib Graphx
Scikit-
learn
R xgboost
GPU Support / Distributed / BYOF / Session Scheduler / MPI / Containers… Anaconda
Python
Spark
Anaconda
Distributed Deep Learning (DDL)
Data Prep / Parallel Training / Model Tuning / Model Evaluation / Inference Services…
IBM Spectrum Conductor Deep Learning Impact
PowerAI Vision
IBM
PowerAI
Enterprise
IBM Spectrum Scale IBM Cloud Object Store
Watson Studio
Elastic Distributed Training (EDT)
Key concepts of IBM Spectrum Conductor
• Application instances
• Customizable feature to support running any long-running service within the cluster
• Application templates (yaml) are created to define the processes (services) that you
want to run in the cluster
• Driverless AI integration is done through application instances
• Spark instance groups
• Is an installation of Apache Spark that can run Spark core services (master, shuffle,
and history), Anaconda distribution instances, and notebooks as configured
• You can create and run multiple Spark instance groups, associating each instance
group with different Spark/Anaconda/notebook version packages as required
• H2O Sparkling Water integration is treated as a notebook within your Spark instance
groups
Key concepts of IBM Spectrum Conductor Cont
• Resource groups
• Provide a simple way of organizing and grouping resources (hosts)
• Defines how to divide up the hosts in the group into slots
• Slots are used to decide if a host is available to place new workload on it
• Consumers
• A way to map organizations/teams to resources they are allowed to use
• Resource planning uses consumers to determine advanced policies for when
to borrow/lend resources to other consumers
• Resource groups map to consumers to allow users adding application
instances or Spark instance groups to only use those resource groups
Role-based access control
• Permissions are assigned to roles
• Roles are assigned to users
• Most permissions are based on a consumer
• Users will have the permissions/role assigned but only for the consumers they
have access to
• Ability to allow users to only access/control what they should
• Example: Each user can see only their Driverless AI instances as desired
How does the integration work?
• H2O Driverless AI is launched on a single host
• The host can have either GPUs or just run with CPUs
• If using GPUs the entire host is taken (with current integration)
• An application instance is created for each user of Driverless AI
• Maintains security for the data this user has access to
• Environment variables through parameters are used to configure Driverless AI
• H2O Sparkling Water runs as a notebook in a Spark instance group
• When the notebook is started up it forms a mini cluster of executors
• These executors stay alive for the entire duration of the notebook
• IBM Spectrum Conductor disables preemption to not reclaim these hosts
• Multiple users can share a Sparkling Water notebook instance or have
dedicated ones per user
Current Integration
14
Session Scheduler
Security
Data Connector
Report/log management
Notebook Spark ELKPython
Resource, Cluster, Service Management (K8s/EGO)
ContainerGPU and Acceleration
Multi-tenancy
Batch Scheduler
Session Scheduler
Session Scheduler
Instance Group #1 Instance Group #2
App instance
# marketing
App instance
# fraud
Instance Group
# 5
Elastic Distributed
Training (EDT)
# other
apps …
Demo
Future Plans (short term)
• Log retrieval from IBM Spectrum Conductor web UI
• Ability to deploy Driverless AI with IBM Spectrum Conductor instead
of installing on all systems (new application template)
• Ability to modify application instance outputs more effectively
• Enhance job monitor to check when Driverless AI is up
Future Plans (longer term)
• Improved port management
• Today you can specify the ports to use, however, you don’t know if they are
being used on existing hots
• The ports might work at first but not later if something else is using the ports
• Improve handling of running Driverless AI with a subset of GPUs on
hosts in the cluster
• Integrate Driverless AI authentication with IBM Spectrum Conductor
authentication/authorization for easier setup
• Look at supporting Driverless AI to run across multiple machines
• Investigate the best approaches to connect to data sources
Long term architecture vision for Driverless AI
integrated with IBM Spectrum Conductor
H2O Driverless AI
Batch Scheduler
(1) Start Driverless AI
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Linux Linux
Session Scheduler
(2) Find a host to run Driverless AI
(3) Run workload
(training,
experiment, etc)
(4) Find hosts to run the
workload on to speed up
execution
It’s available now
• Contact Richard Shedrick ( rshedrick@us.ibm.com ) to get access to
the integration and learn more
• Future announcements and contact points on the integration at:
• IBM Spectrum Conductor Blog:
http://ibm.biz/ConductorBlogs
• IBM Spectrum Conductor’s Slack channel:
http://ibm.biz/ConductorSlack
20
Simplicity: Integrated
Platform that Just Works
Curate, Test, and Support
Fast Moving Open Source
Provide Enterprise
Distribution on RedHat
Easy to deploy Enterprise
AI Platform
Ease of Use, Unique
Capabilities
Faster Model
Training Time
Large data & model
support due to NVLink
Acceleration of Analytics &
ML
AutoML: PowerAI Vision
Elastic Training: Scale GPUs
as Required
Faster Training Times in
Single Server
Scalability to 100s of
Servers (Cluster level
Integration)
Leads to Faster Insights
and Better Economics
Platform that Partners can
build on
Software Partners: H2O,
IBM, Anaconda
SIs, Solution Vendors
& Accelerator Partners
Open AI Platform w/
Ecosystem Partners
Power9
CPU
GPU
PowerAI
IBM
SW
ISV SW
Solution
SIs
Top Reasons to Choose PowerAI Enterprise

More Related Content

What's hot

Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Josh Patterson
 
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
Sri Ambati
 
Saving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AISaving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AI
Databricks
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno Candel
Sri Ambati
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Databricks
 
GPU Acceleration for Financial Services
GPU Acceleration for Financial ServicesGPU Acceleration for Financial Services
GPU Acceleration for Financial Services
Kinetica
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPU
Sri Ambati
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Databricks
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Sri Ambati
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
Balvinder Hira
 
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsOperationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Kinetica
 
Dataminds - ML in Production
Dataminds - ML in ProductionDataminds - ML in Production
Dataminds - ML in Production
Nathan Bijnens
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
Saurabh Kaushik
 
Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018
Pavan Dikondkar
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
Nathan Bijnens
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
 
Big Data and ML on Google Cloud
Big Data and ML on Google CloudBig Data and ML on Google Cloud
Big Data and ML on Google Cloud
Wlodek Bielski
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
Virot "Ta" Chiraphadhanakul
 
Azure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshopAzure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshop
Parashar Shah
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Databricks
 

What's hot (20)

Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARNHadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
Hadoop Summit EU 2013: Parallel Linear Regression, IterativeReduce, and YARN
 
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
H2O Machine Learning with KNIME Analytics Platform - Christian Dietz - H2O AI...
 
Saving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AISaving Energy in Homes with a Unified Approach to Data and AI
Saving Energy in Homes with a Unified Approach to Data and AI
 
H2O Advancements - Arno Candel
H2O Advancements - Arno CandelH2O Advancements - Arno Candel
H2O Advancements - Arno Candel
 
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
 
GPU Acceleration for Financial Services
GPU Acceleration for Financial ServicesGPU Acceleration for Financial Services
GPU Acceleration for Financial Services
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPU
 
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
Accelerating Deep Learning Training with BigDL and Drizzle on Apache Spark wi...
 
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
 
Google Cloud Platform
Google Cloud PlatformGoogle Cloud Platform
Google Cloud Platform
 
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database AnalyticsOperationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
Operationalizing Machine Learning Using GPU Accelerated, In-Database Analytics
 
Dataminds - ML in Production
Dataminds - ML in ProductionDataminds - ML in Production
Dataminds - ML in Production
 
MLOps with Kubeflow
MLOps with Kubeflow MLOps with Kubeflow
MLOps with Kubeflow
 
Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018Google Cloud Platform - Introduction & Certification Path 2018
Google Cloud Platform - Introduction & Certification Path 2018
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Big Data and ML on Google Cloud
Big Data and ML on Google CloudBig Data and ML on Google Cloud
Big Data and ML on Google Cloud
 
Data Science on Google Cloud Platform
Data Science on Google Cloud PlatformData Science on Google Cloud Platform
Data Science on Google Cloud Platform
 
Azure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshopAzure AI platform - Automated ML workshop
Azure AI platform - Automated ML workshop
 
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
Lessons Learned from Using Spark for Evaluating Road Detection at BMW Autonom...
 

Similar to Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI World London 2018

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
Steve Wong
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
Peter Clapham
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
John D Almon
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
Paula Koziol
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
Jen Stirrup
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015
WaveMaker, Inc.
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
Sohil Jain
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microservice
Nilanjan Roy
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
lisanl
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
Francisco González Jiménez
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Raul Chong
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
Cloudera, Inc.
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Indrajit Poddar
 
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud ComputingVTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
Sachin Gowda
 
DataOps with Project Amaterasu
DataOps with Project AmaterasuDataOps with Project Amaterasu
DataOps with Project Amaterasu
DataWorks Summit/Hadoop Summit
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
Dr. Ketan Parmar
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
Zahra Eskandari
 
Basics of Java Cloud
Basics of Java CloudBasics of Java Cloud
Basics of Java Cloud
Ankur Gupta
 

Similar to Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI World London 2018 (20)

Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
AI Scalability for the Next Decade
AI Scalability for the Next DecadeAI Scalability for the Next Decade
AI Scalability for the Next Decade
 
5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics5 Comparing Microsoft Big Data Technologies for Analytics
5 Comparing Microsoft Big Data Technologies for Analytics
 
Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015Docker & aPaaS: Enterprise Innovation and Trends for 2015
Docker & aPaaS: Enterprise Innovation and Trends for 2015
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Building cloud native data microservice
Building cloud native data microserviceBuilding cloud native data microservice
Building cloud native data microservice
 
What's New in IBM Streams V4.1
What's New in IBM Streams V4.1What's New in IBM Streams V4.1
What's New in IBM Streams V4.1
 
Cloud presentation NELA
Cloud presentation NELACloud presentation NELA
Cloud presentation NELA
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
 
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
Lessons Learned from Deploying Apache Spark as a Service on IBM Power Systems...
 
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud ComputingVTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
VTU Open Elective 6th Sem CSE - Module 2 - Cloud Computing
 
DataOps with Project Amaterasu
DataOps with Project AmaterasuDataOps with Project Amaterasu
DataOps with Project Amaterasu
 
Understanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud PlatformUnderstanding cloud with Google Cloud Platform
Understanding cloud with Google Cloud Platform
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Basics of Java Cloud
Basics of Java CloudBasics of Java Cloud
Basics of Java Cloud
 

More from Sri Ambati

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Sri Ambati
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
Sri Ambati
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
Sri Ambati
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
Sri Ambati
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
Sri Ambati
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Sri Ambati
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
Sri Ambati
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
Sri Ambati
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
Sri Ambati
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
Sri Ambati
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
Sri Ambati
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Sri Ambati
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Sri Ambati
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
Sri Ambati
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
Sri Ambati
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
Sri Ambati
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
Sri Ambati
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
Sri Ambati
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
Sri Ambati
 

More from Sri Ambati (20)

GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Generative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptxGenerative AI Masterclass - Model Risk Management.pptx
Generative AI Masterclass - Model Risk Management.pptx
 
AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek AI and the Future of Software Development: A Sneak Peek
AI and the Future of Software Development: A Sneak Peek
 
LLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5thLLMOps: Match report from the top of the 5th
LLMOps: Match report from the top of the 5th
 
Building, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for ProductionBuilding, Evaluating, and Optimizing your RAG App for Production
Building, Evaluating, and Optimizing your RAG App for Production
 
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
Building LLM Solutions using Open Source and Closed Source Solutions in Coher...
 
Risk Management for LLMs
Risk Management for LLMsRisk Management for LLMs
Risk Management for LLMs
 
Open-Source AI: Community is the Way
Open-Source AI: Community is the WayOpen-Source AI: Community is the Way
Open-Source AI: Community is the Way
 
Building Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2OBuilding Custom GenAI Apps at H2O
Building Custom GenAI Apps at H2O
 
Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical Applied Gen AI for the Finance Vertical
Applied Gen AI for the Finance Vertical
 
Cutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM PapersCutting Edge Tricks from LLM Papers
Cutting Edge Tricks from LLM Papers
 
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
Practitioner's Guide to LLMs: Exploring Use Cases and a Glimpse Beyond Curren...
 
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
Open Source h2oGPT with Retrieval Augmented Generation (RAG), Web Search, and...
 
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
KGM Mastering Classification and Regression with LLMs: Insights from Kaggle C...
 
LLM Interpretability
LLM Interpretability LLM Interpretability
LLM Interpretability
 
Never Reply to an Email Again
Never Reply to an Email AgainNever Reply to an Email Again
Never Reply to an Email Again
 
Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)Introducción al Aprendizaje Automatico con H2O-3 (1)
Introducción al Aprendizaje Automatico con H2O-3 (1)
 
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
From Rapid Prototypes to an end-to-end Model Deployment: an AI Hedge Fund Use...
 
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
AI Foundations Course Module 1 - Shifting to the Next Step in Your AI Transfo...
 

Recently uploaded

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 

Recently uploaded (20)

AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 

Scaling out Driverless AI with IBM Spectrum Conductor - Kevin Doyle - H2O AI World London 2018

  • 1. Scaling out Driverless AI in Enterprise Data Centers with IBM Spectrum Conductor Kevin Doyle Lead Architect IBM Spectrum Conductor IBM LinkedIn: https://www.linkedin.com/in/kevin-doyle-675a4031/
  • 2. Benefits of managing H2O with IBM Spectrum Conductor • H2O Driverless AI can scale across compute nodes for multiple instances, with each instance allocated to one host • In a future IBM Spectrum Conductor release, integration improves at the GPU level: You will be able to run multiple Driverless AI instances on the same host, where each instance is allocated to an assigned GPU • Shared file system for Data and logs • Failover to another host if Driverless AI goes down: IBM Spectrum Conductor starts it up on another host (if resources available) • Easily start and stop H2O Driverless AI and maintain instances for each user or groups of users through role-based access control (RBAC) and consumer association, along with all other workloads in one shared compute cluster • H2O Driverless AI and IBM POWER9 GPU Systems are bringing together the best of breed AI innovation. To handle the increasingly complex workloads of AI you need an integrated system of software and hardware: • IBM supports nearly 2.6x mPOWER9ore RAM, 9.5x more I/O bandwidth than comparable systems • Nearly 2X the data ingest speed and over 50% faster feature engineering • With GPU accelerated machine learning delivering nearly 30X speedup on model building • Support for up to 6 V100 GPUs on a single system
  • 3. What is IBM® Spectrum Conductor? • IBM Spectrum Conductor confidently deploys modern computing frameworks and services for a multitenant enterprise environment, both on-premises and in the cloud • Provides multitenancy through application instances and Spark instance groups. You can deploy modern computing frameworks and services, such as Spark, Anaconda, Driverless AI, and H2O Sparkling Water efficiently and effectively, supporting multiple versions and instances of each framework and service • Increases performance and scale through granular and dynamic resource allocation for application instances and Spark instance groups that share a resource pool • Maximizes usage of resources and eliminates silos of resources that would otherwise each be tied to separate application implementations • Provides flexible and efficient data management for shared storage and high availability by connecting to existing storage infrastructure, such as NFS mounts to a file system or IBM Spectrum Scale™
  • 4. VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES Application Application examples • Simulation • Analysis • Design • Big data IT constrained • Long wait times • Low utilization • Data access bottlenecks • IT Sprawl IBM Software Defined Infrastructure Big data Simulation and modeling Analytics Traditional IBM Spectrum Conductor Make multiple computers look like one Prioritized matching of supply with demand Benefits • High utilization • Throughput • Performance • Prioritization • Reduced cost Repeated for many apps and groups Converged compute and storage VIRTUALIZED VIEWOF COMPUTE,NETWORKAND STORAGERESOURCES Faster results Fewer resources Long running services Distinct resources for compute and storage Traditional vs Conductor Management
  • 5. IBM Systems Shared Services Model for Spark, Machine Learning, and Deep Learning • Physical view: IBM Spectrum Conductor installed on each Linux server • Logical view: Users (groups) have their own Spark cluster (optional) that is isolated, protected, and secured by Spark instance groups or application instances – Managed by SLA | 5 Administrator Compute Nodes Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Instance #1 LOB Marketing… Fraud Detection… Data scientist Instance #2 Data scientist Driverless AI Instance #3 Researcher Instance #4 x86 Systems Cloud Object Storage (COS)Spectrum Scale Spectrum Conductor Data Connectors
  • 6. IBM Systems IBM Spectrum Conductor The most complete enterprise-grade solution for Data Science • Anaconda Distributions The solution supports multiple distributions of Anaconda running concurrently. Users can add/remove Conda packages. • Notebooks Integration Out-of-the-box notebooks available: Jupyter, Zeppelin, RStudio, H2O Sparkling Water. Other notebooks and distributed frameworks can be quickly integrated. • Spark Distributions The solution supports multiple versions of Spark running concurrently. • Workload Management / Scheduling A proven workload scheduling engine that enhances the Spark master scheduling logic to enable multi-tenancy. • Services Management Management of other long running application services on the same grid. Spark applications commonly have dependencies on other services that can now be managed as a single solution. • Resource Management & Orchestration Proven architecture at scale. Resources are dynamically allocated to Spark workload with fine grain sharing across applications. • IBM services and support A single point of contact for your services and support needs. | 6 Monitoring&Reporting Workload Management / Scheduling Resource Management & Orchestration Services Management Services and Support Red Hat Linux x86… Notebooks
  • 7. IBM Systems 7© 2016 IBM Corporation Competitive advantage through faster, more predictable analytics Throughput: 41% greater than Spark with YARN; 57% greater than Spark with Mesos Spectrum Conductor with Spark Spark / YARN Spark / Mesos When minutes count 10 minutes 14.1 minutes 15.7 minutes At quarter-end 80 hours 112.8 hours 125.6 hours Product development 26 weeks 36.7 weeks 40.8 weeks Source: STAC Report: Spark Resource Managers, Phase 1 (March 28, 2016) Note: IBM is an active contributor in the Mesos community, helping to advance its capabilities and integration with IBM solutions Predictability: longest job duration compared with median (lower is better) Spectrum Conductor with Spark Spark / YARN Spark / Mesos 1.51X 1.62X 66.32X
  • 8. IBM Systems 8© 2016 IBM Corporation STAC reported significant advantages, up to 2.2x, for IBM Spectrum Conductor with Spark over YARN and Mesos.
  • 9. PowerAI Enterprise ML/DL - Data Science Stack Open Source Frameworks Distribution Data Layer Runtimes, Resource & WL Managers DL Frameworks ML Libraries ML/DL UI and Flow Data Science Apps Value-add Tools IBM Spectrum Conductor Tensor Flow Caffe PyTorch Chainer MLLib Graphx Scikit- learn R xgboost GPU Support / Distributed / BYOF / Session Scheduler / MPI / Containers… Anaconda Python Spark Anaconda Distributed Deep Learning (DDL) Data Prep / Parallel Training / Model Tuning / Model Evaluation / Inference Services… IBM Spectrum Conductor Deep Learning Impact PowerAI Vision IBM PowerAI Enterprise IBM Spectrum Scale IBM Cloud Object Store Watson Studio Elastic Distributed Training (EDT)
  • 10. Key concepts of IBM Spectrum Conductor • Application instances • Customizable feature to support running any long-running service within the cluster • Application templates (yaml) are created to define the processes (services) that you want to run in the cluster • Driverless AI integration is done through application instances • Spark instance groups • Is an installation of Apache Spark that can run Spark core services (master, shuffle, and history), Anaconda distribution instances, and notebooks as configured • You can create and run multiple Spark instance groups, associating each instance group with different Spark/Anaconda/notebook version packages as required • H2O Sparkling Water integration is treated as a notebook within your Spark instance groups
  • 11. Key concepts of IBM Spectrum Conductor Cont • Resource groups • Provide a simple way of organizing and grouping resources (hosts) • Defines how to divide up the hosts in the group into slots • Slots are used to decide if a host is available to place new workload on it • Consumers • A way to map organizations/teams to resources they are allowed to use • Resource planning uses consumers to determine advanced policies for when to borrow/lend resources to other consumers • Resource groups map to consumers to allow users adding application instances or Spark instance groups to only use those resource groups
  • 12. Role-based access control • Permissions are assigned to roles • Roles are assigned to users • Most permissions are based on a consumer • Users will have the permissions/role assigned but only for the consumers they have access to • Ability to allow users to only access/control what they should • Example: Each user can see only their Driverless AI instances as desired
  • 13. How does the integration work? • H2O Driverless AI is launched on a single host • The host can have either GPUs or just run with CPUs • If using GPUs the entire host is taken (with current integration) • An application instance is created for each user of Driverless AI • Maintains security for the data this user has access to • Environment variables through parameters are used to configure Driverless AI • H2O Sparkling Water runs as a notebook in a Spark instance group • When the notebook is started up it forms a mini cluster of executors • These executors stay alive for the entire duration of the notebook • IBM Spectrum Conductor disables preemption to not reclaim these hosts • Multiple users can share a Sparkling Water notebook instance or have dedicated ones per user
  • 14. Current Integration 14 Session Scheduler Security Data Connector Report/log management Notebook Spark ELKPython Resource, Cluster, Service Management (K8s/EGO) ContainerGPU and Acceleration Multi-tenancy Batch Scheduler Session Scheduler Session Scheduler Instance Group #1 Instance Group #2 App instance # marketing App instance # fraud Instance Group # 5 Elastic Distributed Training (EDT) # other apps …
  • 15. Demo
  • 16. Future Plans (short term) • Log retrieval from IBM Spectrum Conductor web UI • Ability to deploy Driverless AI with IBM Spectrum Conductor instead of installing on all systems (new application template) • Ability to modify application instance outputs more effectively • Enhance job monitor to check when Driverless AI is up
  • 17. Future Plans (longer term) • Improved port management • Today you can specify the ports to use, however, you don’t know if they are being used on existing hots • The ports might work at first but not later if something else is using the ports • Improve handling of running Driverless AI with a subset of GPUs on hosts in the cluster • Integrate Driverless AI authentication with IBM Spectrum Conductor authentication/authorization for easier setup • Look at supporting Driverless AI to run across multiple machines • Investigate the best approaches to connect to data sources
  • 18. Long term architecture vision for Driverless AI integrated with IBM Spectrum Conductor H2O Driverless AI Batch Scheduler (1) Start Driverless AI Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Linux Session Scheduler (2) Find a host to run Driverless AI (3) Run workload (training, experiment, etc) (4) Find hosts to run the workload on to speed up execution
  • 19. It’s available now • Contact Richard Shedrick ( rshedrick@us.ibm.com ) to get access to the integration and learn more • Future announcements and contact points on the integration at: • IBM Spectrum Conductor Blog: http://ibm.biz/ConductorBlogs • IBM Spectrum Conductor’s Slack channel: http://ibm.biz/ConductorSlack
  • 20. 20 Simplicity: Integrated Platform that Just Works Curate, Test, and Support Fast Moving Open Source Provide Enterprise Distribution on RedHat Easy to deploy Enterprise AI Platform Ease of Use, Unique Capabilities Faster Model Training Time Large data & model support due to NVLink Acceleration of Analytics & ML AutoML: PowerAI Vision Elastic Training: Scale GPUs as Required Faster Training Times in Single Server Scalability to 100s of Servers (Cluster level Integration) Leads to Faster Insights and Better Economics Platform that Partners can build on Software Partners: H2O, IBM, Anaconda SIs, Solution Vendors & Accelerator Partners Open AI Platform w/ Ecosystem Partners Power9 CPU GPU PowerAI IBM SW ISV SW Solution SIs Top Reasons to Choose PowerAI Enterprise