SlideShare a Scribd company logo
1 of 22
Download to read offline
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hortonworks Premier Inside Out – Introducing
Data Science Experience (DSX)
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Presenters
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
à #1 Pure Open Source Hadoop Distribution
à 1000+ customers and 2100+ ecosystem
partners
à Employs the original architects, developers
and operators of Hadoop from Yahoo!
à Best-in-class 24x7 customer support
à Leading professional services and training
à #1 Data Science Platform (Source: Gartner)
à OpenPOWER performance leadership
à Flexible, software defined storage
à #1 SQL Engine for complex, analytical workloads
à Leader in On-premise and Hybrid Cloud solutions
+
IBM + Hortonworks = Unlocking Actionable Insights
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Data Science Lifecycle
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Next Generation Data Science Problems
Multiple data sources & clusters
Data Scientists
Where is the data I need to answer the
business questions?
Data Engineers
How do I move that data into a central
repository?
How do I transform and cleanse that data?
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Next Generation Data Science Problems
Too many tools and technologies
Data Scientists
How do I learn the latest library/ technique?
I don’t (want to) know Hadoop/ Hive etc.
How do I bring my familiar R/ Python library
to the new data science platform?
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Next Generation Data Science Problems
Socializing insights is challenging
Data Scientists
How do I collaborate and share my work
with others in the organization?
Business Analyst
How do I move that data into a central
repository?
What is the best visualization to tell my
story?
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Next Generation Data Science Problems
Going from prototype to production is cumbersome
Data Scientists
I created this awesome Machine Learning
Model, how do I put it into production?
Data Scientists/ Data Engineers
How are my Machine Learning Models
performing & how to improve them?
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Data Science Experience
Explore & Learn Model & Evaluate
Deploy & Predict Monitor & Measure
The leading data science platform that allows you to easily collaborate across teams, use the top
open source tools and scale at the speed your business requires.
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Data Science Solution
Community Open Source Scale & Enterprise Security
• Find tutorials and datasets
• Connect with Data Scientists
• Ask questions
• Read articles and papers
• Fork and share projects
• Code in Scala/Python/R/SQL
• Zeppelin & Jupyter Notebooks
• RStudio IDE and Shiny
• Apache Spark
• Your favorite libraries
• Data Science at Scale
• Run Spark Jobs on HDP Cluster
• Secure Hadoop Support
• Ranger Atlas Support for Data
• Support for ABAC
Model Management
• Data Shaping Pipeline UI
• Auto-data preparation & modeling
• Advanced Visualizations
• Model management & deployment
• Documented Model APIs
Data Science Experience
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Enterprise	Data	Science	At	Scale	
Enterprise
Secured,	
governed	and	
managed
Tools
Leverage	your	
favorite	tools,	
technologies	
and	libraries
Deployment
From	pilot	to	
production
Data
Build	models	
using	all	the	
data
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
DEMO
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Demo Scenario
Sensors monitoring Trucks
• Stored long term sensor data about various truckers driving behavior
• New sensor data coming from trucks as they are driving in various conditions
• Predict a driving violation before they happen
• Alert the driver | manager
• Business monitors the driver performance
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Demo Flow
Insights from Data Science to Production
Data Scientists
Where is the data I
need to answer the
business questions?
Business Users
Where is the insight
& predictions from
the data?
HDP Cluster
Knox
Admins
How do I meet SLA,
Performance, .., Feature
needs?
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Demo Scenario
Problems Solved
• Data Scientist collaborate, learn new tools & frameworks
• Choice of tools, notebooks and languages
• Run favorite notebook on all data in the HDP Cluster
• Deploy the model to production
• Leverage the production model to deliver insights to business
• Monitor models and retrain models as new data comes in
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
DSX with HDP Roadmap
Summary plan
DSX install with Ambari
DSX Ambari Install, DSX in HDP, Improve Enterprise readiness
Install DSX with Ambari, DSX runs on YARN node labeled nodes, Ranger, Atlas
integration for Model Management, SSO
Improve YARN integration, Model Scoring on YARN
DSX scales on all YARN nodes, Model Scoring and Notebooks run on YARN
Deeper DSX YARN
integration
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Q & A
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Customer Briefings coming to a City Near You!
24
OCT
Silicon
Valley
25
OCT
Salt Lake
City
26
OCT
Dallas
1
NOV
Chicago
2
NOV
Toronto
7
NOV
Tysons
8
NOV
New York
City
9
NOV
Boston
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Join us for a Meetup Session
Enterprise Data Science at Scale Meetup
Silicon Valley 10/30
San Francisco 11/14
Chicago 11/08 (*)
Dallas 11/09 (*)
Toronto 11/09 (*)
NYC 11/15 (*)
Washington DC 11/16 (*)
London 11/24 (*)
Boston 12/01 (*)
(*) Tentative
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
© Hortonworks Inc. 2011 – 2017. All Rights Reserved
Announcement 13 Jun 2017:
IBM and Hortonworks extend partnership to bring
Data Science to HDP
Great Data + Great Data Science = Great Decisions
à IBM chooses Hortonworks Data Platform (HDP®) as their Hadoop distribution
à Hortonworks Data Platform (HDP) combining IBM DSX (Data Science Experience)
& IBM Big SQL into new integrated solutions

More Related Content

What's hot

Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
EMC
 

What's hot (20)

Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Top 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data AnalyticsTop 5 Strategies for Retail Data Analytics
Top 5 Strategies for Retail Data Analytics
 
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective InsightsHow Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
How Customers are Optimizing their EDW for Fast, Secure, and Effective Insights
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Benefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at ScaleBenefits of Transferring Real-Time Data to Hadoop at Scale
Benefits of Transferring Real-Time Data to Hadoop at Scale
 
Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25Dataguise hortonworks insurance_feb25
Dataguise hortonworks insurance_feb25
 
Actian forrester- hortonworks
Actian   forrester- hortonworksActian   forrester- hortonworks
Actian forrester- hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
Oil & Gas Big Data use cases
Oil & Gas Big Data use casesOil & Gas Big Data use cases
Oil & Gas Big Data use cases
 
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
Connecting Home/Building, Life and Car..The Importance of Insurance Risk Moni...
 
HDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New FeaturesHDF 3.1 : An Introduction to New Features
HDF 3.1 : An Introduction to New Features
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?
 
How Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform EducationHow Universities Use Big Data to Transform Education
How Universities Use Big Data to Transform Education
 
The Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen ModernizationThe Power of your Data Achieved - Next Gen Modernization
The Power of your Data Achieved - Next Gen Modernization
 
5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data5 Steps to Create a Company Culture that Embraces the Power of Data
5 Steps to Create a Company Culture that Embraces the Power of Data
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...Who changed my data? Need for data governance and provenance in a streaming w...
Who changed my data? Need for data governance and provenance in a streaming w...
 
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
Pivotal the new_pivotal_big_data_suite_-_revolutionary_foundation_to_leverage...
 

Viewers also liked

Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
Hortonworks
 
Business by Design
Business by DesignBusiness by Design
Business by Design
Craig Martin
 
Exit strategy planning educational linkedin
Exit strategy planning educational linkedinExit strategy planning educational linkedin
Exit strategy planning educational linkedin
denismbrown
 
ABG VR Exit Strategy
ABG VR Exit StrategyABG VR Exit Strategy
ABG VR Exit Strategy
TPS Companies
 
Chase Group Photos
Chase Group PhotosChase Group Photos
Chase Group Photos
CEG12
 

Viewers also liked (20)

Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSenseStreamline Apache Hadoop Operations with Apache Ambari and SmartSense
Streamline Apache Hadoop Operations with Apache Ambari and SmartSense
 
Introduction to Hortonworks Data Platform
Introduction to Hortonworks Data PlatformIntroduction to Hortonworks Data Platform
Introduction to Hortonworks Data Platform
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 
Enterprise Data Warehouse Optimization: 7 Keys to Success
Enterprise Data Warehouse Optimization: 7 Keys to SuccessEnterprise Data Warehouse Optimization: 7 Keys to Success
Enterprise Data Warehouse Optimization: 7 Keys to Success
 
Eric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers ConferenceEric Baldeschwieler Keynote from Storage Developers Conference
Eric Baldeschwieler Keynote from Storage Developers Conference
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
Justin Lurie, Gottesman Company, Mergers & Acquisition (M&A) Webinar for Unit...
Justin Lurie, Gottesman Company, Mergers & Acquisition (M&A) Webinar for Unit...Justin Lurie, Gottesman Company, Mergers & Acquisition (M&A) Webinar for Unit...
Justin Lurie, Gottesman Company, Mergers & Acquisition (M&A) Webinar for Unit...
 
Horizon greater noida
Horizon greater noidaHorizon greater noida
Horizon greater noida
 
Business by Design
Business by DesignBusiness by Design
Business by Design
 
Exit strategy planning educational linkedin
Exit strategy planning educational linkedinExit strategy planning educational linkedin
Exit strategy planning educational linkedin
 
Advanced Skype tips
Advanced Skype tipsAdvanced Skype tips
Advanced Skype tips
 
Filta Convention Preparing Your Business For Sale
Filta Convention Preparing Your Business For SaleFilta Convention Preparing Your Business For Sale
Filta Convention Preparing Your Business For Sale
 
Transworld presentation setup (1)
Transworld presentation setup (1)Transworld presentation setup (1)
Transworld presentation setup (1)
 
Commercial Funding Summary
Commercial Funding SummaryCommercial Funding Summary
Commercial Funding Summary
 
Ultimate Guide to Selecting a Business Broker
Ultimate Guide to Selecting a Business BrokerUltimate Guide to Selecting a Business Broker
Ultimate Guide to Selecting a Business Broker
 
ABG VR Exit Strategy
ABG VR Exit StrategyABG VR Exit Strategy
ABG VR Exit Strategy
 
Tw listing power point
Tw listing power pointTw listing power point
Tw listing power point
 
Chase Group Photos
Chase Group PhotosChase Group Photos
Chase Group Photos
 

Similar to Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
IBM Cloud Paris meetup 20180213 - Hortonworks
IBM Cloud Paris meetup   20180213 - HortonworksIBM Cloud Paris meetup   20180213 - Hortonworks
IBM Cloud Paris meetup 20180213 - Hortonworks
IBM France Lab
 

Similar to Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017 (20)

Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
Enterprise Data Science at Scale @ Princeton, NJ 14-Nov-2017
 
Enterprise Data Science at Scale
Enterprise Data Science at ScaleEnterprise Data Science at Scale
Enterprise Data Science at Scale
 
Enterprise data science at scale
Enterprise data science at scaleEnterprise data science at scale
Enterprise data science at scale
 
Storm Demo Talk - Denver Apr 2015
Storm Demo Talk - Denver Apr 2015Storm Demo Talk - Denver Apr 2015
Storm Demo Talk - Denver Apr 2015
 
Edw Optimization Solution
Edw Optimization Solution Edw Optimization Solution
Edw Optimization Solution
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
IBM Cloud Paris meetup 20180213 - Hortonworks
IBM Cloud Paris meetup   20180213 - HortonworksIBM Cloud Paris meetup   20180213 - Hortonworks
IBM Cloud Paris meetup 20180213 - Hortonworks
 
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
[Hortonworks] Future Of Data: Madrid - HDF & Data in motion
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
Denodo DataFest 2017: Edge Computing: Collecting vs. Connecting to Streaming ...
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for AnalyticsVerizon Centralizes Data into a Data Lake in Real Time for Analytics
Verizon Centralizes Data into a Data Lake in Real Time for Analytics
 
PGDay Brasilia 2017
PGDay Brasilia 2017PGDay Brasilia 2017
PGDay Brasilia 2017
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 

More from Hortonworks

More from Hortonworks (15)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 
4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data4 Essential Steps for Managing Sensitive Data
4 Essential Steps for Managing Sensitive Data
 
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake DebateExploring the Heated-and Completely Unnecessary- Data Lake Debate
Exploring the Heated-and Completely Unnecessary- Data Lake Debate
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
How to Architect and Omnichannel Retail Solution to Achieve Real-Time Custome...
How to Architect and Omnichannel Retail Solution to Achieve Real-Time Custome...How to Architect and Omnichannel Retail Solution to Achieve Real-Time Custome...
How to Architect and Omnichannel Retail Solution to Achieve Real-Time Custome...
 
The Life of a Hadoop Administrator, with and without SmartSense
The Life of a Hadoop Administrator, with and without SmartSenseThe Life of a Hadoop Administrator, with and without SmartSense
The Life of a Hadoop Administrator, with and without SmartSense
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

Strategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering TeamsStrategic AI Integration in Engineering Teams
Strategic AI Integration in Engineering Teams
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří KarpíšekAI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
 
PLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. StartupsPLAI - Acceleration Program for Generative A.I. Startups
PLAI - Acceleration Program for Generative A.I. Startups
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Demystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John StaveleyDemystifying gRPC in .Net by John Staveley
Demystifying gRPC in .Net by John Staveley
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdfLinux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
Linux Foundation Edge _ Overview of FDO Software Components _ Randy at Intel.pdf
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 

Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

  • 1. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Premier Inside Out – Introducing Data Science Experience (DSX)
  • 2. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Presenters
  • 3. © Hortonworks Inc. 2011 – 2017. All Rights Reserved à #1 Pure Open Source Hadoop Distribution à 1000+ customers and 2100+ ecosystem partners à Employs the original architects, developers and operators of Hadoop from Yahoo! à Best-in-class 24x7 customer support à Leading professional services and training à #1 Data Science Platform (Source: Gartner) à OpenPOWER performance leadership à Flexible, software defined storage à #1 SQL Engine for complex, analytical workloads à Leader in On-premise and Hybrid Cloud solutions + IBM + Hortonworks = Unlocking Actionable Insights
  • 4. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Data Science Lifecycle
  • 5. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Next Generation Data Science Problems Multiple data sources & clusters Data Scientists Where is the data I need to answer the business questions? Data Engineers How do I move that data into a central repository? How do I transform and cleanse that data?
  • 6. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Next Generation Data Science Problems Too many tools and technologies Data Scientists How do I learn the latest library/ technique? I don’t (want to) know Hadoop/ Hive etc. How do I bring my familiar R/ Python library to the new data science platform?
  • 7. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Next Generation Data Science Problems Socializing insights is challenging Data Scientists How do I collaborate and share my work with others in the organization? Business Analyst How do I move that data into a central repository? What is the best visualization to tell my story?
  • 8. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Next Generation Data Science Problems Going from prototype to production is cumbersome Data Scientists I created this awesome Machine Learning Model, how do I put it into production? Data Scientists/ Data Engineers How are my Machine Learning Models performing & how to improve them?
  • 9. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Data Science Experience Explore & Learn Model & Evaluate Deploy & Predict Monitor & Measure The leading data science platform that allows you to easily collaborate across teams, use the top open source tools and scale at the speed your business requires.
  • 10. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Data Science Solution Community Open Source Scale & Enterprise Security • Find tutorials and datasets • Connect with Data Scientists • Ask questions • Read articles and papers • Fork and share projects • Code in Scala/Python/R/SQL • Zeppelin & Jupyter Notebooks • RStudio IDE and Shiny • Apache Spark • Your favorite libraries • Data Science at Scale • Run Spark Jobs on HDP Cluster • Secure Hadoop Support • Ranger Atlas Support for Data • Support for ABAC Model Management • Data Shaping Pipeline UI • Auto-data preparation & modeling • Advanced Visualizations • Model management & deployment • Documented Model APIs Data Science Experience
  • 11. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Enterprise Data Science At Scale Enterprise Secured, governed and managed Tools Leverage your favorite tools, technologies and libraries Deployment From pilot to production Data Build models using all the data
  • 12. © Hortonworks Inc. 2011 – 2017. All Rights Reserved DEMO
  • 13. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Demo Scenario Sensors monitoring Trucks • Stored long term sensor data about various truckers driving behavior • New sensor data coming from trucks as they are driving in various conditions • Predict a driving violation before they happen • Alert the driver | manager • Business monitors the driver performance
  • 14. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Demo Flow Insights from Data Science to Production Data Scientists Where is the data I need to answer the business questions? Business Users Where is the insight & predictions from the data? HDP Cluster Knox Admins How do I meet SLA, Performance, .., Feature needs?
  • 15. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Demo Scenario Problems Solved • Data Scientist collaborate, learn new tools & frameworks • Choice of tools, notebooks and languages • Run favorite notebook on all data in the HDP Cluster • Deploy the model to production • Leverage the production model to deliver insights to business • Monitor models and retrain models as new data comes in
  • 16. © Hortonworks Inc. 2011 – 2017. All Rights Reserved DSX with HDP Roadmap Summary plan DSX install with Ambari DSX Ambari Install, DSX in HDP, Improve Enterprise readiness Install DSX with Ambari, DSX runs on YARN node labeled nodes, Ranger, Atlas integration for Model Management, SSO Improve YARN integration, Model Scoring on YARN DSX scales on all YARN nodes, Model Scoring and Notebooks run on YARN Deeper DSX YARN integration
  • 17. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Q & A
  • 18. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Customer Briefings coming to a City Near You! 24 OCT Silicon Valley 25 OCT Salt Lake City 26 OCT Dallas 1 NOV Chicago 2 NOV Toronto 7 NOV Tysons 8 NOV New York City 9 NOV Boston
  • 19. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Join us for a Meetup Session Enterprise Data Science at Scale Meetup Silicon Valley 10/30 San Francisco 11/14 Chicago 11/08 (*) Dallas 11/09 (*) Toronto 11/09 (*) NYC 11/15 (*) Washington DC 11/16 (*) London 11/24 (*) Boston 12/01 (*) (*) Tentative
  • 20. © Hortonworks Inc. 2011 – 2017. All Rights Reserved
  • 21. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You
  • 22. © Hortonworks Inc. 2011 – 2017. All Rights Reserved Announcement 13 Jun 2017: IBM and Hortonworks extend partnership to bring Data Science to HDP Great Data + Great Data Science = Great Decisions à IBM chooses Hortonworks Data Platform (HDP®) as their Hadoop distribution à Hortonworks Data Platform (HDP) combining IBM DSX (Data Science Experience) & IBM Big SQL into new integrated solutions