Accelerating Enterprise Spark
Shaun Connolly
Hortonworks Strategy
@shaunconnolly
Apache Spark Unlocks Enormous
Potential of Data in the Enterprise
Page3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Personalized
Online Ads
Petabytes of
Weblogs Analyzed
with Spark at Scale
• Data streams from a vast array of
desktop and mobile devices
• 13 billion daily events processed
latency as low as 40 milliseconds
• No data cleansing necessary prior
to analysis with Apache Spark
• 2 clusters consolidated into 1
YARN-based HDP cluster
• Launched new product Webtrends
Explore™ -- powered by HDP
Per-Customer
Click Path
Web Log
Analysis
SQL Server
Offload
“We’re able to…look at this data set and process it and do predictions,
behavioral analysis. We can do things that allow us to determine ROI for
different actions and behavioral patterns.”
Peter Crossley, Chief Architect
Behavioral
Segmentation
Ad Click
Predictions
LCV
Analysis
Page4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
New Use CasesCable Company: Optimize Advertising
• Monitor channel changes with Spark Streaming
• Correlate changes with Ads/Programming
• Allocate Ads real time: Show ads to user who are
watching a show and will stay for > over 20 seconds
Railroad Company:
Real-time View of State of Track
• Optimize the track and train maintenance
• Large volume and granularity of track data
• GeoSpatial analytics is critical
Spark Trends
Implications for the Enterprise
Data API Enterprise Ready /
”Hardened”
Data Science is
still the Frontier
Page6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
ETL, Streaming, Reporting, Analytics
Must Integrate into Existing Environments
A Critical Tool in the Enterprise Tool Box
The Data API
Page7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
HA, DR, Tooling, Debugging, Operations
Security, Encryption, Governance Models
Scale
Implications of Enterprise-Ready / “Hardened”
Agile Analytics & Data Science
Need to Democratize
Easy and Better Tooling
Train and Encourage More People to Join Us
Hortonworks Strategy for Enterprise Spark at Scale
Agile Analytics & Data Science
Accelerate Capabilities for the Enterprise
Innovate at the Core
Stay tuned…. March 1
Thank You!
Shaun Connolly
@shaunconnolly

Spark Summit Keynote by Shaun Connolly

  • 1.
    Accelerating Enterprise Spark ShaunConnolly Hortonworks Strategy @shaunconnolly
  • 2.
    Apache Spark UnlocksEnormous Potential of Data in the Enterprise
  • 3.
    Page3 © HortonworksInc. 2011 – 2016. All Rights Reserved Personalized Online Ads Petabytes of Weblogs Analyzed with Spark at Scale • Data streams from a vast array of desktop and mobile devices • 13 billion daily events processed latency as low as 40 milliseconds • No data cleansing necessary prior to analysis with Apache Spark • 2 clusters consolidated into 1 YARN-based HDP cluster • Launched new product Webtrends Explore™ -- powered by HDP Per-Customer Click Path Web Log Analysis SQL Server Offload “We’re able to…look at this data set and process it and do predictions, behavioral analysis. We can do things that allow us to determine ROI for different actions and behavioral patterns.” Peter Crossley, Chief Architect Behavioral Segmentation Ad Click Predictions LCV Analysis
  • 4.
    Page4 © HortonworksInc. 2011 – 2015. All Rights Reserved New Use CasesCable Company: Optimize Advertising • Monitor channel changes with Spark Streaming • Correlate changes with Ads/Programming • Allocate Ads real time: Show ads to user who are watching a show and will stay for > over 20 seconds Railroad Company: Real-time View of State of Track • Optimize the track and train maintenance • Large volume and granularity of track data • GeoSpatial analytics is critical
  • 5.
    Spark Trends Implications forthe Enterprise Data API Enterprise Ready / ”Hardened” Data Science is still the Frontier
  • 6.
    Page6 © HortonworksInc. 2011 – 2015. All Rights Reserved ETL, Streaming, Reporting, Analytics Must Integrate into Existing Environments A Critical Tool in the Enterprise Tool Box The Data API
  • 7.
    Page7 © HortonworksInc. 2011 – 2015. All Rights Reserved HA, DR, Tooling, Debugging, Operations Security, Encryption, Governance Models Scale Implications of Enterprise-Ready / “Hardened”
  • 8.
    Agile Analytics &Data Science Need to Democratize Easy and Better Tooling Train and Encourage More People to Join Us
  • 9.
    Hortonworks Strategy forEnterprise Spark at Scale Agile Analytics & Data Science Accelerate Capabilities for the Enterprise Innovate at the Core
  • 10.
  • 11.