Big data as a service

Uploaded on


More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Track 2: Big Data as a Service 11:50 A.M. – 12:35 P.M.
  • 2. SPEAKERS INCLUDE: • Chris Layton, HPC Systems Administrator, National Center for Computational Sciences, Oak Ridge National Laboratory • Xavier Hughes, Chief Innovation Officer, Dept. of Labor • Dr. Dave Bauer, Chief Scientist, Data Tactics • John Kreisa, VP of Strategic Marketing, Horton Work • Moderator: Toan Do, Director, Intelligence Programs, Red Hat
  • 3. Daniel Ricciuto (left) and Peter Thornton (right) using the Exploratory Data analysis ENvironment (EDEN) to visually explore multiple Community Land Model (CLM) simulation data sets. In particular, Ricciuto and Thornton are analyzing sensitivities in the Amazonia region using the interactive visual analytics in EDEN on EVEREST's Planar display.
  • 4. Chad Steed using EDEN on EVEREST to explore 1000 CLM4 simulations (81 parameters and 7 output variables) on the previous version of the EVEREST display wall.
  • 5. Big & open data provides an opportunity for external partners to help meet our mission and goals. Triumph through crowd-sourcing. Innovation though collaboration.
  • 6. APPLICATIONS A Traditional Approach Under Pressure Business Analytics Custom Applications Packaged Applications DATA SYSTEM 2.8 ZB in 2012 85% from New Data Types RDBMS EDW MPP 15x Machine Data by 2020 REPOSITORIES 40 ZB by 2020 SOURCES Source: IDC Existing Sources Emerging Sources (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 Page 6
  • 7. Most Common NEW TYPES OF DATA 1. Sentiment Understand how your customers feel about your brand and products – right now 2. Clickstream Capture and analyze website visitors’ data trails and optimize your website 3. Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4. Geographic Analyze location-based data to manage operations where they occur Value 5. Server Logs Research logs to diagnose process failures and prevent security breaches 6. Unstructured (txt, video, pictures, etc..) Understand patterns in files across millions of web pages, emails, and documents © Hortonworks Inc. 2013 + Keep existing data longer!
  • 8. APPLICATIONS An Emerging Data Architecture New Custom Applications Business Analytics Packaged Applications DEV & DATA TOOLS SOURCES DATA SYSTEM BUILD & TEST OPERATIONAL TOOLS RDBMS EDW MANAGE & MONITOR MPP REPOSITORIES Existing Sources Emerging Sources (CRM, ERP, Clickstream, Logs) (Sensor, Sentiment, Geo, Unstructured) © Hortonworks Inc. 2013 Page 8
  • 9. Federal Government & Big Data • Law Enforcement/Security – Store and process biometric identification for individuals – Multi-modal ID increases accuracy, but requires more data storage and parallel processing for distinct matching algorithms: – Facial Recognition, Fingerprints, Voice, Gait • Environmental Protection Agency (EPA) – Capture machine generated data to monitor air, water & land quality – Combine sensor data and social media / sentiment analysis • Social Security Administration (SSA) – Finding fraudulent claims for benefits using big data analysis to look for patterns of fraudulent behavior © Hortonworks Inc. 2013