Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CWIN17 Frankfurt / Cloudera

306 views

Published on

Cloudera

  • Be the first to comment

  • Be the first to like this

CWIN17 Frankfurt / Cloudera

  1. 1. 1© Cloudera, Inc. All rights reserved. Connected Services Stefan Lipp/Jochen Faltermeier CWIN 2017 - Frankfurt
  2. 2. 2© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved. Cloudera at-a-glance Customersuccess Large enterprises fueling growth 48% 140%+ customergrowth netexpansion Last 4 years Global 8000 customers Expansion driven by data and new use cases Openpartnernetwork Best of breed solutions 3000+ partners Vast ecosystem of solution & service providers Firsttomarket Open source innovation 2008 founded 1600+ Clouderans Global team doing business in 28 countries Big data innovators from Google, Yahoo and Oracle
  3. 3. 3© Cloudera, Inc. All rights reserved. The data-driven enterprise Explosion of data and devices (IoT) 30B connected devices 440x more data Transformation of IT infrastructure open source cloud machine learning $200B total market1 1 IDC Worldwide Big Data and Business Analytics Market Through 2020
  4. 4. 4© Cloudera, Inc. All rights reserved. We believe data can make what is impossible today, possible tomorrow
  5. 5. 5© Cloudera, Inc. All rights reserved. We empower people to transform complex data into clear and actionable insights DRIVE CUSTOMER INSIGHTS CONNECT PRODUCTS & SERVICES (IoT) PROTECT BUSINESS
  6. 6. 6© Cloudera, Inc. All rights reserved. We deliver the modern platform for machine learning and analytics optimized for the cloud RUNS ANYWHERE Cloud Multi-cloud On-premises SCALABLE Elastic Cost-effective Lower TCO ENTERPRISE GRADE Secure Performant Compliant
  7. 7. 7© Cloudera, Inc. All rights reserved. DRIVE CUSTOMER INSIGHTS CONNECT PRODUCTS & SERVICES (IoT) PROTECT BUSINESS Delivering greater value through improved customer understanding Powering predictive analytics to increase performance and reduce fleet downtime Creating new revenue streams with an advanced anti-fraud solution Cloudera powering data-driven customers
  8. 8. 8© Cloudera, Inc. All rights reserved. Introduction Navistar is a leading manufacturer of commercial trucks, buses, defense vehicles and engines. Since 1831, our history has been interwoven with some of the most defining moments in world history. Whether it was America's westward expansion or WWII, we were there, pushing the limits of what's possible and driving history forward. But that doesn't mean we're stuck in the past. We're determined to keep delivering smart, sustainable technologies - because we believe that innovation defines America's future, too.
  9. 9. 9© Cloudera, Inc. All rights reserved. The Data Challenge & Pre-Hadoop Challenge In late 2013, Navistar launched OnCommand™ Connection. OnCommand™ Connection is part of the OnCommand™ family of fleet Management Services from Navistar. OnCommand™ Connection leverages data feeds from telematics service providers and marries it with Meteorological, Geographical, Engineering, Vehicle Usage, Traffic, Historical Warranty, Service and Part Inventory Data to provide: Real-time vehicle performance data streamlined within a single portal. Service Advisory’s and Scheduling before problems occur Optimized service plans and part delivery to the nearest dealer when problems do occur We now actively monitor more than 300,000 vehicles and are adding to that total daily
  10. 10. 10© Cloudera, Inc. All rights reserved. Using Predictive Maintenance to Improve Performance and Reduce Fleet Downtime • OnCommand Connection is collecting telematics and geolocation data across the fleet • Reduced maintenance costs to $.03 per mile from $.12-$.15 per mile • Centralizing data from 13 systems with varying frequency and semantic definitions • Real-time visibility of ca. 300,000 trucks in order to improve uptime and vehicle performance MANUFACTURING » SERVICE IMPROVEMENT » PREDICTIVE ANALYTICS » PROCESS IMPROVEMENT
  11. 11. 11© Cloudera, Inc. All rights reserved. Benefits & Impact Quantifying Hadoop’s impact: By having literally all of our data in one place, we can perform analytics on an ad-hoc basis. Historically, simple questions required months to answer as we built out subject areas and transformed data. Our “Publish” Cluster brings the data to the consumer and it is certified. We have reduced hard dollar spending on proprietary hardware and expensive disk solutions, but also soft dollars in our speed to deliver answers. We can evaluate “what if” scenarios without the risk of impacting production processes. We can evaluate billions of rows of data and deliver answers in hours not weeks.
  12. 12. 12© Cloudera, Inc. All rights reserved. Data/Software >Analytics >Automation >AI is eating the world „the innovation foodchain“ MarcAndreessen Navistar IR Deck – H1 2017 − Connected services to reduce maintenance cost and improve vehicle uptime − Advanced driver assistance systems and platooning to improve fuel efficiency and safety − Automated record-keeping to enhance driver productivity
  13. 13. 13© Cloudera, Inc. All rights reserved. #1 Telematics provider with 130 billion miles of driving data collected from black boxes in connected cars Challenge: • Drive analytics on 12 million miles of driving data collected every hour Solution: • Telematics solution based on Cloudera to process data from black boxes • Analytics around driving behavior, risks, location, braking patterns, contextual elements and crash information • Provide Usage Based Insurance services TELEMATICS » CONNECTED VEHICLES » INSURANCE TELEMATICS » PREDICTIVE ANALYTICS Connected Car Telematics for Insurance CASE STUDY DATA-DRIVEN PROCESS IOT & Connected Products
  14. 14. 14© Cloudera, Inc. All rights reserved.
  15. 15. 15© Cloudera, Inc. All rights reserved. The IoT Ecosystem &Architecture IoT Gateway Gateway • Edge-Processing • Edge-Analytics IoT Data Storage, Processing & Analytics Centralized IoT Analytics • Time Series Data, Trends • Machine Learning • Context Enrichment • Deeper business insights Distributed Data Processing & Analytics • Cloud & On-PremiseConnected Things • Analytics at the edge • For immediate response Data Center Cloud IoT Analytics Enterprise Data Sources Combining sensor data with contextual data is the key to value creation from IoT
  16. 16. 16© Cloudera, Inc. All rights reserved.
  17. 17. 17© Cloudera, Inc. All rights reserved. The Cloudera Platform for IoT – Data Mgmt. Value Chain Data Sources Data Ingest Data Storage & Processing Serving, Analytics & Machine Learning ENTERPRISE DATA HUB Apache Kafka Stream or batch ingestion of IoT data Apache Sqoop Ingestion of data from relational sources Apache Hadoop Storage (HDFS) & deep batch processing Apache Kudu Storage & serving for fast changing data Apache HBase NoSQL data store for real time applications Apache Impala MPP SQL for fast analytics Cloudera Search Real time searchConnected Things/ Data Sources Structured Data Sources Security, Scalability & Easy Management Deployment Flexibility: Datacenter Cloud Apache Spark Stream & iterative processing, ML
  18. 18. 18© Cloudera, Inc. All rights reserved. Cloudera for IoT – Key Innovations / Differentiators Ideal for real-time analytics on IoT and time series data. Simplifies Lambda architectures for running real-time analytics on streaming data Preserve business flexibility and data portability and minimize cloud lock-in by running in any one of the three major public cloud providers or in private cloud Kudu: Real-Time Analytics Shared Data Experience SDX Data Science Workbench Collaborative hub for enterprise data science and an integrated development environment for running Python, R, & Scala with support for Spark
  19. 19. 19© Cloudera, Inc. All rights reserved. HDFS Fast Scans, Analytics and Processing of Stored Data Fast On-Line Updates & Data Serving Arbitrary Storage (Active Archive) Fast Analytics (on fast-changing or frequently-updated data) Kudu – Fast Analytics on Fast Data RealTimeUsecasesthatfallbetweenHDFSandHBaseweredifficulttomanage Unchanging Fast Changing Frequent Updates HBase Append-Only Real-Time Complex Hybrid Architectures Analytic Gap Pace of Analysis Pace of Data
  20. 20. 20© Cloudera, Inc. All rights reserved. S3 | ADLS | HDFS | KUDU Cloudera Enterprise 20CONFIDENTIAL—RESTRICTED The modern platform for machine learning and analytics optimized for the cloud EXTENSIBLE SERVICES CORE SERVICES DATA ENGINEERING OPERATIONAL DATABASE ANALYTIC DATABASE DATA CATALOG INGEST & REPLICATION SECURITY GOVERNANCE WORKLOAD MANAGEMENT DATA SCIENCE SHARED DATA EXPERIENCE SHARED STORAGE
  21. 21. 21© Cloudera, Inc. All rights reserved. • Unified security – protects sensitive data with consistent controls, even for transient and recurring workloads • Consistent governance – enables secure self-service access to all relevant data and increases compliance • Easy workload management – increases user productivity and boosts job predictability • Flexible ingest and replication – aggregates a single copy of all data, provides disaster recovery, and eases migration • Shared catalog – defines and preserves structure and business context of data for new applications and partner solutions Open platform services Built for multi-function analytics | Optimized for cloud SHARED DATA EXPERIENCE
  22. 22. 22© Cloudera, Inc. All rights reserved. Shared: Data, Operations, Governance, Security, Metadata Data Engineering Data Science Deployment Data Wrangling Visualization and Analysis Model Training & Testing Batch Scoring Online Scoring Serving Data GovernanceCuration Processing Acquisition Reports, Dashboards Dev: Collaboration, Version Control Ops: Deployment, Scheduling, Orchestration Support the complete data science workflow From data to exploration to action
  23. 23. 23© Cloudera, Inc. All rights reserved. Accelerates data science from development to production with: ● Secure self-service data access ● On-demand compute ● Support for Python, R, and Scala ● Project dependency isolation for multiple library versions ● Workflow automation, version control, collaboration and sharing Cloudera Data Science Workbench Self-service data science for the enterprise
  24. 24. 24© Cloudera, Inc. All rights reserved. Amodern data science architecture CDH CDH Cloudera Manager gateway nodes CDH nodes ● Built on Docker and Kubernetes ● Runs on dedicated gateway nodes ● User sessions run in isolated “engine” containers which: ○ Host Kerberos-authenticated Python/R/Scala runtimes ○ Interact with Spark via YARN client mode (Driver runs in container, workers on CDH) ● Single-cluster only (for now) Hive, HDFS, ... CDSW CDSW ... Master ... Engine EngineEngine EngineEngine
  25. 25. 25© Cloudera, Inc. All rights reserved. “Our data scientists want GPUs, but we can’t find a way to deliver multi-tenancy. If they go to the cloud on their own, it’s expensive and we lose governance.” ●Extend existing CDSW benefits to GPU- optimized deep learning tools ●Schedule & share GPU resources ●Train on GPUs, deploy on CPUs ●Works on-premises or cloud Accelerated deep learning on-demand with GPUs Data Science Workbench GPUCPU CDH CPU CDH CPU single-node training distributed training, scoring Multi-tenant GPU support on-premises or cloud
  26. 26. 26© Cloudera, Inc. All rights reserved. Open Ecosystem Black Box An open ecosystem for agility and innovation
  27. 27. 27© Cloudera, Inc. All rights reserved. Run anywhere. Deploy any way. Simple Unified Enterprise Proven at scale Trusted security Hybrid or multi cloud Platform-as-a-Service Simplifies operations Works with your tools
  28. 28. 28© Cloudera, Inc. All rights reserved. RealtimeAnalytics bzw. OperationalAnalytics? my definition „apply logic and mathematics real-time on data to improve operations“ Model Analyze Repeat # Aggregate relational, NoSQL, structured & unstructured data # Accelerate data science from exploration to production using R, Python, Spark and more # Deploy pipelines and models on-premise or in the cloud. Seeking Abnormal Behavior # Serve real-time data at scale for real-time decision making # Stream processing & analytics on changing operational data „
  29. 29. 29© Cloudera, Inc. All rights reserved. Lohnt sich das überhaupt? HW > Data/Software > Analytics > Automation > AI/ML Technology Foodchain aus „Digital or Dead“

×