Hadoop and Manufacturing

Information-Driven
Manufacturing
Capture Value from Manufacturing Data with an Enterprise
Data Hub
Speaker name // Speaker title

2© Cloudera, Inc. All rights reserved.
Trends in Manufacturing
Everything that can be
measured will be measured.
Only increasing...
Continuous Improvement in
cost and efficiency in all areas of
manufacturing operation
Now, more than ever, Quality is
a top concern both from
consumer, dealer and
regulatory standpoint
Instrumentation Efficency Quality
NEED BETTER PICTURE

Manufacturers are collecting data at an
exponential rate, yet struggle to derive value
from all that data...

:
Manufacturing Enterprise Data HUB
Provides the ability to store, analyze all
the data and quickly uncover new
insights, derive value to all phases of the
process from initial design to final
delivery.

Manufacturing Enterprise Data Hub Overview
Keep all the data, whether its
people generated, machine
generated or external.
Statistical and machine learning
analyses using advanced
analytic tools on all the data
(Spark, R, Python,SAS, Matlab)
Access to all the data from the
enterprise and manufacturing at
your fingertips, consolidate silos
(Self Service BI, Search)
Keep all the data Advanced Analytics Leverage all the data

Where Is the Manufacturing Data?
Mapping and Consolidation Are the Tip of the Iceberg for Big Data
Devices &
Sensors
• Device Readings
• Device Performance
• Device Diagnostics
• Battery / Power
Consumption
• Software Logs
• Environmental
Interactions
• R&D
• Quality / Testing
Plant &
Operations
• MES
• Sensors
• Video / Surveillance
• Line Productivity
• Machines
• Staffing / Scheduling
• Quality data
Supply Chain &
Inventory
• ERP
• Supplier / Manufacturer
• Orders / Receivables
• Commodity Supplies /
Prices
• Chargebacks
• Scorecards
• Delivery Metrics
Marketing
& CRM
• Transactions
• Accounts
• Warranties /
Aftermarket
• Customer Service Logs
• Campaigns /
Promotions
• Website / SEO
• Affiliates / Merchants
• Surveys
• Competitive
Intelligence
Public & Trade
• Market Intelligence
• Policy / Regulation
• Demographic / Census
• Psychographic
• Inflation / Macroeconomic
• Gas Prices
• Labor Statistics
• Social / Search
• Public Health Data
• Clinical Studies
• Store Schematics
• Journals / Editorial
• Seismic / Speculation

A Traditional Architecture: What have we tried
Access Data Experiment FastAnalyze Data
Enterprise Data Warehouse
ImplementData Sources
ETLStructured
Unstructured
Ingest
Storage #1, 2, N
ELT
Store & Process
Traditional Architecture
EDW
Archive
ETL
Access Data
Analyze Data
Search
Serve
Serve
Serve
Optimize
Implement
Custom
Application
Point
Solution
ELT
ELT
Statistical
Machine
Learning
SQL
Filter?
Filter?
Filter?
Filter?
Machine Data
Ingest

ETLStructured
Unstructured
Ingest
Storage #1, 2, N
ELT
Store & Process
Traditional Architecture
EDW
Archive
ETL
Access Data
Analyze Data
Search
Serve
Serve
Serve
Optimize
Implement
Custom
Application
Point
Solution
ELT
ELT
Statistical
Machine
Learning
SQL
Challenges with Traditional Architectures
1) Limited Data 2) Long Time to Value
1
2
2
3) Sub Optimal Decisions
3
Filter? Filter?
Filter?
Filter?
Machine Data
Ingest
Filter?

The New Way Forward
1) Unlimited Data Access 2) Reduce Time to Value 3) Decision on all data
ELT
Store & Process
Modern Architecture
Access Data
Analyze Data
Optimize
Implement
Custom
Application
Point
Solution
Statistical
Machine
Learning
SQL
Structured
Unstructured
EDW
ETL
Serve
ETL
Active
Ingest
Ingest
EDH
Archive
Load
Cloudera
ELT
3
2
2
3
1
Search
Machine Data
ETL
Active
Ingest

10© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights
reserved.
Overview on Data Flow in Cloudera EDH
10
3rd party or
public
Network
Equipment
Traditional
RDMBS
EDW
Ffffffffff
Event Based, Near Real Time
• Flume
• Spark Streaming
• Kafka (coming soon)
SQL / Relational
• Sqoop – SQL Import including
Metadata
Web Services/API/Cmd line
• Put/Store Copy/Move files
• NFS Gateway
HUE Web GUI
• User Upload
• User Copy/Move/Rename
Third Part Integrations
Ingest/Storage Process/Transformation
WORKLOAD MANAGEMENT / Yarn (Resource Management) & Oozie (Workflow Engine)
Hadoop File System (HDFS) / Distributed File Storage
ELT, ETL, Transform, Cleanse, Pre-aggregate,
analyze etc.
SQL / Relational
• Hive (Batch SQL)
• Impala (Interactive SQL)
Map Reduce – Java based distributed
processing
• Machine Learning libraries
• Pig – scripting language to perform Map
Reduce
Spark – In Memory distributed processing
• Java, Scala or Python
Third Party Integrations
SQL / Relational (ODBC/JDBC)
• Hive (Batch SQL)
• Impala (Interactive SQL)
Web Services/API/Cmd line
• Get/Move files
• NFS Gateway
HUE Web GUI
• User Download
• User Copy/Move/Rename
Search Index
• Solr Search, full featured with Facets,
NLP, etc.
Third Part Integrations
Raw Data Insight and Value to User
Publish/Consume

AUTHENTICATION
Guarding access to the
system, its data, and its
various systems
LDAP
Kerberos RPC
PROTECTION
Encryption for data at
rest or in motion with
full key management
Cloudera Navigator:
Encrypt & Key Trustee
AUTHORIZATION
Controlling who or
what has access to a
resource or service
POSIX Permissions
Apache Sentry
AUDIT
Capture a complete
and immutable record
of all activity
Cloudera Navigator
SIEM Tools
Security Important?
Cloudera Enterprise Data Hub provides Enterprise-Grade Security, Audit and
Regulatory Compliance
Governing Access to and Management
of All Data-at-Rest and Data-in-Motion
• Cloudera Manager and Navigator
automate protections for Hadoop and
related projects
• Perimeter security
• Role-based access control
• The only complete policy-based
management of sensitive data
• Data lineage and discoverability

Core Benefits of a Manufacturing Enterprise Data Hub
©2014 Cloudera, Inc. All rights reserved.
• Full-Fidelity Active Archive
• Any and All Kinds of Data
• Accelerate Time to Insight (Scale)
• Unlock Agility and Exploration
• Consolidate Silos for 360o View
• Enable Pervasive Analytics across the
entire Value Chain (Design to Post
Sales Delivery and Warranty)

What value is there in Manufacturing Data Hub?
• What product issues are paramount?
• What are technology trends?
• Efficient Parts Utilization—what is the best
part for my design?
• Is all my machine data being utilized?
Design, R&D, PD, Engineering
Hadoop
Cloudera
Secure
Scalable
Flexible
Open
Production, Quality, Manufacturing
• Diagnose Production problems
• What is the cause? People, Parts, Process,
Suppliers?
• Plant inventory
• Resource utilization
• Is all my shop floor data being analyzed?
Supply Chain, Purchasing
• Who are my best Suppliers?
• Who are my worst Suppliers?
• Consolidated view of the Supply Chain?
• Supply Chain disruption impact analysis?
• Consolidated Purchasing (360 Supplier view)
Manufacturing Data Hub
Delivery, Warranty, Support, Service
• Review Customer 360
• Analyze Product Launch information
• Detect Emerging Warranty Issues
• Decrease Correction Times
• Increased Accuracy of Warranty Forecast
• Knowledge base for After Delivery Service
Ask Bigger Questions of all the data

Customer Story

About Vehicle Manufacturer
What do we do -
Manufacture, Sell and Service Vehicles
Who is this Manufacturer
A worldwide leading Manufacturer of Vehicles

Our Objectives
Store and Analyze worldwide data
from Dealers, Customers and Vehicles
Better, Deeper
Analysis
Smarter Predictions, Earlier Detection

The Pre-Hadoop Environment
Parts
Suppliers
Dealers
1 Difficult to connect to multiple sources
1
BI/RDBMS/DW
Challenge
Claims
Machine Data
IDLE
Vehicle Data
WHY?
• Volume
• Too much to store, let alone
query
• Variety
• Different formats, not all Table
Based data
?
?
?

The Pre-Hadoop Environment
Parts
Suppliers
Dealers
2 Impossible to analyze all that data
2
BI/RDBMS/DW
Another Challenge
Claims
Machine Data
IDLE
Vehicle Data
WHY?
• It wasn’t even in one system.
• Different workloads (Macro vs.
Micro Analysis)
Advanced
Analytics

Vehicle Manufacturer Modern Hadoop Architecture
Complete storage of data
(structured and unstructured)
1
Improvements
Process
1
Store
HDFS, HBase
Claims
Machine Data
IDLE
Vehicle Data
Sqoop
Flume, KafkaCopy (XML)

Process Data as needed2
Improvements
Process
2
Store
HDFS, HBase
Claims
Machine Data
IDLE
Vehicle Data
Sqoop
Flume, KafkaCopy (XML)
MR,Pig, Spark, ETL Tools on Hadoop
Process
1

Analysis and Large Scale Ad-Hoc
Queries
3
Improvements
Process
3
Store
HDFS, HBase
Claims Machine Data
IDLE
Vehicle Data
Sqoop Flume, KafkaCopy (XML)
MR,Pig, Spark, ETL Tools on Hadoop
Process
HUE
Discover
Impala Solr
Access
BI
Hive
1

Analysis and Large Scale Ad-Hoc
Queries
3
Improvements
Process
4
Store
HDFS, HBase
Claims Machine Data
IDLE
Vehicle Data
Sqoop Flume, KafkaCopy (XML)
MR, Pig, Spark, ETL Tools on Hadoop
Process
HUE
Discover
Impala Solr
Access
BI
Hive
1
R MLlibSpark
4
Advanced Analytics

Business and Technical ROI
Technology ROI
Business ROI
Proactive Quality Assurance
Build machine learning algorithms that identify production anomalies prior to field testing and find
performance flaws that could not be identified in R&D.
Predictive Intervention
Combine data streaming from machine data (vehicles, plant floor), diagnostics, and
product/engineering data to proactively avoid or address issues and deploy upgrades.
Merge together storage systems for simpler management – Active Archive – Retire Legacy Systems
Unified access to disparate, Siloed data – Retire single use systems
Scale affordably – Grow without destroying the budget
Flexible and Agile – IT can focus on Solutions for the Business vs. being a Data Plumber

24© Cloudera, Inc. All rights reserved. 24
What happened at the parts supplier that caused a spike in support calls
during the past 30 minutes for devices manufactured in Birmingham?
How many devices were returned last month?
Reduce the time to QC issue resolution from weeks to hours.
Drive $15 to $25 million annual savings for each manufacturer.
© 2014 Cloudera, Inc. All rights reserved.

Can we predict which chips have the highest likelihood of failure
and intervene to proactively prevent manufacturing issues?
Which chips most commonly failed last week?
Analyses now executable on hundreds of thousands of units in just seconds.
60x faster data reload and 300% query speedup enable real-time debugging.
25© 2014 Cloudera, Inc. All rights reserved.

Thank you

Cloudera Snapshot
Founded 2008, by former employees of
Employees Today ~ 850
World Class Support 24x7 Global Staff
Pro-active & Predictive Support Programs
Mission Critical Thousands of Enterprise Users
Over 500+ Paying Subscription Customers
The Largest Ecosystem Over 1450+ Partners
Cloudera University Over 100,000+ Trained
Open Source Leaders Cloudera Employees are Leading Developers & Contributors
Total Capital Raised $1B+ (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock)
Mission Help Organizations Leverage the Power of
All Their Data to Ask Bigger Questions.

Expanding Data Requires A New Approach
What we do
Copy Data to Applications
What we should do
Bring Applications to Data
Data
Information-centric
businesses use all Data:
Multi-structured,
Internal & external data
of all types
App
App
App
Process-centric
businesses use:
• Structured data mainly
• Internal data only
• “Important” data only
• Multiple copies of data
App
App
App
Data
Data
Data
Data

Hadoop Changes the Game: Storage & Compute Together
©2014 Cloudera, Inc. All rights
The Hadoop WayThe Old Way
$30,000+ per TB
Expensive & Unattainable
• Hard to scale
• Network is a bottleneck
• Only handles relational data
• Difficult to add new fields & data types
Expensive, Special purpose, “Reliable” Servers
Expensive Licensed Software
Network
Data Storage
(SAN, NAS)
Compute
(RDBMS, EDW)
$300-$1,000 per TB
Affordable & Attainable
• Scales out forever
• No bottlenecks
• Easy to ingest any data
• Agile data access
Commodity “Unreliable” Servers
Hybrid Open Source Software
Compute
(CPU)
Memory Storage
(Disk)
z
z

Enabling the “App Store” of Big Data (Large Ecosystem)
Data
Systems
Enterprise Data Hub
Security and Administration
Unlimited Storage
Process Discover Model Serve
Applications
System Integration
Infrastructure
More than 1,450 partners
ensure compatibility with existing
investments, lower skill barriers, and
help maximize value from your data.Operational
Tools

WEB/MOBILE APPLICATIONS
ONLINE SERVING
SYSTEM
ENTERPRISE DATA
WAREHOUSE
ENTERPRISE
REPORTINGBI / ANALYTICSMACHINE
LEARNING
CONVERGED
APPLICATIONS
CLOUDERA
MANAGER
META DATA /
ETL TOOLS
ENTERPRISE DATA HUB
The Modern Information Architecture
Data Architects System Operators Engineers Data Scientists Analysts Business Users
Customers & End Users
SYS LOGS WEB LOGS FILES RDBMS

32© Cloudera, Inc. All rights reserved.©2014 Cloudera, Inc. All rights
A High Level View of the Journey
Not
Only
SQL
Agile
Exploration
ETL
Acceleration
Operational Efficiency
(Faster, Bigger, Cheaper)
Transformative Applications
(New Business Value)
Cheap
Storage
BusinessIT
EDW
Optimization
Pervasive
Analytics

Hadoop and Manufacturing

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (18)

Similar to Hadoop and Manufacturing

Similar to Hadoop and Manufacturing (20)

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Recently uploaded

Recently uploaded (20)

Hadoop and Manufacturing

Editor's Notes