SlideShare a Scribd company logo
1 of 33
Download to read offline
1 
Bringing 
Hadoop 
to 
a 
produc0on-­‐ready 
state 
Eddie 
Garcia 
-­‐ 
Security 
Architect, 
Office 
of 
the 
CTO
2 
Agenda 
• The 
Future 
of 
Data 
Management 
• Iden0fying 
your 
first 
Hadoop 
project 
• Common 
Hadoop 
project 
challenges 
• Ensuring 
success 
from 
POC 
to 
Produc0on 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
3 
The 
Future 
of 
Data 
Management 
The 
Enterprise 
Data 
Hub
4 
Expanding 
Data 
Requires 
A 
New 
Approach 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved. 
1980s 
Bring 
Data 
to 
Compute 
Now 
Bring 
Compute 
to 
Data 
Rela.ve 
size 
& 
complexity 
Data 
Informa.on-­‐centric 
businesses 
use 
all 
data: 
Mul0-­‐structured, 
internal 
& 
external 
data 
of 
all 
types 
Compute 
Compute 
Compute 
Process-­‐centric 
businesses 
use: 
• Structured 
data 
mainly 
• Internal 
data 
only 
• “Important” 
data 
only 
Compute 
Compute 
Data 
Data 
Compute 
Data 
Data
3 
2 
5 
The 
Old 
Way: 
Bringing 
Data 
to 
Compute 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved. 
Complex 
Architecture 
• Many 
special-­‐purpose 
systems 
• Moving 
data 
around 
• No 
complete 
views 
Cost 
of 
Analy.cs 
• Exis0ng 
systems 
strained 
Time 
to 
Data 
• Up-­‐front 
modeling 
• Transforms 
slow 
• Transforms 
lose 
data 
Missing 
Data 
• Leaving 
data 
behind 
• Risk 
and 
compliance 
• High 
cost 
of 
storage 
• No 
agility 
• “BI 
backlog” 
4 
1 
EDWS 
MARTS 
SERVERS 
DOCUMENTS 
STORAGE 
SEARCH 
ARCHIVE 
ERP, 
CRM, 
RDBMS, 
MACHINES 
FILES, 
IMAGES, 
VIDEOS, 
LOGS, 
CLICKSTREAMS 
EXTERNAL 
DATA 
SOURCES
6 
The 
New 
Way: 
Bringing 
Compute 
to 
Data 
2 
SERVERS 
MARTS 
EDWS 
DOCUMENTS 
STORAGE 
SEARCH 
ARCHIVE 
ERP, 
CRM, 
RDBMS, 
MACHINES 
FILES, 
IMAGES, 
VIDEOS, 
LOGS, 
CLICKSTREAMS 
ESTERNAL 
DATA 
SOURCES 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved. 
Diverse 
Analy.c 
PlaIorm 
• Bring 
applica0ons 
to 
data 
• Combine 
different 
workloads 
on 
common 
data 
(i.e. 
SQL 
+ 
Search) 
• True 
analy,c 
agility 
4 
1 
3 
4 
Ac.ve 
Compliance 
Archive 
• Full 
fidelity 
original 
data 
• Indefinite 
0me, 
any 
source 
• Lowest 
cost 
storage 
1 
Persistent 
Staging 
• One 
source 
of 
data 
for 
all 
analy0cs 
• Persist 
state 
of 
transformed 
data 
• Significantly 
faster 
& 
cheaper 
2 
Self-­‐Service 
Exploratory 
BI 
• Simple 
search 
+ 
BI 
tools 
• “Schema 
on 
read” 
agility 
• Reduce 
BI 
user 
backlog 
requests 
3
7 
Iden0fying 
your 
first 
Hadoop 
Use 
Case
8 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Usage 
of 
Hadoop 
– 
current 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved. 
71.4% 
57.1% 
42.9% 
64.3% 
14.3% 
42.9% 
ETL 
processing 
Data 
warehouse 
off-­‐loading 
Data 
archival 
Advanced 
Analy0cs 
Log 
management 
Search 
applica0on 
What 
are 
you 
currently 
using 
your 
Hadoop 
infrastructure 
for?
9 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Usage 
of 
Hadoop 
– 
next 
6-­‐12 
months 
How 
do 
you 
an.cipate 
using 
your 
Hadoop 
infrastructure 
for 
in 
the 
next 
64.3% 
85.7% 
57.1% 
85.7% 
50.0% 
50.0% 
ETL 
processing 
Data 
warehouse 
off-­‐ 
loading 
Data 
archival 
Advanced 
Analy0cs 
Log 
management 
Search 
applica0on 
6-­‐12 
months? 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved.
10 
Common 
Hadoop 
project 
challenges
11 
Percentage 
of 
0me 
spent 
to 
Produc0on-­‐Ready 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved. 
13.73 
13.42 
9.82 
18.27 
16.09 
9.00 
9.60 
9.67 
11.00 
3.33 
Installa0on 
& 
configura0on 
Training 
/ 
up-­‐skilling 
people 
to 
be 
familiar 
with 
the 
technology 
User 
onboarding 
Applica0on 
development 
Data 
migra0on 
Workload 
migra0on 
Integra0ng 
Seong 
up 
security 
and 
reviewing 
with 
infosec 
Tes0ng 
Other 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users
12 
6.38 
13.33 
14.38 
19.55 
22.08 
19.29 
9.44 
14.55 
12.00 
14.00 
Unable 
to 
iden0fy 
the 
right 
business 
use 
case 
Lack 
of 
engagement 
/ 
sponsorship 
by 
the 
business 
Deployment 
challenges 
(i.e. 
installa0on 
& 
configura0on) 
Security/Compliance 
challenges 
Data 
Management 
(Discovery, 
Lineage, 
Lifecycle 
mgmt) 
Lack 
of 
integra0on 
with 
key 
ISV 
tools 
Lack 
of 
SQL 
features 
Lack 
of 
IT 
skill 
sets 
Lack 
of 
‘Data 
Science’ 
skill 
sets 
Other 
Blockers 
to 
Produc0on-­‐Ready 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved. 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users
13 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Top 
challenges 
– 
Security/Compliance 
Please 
rate 
on 
a 
scale 
of 
1(NA), 
2(least 
challenging) 
to 
6 
(most 
challenging) 
the 
top 
issues 
w.r.t 
0.00 
1.00 
2.00 
3.00 
4.00 
5.00 
6.00 
Data 
encryp0on, 
masking 
PCI 
compliance 
Lack 
of 
unified 
security 
model 
for 
Hadoop 
Lack 
of 
adequate 
governance 
Audit/Access 
mgmt 
Kerberos 
setup 
& 
mgmt 
Hadoop 
security/compliance? 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved.
14 
Cloudera 
Enterprise 
Security 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved. 
Perimeter 
Guarding 
access 
to 
the 
cluster 
itself 
Technical 
Concepts: 
Authen0ca0on 
Network 
isola0on 
Data 
Protec0ng 
data 
in 
the 
cluster 
from 
unauthorized 
visibility 
Technical 
Concepts: 
Encryp0on, 
Tokeniza0on, 
Data 
masking 
Access 
Defining 
what 
users 
and 
applica0ons 
can 
do 
with 
data 
Technical 
Concepts: 
Permissions 
Authoriza0on 
Visibility 
Repor0ng 
on 
where 
data 
came 
from 
and 
how 
it’s 
being 
used 
Technical 
Concepts: 
Audi0ng 
Lineage
15 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
16 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Top 
challenges 
– 
Hadoop 
Opera0ons 
Please 
rate 
on 
a 
scale 
of 
1 
(NA), 
2 
(least 
challenging) 
to 
6 
(most 
challenging) 
the 
top 
issues 
0.00 
0.50 
1.00 
1.50 
2.00 
2.50 
3.00 
3.50 
4.00 
4.50 
5.00 
Hardware 
procurement 
Deployment 
on 
virtualized 
infrastructure, 
cloud 
environments 
Configura0on 
management 
Diagnos0cs 
and 
Troubleshoo0ng 
Plasorm 
stability 
SLA 
management 
Capacity 
Planning 
Chargeback 
Modeling 
Integra0on 
with 
other 
IT 
Mgmt 
tools 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved.
17 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Top 
challenges 
– 
Organiza0on 
issues 
Please 
rate 
on 
a 
scale 
of 
1(NA), 
2(least 
challenging) 
to 
6 
(most 
challenging) 
0.00 
1.00 
2.00 
3.00 
4.00 
5.00 
6.00 
Plasorm 
stability 
Too 
entrenched 
in 
exis0ng 
technologies 
Lack 
of 
skills 
-­‐ 
Analy0cs 
Lack 
of 
skills 
– 
Opera0onal 
Lack 
of 
ROI 
for 
business 
sponsors 
Lack 
of 
concrete 
use 
cases 
Unable 
to 
demonstrate 
quick 
wins 
Lack 
of 
understanding 
of 
Hadoop 
(and 
it’s 
benefits) 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved.
18 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Top 
challenges 
– 
Data 
Warehouse 
off-­‐load 
Please 
rate 
on 
a 
scale 
of 
1(NA), 
2 
(least 
challenging) 
to 
6 
(most 
challenging) 
the 
top 
issues 
0.00 
0.50 
1.00 
1.50 
2.00 
2.50 
3.00 
3.50 
4.00 
4.50 
Plasorm 
stability 
Lack 
of 
non-­‐ANSI 
SQL 
language 
extensions 
from 
their 
EDW 
Lack 
of 
nested/complex 
structures 
(struct, 
map, 
array) 
Lack 
of 
Resource/Workload 
management 
capabili0es 
Unable 
to 
iden0fy 
workloads 
that 
can 
be 
off-­‐loaded 
from 
a 
DW 
Lack 
of 
robust 
ISV 
tools 
(for 
BI, 
ETL 
etc.) 
Lack 
of 
SQL 
compliance 
to 
ANSI 
standards 
Lack 
of 
data 
types 
support 
(e.g. 
Decimal, 
varchar 
etc.) 
in 
Impala/Hive 
Lack 
of 
SQL 
performance 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved.
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Top 
challenges 
– 
Hadoop 
for 
ETL 
Please 
rate 
on 
a 
scale 
of 
1(NA), 
2 
(least 
challenging) 
to 
6 
(most 
challenging) 
the 
top 
issues 
Unable 
to 
leverage/migrate 
exis0ng 
ETL 
pipelines 
to 
Hadoop 
easily 
Lack 
of 
robust 
ISV 
tools; 
current 
tools 
from 
vendors 
like 
Informa0ca, 
Pentaho 
s0ll 
19 
0.00 
0.50 
1.00 
1.50 
2.00 
2.50 
3.00 
3.50 
4.00 
4.50 
immature 
Lack 
of 
skills 
to 
leverage 
Pig, 
MR 
etc. 
to 
build 
ETL 
pipelines 
Lack 
of 
capabili0es 
in 
exis0ng 
tools 
(Pig, 
Hive, 
Crunch, 
Morphlines 
etc.) 
to 
easily 
build 
ETL 
pipelines 
Plasorm 
stability 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved.
20 
*Results 
are 
based 
on 
internal 
Cloudera 
sampled 
survey 
of 
a 
CDH 
install 
base 
users 
Top 
challenges 
– 
Hadoop 
for 
Advanced 
Analy0cs 
Please 
rate 
on 
a 
scale 
of 
1(NA), 
2(least 
challenging) 
to 
6 
(most 
challenging) 
the 
top 
issues 
0.00 
0.50 
1.00 
1.50 
2.00 
2.50 
3.00 
3.50 
4.00 
4.50 
Plasorm 
stability 
Lack 
of 
robust 
workload/resource 
management 
capabili0es 
Lack 
of 
cer0fied 
Partner 
tools 
from 
Cloudera 
Lack 
of 
robust 
ISV 
tools 
for 
advanced 
analy0cs 
(including 
Machine 
Learning) 
in 
Hadoop; 
exis0ng 
tools 
are 
immature 
or 
rela0vely 
new 
(offerings 
from 
SAS, 
©2014 
Cloudera, 
Inc. 
All 
Rights 
Reserved. 
Revolu0on 
etc.) 
Lack 
of 
Data 
analysis 
skills 
Lack 
of 
cer0fied 
Partners
21 
Hadoop 
POC 
to 
Produc0on 
Challenges 
• Invested 
heavily 
in 
learning 
and 
ramp-­‐up 
• Re-­‐wri0ng 
logic 
from 
the 
scratch 
to 
op0mize 
rather 
than 
re-­‐ 
using 
exis0ng 
rou0nes 
ß 
involves 
significant 
0me/resources 
• Took 
more 
than 
6 
months 
to 
get 
Kerberos 
to 
work 
with 
my 
exis0ng 
iden0ty 
infrastructure 
• I 
need 
to 
track 
the 
metadata 
for 
data 
flow 
in/out 
of 
Hadoop 
• Configuring 
mul0-­‐tenancy/work 
load 
management 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
22 
Hadoop 
POC 
to 
Produc0on 
Challenges 
• Time 
and 
resources 
researching 
Analy0cs, 
BI, 
ETL 
tools 
available 
for 
Hadoop 
• Challenges 
making 
data 
in 
Hadoop 
clusters 
available 
to 
end 
business 
users 
• Exis0ng 
skill 
sets 
more 
tuned 
towards 
to 
Data 
Warehouse 
• Some 
resistance 
to 
be 
adop0on 
of 
newer 
technologies 
like 
Hadoop 
(from 
exis0ng 
IT 
folks) 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
23 
Ensuring 
success 
from 
POC 
to 
Produc0on 
Big 
Data 
Center 
of 
Excellence
24 
Manage 
Disrup0on 
Throughout 
the 
Enterprise 
Op0mize 
the 
Value 
of 
an 
Enterprise 
Data 
Hub 
as 
Hadoop 
Adop0on 
Increases 
SAVINGS 
Consolidate 
knowledge 
and 
resources 
to 
Increase 
resource 
u+liza+on 
over 
+me 
and 
exploit 
valuable 
exper+se 
STANDARDS 
Syndicate 
best 
prac+ces 
and 
design 
to 
assure 
efficient 
integra+on 
SPEED 
STRATEGY 
bolster 
infrastructure 
Drive 
adop+on 
via 
a 
solu+ons 
engine, 
not 
autonomous 
projects 
SCALE 
Systema+cally 
learn, 
develop, 
deploy, 
operate, 
publish, 
and 
repeat 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
25 
Op0mize 
Your 
Hadoop 
Knowledge 
Flow 
Big 
Data 
Is 
All 
About 
Con0nuous 
Improvement 
and 
Opera0onal 
Efficiency 
Learn 
Develop 
Deploy 
Operate 
Publish 
REPEAT 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
26 
What 
Is 
a 
Center 
of 
Excellence? 
Centrally 
Incubate, 
Integrate, 
and 
Scale 
Across 
Lines 
of 
Business 
Big 
Data 
Solu.ons 
as 
a 
Service 
Knowledge 
Hub 
Best 
Prac.ces 
Exper.se 
Solu.ons 
Engine 
Infrastructure 
Clusters 
Dedicated 
Team 
Architects 
Administrators 
Developers 
Data 
Scien.sts 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
27 
Why 
Build 
a 
Center 
of 
Excellence? 
Silos 
Obstruct 
Scale, 
Prevent 
Flexibility, 
and 
Drive 
Up 
Costs 
??? 
Novice 
Skill 
Level 
Redundant 
Infrastructure 
Incompa.ble 
Systems 
Disparate 
Data 
$$$ 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
28 
Project-­‐Level 
Benefits 
of 
COE
29 
Big 
Data 
as 
a 
Service 
Drives 
Success 
Coordinate 
Workloads 
to 
Distribute 
Advantages 
Widely 
and 
Seamlessly 
Insights 
& 
Best 
Prac.ces 
Hadoop 
Experts 
Broadly 
Accessible 
Big 
Data 
Technology 
& 
Equipment 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
30 
Building 
Your 
COE
31 
Big 
Data 
COE 
Program 
Roles 
Staff 
Centrally 
and 
Train 
to 
Scale 
Management 
& 
Leadership 
Business 
& 
Data 
Technology 
& 
Ops 
Lead 
Data 
Scien0st 
Lead 
Business 
Analyst 
Developers 
Big 
Data 
Visionary 
Execu0ve 
Sponsor 
Program 
Manager 
Data 
Warehouse 
Admin 
Specialist 
LOB 
Rep 
Architects 
LOB 
Rep 
LOB 
Rep 
Data 
Wranglers 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
32 
A 
Five-­‐Phase 
Development 
Plan 
Work 
with 
Hadoop 
Experts 
from 
Planning 
to 
Support 
Staffing 
• Personnel 
• Training 
• Mentoring 
Incuba.on 
Opera.ons 
Con.nuity 
• Scope 
• Build 
• Cer0fy 
• Deploy 
• Integrate 
• Op0mize 
• Use 
Cases 
• Real-­‐Time 
• Document 
• Cost 
Model 
• Support 
• Sustain 
Expansion 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved.
33 
©2014 
Cloudera, 
Inc. 
All 
rights 
reserved. 
Thank 
You! 
Eddie 
Garcia 
eddie@cloudera.com

More Related Content

What's hot

Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hortonworks
 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureRiccardo Romani
 
Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...DataWorks Summit
 
Oracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in CyprusOracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in CyprusAndy Panayiotou
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack EuropeHortonworks
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...jdijcks
 
Creando un Portal Oracle para una Empresa
Creando un Portal Oracle para una EmpresaCreando un Portal Oracle para una Empresa
Creando un Portal Oracle para una Empresaisarmientop
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityCloudera, Inc.
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks
 
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics ApplicationsHortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applicationsrussell_jurney
 
Oracle Cloud Networking And Security Exposed
Oracle Cloud Networking And Security Exposed Oracle Cloud Networking And Security Exposed
Oracle Cloud Networking And Security Exposed Riccardo Romani
 
Expand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big DataExpand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big Datajdijcks
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Cynthia Saracco
 
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Hortonworks
 

What's hot (20)

Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
Hadoop Operations, Innovations and Enterprise Readiness with Hortonworks Data...
 
Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
 
Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...Leveraging advanced technologies to support critical applications in a secure...
Leveraging advanced technologies to support critical applications in a secure...
 
Oracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in CyprusOracle Database Appliance - Introduction in Cyprus
Oracle Database Appliance - Introduction in Cyprus
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
Creando un Portal Oracle para una Empresa
Creando un Portal Oracle para una EmpresaCreando un Portal Oracle para una Empresa
Creando un Portal Oracle para una Empresa
 
Highly Automated IT
Highly Automated ITHighly Automated IT
Highly Automated IT
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and Visibility
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Hortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics ApplicationsHortonworks: Agile Analytics Applications
Hortonworks: Agile Analytics Applications
 
Oracle Cloud Networking And Security Exposed
Oracle Cloud Networking And Security Exposed Oracle Cloud Networking And Security Exposed
Oracle Cloud Networking And Security Exposed
 
Expand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big DataExpand a Data warehouse with Hadoop and Big Data
Expand a Data warehouse with Hadoop and Big Data
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
OpenPOWER Update
OpenPOWER UpdateOpenPOWER Update
OpenPOWER Update
 
Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
 

Viewers also liked

18364 1 artificial intelligence
18364 1 artificial intelligence18364 1 artificial intelligence
18364 1 artificial intelligenceAbhishek Abhi
 
Artificial Intelligence Chapter two agents
Artificial Intelligence Chapter two agentsArtificial Intelligence Chapter two agents
Artificial Intelligence Chapter two agentsEhsan Nowrouzi
 
Презентация бантиков
Презентация бантиковПрезентация бантиков
Презентация бантиковAkella251
 
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannesTED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannesOgilvy
 
Interactive Minds July 2009
Interactive Minds July 2009Interactive Minds July 2009
Interactive Minds July 2009ABirkill
 
Earn more money - build your personal brand online
Earn more money - build your personal brand onlineEarn more money - build your personal brand online
Earn more money - build your personal brand onlineKatie McGregor
 
Personal learning styles
Personal learning stylesPersonal learning styles
Personal learning stylesMatthew Ritter
 
Letters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. englishLetters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. englishHarunyahyaEnglish
 
The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016Ryan Bonnici
 
Slide show jessie j
Slide show jessie jSlide show jessie j
Slide show jessie jroom24eps
 
Exemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haoticExemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haoticDiana Stănescu
 
Drupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part IDrupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part ITim Hamilton
 

Viewers also liked (20)

18364 1 artificial intelligence
18364 1 artificial intelligence18364 1 artificial intelligence
18364 1 artificial intelligence
 
Data Mining: Key definitions
Data Mining: Key definitionsData Mining: Key definitions
Data Mining: Key definitions
 
Learning agents
Learning agentsLearning agents
Learning agents
 
Learning
LearningLearning
Learning
 
Planning
PlanningPlanning
Planning
 
Artificial Intelligence Chapter two agents
Artificial Intelligence Chapter two agentsArtificial Intelligence Chapter two agents
Artificial Intelligence Chapter two agents
 
Презентация бантиков
Презентация бантиковПрезентация бантиков
Презентация бантиков
 
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannesTED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
TED Talk: There is Magic in the Future #CannesLions / #OgilvyCannes
 
Interactive Minds July 2009
Interactive Minds July 2009Interactive Minds July 2009
Interactive Minds July 2009
 
台灣新傳奇
台灣新傳奇台灣新傳奇
台灣新傳奇
 
Setting the group mode in quickmail
Setting the group mode in quickmailSetting the group mode in quickmail
Setting the group mode in quickmail
 
Earn more money - build your personal brand online
Earn more money - build your personal brand onlineEarn more money - build your personal brand online
Earn more money - build your personal brand online
 
2014_HM_Physician_web
2014_HM_Physician_web2014_HM_Physician_web
2014_HM_Physician_web
 
Personal learning styles
Personal learning stylesPersonal learning styles
Personal learning styles
 
BD2K Update
BD2K Update BD2K Update
BD2K Update
 
Letters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. englishLetters from our prophet (saas) (pbuh). Communicating Islam. english
Letters from our prophet (saas) (pbuh). Communicating Islam. english
 
The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016The Regional Marketer's Playbook - Asia Pacific - 2016
The Regional Marketer's Playbook - Asia Pacific - 2016
 
Slide show jessie j
Slide show jessie jSlide show jessie j
Slide show jessie j
 
Exemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haoticExemple de sisteme cu comportament haotic
Exemple de sisteme cu comportament haotic
 
Drupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part IDrupal as a Data Purveyor, Part I
Drupal as a Data Purveyor, Part I
 

Similar to What it takes to bring Hadoop to a production-ready state

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Oracle Database Lifecycle Management
Oracle Database Lifecycle ManagementOracle Database Lifecycle Management
Oracle Database Lifecycle ManagementHari Srinivasan
 
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...DataWorks Summit
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Cloudera, Inc.
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Cloudera, Inc.
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSteven Totman
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Stefan Lipp
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...TheInevitableCloud
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderainevitablecloud
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationMichael Rainey
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionCloudera, Inc.
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationshadooparchbook
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightPrecisely
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchCloudera, Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 

Similar to What it takes to bring Hadoop to a production-ready state (20)

Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Oracle Database Lifecycle Management
Oracle Database Lifecycle ManagementOracle Database Lifecycle Management
Oracle Database Lifecycle Management
 
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
Big Data Management System: Smart SQL Processing Across Hadoop and your Data ...
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Part 2: A Visual Dive into Machine Learning and Deep Learning 

 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Contexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to ProductionContexti / Oracle - Big Data : From Pilot to Production
Contexti / Oracle - Big Data : From Pilot to Production
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
Strata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applicationsStrata EU tutorial - Architectural considerations for hadoop applications
Strata EU tutorial - Architectural considerations for hadoop applications
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Part 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science WorkbenchPart 1: Introducing the Cloudera Data Science Workbench
Part 1: Introducing the Cloudera Data Science Workbench
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 

More from ClouderaUserGroups

Extending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via APIExtending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via APIClouderaUserGroups
 
Pa cloudera manager-api's_extensibility_v2
Pa   cloudera manager-api's_extensibility_v2Pa   cloudera manager-api's_extensibility_v2
Pa cloudera manager-api's_extensibility_v2ClouderaUserGroups
 
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & ExtensibilityCloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & ExtensibilityClouderaUserGroups
 
Cloudera User Group - From the Lab to the Factory
Cloudera User Group - From the Lab to the FactoryCloudera User Group - From the Lab to the Factory
Cloudera User Group - From the Lab to the FactoryClouderaUserGroups
 
Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
Cloudera User Group Chicago - Cloudera Manager: APIs & ExtensibilityCloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
Cloudera User Group Chicago - Cloudera Manager: APIs & ExtensibilityClouderaUserGroups
 
Cloudera User Group Chicago - The Future of Data
Cloudera User Group Chicago - The Future of DataCloudera User Group Chicago - The Future of Data
Cloudera User Group Chicago - The Future of DataClouderaUserGroups
 

More from ClouderaUserGroups (6)

Extending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via APIExtending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via API
 
Pa cloudera manager-api's_extensibility_v2
Pa   cloudera manager-api's_extensibility_v2Pa   cloudera manager-api's_extensibility_v2
Pa cloudera manager-api's_extensibility_v2
 
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & ExtensibilityCloudera User Group SF - Cloudera Manager: APIs & Extensibility
Cloudera User Group SF - Cloudera Manager: APIs & Extensibility
 
Cloudera User Group - From the Lab to the Factory
Cloudera User Group - From the Lab to the FactoryCloudera User Group - From the Lab to the Factory
Cloudera User Group - From the Lab to the Factory
 
Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
Cloudera User Group Chicago - Cloudera Manager: APIs & ExtensibilityCloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
Cloudera User Group Chicago - Cloudera Manager: APIs & Extensibility
 
Cloudera User Group Chicago - The Future of Data
Cloudera User Group Chicago - The Future of DataCloudera User Group Chicago - The Future of Data
Cloudera User Group Chicago - The Future of Data
 

Recently uploaded

Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 

Recently uploaded (20)

Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 

What it takes to bring Hadoop to a production-ready state

  • 1. 1 Bringing Hadoop to a produc0on-­‐ready state Eddie Garcia -­‐ Security Architect, Office of the CTO
  • 2. 2 Agenda • The Future of Data Management • Iden0fying your first Hadoop project • Common Hadoop project challenges • Ensuring success from POC to Produc0on ©2014 Cloudera, Inc. All rights reserved.
  • 3. 3 The Future of Data Management The Enterprise Data Hub
  • 4. 4 Expanding Data Requires A New Approach ©2014 Cloudera, Inc. All rights reserved. 1980s Bring Data to Compute Now Bring Compute to Data Rela.ve size & complexity Data Informa.on-­‐centric businesses use all data: Mul0-­‐structured, internal & external data of all types Compute Compute Compute Process-­‐centric businesses use: • Structured data mainly • Internal data only • “Important” data only Compute Compute Data Data Compute Data Data
  • 5. 3 2 5 The Old Way: Bringing Data to Compute ©2014 Cloudera, Inc. All rights reserved. Complex Architecture • Many special-­‐purpose systems • Moving data around • No complete views Cost of Analy.cs • Exis0ng systems strained Time to Data • Up-­‐front modeling • Transforms slow • Transforms lose data Missing Data • Leaving data behind • Risk and compliance • High cost of storage • No agility • “BI backlog” 4 1 EDWS MARTS SERVERS DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS EXTERNAL DATA SOURCES
  • 6. 6 The New Way: Bringing Compute to Data 2 SERVERS MARTS EDWS DOCUMENTS STORAGE SEARCH ARCHIVE ERP, CRM, RDBMS, MACHINES FILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMS ESTERNAL DATA SOURCES ©2014 Cloudera, Inc. All rights reserved. Diverse Analy.c PlaIorm • Bring applica0ons to data • Combine different workloads on common data (i.e. SQL + Search) • True analy,c agility 4 1 3 4 Ac.ve Compliance Archive • Full fidelity original data • Indefinite 0me, any source • Lowest cost storage 1 Persistent Staging • One source of data for all analy0cs • Persist state of transformed data • Significantly faster & cheaper 2 Self-­‐Service Exploratory BI • Simple search + BI tools • “Schema on read” agility • Reduce BI user backlog requests 3
  • 7. 7 Iden0fying your first Hadoop Use Case
  • 8. 8 *Results are based on internal Cloudera sampled survey of a CDH install base users Usage of Hadoop – current ©2014 Cloudera, Inc. All Rights Reserved. 71.4% 57.1% 42.9% 64.3% 14.3% 42.9% ETL processing Data warehouse off-­‐loading Data archival Advanced Analy0cs Log management Search applica0on What are you currently using your Hadoop infrastructure for?
  • 9. 9 *Results are based on internal Cloudera sampled survey of a CDH install base users Usage of Hadoop – next 6-­‐12 months How do you an.cipate using your Hadoop infrastructure for in the next 64.3% 85.7% 57.1% 85.7% 50.0% 50.0% ETL processing Data warehouse off-­‐ loading Data archival Advanced Analy0cs Log management Search applica0on 6-­‐12 months? ©2014 Cloudera, Inc. All Rights Reserved.
  • 10. 10 Common Hadoop project challenges
  • 11. 11 Percentage of 0me spent to Produc0on-­‐Ready ©2014 Cloudera, Inc. All Rights Reserved. 13.73 13.42 9.82 18.27 16.09 9.00 9.60 9.67 11.00 3.33 Installa0on & configura0on Training / up-­‐skilling people to be familiar with the technology User onboarding Applica0on development Data migra0on Workload migra0on Integra0ng Seong up security and reviewing with infosec Tes0ng Other *Results are based on internal Cloudera sampled survey of a CDH install base users
  • 12. 12 6.38 13.33 14.38 19.55 22.08 19.29 9.44 14.55 12.00 14.00 Unable to iden0fy the right business use case Lack of engagement / sponsorship by the business Deployment challenges (i.e. installa0on & configura0on) Security/Compliance challenges Data Management (Discovery, Lineage, Lifecycle mgmt) Lack of integra0on with key ISV tools Lack of SQL features Lack of IT skill sets Lack of ‘Data Science’ skill sets Other Blockers to Produc0on-­‐Ready ©2014 Cloudera, Inc. All Rights Reserved. *Results are based on internal Cloudera sampled survey of a CDH install base users
  • 13. 13 *Results are based on internal Cloudera sampled survey of a CDH install base users Top challenges – Security/Compliance Please rate on a scale of 1(NA), 2(least challenging) to 6 (most challenging) the top issues w.r.t 0.00 1.00 2.00 3.00 4.00 5.00 6.00 Data encryp0on, masking PCI compliance Lack of unified security model for Hadoop Lack of adequate governance Audit/Access mgmt Kerberos setup & mgmt Hadoop security/compliance? ©2014 Cloudera, Inc. All Rights Reserved.
  • 14. 14 Cloudera Enterprise Security ©2014 Cloudera, Inc. All rights reserved. Perimeter Guarding access to the cluster itself Technical Concepts: Authen0ca0on Network isola0on Data Protec0ng data in the cluster from unauthorized visibility Technical Concepts: Encryp0on, Tokeniza0on, Data masking Access Defining what users and applica0ons can do with data Technical Concepts: Permissions Authoriza0on Visibility Repor0ng on where data came from and how it’s being used Technical Concepts: Audi0ng Lineage
  • 15. 15 ©2014 Cloudera, Inc. All rights reserved.
  • 16. 16 *Results are based on internal Cloudera sampled survey of a CDH install base users Top challenges – Hadoop Opera0ons Please rate on a scale of 1 (NA), 2 (least challenging) to 6 (most challenging) the top issues 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 Hardware procurement Deployment on virtualized infrastructure, cloud environments Configura0on management Diagnos0cs and Troubleshoo0ng Plasorm stability SLA management Capacity Planning Chargeback Modeling Integra0on with other IT Mgmt tools ©2014 Cloudera, Inc. All Rights Reserved.
  • 17. 17 *Results are based on internal Cloudera sampled survey of a CDH install base users Top challenges – Organiza0on issues Please rate on a scale of 1(NA), 2(least challenging) to 6 (most challenging) 0.00 1.00 2.00 3.00 4.00 5.00 6.00 Plasorm stability Too entrenched in exis0ng technologies Lack of skills -­‐ Analy0cs Lack of skills – Opera0onal Lack of ROI for business sponsors Lack of concrete use cases Unable to demonstrate quick wins Lack of understanding of Hadoop (and it’s benefits) ©2014 Cloudera, Inc. All Rights Reserved.
  • 18. 18 *Results are based on internal Cloudera sampled survey of a CDH install base users Top challenges – Data Warehouse off-­‐load Please rate on a scale of 1(NA), 2 (least challenging) to 6 (most challenging) the top issues 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 Plasorm stability Lack of non-­‐ANSI SQL language extensions from their EDW Lack of nested/complex structures (struct, map, array) Lack of Resource/Workload management capabili0es Unable to iden0fy workloads that can be off-­‐loaded from a DW Lack of robust ISV tools (for BI, ETL etc.) Lack of SQL compliance to ANSI standards Lack of data types support (e.g. Decimal, varchar etc.) in Impala/Hive Lack of SQL performance ©2014 Cloudera, Inc. All Rights Reserved.
  • 19. *Results are based on internal Cloudera sampled survey of a CDH install base users Top challenges – Hadoop for ETL Please rate on a scale of 1(NA), 2 (least challenging) to 6 (most challenging) the top issues Unable to leverage/migrate exis0ng ETL pipelines to Hadoop easily Lack of robust ISV tools; current tools from vendors like Informa0ca, Pentaho s0ll 19 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 immature Lack of skills to leverage Pig, MR etc. to build ETL pipelines Lack of capabili0es in exis0ng tools (Pig, Hive, Crunch, Morphlines etc.) to easily build ETL pipelines Plasorm stability ©2014 Cloudera, Inc. All Rights Reserved.
  • 20. 20 *Results are based on internal Cloudera sampled survey of a CDH install base users Top challenges – Hadoop for Advanced Analy0cs Please rate on a scale of 1(NA), 2(least challenging) to 6 (most challenging) the top issues 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 Plasorm stability Lack of robust workload/resource management capabili0es Lack of cer0fied Partner tools from Cloudera Lack of robust ISV tools for advanced analy0cs (including Machine Learning) in Hadoop; exis0ng tools are immature or rela0vely new (offerings from SAS, ©2014 Cloudera, Inc. All Rights Reserved. Revolu0on etc.) Lack of Data analysis skills Lack of cer0fied Partners
  • 21. 21 Hadoop POC to Produc0on Challenges • Invested heavily in learning and ramp-­‐up • Re-­‐wri0ng logic from the scratch to op0mize rather than re-­‐ using exis0ng rou0nes ß involves significant 0me/resources • Took more than 6 months to get Kerberos to work with my exis0ng iden0ty infrastructure • I need to track the metadata for data flow in/out of Hadoop • Configuring mul0-­‐tenancy/work load management ©2014 Cloudera, Inc. All rights reserved.
  • 22. 22 Hadoop POC to Produc0on Challenges • Time and resources researching Analy0cs, BI, ETL tools available for Hadoop • Challenges making data in Hadoop clusters available to end business users • Exis0ng skill sets more tuned towards to Data Warehouse • Some resistance to be adop0on of newer technologies like Hadoop (from exis0ng IT folks) ©2014 Cloudera, Inc. All rights reserved.
  • 23. 23 Ensuring success from POC to Produc0on Big Data Center of Excellence
  • 24. 24 Manage Disrup0on Throughout the Enterprise Op0mize the Value of an Enterprise Data Hub as Hadoop Adop0on Increases SAVINGS Consolidate knowledge and resources to Increase resource u+liza+on over +me and exploit valuable exper+se STANDARDS Syndicate best prac+ces and design to assure efficient integra+on SPEED STRATEGY bolster infrastructure Drive adop+on via a solu+ons engine, not autonomous projects SCALE Systema+cally learn, develop, deploy, operate, publish, and repeat ©2014 Cloudera, Inc. All rights reserved.
  • 25. 25 Op0mize Your Hadoop Knowledge Flow Big Data Is All About Con0nuous Improvement and Opera0onal Efficiency Learn Develop Deploy Operate Publish REPEAT ©2014 Cloudera, Inc. All rights reserved.
  • 26. 26 What Is a Center of Excellence? Centrally Incubate, Integrate, and Scale Across Lines of Business Big Data Solu.ons as a Service Knowledge Hub Best Prac.ces Exper.se Solu.ons Engine Infrastructure Clusters Dedicated Team Architects Administrators Developers Data Scien.sts ©2014 Cloudera, Inc. All rights reserved.
  • 27. 27 Why Build a Center of Excellence? Silos Obstruct Scale, Prevent Flexibility, and Drive Up Costs ??? Novice Skill Level Redundant Infrastructure Incompa.ble Systems Disparate Data $$$ ©2014 Cloudera, Inc. All rights reserved.
  • 29. 29 Big Data as a Service Drives Success Coordinate Workloads to Distribute Advantages Widely and Seamlessly Insights & Best Prac.ces Hadoop Experts Broadly Accessible Big Data Technology & Equipment ©2014 Cloudera, Inc. All rights reserved.
  • 31. 31 Big Data COE Program Roles Staff Centrally and Train to Scale Management & Leadership Business & Data Technology & Ops Lead Data Scien0st Lead Business Analyst Developers Big Data Visionary Execu0ve Sponsor Program Manager Data Warehouse Admin Specialist LOB Rep Architects LOB Rep LOB Rep Data Wranglers ©2014 Cloudera, Inc. All rights reserved.
  • 32. 32 A Five-­‐Phase Development Plan Work with Hadoop Experts from Planning to Support Staffing • Personnel • Training • Mentoring Incuba.on Opera.ons Con.nuity • Scope • Build • Cer0fy • Deploy • Integrate • Op0mize • Use Cases • Real-­‐Time • Document • Cost Model • Support • Sustain Expansion ©2014 Cloudera, Inc. All rights reserved.
  • 33. 33 ©2014 Cloudera, Inc. All rights reserved. Thank You! Eddie Garcia eddie@cloudera.com