Your SlideShare is downloading. ×
© 2014 MapR Technologies, confidential
TREND

1

Hadoop is Providing Value Across Organizations

ENTERPRISE
DATA HUB

• Multi-structured
data staging & archive
•...
Sellers
Cloud

Advertising
Automation
Cloud

Buyers
Cloud

90B
AD AUCTIONS

per day

© 2014 MapR Technologies, confidentia...
TREND

2

Organizations Have Many Workload-specific Systems

ENTERPRISE
USERS

• Mission-critical
reliability
• Transactio...
REALITY

Hadoop Can Relieve the Pressure from Enterprise Systems

ENTERPRISE
USERS

OPERATIONAL
SYSTEMS

Keys for Producti...
Fortune 100 Financial Services Company

104M
CARD MEMBERS

© 2014 MapR Technologies, confidential
6
REALITY

2

Most Hadoop Projects are Still Science Experiments
Number of
Companies
Cluster Size

Development/Testing
Focus...
Largest Biometric Database in the World

1.2B
PEOPLE

PEOPLE

8

© 2014 MapR Technologies, confidential
8
REALITY

3

Going Big Requires a Rock-Solid Architecture

FOUNDATION

© 2014 MapR Technologies, confidential
REALITY

3

Going Big Requires a Rock-Solid Architecture

Enterprise-grade

Multi-tenancy

High Performance

Open Standard...
MapR Distribution for Hadoop
APACHE HADOOP ECOSYSTEM
Hive/
Stinger/
Tez

Drill

Impala

Shark

Hue

...

Flume

Mahout

Ca...
Apache Hadoop NameNode High Availability (HA)
NAS
Appliance

HDFS HA

A

B

C

D
AA

A

E
BB

Primary NameNode
NameNode
Na...
No NameNode Architecture

A

B

C

D

E

F

NameNode

No special config to enable HA

Up to 1T files (> 5000x advantage)
D...
Comparative Study of Hadoop Distributions: I/O Performance
Read and Write Throughput Benchmarks

IDH 2.4.1

262

276

212
...
World Record Performance
NEW MINUTESORT WORLD RECORD

With a Fraction of the Hardware

1.65 TB
IN 1 MINUTE
298 NODES
PREVI...
Hbase Apps: High Performance with Consistent Low Latency

--- M7 Read Latency

--- Others Read Latency

© 2014 MapR Techno...
MapR M7: The Best In-Hadoop Database

HBase

JVM

NoSQL Columnar Store
 Apache HBase API
 In-Hadoop database


HDFS
JVM...
MapR M7: The Best In-Hadoop Database

Hbase
Interface

BigData Application

JVM
HDFS
Interface

NoSQL Columnar Store
 Apa...
Opportunity to Revolutionize Enterprise Data Architecture

From Redundant Processing Silos and Data Science Experiments…
©...
The Production Enterprise BigData Platform

… to Consolidated Operational and Analytical Workloads
© 2014 MapR Technologie...
Q&A

Engage with us!

@allenday, @mapr
linkedin.com/in/allenday
allenday@mapr.com
tsheng@mapr.com
mdarling@mapr.com
© 2014...
Upcoming SlideShare
Loading in...5
×

20140228 - Singapore - BDAS - Ensuring Hadoop Production Success

714

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
714
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "20140228 - Singapore - BDAS - Ensuring Hadoop Production Success"

  1. 1. © 2014 MapR Technologies, confidential
  2. 2. TREND 1 Hadoop is Providing Value Across Organizations ENTERPRISE DATA HUB • Multi-structured data staging & archive • ETL / DW optimization • Mainframe optimization • Data exploration MARKETING ANALYTICS • Recommendation engines & targeting • Ad optimization • Pricing analysis • Lead scoring RISK ANALYTICS • Network security monitoring • Security information & event management • Fraudulent behavioral analysis OPERATIONS INTELLIGENCE • Supply chain & logistics • System log analysis • Manufacturing quality assurance • Preventative maintenance • Sensor analysis © 2014 MapR Technologies, confidential
  3. 3. Sellers Cloud Advertising Automation Cloud Buyers Cloud 90B AD AUCTIONS per day © 2014 MapR Technologies, confidential 3
  4. 4. TREND 2 Organizations Have Many Workload-specific Systems ENTERPRISE USERS • Mission-critical reliability • Transaction guarantees • Deep security • Real-time performance • Backup and recovery OPERATIONAL SYSTEMS ANALYTICAL SYSTEMS • Interactive SQL • Rich analytics • Mixed workload management • Data governance • Security • Backup and recovery © 2014 MapR Technologies, confidential
  5. 5. REALITY Hadoop Can Relieve the Pressure from Enterprise Systems ENTERPRISE USERS OPERATIONAL SYSTEMS Keys for Production Success • Data protection and recovery • Inter-operability • Read-write performance • Supports operations and analytics ANALYTICAL SYSTEMS • • • • • Data staging Archive Data transformation Data exploration Streaming, interactions © 2014 MapR Technologies, confidential
  6. 6. Fortune 100 Financial Services Company 104M CARD MEMBERS © 2014 MapR Technologies, confidential 6
  7. 7. REALITY 2 Most Hadoop Projects are Still Science Experiments Number of Companies Cluster Size Development/Testing Focus: Educ/Svc 1st Production Use Case 1 – 10 Nodes Wide-scale Production 10 – 2000 Nodes © 2014 MapR Technologies, confidential
  8. 8. Largest Biometric Database in the World 1.2B PEOPLE PEOPLE 8 © 2014 MapR Technologies, confidential 8
  9. 9. REALITY 3 Going Big Requires a Rock-Solid Architecture FOUNDATION © 2014 MapR Technologies, confidential
  10. 10. REALITY 3 Going Big Requires a Rock-Solid Architecture Enterprise-grade Multi-tenancy High Performance Open Standards for Interoperability Data Protection Operational & Analytical FOUNDATION © 2014 MapR Technologies, confidential
  11. 11. MapR Distribution for Hadoop APACHE HADOOP ECOSYSTEM Hive/ Stinger/ Tez Drill Impala Shark Hue ... Flume Mahout Cascading Solr Spark Storm Sentry Zookeeper Management Sqoop Whirr Pig YARN MapReduce Oozie HBase • High availability • Standard file access • Data protection • Standard database • Disaster recovery access Patent • Pluggable services MAPR-FS • Performance 2X-5X MAPR-FS Pending• Broad developer FILES support Enterprise-grade Performance • Ability to logically divide a cluster to support different use cases, job types, user groups, and administrators • Enterprise security authorization • Wire-level authentication • Data governance MapR Data Platform MapR Data Platform MapR Data Platform MapR Data Platform Multi-tenancy Data Protection • Ability to support predictive analytics, real-time database operations,MAPR-DB and MAPR-DB support high arrival TABLES rate data Inter-operability • Unit of work framework to provide transactional integrity Operational & Analytical © 2014 MapR Technologies, confidential
  12. 12. Apache Hadoop NameNode High Availability (HA) NAS Appliance HDFS HA A B C D AA A E BB Primary NameNode NameNode NameNode B HDFS Federation D E F B E C F D DA D E F NameNode F C CC NameNode NameNode F Standby NameNode NameNode NameNode DataNode Single point NameNode Only one activeof failure Multiple single points of failure w/o HA Limited to 50-200 million files Needs 20 NameNodes Performance bottleneck for 1 Billion files E DataNode DataNode DataNode DataNode DataNode Performance bottleneck Commercial NASNAS needed Commercial possibly needed Metadata must fit in memory DataNode DataNode DataNode Double the block reports Performance bottleneck HDFS-based Distributions © 2014 MapR Technologies, confidential
  13. 13. No NameNode Architecture A B C D E F NameNode No special config to enable HA Up to 1T files (> 5000x advantage) DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode DataNode Automatic failover & re-replication Metadata is persisted to disk Significantly less hardware & OpEx Higher performance © 2014 MapR Technologies, confidential
  14. 14. Comparative Study of Hadoop Distributions: I/O Performance Read and Write Throughput Benchmarks IDH 2.4.1 262 276 212 465 MB per Second MB per Second 475 HDP 1.3 MapR M5 2.1.3 59 DFSIO Read Throughput CDH 4.3 69 64 DFSIO Write Throughput Source: Flux7 Labs Study, October 2013 © 2014 MapR Technologies, confidential
  15. 15. World Record Performance NEW MINUTESORT WORLD RECORD With a Fraction of the Hardware 1.65 TB IN 1 MINUTE 298 NODES PREVIOUS RECORD: 1.6 TB with 2200 nodes © 2014 MapR Technologies, confidential
  16. 16. Hbase Apps: High Performance with Consistent Low Latency --- M7 Read Latency --- Others Read Latency © 2014 MapR Technologies, confidential
  17. 17. MapR M7: The Best In-Hadoop Database HBase JVM NoSQL Columnar Store  Apache HBase API  In-Hadoop database  HDFS JVM ext3/ext4 Tables/Files Disks Disks Other Distros MapR M7 The most scalable, enterprise-grade, NoSQL database that supports online applications and analytics © 2014 MapR Technologies, confidential
  18. 18. MapR M7: The Best In-Hadoop Database Hbase Interface BigData Application JVM HDFS Interface NoSQL Columnar Store  Apache HBase API  In-Hadoop database  JVM ext3/ext4 Tables/Files Disks Disks Other Distros MapR M7 The most scalable, enterprise-grade, NoSQL database that supports online applications and analytics © 2014 MapR Technologies, confidential
  19. 19. Opportunity to Revolutionize Enterprise Data Architecture From Redundant Processing Silos and Data Science Experiments… © 2014 MapR Technologies, confidential
  20. 20. The Production Enterprise BigData Platform … to Consolidated Operational and Analytical Workloads © 2014 MapR Technologies, confidential
  21. 21. Q&A Engage with us! @allenday, @mapr linkedin.com/in/allenday allenday@mapr.com tsheng@mapr.com mdarling@mapr.com © 2014 MapR Technologies, confidential

×