SlideShare a Scribd company logo
1
Top 10 lessons learned
from deploying Hadoop
in a private cloud
A use case from OpenLogic, Inc., a Rogue Wave company
Rod Cope, CTO
2
• Introduction
• The problem
• The solution
• Top 10 lessons
• Final thoughts
• Q&A
3
Rod Cope, CTO
Rogue Wave Software
4
• “Big data”
– All the world’s open source software
– Metadata, code, indexes
– Individual tables contain many terabytes
– Relational databases aren’t scale-free
• Growing every day
• Need real-time random access to all data
• Long-running and complex analysis jobs
5
• Hadoop, HBase, and Solr
– Hadoop – distributed file system, map/reduce
– HBase – “NoSQL” data store – column-oriented
– Solr – search server based on Lucene
– All are scalable, flexible, fast, well-supported, and used in
production environments
• And a supporting cast of thousands…
– Stargate, MySQL, Rails, Redis, Resque,
– Nginx, Unicorn, HAProxy, Memcached,
– Ruby, JRuby, CentOS, …
6
Internet Application LAN Data LAN *Caching and load balancing not shown
7
• Private cloud
– 100+ CPU cores
– 100+ terabytes of disk
– Machines don’t have identity
– Add capacity by plugging in new machines
• Why not Amazon EC2?
– Great for computational bursts
– Expensive for long-term storage of big data
– Not yet consistent enough for mission-critical usage of HBase
8
• Configuration is key
• “Commodity hardware” is not an old desktop
• Hadoop & HBase crave bandwidth
• Big data takes a long time…
• Big data is hard
• Scripting languages can help
• Public clouds are expensive
• Not possible without open source
• Expect things to fail – a lot
• It’s all still cutting edge
9
• Many moving parts
• Pay attention to the details
– Operating system – max open files, sockets, and other limits
– Hadoop – max map/reduce jobs, memory, disk
– HBase – region size, memory
– Solr – merge factor, norms, memory
• Minor versions are very important
– Use a good known combination of Hadoop and HBase
– Specific patches are critical
– The fine print matters
10
• Follow all HBase configuration advice here:
– http://hbase.apache.org/book.html#trouble
– Yes, that’s a whole lot of configuration
– Skip steps at your own peril!
• If you really need HA Hadoop:
– http://blog.cloudera.com/blog/2009/07/hadoop-ha-configuration/
• If you hit datanode timeouts while writing to sockets:
– dfs.datanode.socket.write.timeout = 0
– Even though it should be ignored…
11
• Linux kernel is important – affects configuration switches, both
required and optional
– Example: epoll limits required as of 2.6.27, then no longer required in
newer kernels such as 2.6.33+
– http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linuxkernel-
2627-epoll-limits/
• Upgrade your machine BIOS, network card BIOS, and all hardware
drivers
– Example: issues with certain default configurations of Dell boxes on
CentOS/RHEL 5.x and Broadcom NIC’s
• Will drop packets & cause other problems under high load
– Disable MSI in Linux & power saver (C-states) in machine BIOS
12
• Many problems only show up under severe load
– Sustained, massive data loads running for 2-24 hours
• Change only one parameter at a time
– Yes, this can be excruciating
• Ask the mailing list or your support provider
– They’ve seen a lot, likely including your problem…but not always
– Don’t be afraid to dig in and read some code…
13
Ideally we should wait after transferTo returns 0. But
because of a bug in JRE on Linux
(http://bugs.sun.com/view_bug.do?bug_id=5103988), which
throws an exception instead of returning 0, we wait for the
channel to be writable before writing to it. If you ever
see IOException with message "Resource temporarily
unavailable” thrown here, please let us know.Once we move
to JAVA SE 7, wait should be moved to correct place.
• Hadoop stresses every bit of networking code in Java and tends to expose all the
cracks
• This bug was fixed in JDK 1.6.0_18 (after 6 years)
14
• “Commodity hardware” != 3 year old desktop
• Dual quad-core, 32GB RAM, 4+ disks
• Don’t bother with RAID on Hadoop data disks
– Be wary of non-enterprise drives
• Expect ugly hardware issues at some point
15
• Dual quad-core and dual hex-core Dell boxes
• 32-64GB RAM
– ECC (highly recommended by Google)
• 6 x 2TB enterprise hard drives
• RAID 1 on two of the drives
– OS, Hadoop, HBase, Solr, NFS mounts (be careful!), job code, etc.
– Key “source” data backups
• Hadoop datanode gets remaining drives
• Redundant enterprise switches
• Dual- and quad-gigabit NIC’s
16
• Hadoop
– Map/reduce jobs shuffle lots of data
– Continuously replicating blocks and rebalancing
– Loves bandwidth – dual-gigabit network on dedicated switches
– 10Gbps network can help
• HBase
– Needs 5+ machines to stretch its legs
– Depends on ZooKeeper – low-latency is important
– Don’t let it run low on memory
17
• …to do anything
– Load, list, walk directory structures, count, process, test, back up
– I’m not kidding
• Hard to test, but don’t be tempted to skip it
– You’ll eventually hit every corner case you know and don’t know
• Backups are difficult
– Consider a backup Hadoop cluster
– HBase team is working on live replication
– Solr already has built-in replication
18
• Don’t use a single machine to load the cluster
– You might not live long enough to see it finish
• At OpenLogic, we spread raw source data across many machines and hard
drives via NFS
– Be very careful with NFS configuration – can hang machines
• Load data into HBase via Hadoop map/reduce jobs
– Turn off WAL for much better performance
– put.setWriteToWAL(false)
• Avoid large values (> 5MB)
– Works, but may cause instability and/or performance issues
– Rows and columns are cheap, so use more of them instead
• Be careful not to over-commit to Solr
19
• HBase NoSQL
– Think hash table, not relational database
• How do find my data if primary key won’t cut it?
• Solr to the rescue
– Very fast, highly scalable search server with built-in sharding and
replication – based on Lucene
– Dynamic schema, powerful query language, faceted search,
accessible via simple REST-like web API w/XML, JSON, Ruby, and
other data formats
20
• Sharding
– Query any server – it executes the same query against all other servers in
the group
– Returns aggregated result to original caller
• Async replication (slaves poll their masters)
– Can use repeaters if replicating across data centers
• OpenLogic
– Solr farm, sharded, cross-replicated, fronted with HAProxy
• Load balanced writes across masters, reads across slaves and masters
– Billions of lines of code in HBase, all indexed in Solr for real-time search in
multiple ways
– Over 20 Solr fields indexed per source file
21
• Expect to learn and experiment quite a bit
– Many moving parts, lots of decisions to make
– You won’t get them all right the first time
• Expect to discover new and better ways of modeling your
data and processes
– Don’t be afraid to start over once or twice
• Consider getting outside help
– Training, consulting, mentoring, support
22
• Scripting is faster and easier than writing Java
• Great for system administration tasks, testing
• Standard HBase shell is based on JRuby
• Very easy map/reduce jobs with JRuby and Wukong
• Used heavily at OpenLogic
– Productivity of Ruby
– Power of Java Virtual Machine
– Ruby on Rails, Hadoop integration, GUI clients
23
24
JRuby
list = ["Rod", "Neeta", "Eric", "Missy"]
shorts = list.find_all { |name| name.size <= 4 }
puts shorts.size
shorts.each { |name| puts name }
-> 2
-> Rod
Eric
Groovy
list = ["Rod", "Neeta", "Eric", "Missy"]
shorts = list.findAll { name -> name.size() <= 4 }
println shorts.size
shorts.each { name -> println name }
-> 2
-> Rod
Eric
25
• Amazon EC2
– EBS Storage
• 100TB * $0.10/GB/month = $120k/year
– Double Extra Large instances
• 13 EC2 compute units, 34.2GB RAM
• 20 instances * $1.00/hr * 8,760 hrs/yr = $175k/year
• 3 year reserved instances
– 20 * 4k = $80k up front to reserve
– (20 * $0.34/hr * 8,760 hrs/yr * 3 yrs) / 3 = $86k/year to operate
– Totals for 20 virtual machines
• 1st year cost: $120k + $80k + $86k = $286k
• 2nd & 3rd year costs: $120k + $86k = $206k
• Average: ($286k + $206k + $206k) / 3 = $232k/year
26
• Buy your own
– 20 * Dell servers w/12 CPU cores, 32GB RAM, 5 TB disk = $160k
• Over 33 EC2 compute units each
– Total: $53k/year (amortized over 3 years)
27
• Amazon EC2
– 20 instances * 13 EC2 compute units = 260 EC2 compute units
– Cost: $232k/year
• Buy your own
– 20 machines * 33 EC2 compute units = 660 EC2 compute units
– Cost: $53k/year
– Does not include hosting and maintenance costs
• Don’t think system administration goes away
– You still “own” all the instances – monitoring, debugging, support
28
29
• Hadoop, HBase, Solr
• Apache, Tomcat, ZooKeeper,
HAProxy
• Stargate, JRuby, Lucene, Jetty,
HSQLDB, Geronimo
• Apache Commons, JUnit
• CentOS
• Dozens more
• Too expensive to build or buy
everything
30
• Hardware
– Power supplies, hard drives
• Operating system
– Kernel panics, zombie processes, dropped packets
• Hadoop and friends
– Hadoop datanodes, HBase regionservers, Stargate servers, Solr
servers
• Your code and data
– Stray map/reduce jobs, strange corner cases in your data leading
to program failures
31
• Hadoop
– SPOF around Namenode, append functionality
• HBase
– Backup, replication, and indexing solutions in flux
• Solr
– Several competing solutions around cloud-like scalability and
fault-tolerance, including Zookeeper and Hadoop integration
32
• You can host big data in your own private cloud
– Tools are available today that didn’t exist a few years ago
– Fast to prototype – production readiness takes time
– Expect to invest in training and support
• Public clouds
– Great for learning, experimenting, testing
– Best for bursts vs. sustained loads
– Beware latency, expense of long-term big data storage
• You still need “small data”
– SQL and NoSQL coexist peacefully
– OpenLogic uses MySQL & Redis in addition to HBase, Solr, Memcached
33
Q&A
34
Rod Cope
rod.cope@roguewave.com
See us in action:
www.roguewave.com
35

More Related Content

What's hot

Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
Alex Moundalexis
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
Alex Moundalexis
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014Hassan Islamov
 
Optimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopOptimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for Hadoop
DataWorks Summit
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker
Fabio Fumarola
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
Justin Swanhart
 
Apache kudu
Apache kuduApache kudu
Apache kudu
Asim Jalis
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedIn
Allen Wittenauer
 
HBase and Accumulo | Washington DC Hadoop User Group
HBase and Accumulo | Washington DC Hadoop User GroupHBase and Accumulo | Washington DC Hadoop User Group
HBase and Accumulo | Washington DC Hadoop User Group
Cloudera, Inc.
 
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit
 
March 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling HadoopMarch 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling Hadoop
Yahoo Developer Network
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
Subhas Kumar Ghosh
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
Allen Wittenauer
 
Conquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryConquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard query
Justin Swanhart
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
Amal G Jose
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloud
Justin Swanhart
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
EDB
 
Impala 2.0 Update #impalajp
Impala 2.0 Update #impalajpImpala 2.0 Update #impalajp
Impala 2.0 Update #impalajp
Cloudera Japan
 

What's hot (20)

Improving Hadoop Performance via Linux
Improving Hadoop Performance via LinuxImproving Hadoop Performance via Linux
Improving Hadoop Performance via Linux
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Improving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux ConfigurationImproving Hadoop Cluster Performance via Linux Configuration
Improving Hadoop Cluster Performance via Linux Configuration
 
Intro to big data choco devday - 23-01-2014
Intro to big data   choco devday - 23-01-2014Intro to big data   choco devday - 23-01-2014
Intro to big data choco devday - 23-01-2014
 
Optimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for HadoopOptimizing your Infrastrucure and Operating System for Hadoop
Optimizing your Infrastrucure and Operating System for Hadoop
 
HBase lon meetup
HBase lon meetupHBase lon meetup
HBase lon meetup
 
8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker8a. How To Setup HBase with Docker
8a. How To Setup HBase with Docker
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
Apache kudu
Apache kuduApache kudu
Apache kudu
 
Hadoop Performance at LinkedIn
Hadoop Performance at LinkedInHadoop Performance at LinkedIn
Hadoop Performance at LinkedIn
 
HBase and Accumulo | Washington DC Hadoop User Group
HBase and Accumulo | Washington DC Hadoop User GroupHBase and Accumulo | Washington DC Hadoop User Group
HBase and Accumulo | Washington DC Hadoop User Group
 
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
Accumulo Summit 2014: Benchmarking Accumulo: How Fast Is Fast?
 
March 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling HadoopMarch 2011 HUG: Scaling Hadoop
March 2011 HUG: Scaling Hadoop
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
 
Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)Let's Talk Operations! (Hadoop Summit 2014)
Let's Talk Operations! (Hadoop Summit 2014)
 
Conquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard queryConquering "big data": An introduction to shard query
Conquering "big data": An introduction to shard query
 
Deployment and Management of Hadoop Clusters
Deployment and Management of Hadoop ClustersDeployment and Management of Hadoop Clusters
Deployment and Management of Hadoop Clusters
 
Divide and conquer in the cloud
Divide and conquer in the cloudDivide and conquer in the cloud
Divide and conquer in the cloud
 
Simple Works Best
 Simple Works Best Simple Works Best
Simple Works Best
 
Impala 2.0 Update #impalajp
Impala 2.0 Update #impalajpImpala 2.0 Update #impalajp
Impala 2.0 Update #impalajp
 

Similar to Top 10 lessons learned from deploying hadoop in a private cloud

Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
Brian Enochson
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around HadoopDataWorks Summit
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
Steve Staso
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
saipriyacoool
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברגTaldor Group
 
Managing growth in Production Hadoop Deployments
Managing growth in Production Hadoop DeploymentsManaging growth in Production Hadoop Deployments
Managing growth in Production Hadoop Deployments
DataWorks Summit
 
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Eric Baldeschwieler
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
clairvoyantllc
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
HBaseCon
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
Joe Alex
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
James Chen
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
Sandeep Singh
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
bddmoscow
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
bhargavi804095
 
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQLCompressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Arseny Chernov
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosHadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Lester Martin
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
Andraz Tori
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
cdmaxime
 
002 Introduction to hadoop v3
002   Introduction to hadoop v3002   Introduction to hadoop v3
002 Introduction to hadoop v3
Dendej Sawarnkatat
 
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the FieldSearch in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
Alex Moundalexis
 

Similar to Top 10 lessons learned from deploying hadoop in a private cloud (20)

Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Infrastructure Around Hadoop
Infrastructure Around HadoopInfrastructure Around Hadoop
Infrastructure Around Hadoop
 
Hadoop Primer
Hadoop PrimerHadoop Primer
Hadoop Primer
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
 
4. hadoop גיא לבנברג
4. hadoop  גיא לבנברג4. hadoop  גיא לבנברג
4. hadoop גיא לבנברג
 
Managing growth in Production Hadoop Deployments
Managing growth in Production Hadoop DeploymentsManaging growth in Production Hadoop Deployments
Managing growth in Production Hadoop Deployments
 
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
Hadoop - Where did it come from and what's next? (Pasadena Sept 2014)
 
Bigdata workshop february 2015
Bigdata workshop  february 2015 Bigdata workshop  february 2015
Bigdata workshop february 2015
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Introduction to Hadoop and Big Data
Introduction to Hadoop and Big DataIntroduction to Hadoop and Big Data
Introduction to Hadoop and Big Data
 
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和SparkEtu Solution Day 2014 Track-D: 掌握Impala和Spark
Etu Solution Day 2014 Track-D: 掌握Impala和Spark
 
Hadoop-Quick introduction
Hadoop-Quick introductionHadoop-Quick introduction
Hadoop-Quick introduction
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
hadoop distributed file systems complete information
hadoop distributed file systems complete informationhadoop distributed file systems complete information
hadoop distributed file systems complete information
 
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQLCompressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
Compressed Introduction to Hadoop, SQL-on-Hadoop and NoSQL
 
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive DemosHadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
Hadoop Demystified + MapReduce (Java and C#), Pig, and Hive Demos
 
SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!SQL or NoSQL, that is the question!
SQL or NoSQL, that is the question!
 
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
Cloudera Impala - Las Vegas Big Data Meetup Nov 5th 2014
 
002 Introduction to hadoop v3
002   Introduction to hadoop v3002   Introduction to hadoop v3
002 Introduction to hadoop v3
 
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the FieldSearch in the Apache Hadoop Ecosystem: Thoughts from the Field
Search in the Apache Hadoop Ecosystem: Thoughts from the Field
 

More from Rogue Wave Software

The Global Influence of Open Banking, API Security, and an Open Data Perspective
The Global Influence of Open Banking, API Security, and an Open Data PerspectiveThe Global Influence of Open Banking, API Security, and an Open Data Perspective
The Global Influence of Open Banking, API Security, and an Open Data Perspective
Rogue Wave Software
 
No liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failureNo liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failure
Rogue Wave Software
 
Disrupt or be disrupted – Using secure APIs to drive digital transformation
Disrupt or be disrupted – Using secure APIs to drive digital transformationDisrupt or be disrupted – Using secure APIs to drive digital transformation
Disrupt or be disrupted – Using secure APIs to drive digital transformation
Rogue Wave Software
 
Leveraging open banking specifications for rigorous API security – What’s in...
Leveraging open banking specifications for rigorous API security –  What’s in...Leveraging open banking specifications for rigorous API security –  What’s in...
Leveraging open banking specifications for rigorous API security – What’s in...
Rogue Wave Software
 
Adding layers of security to an API in real-time
Adding layers of security to an API in real-timeAdding layers of security to an API in real-time
Adding layers of security to an API in real-time
Rogue Wave Software
 
Getting the most from your API management platform: A case study
Getting the most from your API management platform: A case studyGetting the most from your API management platform: A case study
Getting the most from your API management platform: A case study
Rogue Wave Software
 
Advanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsAdvanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applications
Rogue Wave Software
 
The forgotten route: Making Apache Camel work for you
The forgotten route: Making Apache Camel work for youThe forgotten route: Making Apache Camel work for you
The forgotten route: Making Apache Camel work for you
Rogue Wave Software
 
Are open source and embedded software development on a collision course?
Are open source and embedded software development on a  collision course?Are open source and embedded software development on a  collision course?
Are open source and embedded software development on a collision course?
Rogue Wave Software
 
Three big mistakes with APIs and microservices
Three big mistakes with APIs and microservices Three big mistakes with APIs and microservices
Three big mistakes with APIs and microservices
Rogue Wave Software
 
5 strategies for enterprise cloud infrastructure success
5 strategies for enterprise cloud infrastructure success5 strategies for enterprise cloud infrastructure success
5 strategies for enterprise cloud infrastructure success
Rogue Wave Software
 
PSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliancePSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliance
Rogue Wave Software
 
Java 10 and beyond: Keeping up with the language and planning for the future
Java 10 and beyond: Keeping up with the language and planning for the futureJava 10 and beyond: Keeping up with the language and planning for the future
Java 10 and beyond: Keeping up with the language and planning for the future
Rogue Wave Software
 
How to keep developers happy and lawyers calm (Presented at ESC Boston)
How to keep developers happy and lawyers calm (Presented at ESC Boston)How to keep developers happy and lawyers calm (Presented at ESC Boston)
How to keep developers happy and lawyers calm (Presented at ESC Boston)
Rogue Wave Software
 
Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)
Rogue Wave Software
 
How to migrate SourcePro apps from Solaris to Linux
How to migrate SourcePro apps from Solaris to LinuxHow to migrate SourcePro apps from Solaris to Linux
How to migrate SourcePro apps from Solaris to Linux
Rogue Wave Software
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC apps
Rogue Wave Software
 
Enterprise Linux: Justify your migration from Red Hat to CentOS
Enterprise Linux: Justify your migration from Red Hat to CentOSEnterprise Linux: Justify your migration from Red Hat to CentOS
Enterprise Linux: Justify your migration from Red Hat to CentOS
Rogue Wave Software
 
Walk through an enterprise Linux migration
Walk through an enterprise Linux migrationWalk through an enterprise Linux migration
Walk through an enterprise Linux migration
Rogue Wave Software
 
How to keep developers happy and lawyers calm
How to keep developers happy and lawyers calmHow to keep developers happy and lawyers calm
How to keep developers happy and lawyers calm
Rogue Wave Software
 

More from Rogue Wave Software (20)

The Global Influence of Open Banking, API Security, and an Open Data Perspective
The Global Influence of Open Banking, API Security, and an Open Data PerspectiveThe Global Influence of Open Banking, API Security, and an Open Data Perspective
The Global Influence of Open Banking, API Security, and an Open Data Perspective
 
No liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failureNo liftoff, touchdown, or heartbeat shall miss because of a software failure
No liftoff, touchdown, or heartbeat shall miss because of a software failure
 
Disrupt or be disrupted – Using secure APIs to drive digital transformation
Disrupt or be disrupted – Using secure APIs to drive digital transformationDisrupt or be disrupted – Using secure APIs to drive digital transformation
Disrupt or be disrupted – Using secure APIs to drive digital transformation
 
Leveraging open banking specifications for rigorous API security – What’s in...
Leveraging open banking specifications for rigorous API security –  What’s in...Leveraging open banking specifications for rigorous API security –  What’s in...
Leveraging open banking specifications for rigorous API security – What’s in...
 
Adding layers of security to an API in real-time
Adding layers of security to an API in real-timeAdding layers of security to an API in real-time
Adding layers of security to an API in real-time
 
Getting the most from your API management platform: A case study
Getting the most from your API management platform: A case studyGetting the most from your API management platform: A case study
Getting the most from your API management platform: A case study
 
Advanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsAdvanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applications
 
The forgotten route: Making Apache Camel work for you
The forgotten route: Making Apache Camel work for youThe forgotten route: Making Apache Camel work for you
The forgotten route: Making Apache Camel work for you
 
Are open source and embedded software development on a collision course?
Are open source and embedded software development on a  collision course?Are open source and embedded software development on a  collision course?
Are open source and embedded software development on a collision course?
 
Three big mistakes with APIs and microservices
Three big mistakes with APIs and microservices Three big mistakes with APIs and microservices
Three big mistakes with APIs and microservices
 
5 strategies for enterprise cloud infrastructure success
5 strategies for enterprise cloud infrastructure success5 strategies for enterprise cloud infrastructure success
5 strategies for enterprise cloud infrastructure success
 
PSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliancePSD2 & Open Banking: How to go from standards to implementation and compliance
PSD2 & Open Banking: How to go from standards to implementation and compliance
 
Java 10 and beyond: Keeping up with the language and planning for the future
Java 10 and beyond: Keeping up with the language and planning for the futureJava 10 and beyond: Keeping up with the language and planning for the future
Java 10 and beyond: Keeping up with the language and planning for the future
 
How to keep developers happy and lawyers calm (Presented at ESC Boston)
How to keep developers happy and lawyers calm (Presented at ESC Boston)How to keep developers happy and lawyers calm (Presented at ESC Boston)
How to keep developers happy and lawyers calm (Presented at ESC Boston)
 
Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)Open source applied - Real world use cases (Presented at Open Source 101)
Open source applied - Real world use cases (Presented at Open Source 101)
 
How to migrate SourcePro apps from Solaris to Linux
How to migrate SourcePro apps from Solaris to LinuxHow to migrate SourcePro apps from Solaris to Linux
How to migrate SourcePro apps from Solaris to Linux
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC apps
 
Enterprise Linux: Justify your migration from Red Hat to CentOS
Enterprise Linux: Justify your migration from Red Hat to CentOSEnterprise Linux: Justify your migration from Red Hat to CentOS
Enterprise Linux: Justify your migration from Red Hat to CentOS
 
Walk through an enterprise Linux migration
Walk through an enterprise Linux migrationWalk through an enterprise Linux migration
Walk through an enterprise Linux migration
 
How to keep developers happy and lawyers calm
How to keep developers happy and lawyers calmHow to keep developers happy and lawyers calm
How to keep developers happy and lawyers calm
 

Recently uploaded

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Yara Milbes
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
Ortus Solutions, Corp
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
ShamsuddeenMuhammadA
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 

Recently uploaded (20)

2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi ArabiaTop 7 Unique WhatsApp API Benefits | Saudi Arabia
Top 7 Unique WhatsApp API Benefits | Saudi Arabia
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024BoxLang: Review our Visionary Licenses of 2024
BoxLang: Review our Visionary Licenses of 2024
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptxText-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
Text-Summarization-of-Breaking-News-Using-Fine-tuning-BART-Model.pptx
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 

Top 10 lessons learned from deploying hadoop in a private cloud

  • 1. 1 Top 10 lessons learned from deploying Hadoop in a private cloud A use case from OpenLogic, Inc., a Rogue Wave company Rod Cope, CTO
  • 2. 2 • Introduction • The problem • The solution • Top 10 lessons • Final thoughts • Q&A
  • 3. 3 Rod Cope, CTO Rogue Wave Software
  • 4. 4 • “Big data” – All the world’s open source software – Metadata, code, indexes – Individual tables contain many terabytes – Relational databases aren’t scale-free • Growing every day • Need real-time random access to all data • Long-running and complex analysis jobs
  • 5. 5 • Hadoop, HBase, and Solr – Hadoop – distributed file system, map/reduce – HBase – “NoSQL” data store – column-oriented – Solr – search server based on Lucene – All are scalable, flexible, fast, well-supported, and used in production environments • And a supporting cast of thousands… – Stargate, MySQL, Rails, Redis, Resque, – Nginx, Unicorn, HAProxy, Memcached, – Ruby, JRuby, CentOS, …
  • 6. 6 Internet Application LAN Data LAN *Caching and load balancing not shown
  • 7. 7 • Private cloud – 100+ CPU cores – 100+ terabytes of disk – Machines don’t have identity – Add capacity by plugging in new machines • Why not Amazon EC2? – Great for computational bursts – Expensive for long-term storage of big data – Not yet consistent enough for mission-critical usage of HBase
  • 8. 8 • Configuration is key • “Commodity hardware” is not an old desktop • Hadoop & HBase crave bandwidth • Big data takes a long time… • Big data is hard • Scripting languages can help • Public clouds are expensive • Not possible without open source • Expect things to fail – a lot • It’s all still cutting edge
  • 9. 9 • Many moving parts • Pay attention to the details – Operating system – max open files, sockets, and other limits – Hadoop – max map/reduce jobs, memory, disk – HBase – region size, memory – Solr – merge factor, norms, memory • Minor versions are very important – Use a good known combination of Hadoop and HBase – Specific patches are critical – The fine print matters
  • 10. 10 • Follow all HBase configuration advice here: – http://hbase.apache.org/book.html#trouble – Yes, that’s a whole lot of configuration – Skip steps at your own peril! • If you really need HA Hadoop: – http://blog.cloudera.com/blog/2009/07/hadoop-ha-configuration/ • If you hit datanode timeouts while writing to sockets: – dfs.datanode.socket.write.timeout = 0 – Even though it should be ignored…
  • 11. 11 • Linux kernel is important – affects configuration switches, both required and optional – Example: epoll limits required as of 2.6.27, then no longer required in newer kernels such as 2.6.33+ – http://pero.blogs.aprilmayjune.org/2009/01/22/hadoop-and-linuxkernel- 2627-epoll-limits/ • Upgrade your machine BIOS, network card BIOS, and all hardware drivers – Example: issues with certain default configurations of Dell boxes on CentOS/RHEL 5.x and Broadcom NIC’s • Will drop packets & cause other problems under high load – Disable MSI in Linux & power saver (C-states) in machine BIOS
  • 12. 12 • Many problems only show up under severe load – Sustained, massive data loads running for 2-24 hours • Change only one parameter at a time – Yes, this can be excruciating • Ask the mailing list or your support provider – They’ve seen a lot, likely including your problem…but not always – Don’t be afraid to dig in and read some code…
  • 13. 13 Ideally we should wait after transferTo returns 0. But because of a bug in JRE on Linux (http://bugs.sun.com/view_bug.do?bug_id=5103988), which throws an exception instead of returning 0, we wait for the channel to be writable before writing to it. If you ever see IOException with message "Resource temporarily unavailable” thrown here, please let us know.Once we move to JAVA SE 7, wait should be moved to correct place. • Hadoop stresses every bit of networking code in Java and tends to expose all the cracks • This bug was fixed in JDK 1.6.0_18 (after 6 years)
  • 14. 14 • “Commodity hardware” != 3 year old desktop • Dual quad-core, 32GB RAM, 4+ disks • Don’t bother with RAID on Hadoop data disks – Be wary of non-enterprise drives • Expect ugly hardware issues at some point
  • 15. 15 • Dual quad-core and dual hex-core Dell boxes • 32-64GB RAM – ECC (highly recommended by Google) • 6 x 2TB enterprise hard drives • RAID 1 on two of the drives – OS, Hadoop, HBase, Solr, NFS mounts (be careful!), job code, etc. – Key “source” data backups • Hadoop datanode gets remaining drives • Redundant enterprise switches • Dual- and quad-gigabit NIC’s
  • 16. 16 • Hadoop – Map/reduce jobs shuffle lots of data – Continuously replicating blocks and rebalancing – Loves bandwidth – dual-gigabit network on dedicated switches – 10Gbps network can help • HBase – Needs 5+ machines to stretch its legs – Depends on ZooKeeper – low-latency is important – Don’t let it run low on memory
  • 17. 17 • …to do anything – Load, list, walk directory structures, count, process, test, back up – I’m not kidding • Hard to test, but don’t be tempted to skip it – You’ll eventually hit every corner case you know and don’t know • Backups are difficult – Consider a backup Hadoop cluster – HBase team is working on live replication – Solr already has built-in replication
  • 18. 18 • Don’t use a single machine to load the cluster – You might not live long enough to see it finish • At OpenLogic, we spread raw source data across many machines and hard drives via NFS – Be very careful with NFS configuration – can hang machines • Load data into HBase via Hadoop map/reduce jobs – Turn off WAL for much better performance – put.setWriteToWAL(false) • Avoid large values (> 5MB) – Works, but may cause instability and/or performance issues – Rows and columns are cheap, so use more of them instead • Be careful not to over-commit to Solr
  • 19. 19 • HBase NoSQL – Think hash table, not relational database • How do find my data if primary key won’t cut it? • Solr to the rescue – Very fast, highly scalable search server with built-in sharding and replication – based on Lucene – Dynamic schema, powerful query language, faceted search, accessible via simple REST-like web API w/XML, JSON, Ruby, and other data formats
  • 20. 20 • Sharding – Query any server – it executes the same query against all other servers in the group – Returns aggregated result to original caller • Async replication (slaves poll their masters) – Can use repeaters if replicating across data centers • OpenLogic – Solr farm, sharded, cross-replicated, fronted with HAProxy • Load balanced writes across masters, reads across slaves and masters – Billions of lines of code in HBase, all indexed in Solr for real-time search in multiple ways – Over 20 Solr fields indexed per source file
  • 21. 21 • Expect to learn and experiment quite a bit – Many moving parts, lots of decisions to make – You won’t get them all right the first time • Expect to discover new and better ways of modeling your data and processes – Don’t be afraid to start over once or twice • Consider getting outside help – Training, consulting, mentoring, support
  • 22. 22 • Scripting is faster and easier than writing Java • Great for system administration tasks, testing • Standard HBase shell is based on JRuby • Very easy map/reduce jobs with JRuby and Wukong • Used heavily at OpenLogic – Productivity of Ruby – Power of Java Virtual Machine – Ruby on Rails, Hadoop integration, GUI clients
  • 23. 23
  • 24. 24 JRuby list = ["Rod", "Neeta", "Eric", "Missy"] shorts = list.find_all { |name| name.size <= 4 } puts shorts.size shorts.each { |name| puts name } -> 2 -> Rod Eric Groovy list = ["Rod", "Neeta", "Eric", "Missy"] shorts = list.findAll { name -> name.size() <= 4 } println shorts.size shorts.each { name -> println name } -> 2 -> Rod Eric
  • 25. 25 • Amazon EC2 – EBS Storage • 100TB * $0.10/GB/month = $120k/year – Double Extra Large instances • 13 EC2 compute units, 34.2GB RAM • 20 instances * $1.00/hr * 8,760 hrs/yr = $175k/year • 3 year reserved instances – 20 * 4k = $80k up front to reserve – (20 * $0.34/hr * 8,760 hrs/yr * 3 yrs) / 3 = $86k/year to operate – Totals for 20 virtual machines • 1st year cost: $120k + $80k + $86k = $286k • 2nd & 3rd year costs: $120k + $86k = $206k • Average: ($286k + $206k + $206k) / 3 = $232k/year
  • 26. 26 • Buy your own – 20 * Dell servers w/12 CPU cores, 32GB RAM, 5 TB disk = $160k • Over 33 EC2 compute units each – Total: $53k/year (amortized over 3 years)
  • 27. 27 • Amazon EC2 – 20 instances * 13 EC2 compute units = 260 EC2 compute units – Cost: $232k/year • Buy your own – 20 machines * 33 EC2 compute units = 660 EC2 compute units – Cost: $53k/year – Does not include hosting and maintenance costs • Don’t think system administration goes away – You still “own” all the instances – monitoring, debugging, support
  • 28. 28
  • 29. 29 • Hadoop, HBase, Solr • Apache, Tomcat, ZooKeeper, HAProxy • Stargate, JRuby, Lucene, Jetty, HSQLDB, Geronimo • Apache Commons, JUnit • CentOS • Dozens more • Too expensive to build or buy everything
  • 30. 30 • Hardware – Power supplies, hard drives • Operating system – Kernel panics, zombie processes, dropped packets • Hadoop and friends – Hadoop datanodes, HBase regionservers, Stargate servers, Solr servers • Your code and data – Stray map/reduce jobs, strange corner cases in your data leading to program failures
  • 31. 31 • Hadoop – SPOF around Namenode, append functionality • HBase – Backup, replication, and indexing solutions in flux • Solr – Several competing solutions around cloud-like scalability and fault-tolerance, including Zookeeper and Hadoop integration
  • 32. 32 • You can host big data in your own private cloud – Tools are available today that didn’t exist a few years ago – Fast to prototype – production readiness takes time – Expect to invest in training and support • Public clouds – Great for learning, experimenting, testing – Best for bursts vs. sustained loads – Beware latency, expense of long-term big data storage • You still need “small data” – SQL and NoSQL coexist peacefully – OpenLogic uses MySQL & Redis in addition to HBase, Solr, Memcached
  • 34. 34 Rod Cope rod.cope@roguewave.com See us in action: www.roguewave.com
  • 35. 35