Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica

HBase
In Practice

Ravi Veeramachaneni

Topics
 Why HBase?
 HBase Usecases – HBase @Navteq
 Design Considerations
 Hardware/Deployment Considerations
 Practical Tips (Tuning/Optimization)
 Wanted Features

Ravi Veeramachaneni HBase – In Practice 2

Hadoop Benefits
• Stores (HDFS) and Process (MR) large amounts of data
• Scales (100s and 1000s of nodes)
• Inexpensive (no license cost, low cost hardware)
• Fast (1TB sort in 62s, 1PB in 16.25h*)
• Availability (failover built into the platform)
• Data Recoverability (failure should not result in any data
loss)
• Replication (out-of-the-box 3-way replication and
configurable)
• Better Throughput (Time to read the whole dataset is more
important than latency in reading the first record)
• Write once and read-many-times pattern
• Works well with structured, unstructured or semi-structured
data
*YDN Blog: Jim Gray’s Benchmark @ http://developer.yahoo.com/blogs/hadoop/posts/2009/05/hadoop_sorts_a_petabyte_in_162/


But …
 Not so good or does not support
• Random access
• Updating the data and/or file (writes are always
at the EOF)
• Apps that require low latency access to data
• Does not to support lots of small files
• Does not support multiple writers
• Not a solution for every Data problem


Featuring HBase
 HBase Scales (runs on top of Hadoop)
 HBase provides fast table scans for time ranges and
fast key based lookups
 HBase stores null values for free
• Saves both disk space and disk IO time
 HBase supports unstructured/semi-structured data through
column families
 HBase has built-in version management
 Map Reduce data input
• Tables are sorted and have unique keys
• Reducer often times optional
• Combiner not needed
 Strong community support and wider adoption


HBase Usecases
 To solve Big Data problems
 Sparse data (un- or semi-structured)
 Cost effectively Scalable
 Versioned data
 Some other features may interest to you
 Linear distribution of data across the data nodes
 Rows are stored in byte-lexographic sorted order
 Atomic Read/Write/Update
 Data Access – Random, Sequential reads and writes
 Automatic replication of Data for HA
 But not for every Data problem


Navteq’s Usecase
 Content is
– Constantly growing (in higher TB)
– Sparse and unstructured
– Provided in multiple data formats
– Ingested, processed and delivered in transactional and batch mode

 Content Breadth
– 100s of millions of content records
– 100s of content suppliers + community input

 Content Depth
– On average, a content record has 120 attributes
– Certain types of content have more than 400 attributes
– Content classified across 270+ categories


Content Processing High-level Overview
Batch and Transactional API

Bulk Content Customer and
Sources Community UGC

Merchant Community, Us
Data er and
Merchant
Media

Place ID from Place Registry
Location ID from Location Referencing

Source & Blended Record Management Tiered Quality System

PUBLISHING
real-time, on-demand
Place ID Bulk Content delivery; Search, and
Location ID other mobile devices


HBase @ NAVTEQ
 Started in 2009, hbase 0.19.x (apache)
• 8-node VMWare Sandbox Cluster
• Flaky, unstable, RS Failures
• Switched to CDH
 Early 2010, hbase 0.20.x (CDH2)
• 10-node Physical Sandbox Cluster
• Still had lot of challenges, RS Failures, META corruption
• Cluster expanded significantly with multiple environments
 Current (hbase 0.90.3)
• Moved to CDH3u1 official release
• Multiple teams/projects using Hadoop/HBase implementation
• Working on Hive/HBase integration, Oozie, Lucene/Solr
integration, Cloudera Enterprise and few other


Measured Business Value
 Scalability & Deployment
• Handling spikes are addressed by simply adding nodes
• No code changes or deployment needed
• From 15 to 30 to 60 nodes and more, as data grows
• Deployment are well managed and controlled (from 12-16
hours to < 2 hours)
 Speed to Market
• By supporting Real-time transactions (instead of quarterly
update)
• Batch updates are handled more efficiently (from days to
hours)
 Faster Supplier On-boarding
• Flexible and externally managed Business Rules
 Cheaper than the existing solution
 <$2m vs. $12m (based on projected growth)

HBase & Zookeeper
 ZK – Distributed coordination service
• Coordinates messages sent across the network between nodes
(network fails, etc.)
 HBase depends on ZK and authorizes ZK to manage the state

 HBase hosts key info on ZK
• Location of root catalog table
• Address of the current cluster master
• Bootstrapping a client connection to an HBase cluster

 Client connects to ZK quorum first
• To learn the location of -ROOT-
• Clients consult -ROOT- to elicit the location of the .META. Region
• Client then does a lookup against the found .META. Region to figure
the hosting user-space region and its location
• Clients caches all the above for future traversing


Design Considerations
 Database/schema design
• Transition to Column-oriented or flat schema
 Understand your access pattern
 Row-key design/implementation
• Sequential keys
• Suffers from distribution of load but uses the block caches
• Can be addressed by pre-splitting the regions
• Randomize keys to get better distribution
• Achieved through hashing on Key Attributes – SHA1 or MD5
• Suffers range scans
 Too many Column Families (NOT Good)
• Initially we had about 30 or so, now reduced to 8
 Compression
• LZO or Snappy (20% better than LZO) – Block (default)


Design Considerations
 Serialization
• AVRO didn’t work well – deserialization issue
• Developed configurable serialization mechanism that uses JSON
except Date type
 Secondary Indexes
• Were using ITHBase and IHBase from contrib – doesn’t work well
• Redesigned schema without need for index
• We still need it though
 Performance
• Several tunable parameters
• Hadoop, HBase, OS, JVM, Networking, Hardware
 Scalability
• Interfacing with real-time (interactive) systems from batch oriented
system


Hadoop/HBase Processes


Hardware/Deployment Considerations
 Hardware (Hadoop+HBase)
• Data Node - 24GB RAM, 8 Cores, 4x1TB (64GB, 24 Cores, 8x2TB)
• 6 mappers and 6 reducers per node (16 mappers, 4 reducers)
• Memory allocation by process
• Data Node – 1GB (2GB)
• Task Tracker – 1GB (2GB)
• Map Tasks – 6x1GB (16x1.5GB)
• Reduce Tasks – 6x1GB (4x1.5GB)
• Region Server – 8GB (24GB)
• Total Allocation: 24GB (64GB)
 Deployment
• Do not run ZK instances on DN, have a separate ZK quorum (3
minimum)
• Do not run HMaster on NN
• Avoid SPOF for HMaster (run additional master(s))


HBase Configuration/Tuning
 Configuring HBase
• Configuration is the key
• Many moving parts – typos, out of synchronization
• Operating System
• Number of open files (ulimit) to 32K or even higher (/etc/security/limits.conf)
• vm.swapiness to lower or 0
• HDFS
• Adjust block size based on the use case
• Increase xceivers to 2047 (dfs.datanode.max.xceivers)
• Set socket timeout to 0 (dfs.datanode.socket.write.timeout)
• HBase
• Needs more memory
• No swapping – JVM hates it
• GC pauses could cause timeouts or RS failures (read article posted by
Todd Lipcon on avoiding full GC)


 HBase
• Per-cluster
• Turn-off block cache if the hit ratio is less (hfile.block.cache.size, default
20%)
• Per-table
• MemStore flush Size (hbase.hregion.memstore.flush.size, default 64MB and
hbase.hregion.memstore.block.multiplier, default 2)
• Max File Size (hbase.hregion.max.filesize, default 256MB)
• Per-CF
• Compression
• Bloom Filter
• Per-RS
• Amount of heap in each RS to reserve for all MemStores
(hbase.regionserver.global.memstore.upperLimit, default 0.4)
• MemStore flush size
• Max file size
• Per-SF
• Maximum number of SFs per store to allow
(hbase.hstore.blockingStoreFiles, default 7)


• HBase
• Write (puts) optimization (Ryan Rawson HUG8 presentation – HBase
importing)
– hbase.regionserver.global.memstore.upperLimit=0.3
– hbase.regionserver.global.memstore.lowerLimit=0.15
– hbase.regionserver.handler.count=256
– hbase.hregion.memstore.block.multiplier=8
– hbase.hstore.blockingStoreFiles=25
• Control number of store files (hbase.hregion.max.filesize)
 Security
• Still in flux, need robust RBAC
 Reliability
• Name Node is SPOF
• HBase is sensitive
• Region Server Failures


Desired Features
 Better operational tools for using Hadoop and HBase
• Job management, backup, restore, user provisioning, general
administrative tasks, etc.
 Support for Secondary Indexes
 Full-text Indexes and Searching (Lucene/Solr integration?)
 HA support for Name Node
 Need Data Replication for HA & DR
 Security at Table, CF and Row level
 Good documentation (it’s getting better though) – now Lars
book out


Thank you


Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica

Similar to Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica (20)

More from Cloudera, Inc.

More from Cloudera, Inc. (20)

Recently uploaded

Recently uploaded (20)

Hadoop World 2011: Practical HBase - Ravi Veeramchaneni, Informatica