With HBase hitting the 1.0 mark and adoption/production use cases continuing to grow, it's been an exciting year since last we met at HBaseCon 2014. What is the state of HBase today, and where does it go from here?
Outline
● State of the Project
● State of the Software
● State of the Ecosystem
2.0
Outline
● State of the Project
● State of the Software
● State of the Ecosystem
2.0
State of the Project
● Backing medium- and high- scale services
o Hundreds of enterprises
o Some of the largest Internet companies in the world
● Well established, mature codebase
o >100 contributors, 4.2M lines of code, 1200+ man-
years of total effort*
● Runs on HDFS, MapR, Gluster, GPFS, etc.
● As a service: AWS EMR, HDInsight, etc.
2.0
*Source: OpenHub https://www.openhub.net/p/hbase
Project: Vision
Simple, steady, and powerful: “A first class high
performance horizontally scalable data storage
engine for Big Data, suitable as the store of
record for mission critical data.”
2.0
Project: Goals
● Availability: Always more, always faster
● Stability and Operability
o Continuous Improvement
● Scaling (up and down)
● Readying for NextGen ‘commodity’ hardware
● Multi-tenancy
● Diversifying our ecosystem
o Come talk to us if you’re building a Big Data product
2.0
● Eight new committers
Zhang Duo (duozhang), Andrey Stepachev (octo47),
Liu Shaohui (liushaohui), Virag Kothari (virag),
Sean Busbey (busbey), Srikanth Srungarapu (ssrungarapu), Jing
Chen (Jerry) He (jerryjch),
Misty Stanley-Jones (misty)
● Now 43 committers! from a diverse group of companies
including Cask, Cloudera, Facebook, HortonWorks,
IBM, Intel, Salesforce, Xiaomi, Yahoo!, and Google
2.0Project: Committers
Project: PMC
● First chair rotation in the project lifetime
Michael Stack (stack), outgoing
Andrew Purtell (apurtell), incoming
● Four new members
Sean Busbey (busbey)
Matteo Bertozzi (mbertozzi)
Nick Dimiduk (ndimiduk)
Jeffrey Zhong (jefferyz)
2.0
Software : Semantic Versioning
Client / Server API cleanup continuing
Dependency isolation / shading
Goal is for full semver compliance
HBase-1.0 talk and HBase-2.0 panel for more
Software: Focus
● Smaller regions, more regions (scaling)
o Less write amplification
o 1M+ region clusters
● Stability
o Procedure Version2
o Assignment improvements/stability
● Scanners
o Chunking, Heartbeating, ‘Parking’, Streaming
2.0
Software: Focus
● Adaption: Work Loads
o HBase as Medium Object Store (MOB)
● Tunable Consistency
o TIMELINE Consistency
● Improving coprocessor API supportability
● Profile-driven optimization
● Improved GC-friendliness, use more RAM
o Offheaping
2.0
Ecosystem: SQL
o Phoenix 4.4.0RC for HBase 1.0.0
o SQL over raw HBase tables
● Trafodion
o Trafodion 1.1.0 announced
o Heading for Apache Incubator!
● & LeanXcale
2.0
Get Involved!
Follow us on Twitter
@HBase
Follow us on Facebook
Follow our Blog
https://blogs.apache.org/hbase/
Join our mailing lists
user-subscribe@hbase.apache.org
dev-subscribe@hbase.apache.org
2.0
Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.
Deploys you will hear about today.
Not compete with C*. MTTR stuff. Our replication is better than theirs. Master-master, large sequential scans. We should not have to do these C* vs HBase fights anymore… No advertising, no coherent leadership, no PM… Doesn’t help in sales. No one talks about it.
New logo
We won’t leave you behind. 0.94 might be finished. We want you to move up to newer versions…. Counts are since last hbasecon
Add and delete column family while table is online.
See the ecosystem track and use cases for sampling of what is going on in hbase ecosystem these times.
See the ecosystem track and use cases for sampling of what is going on in hbase ecosystem these times. Be sure attend SQL SmackDown
Ambari shipping
Welcome everyone! Today is going to be fantastic and we have quite the agenda for you. In the interest of time, lets dive right in.