Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HBaseCon 2015 General Session: State of HBase

3,525 views

Published on

With HBase hitting the 1.0 mark and adoption/production use cases continuing to grow, it's been an exciting year since last we met at HBaseCon 2014. What is the state of HBase today, and where does it go from here?

Published in: Software
  • Be the first to comment

HBaseCon 2015 General Session: State of HBase

  1. 1. 1 hbasecon.com
  2. 2. The State of HBase Andrew Purtell, Enis Söztutar, Michael Stack
  3. 3. About Us Andrew Purtell Salesforce Release Manager for 0.98 @akpurtell Enis Söztutar Hortonworks Release Manager for 1.0 @enissoz 2.0 Michael Stack Cloudera - @saintstack
  4. 4. Outline ● State of the Project ● State of the Software ● State of the Ecosystem 2.0
  5. 5. Outline ● State of the Project ● State of the Software ● State of the Ecosystem 2.0
  6. 6. State of the Project ● Backing medium- and high- scale services o Hundreds of enterprises o Some of the largest Internet companies in the world ● Well established, mature codebase o >100 contributors, 4.2M lines of code, 1200+ man- years of total effort* ● Runs on HDFS, MapR, Gluster, GPFS, etc. ● As a service: AWS EMR, HDInsight, etc. 2.0 *Source: OpenHub https://www.openhub.net/p/hbase
  7. 7. Project: Vision Simple, steady, and powerful: “A first class high performance horizontally scalable data storage engine for Big Data, suitable as the store of record for mission critical data.” 2.0
  8. 8. Project: Goals ● Availability: Always more, always faster ● Stability and Operability o Continuous Improvement ● Scaling (up and down) ● Readying for NextGen ‘commodity’ hardware ● Multi-tenancy ● Diversifying our ecosystem o Come talk to us if you’re building a Big Data product 2.0
  9. 9. Project: Delivered 2.0 1.0.0 2.0.0 by HBaseCon 2016! Released Feb 24, 2015
  10. 10. ● Eight new committers Zhang Duo (duozhang), Andrey Stepachev (octo47), Liu Shaohui (liushaohui), Virag Kothari (virag), Sean Busbey (busbey), Srikanth Srungarapu (ssrungarapu), Jing Chen (Jerry) He (jerryjch), Misty Stanley-Jones (misty) ● Now 43 committers! from a diverse group of companies including Cask, Cloudera, Facebook, HortonWorks, IBM, Intel, Salesforce, Xiaomi, Yahoo!, and Google 2.0Project: Committers
  11. 11. Project: PMC ● First chair rotation in the project lifetime Michael Stack (stack), outgoing Andrew Purtell (apurtell), incoming ● Four new members Sean Busbey (busbey) Matteo Bertozzi (mbertozzi) Nick Dimiduk (ndimiduk) Jeffrey Zhong (jefferyz) 2.0
  12. 12. Project: dev@hbase.a.o
  13. 13. Project: user@hbase.a.o
  14. 14. Outline ● State of the Project ● State of the Software ● State of the Ecosystem 2.0
  15. 15. Software: Semantic Versioning MAJOR.MINOR.PATCH[-identifiers] PATCH: only BC bug fixes. MINOR: BC new features MAJOR: Incompatible changes 2.0
  16. 16. Software: Releases
  17. 17. ● 0.94.x o Eight releases: 0.94.20 - 0.94.27* ● 0.98.x o Twelve releases: 0.98.2 - 0.98.11 ● 1.0.x o Two releases: 1.0.0 - 1.0.1 ● 1.1.x o Release candidate 2.0Software: Releases
  18. 18. Software: Issues 13K+ issues created 12K issues resolved!
  19. 19. Software: Issues ~3K issues resolved last year
  20. 20. Software: Semantic Versioning Starting with 1.0.0, HBase is working towards Semantic Versioning* of releases… * http://semver.org/ 2.0
  21. 21. Software: Semantic Versioning MAJOR.MINOR.PATCH[-identifiers] 2.0
  22. 22. Software: Semantic Versioning MAJOR.MINOR.PATCH[-identifiers] PATCH: only BC bug fixes. 2.0
  23. 23. Software: Semantic Versioning MAJOR.MINOR.PATCH[-identifiers] PATCH: only BC bug fixes. MINOR: BC new features 2.0
  24. 24. Software: Semantic Versioning 1.0.0 1.0.1 1.1.0 2.0.0-alpha 2.0.0-beta 2.0
  25. 25. Software : Semantic Versioning Client / Server API cleanup continuing Dependency isolation / shading Goal is for full semver compliance HBase-1.0 talk and HBase-2.0 panel for more
  26. 26. Software: Focus ● Smaller regions, more regions (scaling) o Less write amplification o 1M+ region clusters ● Stability o Procedure Version2 o Assignment improvements/stability ● Scanners o Chunking, Heartbeating, ‘Parking’, Streaming 2.0
  27. 27. Software: Focus ● Adaption: Work Loads o HBase as Medium Object Store (MOB) ● Tunable Consistency o TIMELINE Consistency ● Improving coprocessor API supportability ● Profile-driven optimization ● Improved GC-friendliness, use more RAM o Offheaping 2.0
  28. 28. Software: Focus ● Multitenancy o Table groups o Quotas o Priorities ● Using all of the machine o RAM o iops o All of the CPUs 2.0
  29. 29. Outline ● State of the Project ● State of the Software ● State of the Ecosystem 2.0
  30. 30. Ecosystem ● OpenTSDB ● Transaction Managers o Themis, Tephra, Omid2 ● Lots-o-Graphs-on-HBase ● SQL ● Hadoop dogfooding Hbase ● Google Cloud Bigtable (keynote follows) 2.0
  31. 31. Ecosystem: SQL o Phoenix 4.4.0RC for HBase 1.0.0 o SQL over raw HBase tables ● Trafodion o Trafodion 1.1.0 announced o Heading for Apache Incubator! ● & LeanXcale 2.0
  32. 32. 2.0Ecosytem: Dogfooding ● YARN-2928 Application Timeline Service ● HIVE-9452 HBase to store Hive metadata ● AMBARI-5707 Ambari Metrics System
  33. 33. Get Involved! Follow us on Twitter @HBase Follow us on Facebook Follow our Blog https://blogs.apache.org/hbase/ Join our mailing lists user-subscribe@hbase.apache.org dev-subscribe@hbase.apache.org 2.0
  34. 34. 35 hbasecon.com

×