HBase 1.0 is the new stable major release, and the start of "semantic versioned" releases. We will cover new features, changes in behavior and requirements, source/binary and wire compatibility details, and upgrading. We'll also dive deep into the new standardized client API in 1.0, which establishes a separation of concerns, encapsulates what is needed from how it's delivered, and guarantees future compatibility while freeing the implementation to evolve.
5. Why 1.0 now?
Ran out of numbers, which was the plan for switching to the
0.9x versions
Community agreement that HBase has already reached the
maturity level
Start semantic versioning and compatibility guarantees
6. Apache HBase v1.0 marks a major milestone in the project's development. It is a monumental moment
that the army of contributors who have made this possible should all be proud of. The result is a thing
of collaborative beauty that also happens to power key, large-scale Internet platforms.
Michael Stack
The HBase 1.0 release appropriately acknowledges a maturity already achieved by the Apache
HBase community and software both, and is a great occasion to learn more about HBase, how it can
help you solve your scale data challenges, and the growing ecosystem of Open Source and
commercial software that chooses HBase as foundation.
Andrew Purtell
https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72
8. Release goals
The 1.0.0 release has three goals:
1. Lay a stable foundation for future 1.x
releases
9. Release goals
The 1.0.0 release has three goals:
1. Lay a stable foundation for future 1.x
releases
2. Stabilize running HBase cluster and its
clients; and
10. Release goals
The 1.0.0 release has three goals:
1. Lay a stable foundation for future 1.x
releases
2. Stabilize running HBase cluster and its
clients; and
3. Make versioning and compatibility
dimensions explicit
12. Overview
Over 1500 jiras resolved on top of 0.98.0!
See release announcement for a
comprehensive summary
13. API overhaul
Introduced new base interfaces
Client API is explicitly marked
Javadoc for client side is separated
Client API will have source compat in 1.x
14. Read availability with region replicas
Phase 1 of “region replicas” feature. (Phase 2 in 1.1)
Each region can have “replicas” hosted in other RSs
Only primary accepts writes
Reads can be performed with STRONG or TIMELINE
consistency
15. Online config change
Configuration can be updated while the region
server is running
hbase> update_all_config
hbase> update_master_config
hbase> update_config ‘<serverName>’
Only some configs can be update online
some compaction / load balancer configs for now
Other forward ports from 0.89-fb branch
16. New and noteworthy
Extensive documentation/website improvements
Automatic tuning of global memstore and block cache sizes
Bucket cache easier to configure
Compressed blocks in the block cache
Pluggable replication endpoint
Basic client backpressure mechanism
17. New and noteworthy cont.
Docker file
Per-cell TTL
CopyTable with --bulkload
Truncate table command
Atomic Table.checkAndMutate()
Namespace permissions
18. Under the covers
Cell based read/write path
Ring buffer based WAL improvements
Multi WAL files in HRegionServer
ZK-less assignment (disabled by default)
Client Preemptive Fast Fail
Combining mvcc and seqIds
Various security, tags and visibility labels improvements
Various fixes to REST server
Numerous improvements in other areas and bug fixes too long to list here.
19. Changes in behavior: JDK
✓*: should work, but not well tested
https://hbase.apache.org/book.html#basic.prerequisites
JDK Version HBase-1.1 HBase-1.0 HBase-0.98
JDK 6 ✗ ✗ ✓
JDK 7 ✓ ✓ ✓
JDK 8 ✓* ✓* ✓*
20. Changes in behavior: Hadoop
Hadoop Version HBase-1.1 HBase-1.0 HBase-0.98
Hadoop-1.x ✗ ✗ ✓*
Hadoop-2.2 ✗ ✓* ✓*
Hadoop-2.3 ✓* ✓* ✓
Hadoop-2.4 ✓ ✓ ✓
Hadoop-2.5 ✓ ✓ ✓
Hadoop-2.6 ✓ ✓ ✓*
✓*: should work, but not well tested
https://hbase.apache.org/book.html#basic.prerequisites
21. Changes in behavior
Zookeeper-3.4.x is required
Default ports changed to 160XX (out of ephemeral range)
Hfile v3 is default
Slab cache removed
Default heap is ¼ of physical memory (instead of 1GB)
23. Semantic Versioning
Starting with the 1.0.0 release, HBase works toward
Semantic Versioning
MAJOR.MINOR.PATCH[-identifiers]
PATCH: only BC bug fixes.
MINOR: BC new features
MAJOR: Incompatible changes
24. Post 1.0 versions
New versioning already in action
● 1.0.0
● 1.0.1 (patch release)
● 1.1.0 (minor release)
1.0.x and 1.1.x is expected to have ~monthly releases
1.2.0 and 2.0.0 in the works
25. HBase API surface
Client API
Explicitly marked with InterfaceAudience.Public
Get/Put/Table/Connection, etc
LimitedPrivate API
Explicitly marked with InterfaceAudience.LimitedPrivate
Coprocessors, replication APIs
Private API
Explicitly marked with InterfaceAudience.Private
All other classes not marked
Also InterfaceAudience.{Stable,Evolving,Unstable}
26. Major Minor Patch
Client-Server Wire Compatibility
✗ ✓ ✓
Server-Server Compatibility
✗ ✓ ✓
File Format Compatibility
✗* ✓ ✓
Client API Compatibility
✗ ✓ ✓
Client Binary Compatibility
✗ ✗ ✓
Server Side Limited API C.
✗ ✗*/✓* ✓
Dependency Compatibility
✗ ✓ ✓
Operation Compatibility
✗ ✗ ✓
27. 1.0.x Compatibility with earlier: Source
1.0.x is (mostly) source compatible with earlier
versions
Filter / Coprocessor users will see some
changes
We strongly advise ALL users to switch to new
API
Deprecated APIs will be removed (in 2.0)
28. 1.0.x Compatibility with earlier: Binary
1.0 is NOT binary compatible with earlier
versions
Clients/coprocessors have to be recompiled to
link against 1.0 jars
Cannot drop/replace jars against an application
compiled with 0.98
29. 1.0.x Compatibility with earlier: Wire
1.0.x is wire compatible with 0.98.x releases
0.98.x client can be used to access 1.0.x
cluster (allows rolling upgrades)
NOT binary compatible with earlier (0.96,0.94)
HFile v3 is default. Once upgraded, cannot “go
back”
31. Upgrade to 1.0.x
From 0.98.x
Regular upgrade or rolling upgrade fashion is supported.
From 0.96.x
Supported with a shutdown and restart of the cluster.
No rolling upgrades.
No need to run extra steps/scripts.
From 0.94.x
Supported similarly to upgrade from 0.94 -> 0.96.
The upgrade script should be run to rewrite cluster level metadata.
From earlier versions (0.92,0.90,etc) upgrade is not supported
33. Why the new interfaces?
HBase 1.0 had a goal to create new client interfaces
● Explicit contracts - Clear definition of the surface
● Defining a standard API in the code
● Clearer focus of responsibility - each piece doing one
thing well.
36. Managed Connections Going Away
HBase Client used to have implicit connection
management.
Managed Connections was trying to do lifecycle
management without understanding the application,
sometimes with unpredictable results.
HBase 1.0 introduces explicit Connection management.
37. Connection
Simple replacement for HConnection
Focal point to get a Table, RegionLocator,
Admin, or BufferedMutator
Use TableName instead of String/byte[]
User Managed - must call connection.close()
Connections have a cache of region metadata and a
shared threadpool; close() releases shared resources.
38. Admin
Replaces HBaseAdmin for administration
Functionality
create/delete/list Table and Snapshots, split table,
add/remove table columns and etc
Retrieved via connection.getAdmin().
Use TableName object instead of String/byte[]
Remember to .close()
39. RegionLocator
Region metadata related functionality
get start/end keys, get all regions, get region for qualifier
No manipulation of regions. That’s in Admin.
Lightweight - uses cached region information
from connection
Remember to .close()
40. Table (part I)
Most of HTable’s methods - CRUD
put, delete, get - both single and list
increment, append
scan
batch
checkAnd*
coprocessor service
41. Table (Part II)
Removed autoflush
The autoflush functionality was complex and used for
batch writes. BufferedMutator was introduces for that
purpose.
One Table per thread
Remember to close()
Release the threadpool
42. BufferedMutator (part I)
Autoflush and BufferedMutator are used when
“writes are small and many; it especially makes
sense when there is no natural flush point.” --
stack on HBASE-12728
Supports all Batches Mutations
Puts were supported before.
Adds batched Deletes, Appends, Increments,
RowMutations
43. BufferedMutator (part II)
Used in Map/Reduces
Can be used in high performance servlets, if
you can tolerate some data loss.
Use ExceptionListener
CLOSE!
does a flush() - You might lose data in the buffer
also closes threadpools
48. Thanks to the users and developers who made
1.0 happen!
References:
https://hbase.apache.org/book.html#hbase.versioning
https://hbase.apache.org/book.html#basic.prerequisites
https://hbase.apache.org/book.html#hadoop
https://hbase.apache.org/book.html#upgrade1.0
https://mail-archives.apache.org/mod_mbox/hbase-dev/201502.mbox/%3CCAMUu0w-3K1aZgY7nJReUaMBF1Qj%
2B2DNwDNOth1su%2Bxr93zGy3w%40mail.gmail.com%3E
https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces72