HBase Data Types

4,877 views

Published on

Describes the motivations behind the new DataType interface, considerations when adding types to a schemaless database, and example usage.

Published in: Technology, Education

HBase Data Types

  1. 1. HBase Data Types Nick Dimiduk, Hortonworks @xefyr n10k.com
  2. 2. Agenda • Motivations • Progress thus far • Future work • Examples • More Examples Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 2
  3. 3. Why introduce types? • Δ(SQL, byte[]): (╯°□°)╯︵ ┻━┻ • Rule of least surprise • Interoperability across tools • Distill best practices Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 3
  4. 4. Considerations • Opt-in for current users • Easy transition for existing applications • Client-side only mostly – Filters, Split policies, Coprocessors, Block encoding • Avoid POJO constraints – No required base-class/interface – No magic (avoid ASM, ORM) • Non-Java clients • HBASE-8089 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 4
  5. 5. Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 5
  6. 6. Inspiration • Orderly • PostgreSQL / PostGIS • HBASE-7221 • HBASE-7692 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 6
  7. 7. Features: Encoding • Order preservation • Override direction (ASC/DSC) • Fixed, variable-width • Null-able • Self-identifying • Efficient Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 7
  8. 8. Features: API • Complex type encoding – Compound rowkey pattern – Order preservation – Nullable fields • Runtime metadata • User-extensible Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 8
  9. 9. Implementation$ HBASE-8089
  10. 10. Implementation: Encoding o.a.h.h.util.OrderedBytes • null • numeric, +/-Inf, NaN • int8, int16, int32, int64 • float32, float64 • variable-length text • variable-length blob Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. o.a.h.h.util.Bytes • numeric • boolean • int16, int32, int64 • float32, float64 • variable-length text 2014-­‐11-­‐18 10
  11. 11. Implementation: API interface DataType<T> • decode() • encode() • encodedClass() • encodedLength() • getOrder() • isNullable() • isOrderPreserving() • isSkippable() • skip() implements DataType • OrderedXXX • RawXXX • Struct – StructBuilder – StructIterator – TerminatedWrapper – FixedLengthWrapper • Union{2,3,4} Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 11
  12. 12. Up Next • “Default” types • More complex types – Arrays/Lists – Maps/Dicts • Tool integration – Apache Phoenix – Cloudera Kite • Performance audit, HBASE-8694 • Improved metadata, HBASE-8863 – isCastableTo – isCoercableTo – isComparableTo • TypedTable, HBASE-7941 • Beyond Java, HBASE-10091 – REST – Thrift – Shell • ImportTsv, HBASE-8593 • User documentation • Coprocessors? • Filters? • CAS? • DataBlockEncoders? Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 12
  13. 13. Examples
  14. 14. A case for TypedTable Put p = new Put(Bytes.toBytes(u.user)); p.add(INFO_FAM, USER_COL, Bytes.toBytes(u.user)); p.add(INFO_FAM, NAME_COL, Bytes.toBytes(u.name)); p.add(INFO_FAM, EMAIL_COL, Bytes.toBytes(u.email)); p.add(INFO_FAM, PASS_COL, Bytes.toBytes(u.password)); Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 14
  15. 15. A case for TypedTable! static final RawString ENC_STR = new RawString();! static final RawLong ENC_LONG = new RawLong();! --! ! SimplePositionedByteRange pbr =! new SimplePositionedByteRange(100);! ENC_STR.encode(pbr, u.user);! Put p = new Put(Bytes.copy(pbr.getBytes(), pbr.getOffset(), pbr.getPosition()));! p.add(INFO_FAM, USER_COL, Bytes.copy(pbr.getBytes(), ...);! pbr.setPosition(0);! ENC_STR.encode(pbr, u.name);! p.add(INFO_FAM, NAME_COL, Bytes.copy(pbr.getBytes(), ...);! ...! 2014-­‐11-­‐18 15 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License.
  16. 16. Structs: writing ! ! ! Struct struct = new StructBuilder()! .add(OrderedNumeric.ASCENDING)! .add(OrderedString.ASCENDING)! .toStruct();! PositionedByteRange buf1 =! new SimplePositionedByteRange(7);! struct.encode(buf1,! new Object[] { BigDecimal.ONE, "foo" });! ! Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 16
  17. 17. Structs: reading ! ! ! ! buf1.setPosition(0);! StructIterator it = longer.iterator(buf1);! while (it.hasNext()) {! System.out.print(it.next() + ", ");! }! ! > BigDecimal.ONE, foo! Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 17
  18. 18. Structs: schema migration Struct addedFields = new StructBuilder()! .add(OrderedNumeric.ASCENDING)! .add(OrderedString.ASCENDING)! .add(OrderedString.ASCENDING)! .add(OrderedNumeric.ASCENDING)! .toStruct();! ! buf1.setPosition(0);! StructIterator it = longer.iterator(buf1);! while (it.hasNext()) {! System.out.print(it.next() + ", ");! }! > BigDecimal.ONE, foo, null, null! !2014-­‐11-­‐18 18 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License.
  19. 19. Protobuf (HBASE-11161) ! class PBKeyValue extends PBType<CellProtos.KeyValue> {! ! @Override! public int encode(PositionedByteRange dst, KeyValue val) {! CodedOutputStream os = outputStreamFromByteRange(dst);! int before = os.spaceLeft(), after, written;! val.writeTo(os);! after = os.spaceLeft();! written = before - after;! dst.setPosition(dst.getPosition() + written);! return written;! }! 2014-­‐11-­‐18 19 Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License.
  20. 20. More Examples$ https://gist.github.com/ndimiduk/bcf33f09cc7e4408f684
  21. 21. Thanks! M A N N I N G Nick Dimiduk Amandeep Khurana FOREWORD BY Michael Stack hbaseinaction.com Nick Dimiduk github.com/ndimiduk @xefyr n10k.com http://s.apache.org/bGN Licensed under a Crea3ve Commons A8ribu3on-­‐ShareAlike 3.0 Unported License. 2014-­‐11-­‐18 21

×