0
Hbase Key Design
Hbase, unlike RDBMS's has;
●
A data model basically a sorted, nested map of
maps
– there is no schema that constrains the ...
*Does not store null values, all row keys and
column keys must be set.
Tall-Narrow Versus Flat-Wide Tables
●
HBase can only split at row boundaries
●
A single row could outgrow the maximum
file...
Partial Key Scans
●
Specify start and end key
●
Can use date
●
Powerful mechanism (lexicographic order)
➢
Need to pad the ...
Time Series Data
●
Secondary Index(es)
●
Multiple secondary indexes can be emulated
by using multiple column families (not
recommended)
●
Secondary Index(es)
●
Multiple secondary indexes can be emulated
by using multiple column families (not
recommended)
H base key design
H base key design
Upcoming SlideShare
Loading in...5
×

H base key design

138

Published on

Based on the book HBase The Definitive Guide

Published in: Technology, Travel
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
138
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "H base key design"

  1. 1. Hbase Key Design
  2. 2. Hbase, unlike RDBMS's has; ● A data model basically a sorted, nested map of maps – there is no schema that constrains the list of keys (relational table keys have fixed schema and uniform for all rows) – the value may itself be some complex structure (can be another map)
  3. 3. *Does not store null values, all row keys and column keys must be set.
  4. 4. Tall-Narrow Versus Flat-Wide Tables ● HBase can only split at row boundaries ● A single row could outgrow the maximum file/region size and work against the region split facility. ● Better approach would be to divide the data logically
  5. 5. Partial Key Scans ● Specify start and end key ● Can use date ● Powerful mechanism (lexicographic order) ➢ Need to pad the values for fixed length ● Can utilize Pagination ● Drawback of composite keys: Atomicity ● Not optimal for updates
  6. 6. Time Series Data
  7. 7. ● Secondary Index(es) ● Multiple secondary indexes can be emulated by using multiple column families (not recommended)
  8. 8. ● Secondary Index(es) ● Multiple secondary indexes can be emulated by using multiple column families (not recommended)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×