Your SlideShare is downloading. ×
0
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Hbase: Introduction to column oriented databases
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hbase: Introduction to column oriented databases

12,247

Published on

Big Data is getting more attention each day, followed by new storage paradigms. This presentation shows a fast intro to HBase, a column oriented database used by Facebook and other big players to …

Big Data is getting more attention each day, followed by new storage paradigms. This presentation shows a fast intro to HBase, a column oriented database used by Facebook and other big players to store and extract knowledge of high volume of data.

Published in: Technology
0 Comments
10 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
12,247
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
385
Comments
0
Likes
10
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. HBase Introduction to column oriented databases Luís Cipriani @lfcipriani (twitter, linkedin, github, ...) 22o. GURU (2012-02-25) - Sao Paulo/Brazilsexta-feira, 24 de fevereiro de 12
  • 2. MEsexta-feira, 24 de fevereiro de 12
  • 3. intro “A BigTable HBase is a sparse, distributed, persistent multidimensional sorted map” http://research.google.com/archive/bigtable.htmlsexta-feira, 24 de fevereiro de 12
  • 4. intro > data model { <-- table // ... "aaaaa" : { <-- row "A" : { <-- column family "foo" : { <-- column (qualifier) 15: "y", <-- timestamp, value 4: "m" } "bar" : {...} }, "B" : { "" : {...} } }, "aaaab" : { "A" : { "foo" : {...}, "bar" : {...}, "joe" : {...} }, "B" : { "" : {...} } }, // ... }sexta-feira, 24 de fevereiro de 12
  • 5. intro > data model (Table, RowKey, Family, Column, Timestamp) → Valuesexta-feira, 24 de fevereiro de 12
  • 6. intro > hadoop stack • hadoop HDFS (or not) • hadoop MapReduce • hadoop ZooKeeper • hadoop HBase • hadoop Hue, Whirr, etc...sexta-feira, 24 de fevereiro de 12
  • 7. architecturesexta-feira, 24 de fevereiro de 12
  • 8. key design > read/write model • randon reads (get) • sequential reads (scan) • partial key scans • writes (put = update)sexta-feira, 24 de fevereiro de 12
  • 9. key design > storage model http://ofps.oreilly.com/titles/9781449396107/advanced.htmlsexta-feira, 24 de fevereiro de 12
  • 10. key design > strategies • tall-narrow vs flat-wide • partial key scans • pagination • time series • salting • field swap • randomization • secondary indexessexta-feira, 24 de fevereiro de 12
  • 11. key design > examplesexta-feira, 24 de fevereiro de 12
  • 12. development • installation modes • standalone, pseudo-distributed, distributed • JRuby console • Access • java/jruby API (more features) • entrypoints REST, Thrift, Avro, Protobuffers • there several other libssexta-feira, 24 de fevereiro de 12
  • 13. cons • complex config and maintenance • hot regions • no secondary index built-in • no transactions built-in • complex schema designsexta-feira, 24 de fevereiro de 12
  • 14. pros • distributed • scalable (auto-sharding) • built on Hadoop stack • handles Big Data • high performance for write and read • no SPOF • fault tolerant, no data loss • active communitysexta-feira, 24 de fevereiro de 12
  • 15. Reformulação Box de Login Abril ID http://engineering.abril.com.br/ http://abr.io/hbase-intro https://pinboard.in/u:lfcipriani/t:hbase/ http://hbase.apache.org/ ? http://shop.oreilly.com/product/0636920014348.dosexta-feira, 24 de fevereiro de 12

×