Apache Hadoop HBase
● What is it ?
● Why use it ?
● Architecture
● Storage
● Related Projects
Hbase – What is it ?
● A Hadoop Data Store
● A noSQL store for big data
● It is Open Source, written in Java
● It is a dis...
Hbase – Why / When use it ?
● Data in billions of rows
● Complex data
● High volume of I/O
● High level of data nodes, 5 +...
HBase – Architecture
Where does Hbase sit in relation to Hadoop ?
HBase – Architecture
● HBase is a data store
● Uses Hadoop for distributed storage
● Data stored across region servers
● R...
HBase – Storage
● What is the architecture ?
HBase – Storage
● Client makes call i.e. put
● Request RPC'ed as key value to Region server
● Key Value routed to region f...
HBase – Related Projects
● Apache Flume – move large data sets to Hadoop
● Apache Sqoop – cmd line, move rdbms data to Had...
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project...
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project...
Upcoming SlideShare
Loading in …5
×

An Introduction to Apache HBase

640 views

Published on

What is Apache HBase in terms of big data and Hadoop ?
How does it relate to the other Hadoop tools ?

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
640
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
61
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

An Introduction to Apache HBase

  1. 1. Apache Hadoop HBase ● What is it ? ● Why use it ? ● Architecture ● Storage ● Related Projects
  2. 2. Hbase – What is it ? ● A Hadoop Data Store ● A noSQL store for big data ● It is Open Source, written in Java ● It is a distributed database ● Automatic sharding, table data spread over cluster ● Automatic region server fail over
  3. 3. Hbase – Why / When use it ? ● Data in billions of rows ● Complex data ● High volume of I/O ● High level of data nodes, 5 + ● No need for extra RDBMS functions i.e. transactions
  4. 4. HBase – Architecture Where does Hbase sit in relation to Hadoop ?
  5. 5. HBase – Architecture ● HBase is a data store ● Uses Hadoop for distributed storage ● Data stored across region servers ● Region server data spread across HDFS data nodes ● A write ahead log (WAL) is used to record changes
  6. 6. HBase – Storage ● What is the architecture ?
  7. 7. HBase – Storage ● Client makes call i.e. put ● Request RPC'ed as key value to Region server ● Key Value routed to region for row ● Data is written to WAL ● Data written to region memStore ● If region server cashes WAL can be used to recover data
  8. 8. HBase – Related Projects ● Apache Flume – move large data sets to Hadoop ● Apache Sqoop – cmd line, move rdbms data to Hadoop ● Apache Hbase – Non relational database ● Apache Pig – analyse large data sets ● Apache Oozie – work flow scheduler ● Apache Mahout – machine learning and data mining ● Apache Hue – Hadoop user interface ● Apache Zoo Keeper – configuration / build
  9. 9. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  10. 10. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems

×