SlideShare a Scribd company logo
Felipe Ferreira
Conhecendo o
Natural Partner for Innovation
felipe.ferreira@indt.org.br
• NoSQL datastore built on top of HDFS (Hadoop)
• An Apache Top Level Project
• The goal is the hosting of very large tables (billions of
rows X millions of columns)
• Based on Google’s BigTable paper
What Is HBase?
• Storing large amounts of data (TB/PB)
• High throughput for a large number of requests
• Storing unstructured or variable column data
• Big Data with random read and writes
Why Use HBase?
• Only use with Big Data problems
• Read straight through files
• Write all at once or append new files
– Not random reads or writes
• Access patterns of the data are ill-defined
When to Consider Not Using HBase?
• More complete list at http://wiki.apache.org/hadoop/Hbase/PoweredBy
Hbase in production
HBase Architecture – How It works
• HBase Master
• RegionServer
• ZooKeeper
• HDFS
– NameNode/Standby NameNode
– DataNode
Meet the Daemons
Daemon Locations
Tables and Column Families
Rows and Columns
Regions
Regions
Write Path
Read Path
HBase API – How to access the data
• Data is not accessed over SQL
• You must:
– Create your own connections
– Keep track of the type of data in a column
– Give each row a key
– Access a row by its key
No SQL Means No SQL
• Gets
– Gets a row’s data based on the row key
• Puts
– Update/inserts a row with data based on the row key
• Scans
– Finds all matching rows based on the row key
– Scan logic can be increased by using filters
Types of Access
Gets
Puts
Puts
HBase Schema Design – How to design
• Designing schemas for HBase requires an in-depth knowledge
• Schema Design is ‘data-centric’ not ‘relationship-centric’
• You design around how data is accessed
• Row keys are engineered
No SQL Means No SQL
• A row key is more than the glue between two tables
• Engineering time is spent just on constructing a row key
– Contents of a row key vary by access pattern
– Often made up of several pieces of data
Row Keys
• Schema design does not start in an ERD
• Access pattern must be known and ascertained
• Denormalize to improve performance
– Fewer, bigger tables
Schema Design
HBase in production - examples
• Use of HBase to integrate SMS, chat, email and Facebook Messages into
one inbox
• HydraBase – The evolution of HBase@Facebook
• HBase provides a distributed, read/write backup of all mysql tables in
Twitter's production
• A number of applications including people search rely on HBase internally
for data generation
• Additionally, the operations team uses HBase as a timeseries database for
cluster-wide monitoring/performance data
• Uses HBase as a foundation for cloud scale storage for a variety of
applications
• Uses HBase to build a graph service for global web threat entities
evaluation and reputation
Internal Use Only
Non-profit R&D Center
founded by Nokia in 2001 in Brazil
Focused on projects
delivering solutions and products in the mobile
technology area
Technical team of 200+
Located in Brazil
Manaus | Brasilia | Recife | São Paulo
50+
invention reports
accepted by
Nokia/Microsoft to file
patent application
500+
items of scientific
production
300+
completed projects
Internal Use Only
OUR
CERTIFICATIONS
Internal Use Only
OUR
AWARDS
Eco System Saving Tips (app)
Mobile World Congress 2012
Facelock1st prize
London Hackathon | Nokia World 2010
Audio Aid
1st prize |Forum Nokia
Calling All Innovators 2009
Microsoft Data Gathering
Tele.Síntese
2012 & 2013
award
• About training in Big Data (Developer, Analyst, Admin):
http://www.indt.org/servicos/treinamentos/hadoop-developer
http://www.indt.org/servicos/treinamentos/hadoop-analyst
http://www.indt.org/servicos/treinamentos/hadoop-admin
• About Hbase
http://hbase.apache.org/
• About INDT:
http://www.indt.org
communications@indt.org.br
• About Hortonworks:
http://www.hortonworks.com
communications@indt.org.br
INFOS + CONTACT

More Related Content

What's hot

Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
Shweta Patnaik
 
Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and Stack
John Dougherty
 
H-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemH-Base in Data Base Mangement System
H-Base in Data Base Mangement System
Preetham Devisetty
 
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseBDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
David Lauzon
 
BDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaBDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using Impala
David Lauzon
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
Humoyun Ahmedov
 
Introduction to hbase
Introduction to hbaseIntroduction to hbase
Introduction to hbase
Uday Vakalapudi
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Md. Afif Al Mamun
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
Bigdatapump
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
larsgeorge
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
Andrew Brust
 
Big data stores
Big data  storesBig data  stores
Big data stores
Kumaran Ramanujam
 
Big data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaBig data Intro by Kaushik Dutta
Big data Intro by Kaushik Dutta
Kaushik Dutta
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsight
Nilesh Gule
 
Lviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQLLviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQL
zenyk
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" Sources
Mark Rittman
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
Daniel Abadi
 
Hadoop
HadoopHadoop
Hadoop
avnishagr
 

What's hot (20)

HBase
HBaseHBase
HBase
 
Apache hadoop technology : Beginners
Apache hadoop technology : BeginnersApache hadoop technology : Beginners
Apache hadoop technology : Beginners
 
Hadoop, Infrastructure and Stack
Hadoop, Infrastructure and StackHadoop, Infrastructure and Stack
Hadoop, Infrastructure and Stack
 
H-Base in Data Base Mangement System
H-Base in Data Base Mangement SystemH-Base in Data Base Mangement System
H-Base in Data Base Mangement System
 
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use caseBDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
BDM9 - Comparison of Oracle RDBMS and Cloudera Impala for a hospital use case
 
BDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using ImpalaBDM8 - Near-realtime Big Data Analytics using Impala
BDM8 - Near-realtime Big Data Analytics using Impala
 
Hadoop jon
Hadoop jonHadoop jon
Hadoop jon
 
Introduction to hbase
Introduction to hbaseIntroduction to hbase
Introduction to hbase
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Big data stores
Big data  storesBig data  stores
Big data stores
 
Big data Intro by Kaushik Dutta
Big data Intro by Kaushik DuttaBig data Intro by Kaushik Dutta
Big data Intro by Kaushik Dutta
 
Getting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsightGetting started with big data in Azure HDInsight
Getting started with big data in Azure HDInsight
 
Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014Apache Drill at ApacheCon2014
Apache Drill at ApacheCon2014
 
Lviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQLLviv EDGE 2 - NoSQL
Lviv EDGE 2 - NoSQL
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" Sources
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
 
Hadoop
HadoopHadoop
Hadoop
 

Similar to Conhecendo o Apache HBase

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
JAX London
 
Techincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql databaseTechincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql database
Rishabh Dugar
 
Nyc hadoop meetup introduction to h base
Nyc hadoop meetup   introduction to h baseNyc hadoop meetup   introduction to h base
Nyc hadoop meetup introduction to h base智杰 付
 
Unit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxUnit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptx
BhavanaHotchandani
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
Some corner at the Laboratory
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Esther Kundin
 
Big data solutions in azure
Big data solutions in azureBig data solutions in azure
Big data solutions in azure
Mostafa
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
Mayank Singh
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud ComputingFarzad Nozarian
 
Hive
HiveHive
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
Prashant Gupta
 
Impala for PhillyDB Meetup
Impala for PhillyDB MeetupImpala for PhillyDB Meetup
Impala for PhillyDB Meetup
Shravan (Sean) Pabba
 
Hadoop - Apache Hbase
Hadoop - Apache HbaseHadoop - Apache Hbase
Hadoop - Apache Hbase
Vibrant Technologies & Computers
 
Big data solutions in Azure
Big data solutions in AzureBig data solutions in Azure
Big data solutions in Azure
Mostafa
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
Prashanth Yennampelli
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
hive.pptx
hive.pptxhive.pptx
hive.pptx
SVijaylakshmi
 
Building Big data solutions in Azure
Building Big data solutions in AzureBuilding Big data solutions in Azure
Building Big data solutions in Azure
Mostafa
 
Hive and querying data
Hive and querying dataHive and querying data
Hive and querying data
KarthigaGunasekaran1
 

Similar to Conhecendo o Apache HBase (20)

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Techincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql databaseTechincal Talk Hbase-Ditributed,no-sql database
Techincal Talk Hbase-Ditributed,no-sql database
 
Nyc hadoop meetup introduction to h base
Nyc hadoop meetup   introduction to h baseNyc hadoop meetup   introduction to h base
Nyc hadoop meetup introduction to h base
 
Unit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptxUnit II Hadoop Ecosystem_Updated.pptx
Unit II Hadoop Ecosystem_Updated.pptx
 
Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry TrendsBig Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
 
Big data solutions in azure
Big data solutions in azureBig data solutions in azure
Big data solutions in azure
 
NoSql - mayank singh
NoSql - mayank singhNoSql - mayank singh
NoSql - mayank singh
 
Big Data and Cloud Computing
Big Data and Cloud ComputingBig Data and Cloud Computing
Big Data and Cloud Computing
 
Hive
HiveHive
Hive
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
Impala for PhillyDB Meetup
Impala for PhillyDB MeetupImpala for PhillyDB Meetup
Impala for PhillyDB Meetup
 
Hadoop - Apache Hbase
Hadoop - Apache HbaseHadoop - Apache Hbase
Hadoop - Apache Hbase
 
Big data solutions in Azure
Big data solutions in AzureBig data solutions in Azure
Big data solutions in Azure
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
hive.pptx
hive.pptxhive.pptx
hive.pptx
 
Building Big data solutions in Azure
Building Big data solutions in AzureBuilding Big data solutions in Azure
Building Big data solutions in Azure
 
Hive and querying data
Hive and querying dataHive and querying data
Hive and querying data
 

Recently uploaded

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 

Recently uploaded (20)

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 

Conhecendo o Apache HBase

  • 1. Felipe Ferreira Conhecendo o Natural Partner for Innovation felipe.ferreira@indt.org.br
  • 2. • NoSQL datastore built on top of HDFS (Hadoop) • An Apache Top Level Project • The goal is the hosting of very large tables (billions of rows X millions of columns) • Based on Google’s BigTable paper What Is HBase?
  • 3. • Storing large amounts of data (TB/PB) • High throughput for a large number of requests • Storing unstructured or variable column data • Big Data with random read and writes Why Use HBase?
  • 4. • Only use with Big Data problems • Read straight through files • Write all at once or append new files – Not random reads or writes • Access patterns of the data are ill-defined When to Consider Not Using HBase?
  • 5. • More complete list at http://wiki.apache.org/hadoop/Hbase/PoweredBy Hbase in production
  • 6. HBase Architecture – How It works
  • 7. • HBase Master • RegionServer • ZooKeeper • HDFS – NameNode/Standby NameNode – DataNode Meet the Daemons
  • 9. Tables and Column Families
  • 15. HBase API – How to access the data
  • 16. • Data is not accessed over SQL • You must: – Create your own connections – Keep track of the type of data in a column – Give each row a key – Access a row by its key No SQL Means No SQL
  • 17. • Gets – Gets a row’s data based on the row key • Puts – Update/inserts a row with data based on the row key • Scans – Finds all matching rows based on the row key – Scan logic can be increased by using filters Types of Access
  • 18. Gets
  • 19. Puts
  • 20. Puts
  • 21. HBase Schema Design – How to design
  • 22. • Designing schemas for HBase requires an in-depth knowledge • Schema Design is ‘data-centric’ not ‘relationship-centric’ • You design around how data is accessed • Row keys are engineered No SQL Means No SQL
  • 23. • A row key is more than the glue between two tables • Engineering time is spent just on constructing a row key – Contents of a row key vary by access pattern – Often made up of several pieces of data Row Keys
  • 24. • Schema design does not start in an ERD • Access pattern must be known and ascertained • Denormalize to improve performance – Fewer, bigger tables Schema Design
  • 25. HBase in production - examples
  • 26. • Use of HBase to integrate SMS, chat, email and Facebook Messages into one inbox • HydraBase – The evolution of HBase@Facebook
  • 27. • HBase provides a distributed, read/write backup of all mysql tables in Twitter's production • A number of applications including people search rely on HBase internally for data generation • Additionally, the operations team uses HBase as a timeseries database for cluster-wide monitoring/performance data
  • 28. • Uses HBase as a foundation for cloud scale storage for a variety of applications • Uses HBase to build a graph service for global web threat entities evaluation and reputation
  • 29. Internal Use Only Non-profit R&D Center founded by Nokia in 2001 in Brazil Focused on projects delivering solutions and products in the mobile technology area Technical team of 200+ Located in Brazil Manaus | Brasilia | Recife | São Paulo 50+ invention reports accepted by Nokia/Microsoft to file patent application 500+ items of scientific production 300+ completed projects
  • 31. Internal Use Only OUR AWARDS Eco System Saving Tips (app) Mobile World Congress 2012 Facelock1st prize London Hackathon | Nokia World 2010 Audio Aid 1st prize |Forum Nokia Calling All Innovators 2009 Microsoft Data Gathering Tele.Síntese 2012 & 2013 award
  • 32. • About training in Big Data (Developer, Analyst, Admin): http://www.indt.org/servicos/treinamentos/hadoop-developer http://www.indt.org/servicos/treinamentos/hadoop-analyst http://www.indt.org/servicos/treinamentos/hadoop-admin • About Hbase http://hbase.apache.org/ • About INDT: http://www.indt.org communications@indt.org.br • About Hortonworks: http://www.hortonworks.com communications@indt.org.br INFOS + CONTACT