SlideShare a Scribd company logo
Big Data – 4 V’s
NoSQL 
• NoSQL is all about scalability 
• Scaling to size 
• Scaling to complexity 
• Deliver Heavy R/W workloads 
• Data duplication and denormalization are first-class 
citizens
RDBMS vs NoSQL
No SQL Types
Database Chart
CAP Theorem
Re Check.. 
• What is CAP theorem? 
• Does NoSQL supports Transaction? 
• NoSQL Types?
HBase 
• Scalable, distributed data store 
• Sorted map of maps / Key- Value store 
• Open source avatar of Google’s Bigtable 
• Sparse 
• Multi dimensional 
• Tightly integrated with Hadoop 
• Not a RDBMS
Architecture 
HDFS((DataNodes) 
Storage 
ZooKeeper 
Membership management 
RegionServers 
Serve the regions 
HBase Masters 
Janitorial work
Column Oriented
Distributed
Variable number of columns
Important Terms 
• Table 
• Consists of rows and columns 
• Row 
• Has a bunch of columns. 
• Identified by a rowkey (primary’ key) 
• Column Qualifier 
• Dynamic column name 
• Column Family 
• Column groups - logical and physical (Similar access pattern) 
• Cell 
• The actual element that contains the data for a row-column insertion 
• Version 
• Every cell has multiple versions
Logical & Tall(v/s(Wide(tab Plehsy(sstiocraal gSet(rfuocottuprreint 
CF1 CF2 
r1 c1:v1 c1:v9 c6:v2 
r2 c1:v2 c3:v6 
r3 c2:v3 c5:v6 
r4 c2:v4 
r5 c1:v1 c3:v5 c7:v8 
HFile for CF1 HFile for CF2 
r1:CF1:c1:t1:v1 
r2:CF1:c1:t2:v2 
r2:CF1:c3:t3:v6 
r3:CF1:c2:t1:v3 
r4:CF1:c2:t1:v4 
r5:CF1:c1:t2:v1 
r5:CF1:c3:t3:v5 
r1:CF2:c1:t1:v9 
r1:CF2:c6:t4:v2 
r3:CF2:c5:t4:v6 
r5:CF2:c7:t3:v8 
Result object returned for a Get() on row r5 
r5:CF1:c1:t2:v1 
r5:CF1:c3:t3:v5 
r5:cf2:c7:t3:v8 
KeyValue objects 
Cell 
Value 
Time 
Stamp 
Col 
Qual 
Col 
Fam 
Row 
Key 
Key Value 
Logical representation of an HBase table. 
We'll look at what it means to Get() row r5 from this table. 
Actual physical storage of the table 
Structure of a KeyValue object
(J)Ruby Shell Commands 
• General 
• DDL 
• Create 
• Describe 
• Namespace 
• DML 
• Put 
• Get 
• Scan 
• Delete 
• Tools 
• Replication 
• Snapshot 
• Security 
• Visibility 
Creating Table: 
create 'DEVICE_DETAIL','BASIC_INFO','CONTRACT_INFO' 
Data Generation : 
put 'DEVICE_DETAIL','Device1','BASIC_INFO:IP_ADDR','10.10.10.10' 
put 'DEVICE_DETAIL','Device2','BASIC_INFO:IP_ADDR','20.20.20.20' 
Descripting Table: 
describe 'DEVICE_DETAIL' 
Alert Info : 
alter 'DEVICE_DETAIL',{NAME => 'CONTRACT_INFO',VERSIONS => 3 } 
Update Data: 
put 'DEVICE_DETAIL','Device2','CONTRACT_INFO:CONTRACT_NUMBER','22222222' 
Multi- Version Example : 
get 'DEVICE_DETAIL','Device2', {COLUMN=>'CONTRACT_INFO:CONTRACT_NUMBER', VERSIONS=>2} 
Scan Info: 
scan 'DEVICE_DETAIL’ 
Scan with Filter : 
scan 'DEVICE_DETAIL' , { COLUMNS => 'CONTRACT_INFO:STATUS', LIMIT => 10, FILTER => 
"ValueFilter( =, 'binary:IN_ACTIVE' )" } 
Delete Info: 
delete 'DEVICE_DETAIL','Device2','CONTRACT_INFO:STATUS'
Java API 
• HTable 
• HBaseAdmin 
• HTablePool 
• Get 
• Put 
• Delete 
• Scan 
• Increment 
• HTableDescriptor 
• HTableInterface 
• Result 
• ResultScanner 
• KeyValue 
HTable table = new HTable(configuration, hbasetablename); 
Put row = new Put(Bytes.toBytes(rowKey)); 
row.add(Bytes.toBytes(columnFamily), Bytes.toBytes(key), 
Bytes.toBytes(value)); 
Get getKey = new Get(Bytes.toBytes(key)); 
Result result = table.get(getKey);
Spark HBase 
// create configuration 
val config = HBaseConfiguration.create() 
config.set("hbase.zookeeper.quorum", "localhost") 
config.set("hbase.zookeeper.property.clientPort","2181") 
config.set("hbase.mapreduce.inputtable", "hbaseTableName") 
// read data 
val hbaseData = sparkContext.hadoopRDD(new JobConf(config), classOf[TableInputFormat], 
classOf[ImmutableBytesWritable], classOf[Result]) 
// count rows 
println(hbaseData.count)
HBase Architecture
Write & Read Logic
SQL
Re Check.. 
• Column family? 
• HBase components? 
• Name few Shell commands? 
• Version in HBase?
Reference Slides
Use Case 
• Canonical(use(case:(storing(crawl(data(and(indices(for(search 
14 
1 
Web Search 
powered by Bigtable 
Crawlers 
Crawlers 
1 Crawlers constantly scour the Internet for new pages. 
Those pages are stored as individual records in Bigtable. 3 
2 A MapReduce job runs over the entire table, generating 
search indexes for the Web Search application. 
4 
2 
5 
Indexing the Internet 
Searching the Internet 
3 The user initiates a Web Search request. 
4 The Web Search application queries the Search Indexes 
and retries matching documents directly from Bigtable. 
5 Search results are presented to the user. 
Internets Bigtable 
Crawlers 
Crawlers 
MapReduce 
You 
Search 
InSdeeaxrch 
InSdeeaxrch 
Index 
Web Search
Hbase Architecture
Replications
CAP Theorem

More Related Content

What's hot

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
JAX London
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
HBaseCon
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
Cloudera, Inc.
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
HBaseCon
 
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
Cloudera, Inc.
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
sheetal sharma
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
HBaseCon
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
tshiran
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
Nick Dimiduk
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future
HBaseCon
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
Hyunsik Choi
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
HBaseCon
 
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Hyunsik Choi
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
HBaseCon
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
Cloudera, Inc.
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
Cloudera, Inc.
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
Osama Hussein
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
larsgeorge
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Chicago Hadoop Users Group
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache Hadoop
Chicago Hadoop Users Group
 

What's hot (20)

Intro to HBase - Lars George
Intro to HBase - Lars GeorgeIntro to HBase - Lars George
Intro to HBase - Lars George
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBaseHBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
 
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsightOptimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
 
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
 
Apache hadoop hbase
Apache hadoop hbaseApache hadoop hbase
Apache hadoop hbase
 
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
 
Analyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache DrillAnalyzing Real-World Data with Apache Drill
Analyzing Real-World Data with Apache Drill
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future Apache Spark on Apache HBase: Current and Future
Apache Spark on Apache HBase: Current and Future
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
 
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload DiversityHarmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
 
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
 
HBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDKHBase Data Modeling and Access Patterns with Kite SDK
HBase Data Modeling and Access Patterns with Kite SDK
 
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBaseHBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Compaction Improvements in Apache HBase
 
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014HBase Status Report - Hadoop Summit Europe 2014
HBase Status Report - Hadoop Summit Europe 2014
 
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
Using HBase Co-Processors to Build a Distributed, Transactional RDBMS - Splic...
 
An Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache HadoopAn Introduction to Impala – Low Latency Queries for Apache Hadoop
An Introduction to Impala – Low Latency Queries for Apache Hadoop
 

Viewers also liked

Apache hbase overview (20160427)
Apache hbase overview (20160427)Apache hbase overview (20160427)
Apache hbase overview (20160427)
Steve Min
 
The Hive Overview
The Hive OverviewThe Hive Overview
The Hive Overview
thehivecs
 
HBASE Overview
HBASE OverviewHBASE Overview
HBASE Overview
Sampath Rachakonda
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
nvvrajesh
 
Base de données graphe et Neo4j
Base de données graphe et Neo4jBase de données graphe et Neo4j
Base de données graphe et Neo4j
Boris Guarisma
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
Neo4j
 
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Serious Detecting
 
Graphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assuranceGraphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assurance
Linkurious
 
Introduction à Neo4j
Introduction à Neo4jIntroduction à Neo4j
Introduction à Neo4j
Neo4j
 
Présentation des bases de données orientées graphes
Présentation des bases de données orientées graphesPrésentation des bases de données orientées graphes
Présentation des bases de données orientées graphes
Koffi Sani
 

Viewers also liked (10)

Apache hbase overview (20160427)
Apache hbase overview (20160427)Apache hbase overview (20160427)
Apache hbase overview (20160427)
 
The Hive Overview
The Hive OverviewThe Hive Overview
The Hive Overview
 
HBASE Overview
HBASE OverviewHBASE Overview
HBASE Overview
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Base de données graphe et Neo4j
Base de données graphe et Neo4jBase de données graphe et Neo4j
Base de données graphe et Neo4j
 
Data Modeling with Neo4j
Data Modeling with Neo4jData Modeling with Neo4j
Data Modeling with Neo4j
 
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
Instruction Manual Minelab X-TERRA 705 Metal Detector French Language (4901-0...
 
Graphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assuranceGraphes et détection de fraude : exemple de l'assurance
Graphes et détection de fraude : exemple de l'assurance
 
Introduction à Neo4j
Introduction à Neo4jIntroduction à Neo4j
Introduction à Neo4j
 
Présentation des bases de données orientées graphes
Présentation des bases de données orientées graphesPrésentation des bases de données orientées graphes
Présentation des bases de données orientées graphes
 

Similar to NoSQL & HBase overview

Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
MapR Technologies
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
Michael Stack
 
Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14
Jeremy Walsh
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
Shashwat Shriparv
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
BigDataEverywhere
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBase
WibiData
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
Carol McDonald
 
מיכאל
מיכאלמיכאל
מיכאל
sqlserver.co.il
 
HBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLES
HBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLESHBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLES
HBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLES
harikumar288574
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
Rohit Agrawal
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBase
Carol McDonald
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
vijayapraba1
 
Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018
Aman Sinha
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
Amit Kumar Gupta
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
Mao Geng
 
Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql database
bigdatagurus_meetup
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
Tony Rogerson
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
Cloudera, Inc.
 
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQLAdding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Piotr Pruski
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
Modern Data Stack France
 

Similar to NoSQL & HBase overview (20)

Introduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and SecurityIntroduction to Apache HBase, MapR Tables and Security
Introduction to Apache HBase, MapR Tables and Security
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14Introduction to HBase - Phoenix HUG 5/14
Introduction to HBase - Phoenix HUG 5/14
 
H base introduction & development
H base introduction & developmentH base introduction & development
H base introduction & development
 
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant) Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
Big Data Everywhere Chicago: Unleash the Power of HBase Shell (Conversant)
 
Performing Data Science with HBase
Performing Data Science with HBasePerforming Data Science with HBase
Performing Data Science with HBase
 
NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill NoSQL HBase schema design and SQL with Apache Drill
NoSQL HBase schema design and SQL with Apache Drill
 
מיכאל
מיכאלמיכאל
מיכאל
 
HBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLES
HBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLESHBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLES
HBASE, HIVE , ARCHITECTURE AND WORKING EXAMPLES
 
Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7Advance Hive, NoSQL Database (HBase) - Module 7
Advance Hive, NoSQL Database (HBase) - Module 7
 
Getting started with HBase
Getting started with HBaseGetting started with HBase
Getting started with HBase
 
HBase.pptx
HBase.pptxHBase.pptx
HBase.pptx
 
Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018Apache Drill talk ApacheCon 2018
Apache Drill talk ApacheCon 2018
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Hadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log projectHadoop and HBase experiences in perf log project
Hadoop and HBase experiences in perf log project
 
Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql database
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
 
Large Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache SparkLarge Scale Machine Learning with Apache Spark
Large Scale Machine Learning with Apache Spark
 
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQLAdding Value to HBase with IBM InfoSphere BigInsights and BigSQL
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
 
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
HUG France Feb 2016 - Migration de données structurées entre Hadoop et RDBMS ...
 

More from Venkata Naga Ravi

Microservices with Docker
Microservices with Docker Microservices with Docker
Microservices with Docker
Venkata Naga Ravi
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Venkata Naga Ravi
 
Quick Trip with Docker
Quick Trip with DockerQuick Trip with Docker
Quick Trip with Docker
Venkata Naga Ravi
 
Glint with Apache Spark
Glint with Apache SparkGlint with Apache Spark
Glint with Apache Spark
Venkata Naga Ravi
 
Flocker
FlockerFlocker
Big Data Benchmarking
Big Data BenchmarkingBig Data Benchmarking
Big Data Benchmarking
Venkata Naga Ravi
 
Go Lang
Go LangGo Lang
Kubernetes
KubernetesKubernetes
Kubernetes
Venkata Naga Ravi
 
Software Defined Network - SDN
Software Defined Network - SDNSoftware Defined Network - SDN
Software Defined Network - SDN
Venkata Naga Ravi
 
Virtual Container - Docker
Virtual Container - Docker Virtual Container - Docker
Virtual Container - Docker
Venkata Naga Ravi
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
Venkata Naga Ravi
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache Spark
Venkata Naga Ravi
 

More from Venkata Naga Ravi (12)

Microservices with Docker
Microservices with Docker Microservices with Docker
Microservices with Docker
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Quick Trip with Docker
Quick Trip with DockerQuick Trip with Docker
Quick Trip with Docker
 
Glint with Apache Spark
Glint with Apache SparkGlint with Apache Spark
Glint with Apache Spark
 
Flocker
FlockerFlocker
Flocker
 
Big Data Benchmarking
Big Data BenchmarkingBig Data Benchmarking
Big Data Benchmarking
 
Go Lang
Go LangGo Lang
Go Lang
 
Kubernetes
KubernetesKubernetes
Kubernetes
 
Software Defined Network - SDN
Software Defined Network - SDNSoftware Defined Network - SDN
Software Defined Network - SDN
 
Virtual Container - Docker
Virtual Container - Docker Virtual Container - Docker
Virtual Container - Docker
 
Java 8 Lambda and Streams
Java 8 Lambda and StreamsJava 8 Lambda and Streams
Java 8 Lambda and Streams
 
In Memory Analytics with Apache Spark
In Memory Analytics with Apache SparkIn Memory Analytics with Apache Spark
In Memory Analytics with Apache Spark
 

Recently uploaded

OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
Gerardo Pardo-Castellote
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
Alina Yurenko
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Envertis Software Solutions
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
kalichargn70th171
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 

Recently uploaded (20)

OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket ManagementUtilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
Utilocate provides Smarter, Better, Faster, Safer Locate Ticket Management
 
DDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systemsDDS-Security 1.2 - What's New? Stronger security for long-running systems
DDS-Security 1.2 - What's New? Stronger security for long-running systems
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)GOING AOT WITH GRAALVM FOR  SPRING BOOT (SPRING IO)
GOING AOT WITH GRAALVM FOR SPRING BOOT (SPRING IO)
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise EditionWhy Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
Why Choose Odoo 17 Community & How it differs from Odoo 17 Enterprise Edition
 
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 

NoSQL & HBase overview

  • 1.
  • 2. Big Data – 4 V’s
  • 3. NoSQL • NoSQL is all about scalability • Scaling to size • Scaling to complexity • Deliver Heavy R/W workloads • Data duplication and denormalization are first-class citizens
  • 8. Re Check.. • What is CAP theorem? • Does NoSQL supports Transaction? • NoSQL Types?
  • 9. HBase • Scalable, distributed data store • Sorted map of maps / Key- Value store • Open source avatar of Google’s Bigtable • Sparse • Multi dimensional • Tightly integrated with Hadoop • Not a RDBMS
  • 10.
  • 11. Architecture HDFS((DataNodes) Storage ZooKeeper Membership management RegionServers Serve the regions HBase Masters Janitorial work
  • 15. Important Terms • Table • Consists of rows and columns • Row • Has a bunch of columns. • Identified by a rowkey (primary’ key) • Column Qualifier • Dynamic column name • Column Family • Column groups - logical and physical (Similar access pattern) • Cell • The actual element that contains the data for a row-column insertion • Version • Every cell has multiple versions
  • 16. Logical & Tall(v/s(Wide(tab Plehsy(sstiocraal gSet(rfuocottuprreint CF1 CF2 r1 c1:v1 c1:v9 c6:v2 r2 c1:v2 c3:v6 r3 c2:v3 c5:v6 r4 c2:v4 r5 c1:v1 c3:v5 c7:v8 HFile for CF1 HFile for CF2 r1:CF1:c1:t1:v1 r2:CF1:c1:t2:v2 r2:CF1:c3:t3:v6 r3:CF1:c2:t1:v3 r4:CF1:c2:t1:v4 r5:CF1:c1:t2:v1 r5:CF1:c3:t3:v5 r1:CF2:c1:t1:v9 r1:CF2:c6:t4:v2 r3:CF2:c5:t4:v6 r5:CF2:c7:t3:v8 Result object returned for a Get() on row r5 r5:CF1:c1:t2:v1 r5:CF1:c3:t3:v5 r5:cf2:c7:t3:v8 KeyValue objects Cell Value Time Stamp Col Qual Col Fam Row Key Key Value Logical representation of an HBase table. We'll look at what it means to Get() row r5 from this table. Actual physical storage of the table Structure of a KeyValue object
  • 17. (J)Ruby Shell Commands • General • DDL • Create • Describe • Namespace • DML • Put • Get • Scan • Delete • Tools • Replication • Snapshot • Security • Visibility Creating Table: create 'DEVICE_DETAIL','BASIC_INFO','CONTRACT_INFO' Data Generation : put 'DEVICE_DETAIL','Device1','BASIC_INFO:IP_ADDR','10.10.10.10' put 'DEVICE_DETAIL','Device2','BASIC_INFO:IP_ADDR','20.20.20.20' Descripting Table: describe 'DEVICE_DETAIL' Alert Info : alter 'DEVICE_DETAIL',{NAME => 'CONTRACT_INFO',VERSIONS => 3 } Update Data: put 'DEVICE_DETAIL','Device2','CONTRACT_INFO:CONTRACT_NUMBER','22222222' Multi- Version Example : get 'DEVICE_DETAIL','Device2', {COLUMN=>'CONTRACT_INFO:CONTRACT_NUMBER', VERSIONS=>2} Scan Info: scan 'DEVICE_DETAIL’ Scan with Filter : scan 'DEVICE_DETAIL' , { COLUMNS => 'CONTRACT_INFO:STATUS', LIMIT => 10, FILTER => "ValueFilter( =, 'binary:IN_ACTIVE' )" } Delete Info: delete 'DEVICE_DETAIL','Device2','CONTRACT_INFO:STATUS'
  • 18. Java API • HTable • HBaseAdmin • HTablePool • Get • Put • Delete • Scan • Increment • HTableDescriptor • HTableInterface • Result • ResultScanner • KeyValue HTable table = new HTable(configuration, hbasetablename); Put row = new Put(Bytes.toBytes(rowKey)); row.add(Bytes.toBytes(columnFamily), Bytes.toBytes(key), Bytes.toBytes(value)); Get getKey = new Get(Bytes.toBytes(key)); Result result = table.get(getKey);
  • 19. Spark HBase // create configuration val config = HBaseConfiguration.create() config.set("hbase.zookeeper.quorum", "localhost") config.set("hbase.zookeeper.property.clientPort","2181") config.set("hbase.mapreduce.inputtable", "hbaseTableName") // read data val hbaseData = sparkContext.hadoopRDD(new JobConf(config), classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result]) // count rows println(hbaseData.count)
  • 21. Write & Read Logic
  • 22. SQL
  • 23. Re Check.. • Column family? • HBase components? • Name few Shell commands? • Version in HBase?
  • 25.
  • 26.
  • 27.
  • 28. Use Case • Canonical(use(case:(storing(crawl(data(and(indices(for(search 14 1 Web Search powered by Bigtable Crawlers Crawlers 1 Crawlers constantly scour the Internet for new pages. Those pages are stored as individual records in Bigtable. 3 2 A MapReduce job runs over the entire table, generating search indexes for the Web Search application. 4 2 5 Indexing the Internet Searching the Internet 3 The user initiates a Web Search request. 4 The Web Search application queries the Search Indexes and retries matching documents directly from Bigtable. 5 Search results are presented to the user. Internets Bigtable Crawlers Crawlers MapReduce You Search InSdeeaxrch InSdeeaxrch Index Web Search
  • 31.

Editor's Notes

  1. Most NoSQL stores lack true ACID transactions, although a few recent systems, such as FairCom c-treeACE, Google Spanner (though technically a NewSQL database) and FoundationDB, have made them central to their designs. Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value Eventually consistent services are often classified as providing BASE (Basically Available, Soft state, Eventual consistency) semantics, in contrast to traditional ACID (Atomicity, Consistency, Isolation, Durability) guarantees.
  2. http://blog.monitis.com/2011/05/22/picking-the-right-nosql-database-tool/
  3. Eric Brewer’s CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) Consistency means that each client always has the same view of the data. Availability means that all clients can always read and write. Partition tolerance means that the system works well across physical network partitions.
  4. http://localhost:60010/master-status
  5. Eric Brewer’s CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.) Consistency means that each client always has the same view of the data. Availability means that all clients can always read and write. Partition tolerance means that the system works well across physical network partitions.