HBase backups and performance on MapR

•Download as PPTX, PDF•

3 likes•2,162 views

lohitvijayarenu

Slides from HBase User Group about HBase backups and performance on MapR distribution for Apache Hadoop.

Technology

HBase on MapR LohitVijayaRenu, MapR Technologies, Inc. HBasecontributor day at Yahoo, June 30 2011

Who am I? LohitVijayaRenu, Software Engineer at MapR Technologies (lohit@maprtech.com) MapR Combines the best of the Hadoop community contributions with significant internally financed infrastructure development to provide complete distribution for Apache Hadoop (www.mapr.com)

HBase on MapR Backups using Snapshots Performance on MapR Highly available MapR MapR Control System

HBase Backups "We're trying to come up with right strategy for backing up HBase tables ...Currently, we're employing exports (writing onto HDFS of another cluster directly), but is taking too long (~5 hours to export ~5GB of data)...” ManojMurumkar "...Recently I encountered a problem about data loss of HBase. So it comes to the question that how to backup HBase data to recover table records...What about copy the directory of HBase to another directory in HDFS?... " Liu Xianglong Source: hbase-user group Available options ,[object Object]

Table Snapshots Source: http://blog.sematext.com/2011/03/11/hbase-backup-options/

Zero performance loss on writing to original

REST API for creation and deletion of snapshotsREAD / WRITE /hbase /hbase/.snapshot/Snapshot20110630 /hbase/.snaphsot/Snapshot20110629 /hbase/.snaphsot/Snapshot3 MapR REDIRECT ON WRITE FOR SNAPSHOT Data Blocks A B C C’ D Snapshot 3 Snapshot 20110629 Snapshot 20110630

MapR Snapshots HBase table in DFS Take snapshot on running HBase Restore from snapshot

What's hot

Upgrading from-hdp-21-to-hdp-24wyukawa

MongoFr : MongoDB as a log CollectorPierre Baillet

Embulk and Machine Learning infrastructureHiroshi Toyama

Redis: REmote DIctionary ServerEzra Zygmuntowicz

Hadoop hbase introductionJakub Stransky

Open Source Databases And GisKudos S.A.S

Big data solution capacity planningRiyaz Shaikh

Introduction to map reduceBhupesh Chawda

2011.10.14 Apache Giraph - HortonworksAvery Ching

To Hire, or to train, that is the question (Percona Live 2014)Geoffrey Anderson

Aerospike - fast and furious caching @ Burgasconf 2016Tihomir Trifonov

Giraph주영 송

PostgreSQL is the new NoSQL - at Devoxx 2018Quentin Adam

Streaming API, Spark and RubyManohar Amrutkar

Hadoop eco system-first classalogarg

Scaling Storage and Computation with Hadoopyaevents

Ops Jumpstart: MongoDB Administration 101MongoDB

Apache Hadoop Big Data TechnologyJay Nagar

Baseband processing units virtualization for cloud radio access networksieeepondy

Apache HAMA: An Introduction toBulk Synchronization Parallel on Hadoopguest20d395b

What's hot (20)

Upgrading from-hdp-21-to-hdp-24

MongoFr : MongoDB as a log Collector

Embulk and Machine Learning infrastructure

Redis: REmote DIctionary Server

Hadoop hbase introduction

Open Source Databases And Gis

Big data solution capacity planning

Introduction to map reduce

2011.10.14 Apache Giraph - Hortonworks

To Hire, or to train, that is the question (Percona Live 2014)

Aerospike - fast and furious caching @ Burgasconf 2016

Giraph

PostgreSQL is the new NoSQL - at Devoxx 2018

Streaming API, Spark and Ruby

Hadoop eco system-first class

Scaling Storage and Computation with Hadoop

Ops Jumpstart: MongoDB Administration 101

Apache Hadoop Big Data Technology

Baseband processing units virtualization for cloud radio access networks

Apache HAMA: An Introduction toBulk Synchronization Parallel on Hadoop

Viewers also liked

Apache Drill – Hands-On SQL ReferencesMapR Technologies

Spark & Hadoop at Production at ScaleMapR Technologies

Inside MapR's M7MapR Technologies

Machine Learning with Hadoop Boston hug 2012MapR Technologies

Practical Machine Learning: Innovations in Recommendation WorkshopMapR Technologies

MapR-DB – The First In-Hadoop Document DatabaseMapR Technologies

Intro to Apache Spark by CTO of TwingoMapR Technologies

Real Time and Big Data – It’s About TimeMapR Technologies

Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapRDouglas Bernardini

Intro to Apache Spark by Marco VasquezMapR Technologies

Apache Drill でたしなむセルフサービスデータ探索 - 2014/11/06 Cloudera World Tokyo 2014 LTセッションMapR Technologies Japan

MapR & Skytree: MapR Technologies

Inside MapR's M7Ted Dunning

Introduction to Apache HBase, MapR Tables and SecurityMapR Technologies

Apache Spark & HadoopMapR Technologies

HBase and Drill: How Loosely Typed SQL is Ideal for NoSQLMapR Technologies

Apache HBase Performance TuningLars Hofhansl

Viewers also liked (17)

Apache Drill – Hands-On SQL References

Spark & Hadoop at Production at Scale

Inside MapR's M7

Machine Learning with Hadoop Boston hug 2012

Practical Machine Learning: Innovations in Recommendation Workshop

MapR-DB – The First In-Hadoop Document Database

Intro to Apache Spark by CTO of Twingo

Real Time and Big Data – It’s About Time

Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR

Intro to Apache Spark by Marco Vasquez

Apache Drill でたしなむセルフサービスデータ探索 - 2014/11/06 Cloudera World Tokyo 2014 LTセッション

MapR & Skytree:

Inside MapR's M7

Introduction to Apache HBase, MapR Tables and Security

Apache Spark & Hadoop

HBase and Drill: How Loosely Typed SQL is Ideal for NoSQL

Apache HBase Performance Tuning

Similar to HBase backups and performance on MapR

Hw09 Practical HBase Getting The Most From Your H Base InstallCloudera, Inc.

Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...Yahoo Developer Network

Chicago Data Summit: Apache HBase: An IntroductionCloudera, Inc.

Hadoop online-trainingGeohedrick

HBase lon meetupMatteo Bertozzi

Intro to big data choco devday - 23-01-2014Hassan Islamov

20140202 fosdem-nosql-devroom-hadoop-yarnDatalayer

Intro to HBase - Lars GeorgeJAX London

HBase introduction talkHayden Marchant

Apache Hadoop and HBaseCloudera, Inc.

HBase, crazy dances on the elephant back.Roman Nikitchenko

Hadoop and Mapreduce Introductionrajsandhu1989

Facebook keynote-nicolas-qconYiwei Ma

Facebook Messages & HBase强王

支撑Facebook消息处理的h base存储系统yongboy

Architectural Evolution Starting from HadoopSpagoWorld

An Introduction to Impala – Low Latency Queries for Apache HadoopChicago Hadoop Users Group

NoSQL: Cassadra vs. HBaseAntonio Severien

Mapreduce over snapshotsenissoz

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)Claudiu Barbura

Similar to HBase backups and performance on MapR (20)

Hw09 Practical HBase Getting The Most From Your H Base Install

Apache Hadoop India Summit 2011 talk "Searching Information Inside Hadoop Pla...

Chicago Data Summit: Apache HBase: An Introduction

Hadoop online-training

HBase lon meetup

Intro to big data choco devday - 23-01-2014

20140202 fosdem-nosql-devroom-hadoop-yarn

Intro to HBase - Lars George

HBase introduction talk

Apache Hadoop and HBase

HBase, crazy dances on the elephant back.

Hadoop and Mapreduce Introduction

Facebook keynote-nicolas-qcon

Facebook Messages & HBase

支撑Facebook消息处理的h base存储系统

Architectural Evolution Starting from Hadoop

An Introduction to Impala – Low Latency Queries for Apache Hadoop

NoSQL: Cassadra vs. HBase

Mapreduce over snapshots

xPatterns ... beyond Hadoop (Spark, Shark, Mesos, Tachyon)

Recently uploaded

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Manulife - Insurer Transformation Award 2024The Digital Insurer

ICT role in 21st century education and its challengesrafiqahmad00786416

A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz

GenAI Risks & Security Meetup 01052024.pdflior mazor

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

FWD Group - Insurer Innovation Award 2024The Digital Insurer

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Recently uploaded (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Axa Assurance Maroc - Insurer Innovation Award 2024

How to Troubleshoot Apps for the Modern Connected Worker

Powerful Google developer tools for immediate impact! (2023-24 C)

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Data Cloud, More than a CDP by Matt Robison

Manulife - Insurer Transformation Award 2024

ICT role in 21st century education and its challenges

A Beginners Guide to Building a RAG App Using Open Source Milvus

GenAI Risks & Security Meetup 01052024.pdf

presentation ICT roal in 21st century education

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Automating Google Workspace (GWS) & more with Apps Script

AXA XL - Insurer Innovation Award Americas 2024

Strategies for Landing an Oracle DBA Job as a Fresher

FWD Group - Insurer Innovation Award 2024

AWS Community Day CPH - Three problems of Terraform

HBase backups and performance on MapR

1. HBase on MapR LohitVijayaRenu, MapR Technologies, Inc. HBasecontributor day at Yahoo, June 30 2011

2. Who am I? LohitVijayaRenu, Software Engineer at MapR Technologies (lohit@maprtech.com) MapR Combines the best of the Hadoop community contributions with significant internally financed infrastructure development to provide complete distribution for Apache Hadoop (www.mapr.com)

3. HBase on MapR Backups using Snapshots Performance on MapR Highly available MapR MapR Control System

5. CopyTable

6. Distcp

7. Backup from Mozilla

8. Cluster Replication

9. Table Snapshots Source: http://blog.sematext.com/2011/03/11/hbase-backup-options/

10.

11. Snapshots are consistent

12. Saves space by sharing blocks

13. Lightning fast

14. Zero performance loss on writing to original

15. Scheduled, or on-demand

16. REST API for creation and deletion of snapshotsREAD / WRITE /hbase /hbase/.snapshot/Snapshot20110630 /hbase/.snaphsot/Snapshot20110629 /hbase/.snaphsot/Snapshot3 MapR REDIRECT ON WRITE FOR SNAPSHOT Data Blocks A B C C’ D Snapshot 3 Snapshot 20110629 Snapshot 20110630

17. MapR Snapshots HBase table in DFS Take snapshot on running HBase Restore from snapshot

18. MapR Control System Snapshot information Snapshot Schedules All UI operations have REST APIs More info at www.mapr.com

19.

20. Consistent, point-in-time data replication to different cluster

21. Differential deltas areupdated

22. Compressed and check-summed

23. Scheduled or on-demand

24. REST API for setup, start and stop mirrorBackup Production Datacenter 2 Datacenter 1 WAN

25. HBase performance "...Initially, when the table was empty I was getting around 300 inserts per second with 50 writing threads. Then, when the region split and a second server was added the rate suddenly jumped to 3000 inserts/sec per server, so ~6000 for the two servers...“ EranKutner "...My scenario is similar, we need under 10k rows, 10-20 columns and which can have thousands of version with value not greater than 300 bytes...Can we get 40-50k records/sec insertion speed in HBase??...“ GauravVashishth Source: hbase-user group

26.

27. HMaster and RegionServer running on MapR

28. YCSB Client running on RS nodesZooKeeper YCSB setup YCSB YCSB YCSB YCSB RS RS RS RS Master MapR https://github.com/lohitvijayarenu/YCSB

29.

30. Throughput rates were similar from all nodes

31. All operations in cluster completed around same time.YCSB operations from nodes

32. Insert performance Dataset: 1B rows Row size: 1K 10 RS, 11 2TB @7200 8 Cores, 24GB RAM, 2Gbps 3 Replication, No compression Ops Seconds Insert (one node)

33. Read performance Dataset: 0.9B rows Row size: 1K 9 RS, 5 500G @7200 8 cores, 24GB RAM, 2Gbps Ops Seconds Read (one node)

34. HBase High Availability "...In HBase 0.90 I have seen that it has a fault tolerant behavior of triggering lease recovery and closing the file when the writer dies in the middle. Yet does hbase have any workaround/recovery when NameNode is restarted in the middle of the file write(possibly the HLog file , after some syncs)???..." Gokulakannan M source: hbase-user group

35. MapR High Availability No single point of failure Distributed NameNode Automatic and transparent failover Better performance Replicated and persisted to disk Fully distributed and highly scalable Real time HBase on MapR HBASE READ / WRITE MapR (No Single Point of Failure) Node Node Node NN NN NN Node Node Node NN NN NN

36. MapR Heatmap™ Intuitive Insightful Comprehensive One node or thousands More at www.mapr.com

37.

38. http://mapr.com/only-with-mapr.html

39. Follow us @mapr

40. Download and try from www.mapr.com

HBase backups and performance on MapR

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (17)

Similar to HBase backups and performance on MapR

Similar to HBase backups and performance on MapR (20)

More from lohitvijayarenu

More from lohitvijayarenu (14)

Recently uploaded

Recently uploaded (20)

HBase backups and performance on MapR