The document discusses NHN Japan's use of HBase for the LINE messaging platform's storage infrastructure. Some key points:
- HBase is used to store tens of billions of message rows per day for LINE, achieving sub-10ms response times and high availability through dual clusters.
- The presentation covers their experience migrating HBase clusters between data centers online, handling NameNode failures, and stabilizing the LINE message storage cluster.
- It describes the custom HBase replication and bulk data migration tools developed by NHN Japan to support online cluster migrations without downtime. Failure handling and cluster stabilization techniques are also discussed.
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsCloudera, Inc.
The latest Apache HBase releases, 0.92 and 0.94, contain many improvements over prior releases in terms of correctness and performance improvements. We discuss a couple of these improvements from a development and operations perspective. For correctness, we discuss the ACID guarantees of HBase, give a case study of problems with earlier releases, and give an overview of the implementation internals that were improved to fix the issues. For performance, we discuss recent improvements in 0.94 and how to monitor the performance of a cluster with new metrics.
Hadoop Summit 2012 | HBase Consistency and Performance ImprovementsCloudera, Inc.
The latest Apache HBase releases, 0.92 and 0.94, contain many improvements over prior releases in terms of correctness and performance improvements. We discuss a couple of these improvements from a development and operations perspective. For correctness, we discuss the ACID guarantees of HBase, give a case study of problems with earlier releases, and give an overview of the implementation internals that were improved to fix the issues. For performance, we discuss recent improvements in 0.94 and how to monitor the performance of a cluster with new metrics.
Hadoop 0.23 contains major architectural changes in both HDFS and Map-Reduce frameworks. The fundamental changes include HDFS (Hadoop Distributed File System) Federation and YARN (Yet Another Resource Negotiator) to overcome the current scalability limitations of both HDFS and Job Tracker. Despite major architectural changes, the impact on user applications and programming model has been kept to a minimal to ensure that existing user Hadoop applications written in Hadoop 20 will continue to function with minimal changes. In this talk we will discuss the architectural changes which Hadoop 23 introduces and compare it to Hadoop 20. Since this is the biggest major release of Hadoop that has been adopted at Yahoo! (after Hadoop 20) in 3 years, we will talk about the customer impact and potential deployment issues of Hadoop 23 and its ecosystems. The deployment of Hadoop 23 at Yahoo! is an ongoing process and is being conducted in a phased manner on our clusters.
Presenter: Viraj Bhat, Principal Engineer, Yahoo!
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
Details the first ever Exabyte-scale system that can hold a Trillion large files. Describes MapR's Distributed NameNode (tm) architecture, and how it scales very easily and seamlessly. Shows map-reduce performance across a variety of benchmarks like dfsio, pig-mix, nnbench, terasort and YCSB.
Apache HBase is the Hadoop opensource, distributed, versioned storage manager well suited for random, realtime read/write access. This talk will give an overview on how HBase achieve random I/O, focusing on the storage layer internals. Starting from how the client interact with Region Servers and Master to go into WAL, MemStore, Compactions and on-disk format details. Looking at how the storage is used by features like snapshots, and how it can be improved to gain flexibility, performance and space efficiency.
Introduction to HBase. HBase is a NoSQL databases which experienced a tremendous increase in popularity during the last years. Large companies like Facebook, LinkedIn, Foursquare are using HBase. In this presentation we will address questions like: what is HBase?, and compared to relational databases?, what is the architecture?, how does HBase work?, what about the schema design?, what about the IT ressources?. Questions that should help you consider whether this solution might be suitable in your case.
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
Describes the thinking behind MapR's architecture. MapR"s Hadoop achieves better reliability on commodity hardware compared to anything on the planet, including custom, proprietary hardware from other vendors. Apache HDFS and Cassandra replication is also discussed, as are SAN and NAS storage systems like Netapp and EMC.
This talk delves into the many ways that a user has to use HBase in a project. Lars will look at many practical examples based on real applications in production, for example, on Facebook and eBay and the right approach for those wanting to find their own implementation. He will also discuss advanced concepts, such as counters, coprocessors and schema design.
Adobe has packaged HBase in Docker containers and uses Marathon and Mesos to schedule them—allowing us to decouple the RegionServer from the host, express resource requirements declaratively, and open the door for unassisted real-time deployments, elastic (up and down) real-time scalability, and more. In this talk, you'll hear what we've learned and explain why this approach could fundamentally change HBase operations.
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
Apache HDFS, the file system on which HBase is most commonly deployed, was originally designed for high-latency high-throughput batch analytic systems like MapReduce. Over the past two to three years, the rising popularity of HBase has driven many enhancements in HDFS to improve its suitability for real-time systems, including durability support for write-ahead logs, high availability, and improved low-latency performance. This talk will give a brief history of some of the enhancements from Hadoop 0.20.2 through 0.23.0, discuss some of the most exciting work currently under way, and explore some of the future enhancements we expect to develop in the coming years. We will include both high-level overviews of the new features as well as practical tips and benchmark results from real deployments.
There's a big shift in both at the architecture and api level from Hadoop 1 vs Hadoop 2, particularly YARN and we had our first meetup to talk about this (http://www.meetup.com/Atlanta-YARN-User-Group/) on 10/13/2013.
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)tatsuya6502
This is the Japanese translation of the presentation at Tokyo HBase Meetup (July 1, 2011)
Author:
Jonathan Gray
Software Engineer / HBase Commiter at Facebook
Hadoop 0.23 contains major architectural changes in both HDFS and Map-Reduce frameworks. The fundamental changes include HDFS (Hadoop Distributed File System) Federation and YARN (Yet Another Resource Negotiator) to overcome the current scalability limitations of both HDFS and Job Tracker. Despite major architectural changes, the impact on user applications and programming model has been kept to a minimal to ensure that existing user Hadoop applications written in Hadoop 20 will continue to function with minimal changes. In this talk we will discuss the architectural changes which Hadoop 23 introduces and compare it to Hadoop 20. Since this is the biggest major release of Hadoop that has been adopted at Yahoo! (after Hadoop 20) in 3 years, we will talk about the customer impact and potential deployment issues of Hadoop 23 and its ecosystems. The deployment of Hadoop 23 at Yahoo! is an ongoing process and is being conducted in a phased manner on our clusters.
Presenter: Viraj Bhat, Principal Engineer, Yahoo!
Design, Scale and Performance of MapR's Distribution for Hadoopmcsrivas
Details the first ever Exabyte-scale system that can hold a Trillion large files. Describes MapR's Distributed NameNode (tm) architecture, and how it scales very easily and seamlessly. Shows map-reduce performance across a variety of benchmarks like dfsio, pig-mix, nnbench, terasort and YCSB.
Apache HBase is the Hadoop opensource, distributed, versioned storage manager well suited for random, realtime read/write access. This talk will give an overview on how HBase achieve random I/O, focusing on the storage layer internals. Starting from how the client interact with Region Servers and Master to go into WAL, MemStore, Compactions and on-disk format details. Looking at how the storage is used by features like snapshots, and how it can be improved to gain flexibility, performance and space efficiency.
Introduction to HBase. HBase is a NoSQL databases which experienced a tremendous increase in popularity during the last years. Large companies like Facebook, LinkedIn, Foursquare are using HBase. In this presentation we will address questions like: what is HBase?, and compared to relational databases?, what is the architecture?, how does HBase work?, what about the schema design?, what about the IT ressources?. Questions that should help you consider whether this solution might be suitable in your case.
Architectural Overview of MapR's Apache Hadoop Distributionmcsrivas
Describes the thinking behind MapR's architecture. MapR"s Hadoop achieves better reliability on commodity hardware compared to anything on the planet, including custom, proprietary hardware from other vendors. Apache HDFS and Cassandra replication is also discussed, as are SAN and NAS storage systems like Netapp and EMC.
This talk delves into the many ways that a user has to use HBase in a project. Lars will look at many practical examples based on real applications in production, for example, on Facebook and eBay and the right approach for those wanting to find their own implementation. He will also discuss advanced concepts, such as counters, coprocessors and schema design.
Adobe has packaged HBase in Docker containers and uses Marathon and Mesos to schedule them—allowing us to decouple the RegionServer from the host, express resource requirements declaratively, and open the door for unassisted real-time deployments, elastic (up and down) real-time scalability, and more. In this talk, you'll hear what we've learned and explain why this approach could fundamentally change HBase operations.
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, ClouderaCloudera, Inc.
Apache HDFS, the file system on which HBase is most commonly deployed, was originally designed for high-latency high-throughput batch analytic systems like MapReduce. Over the past two to three years, the rising popularity of HBase has driven many enhancements in HDFS to improve its suitability for real-time systems, including durability support for write-ahead logs, high availability, and improved low-latency performance. This talk will give a brief history of some of the enhancements from Hadoop 0.20.2 through 0.23.0, discuss some of the most exciting work currently under way, and explore some of the future enhancements we expect to develop in the coming years. We will include both high-level overviews of the new features as well as practical tips and benchmark results from real deployments.
There's a big shift in both at the architecture and api level from Hadoop 1 vs Hadoop 2, particularly YARN and we had our first meetup to talk about this (http://www.meetup.com/Atlanta-YARN-User-Group/) on 10/13/2013.
Tokyo HBase Meetup - Realtime Big Data at Facebook with Hadoop and HBase (ja)tatsuya6502
This is the Japanese translation of the presentation at Tokyo HBase Meetup (July 1, 2011)
Author:
Jonathan Gray
Software Engineer / HBase Commiter at Facebook
Oracle Databaseの既存バージョンの10gや11gOracle Zero Data Loss Recovery Applianceの登場で、ますます重要な機能となってきたOracle Recovery Managerについて、OTN人気連載シリーズ「しばちょう先生の試して納得!DBAへの道」の執筆者が語ります。RMANバックアップの運用例から、高速増分バックアップの内部動作とチューニング方法まで、出し惜しみなく解説します。
第2回NHNテクノロジーカンファレンスで発表した資料ですー。
References: LINE Storage: Storing billions of rows in Sharded-Redis and HBase per Month (http://tech.naver.jp/blog/?p=1420), I posted this entry in 2012.3.
Jingwei Lu and Jason Zhang (Airbnb)
AirStream is a realtime stream computation framework built on top of Spark Streaming and HBase that allows our engineers and data scientists to easily leverage HBase to get real-time insights and build real-time feedback loops. In this talk, we will introduce AirStream, and then go over a few production use cases.
MapR is an amazing new distributed filesystem modeled after Hadoop. It maintains API compatibility with Hadoop, but far exceeds it in performance, manageability, and more.
/* Ted's MapR meeting slides incorporated here */
Strata + Hadoop World 2012: HDFS: Now and FutureCloudera, Inc.
Hadoop 1.0 is a significant milestone in being the most stable and robust Hadoop release tested in production against a variety of applications. It offers improved performance, support for HBase, disk-fail-in-place, Webhdfs, etc over previous releases. The next major release, Hadoop 2.0 offers several significant HDFS improvements including new append-pipeline, federation, wire compatibility, NameNode HA, further performance improvements, etc. We describe how to take advantages of the new features and their benefits. We also discuss some of the misconceptions and myths about HDFS.
In this session you will learn:
1. History of hadoop
2. Hadoop Ecosystem
3. Hadoop Animal Planet
4. What is Hadoop?
5. Distinctions of hadoop
6. Hadoop Components
7. The Hadoop Distributed Filesystem
8. Design of HDFS
9. When Not to use Hadoop?
10. HDFS Concepts
11. Anatomy of a File Read
12. Anatomy of a File Write
13. Replication & Rack awareness
14. Mapreduce Components
15. Typical Mapreduce Job
BDM37: Hadoop in production – the war stories by Nikolaï Grigoriev, Principal...Big Data Montreal
Sharing of Hadoop cluster deployment experience in production from scratch on real hardware. Brief overview of Hadoop stack, its components, major deployment and configuration challenges, performance tuning and application tuning experience. Some “war stories” about the issues we have faced while operating, the benefits of DevOps approach for running Hadoop apps.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Monitoring Java Application Security with JDK Tools and JFR Events
Storage infrastructure using HBase behind LINE messages
1.
2. Storage infrastructure using
HBase behind LINE messages
NHN Japan Corp.
LINE Server Task Force
Shunsuke Nakamura
@sunsuk7tp
13.1.21
Hadoop
Conference
Japan
2013
Winter
2
3. To support ’s users, we have built
message storage that is
Large scale (tens of billion rows/day)
Responsive (under 10 ms)
High available (dual clusters)
13.1.21
Hadoop
Conference
Japan
2013
Winter
3
4. Outline
• About LINE
• LINE & Storage requirements
• What we achieved
• Today’s topics
– IDC online migration
– NN failover
– Stabilizing LINE message cluster
• Conclusion
13.1.21
Hadoop
Conference
Japan
2013
Winter
4
5. LINE
- A global messenger powered by NHN Japan -
Devices
5 different mobile platforms
+ Desktop support
13.1.21
Hadoop
Conference
Japan
2013
Winter
5
9. New year 2013 in Japan
Number of requests in a HBase cluster
Usual Peak Hours New Year 2013
X
3
(ploFed
by
1min)
あけおめ!
新年好!
3
5mes
traffic
explosion
LINE
Storage
had
no
problems
:)
13.1.21
Hadoop
Conference
Japan
2013
Winter
9
10. LINE on Hadoop
Storages for service, backup and log
For HBase, M/R and log archive
Bulk migration and ad-hoc analysis
For HBase and Sharded-Redis
Collecting Apache and Tomcat logs
KPI, Log analysis
13.1.21
Hadoop
Conference
Japan
2013
Winter
10
11. LINE on Hadoop
Storages for service, backup and log
For HBase, M/R and log archive
Bulk migration and ad-hoc analysis
For HBase and Sharded-Redis
Collecting Apache and Tomcat logs
KPI, Log analysis
13.1.21
Hadoop
Conference
Japan
2013
Winter
11
12. LINE service requirements
LINE is a…
Messaging Service - Should be fast
Global Service - Downtime not allowed
But, not a Simple Messaging Service.
Message synchronization b/w phone & PCs
– Messages should be kept for a while.
13.1.21
Hadoop
Conference
Japan
2013
Winter
12
13. LINE’s storage requirements
No
data
loss
Eventual
Low
consistency
latency
HA
Flexible
schema
Easy
scale-‐
management
out
13.1.21
Hadoop
Conference
Japan
2013
Winter
13
14. Our selection is HBase
• Low latency for large amount of data
• Linearly scalable
• Relatively lower operating cost
– Replication by nature
– Automatic failover
• Data model fits our requirements
– Semi-structured
– Timestamp
13.1.21
Hadoop
Conference
Japan
2013
Winter
14
15. Stored rows per day in a cluster
(billions/day)
10
8
6
4
2
13.1.21
Hadoop
Conference
Japan
2013
Winter
15
16. What we achieved with HBase
• No data loss
– Persistent
– Data replication
• Automatic recovery from server failure
• Reasonable performance for large data sets
– Hundreds of billion rows
– Write: ~ 1 ms
– Read: 1 ~ 10 ms
13.1.21
Hadoop
Conference
Japan
2013
Winter
16
17. Many issues we had
• Heterogeneous storages coordination
• IDC online migration
• Flush & Compaction Storms by “too many HLogs”
• Row & Column distribution
• Secondary Index
• Region Management
– load, size balancing
– RS Allocation
– META region
– M/R
• Monitoring for diagnostics
• Traffic burst by decommission
• NN problems
• Performance degradation
– hotspot problem
– timeout burst
– GC problem
• Client bugs
– Thread Blocking on server failure (HBASE-6364)
13.1.21
Hadoop
Conference
Japan
2013
Winter
17
18. Today’s topics
IDC online migration
NN failover
Stabilizing LINE message cluster
13.1.21
Hadoop
Conference
Japan
2013
Winter
18
20. Why?
• Move whole HBase clusters and data
• For better network infrastructure
• Without downtime
13.1.21
Hadoop
Conference
Japan
2013
Winter
20
21. IDC online migration
Before migration
App Server
dst-HBase
write
src-HBase
13.1.21
Hadoop
Conference
Japan
2013
Winter
21
22. IDC online migration
• Write to both (client-level replication)
write
App Server
dst-HBase
write
src-HBase
13.1.21
Hadoop
Conference
Japan
2013
Winter
22
23. IDC online migration
• New data: Incremental replication
• Old data: Bulk migration
• dst’s timestamp equals src’s one
write
App Server
dst-HBase
write
src-HBase
13.1.21
Hadoop
Conference
Japan
2013
Winter
23
24. LINE HBase Replicator & BulkMigrator
Replicator is for incremental replication
BulkMigrator is for bulk migration
13.1.21
Hadoop
Conference
Japan
2013
Winter
24
25. LINE HBase Replicator
• Our own implementation
• Prefer pull to push
• Throughput throttling
• Workload isolation of replicator and RS
• Rowkey conversion and filtering
HBase
Replicator
LINE
HBase
Replicator
src-HBase
src-HBase
push
pull
dst-HBase
dst-HBase
13.1.21
Hadoop
Conference
Japan
2013
Winter
25
26. LINE HBase Replicator
- A simple daemon to replicate local regions -
1. HLogTracker reads a ckpt
and selects next HLog.
2. For each entry in HLog:
1. Filter & convert a HLog.Entry
2. Create Puts and batch to dst HBase
• Periodic checkpointing
• Generally, entries are replicated
in seconds
13.1.21
Hadoop
Conference
Japan
2013
Winter
26
27. Bulk migration
1. MapReduce between any storages
– Map task only
– Read source, write destination
– Task scheduling problem depends on region allocation
2. Non MapReduce version (BulkMigrator)
– Our own implementation
– HBase → HBase
– On each RS, scan & batch by a region
– Throughput throttling
– Slow, but easy to implement and debug
13.1.21
Hadoop
Conference
Japan
2013
Winter
27
31. NameNode failure
in 2012.10
13.1.21
Hadoop
Conference
Japan
2013
Winter
31
32. HA-NN failover failed
• Not NameNode process
• Incorrect leader election at network partitioning
• Complicated configuration
– Easy to mistake, difficult to control
– Pacemaker scripting was not straightforward
– VIP is risky to HDFS
• DRBD split-brain problem
– Protocol C
– Unable to re-sync while service is online
13.1.21
Hadoop
Conference
Japan
2013
Winter
32
33. Now: In-house NN failure handling
• Bye-bye old HA-NN
– Had to restart whole HBase clusters after NN failover
• Alternative ideas
– Quorum-based leader election (Using ZK)
– Using L4 switch
– Implement our own AvatarNode
• Safer solution instead of a little downtime
13.1.21
Hadoop
Conference
Japan
2013
Winter
33
34. In-house NN failure handling (1)
rsync
with
-‐-‐link-‐dest
periodically
13.1.21
Hadoop
Conference
Japan
2013
Winter
34
35. In-house NN failure handling (2)
Bomb
13.1.21
Hadoop
Conference
Japan
2013
Winter
35
36. In-house NN failure handling (3)
13.1.21
Hadoop
Conference
Japan
2013
Winter
36
38. Stabilizing LINE message cluster
Case
1
“Too
many
HLogs”
H/W
Failure
RS
GC
Storm
Handling
Case
3
Case
2
META
region
Hotspot
workload
Performance
problems
isola5on
Case
4
Region
mappings
to
RS
13.1.21
Hadoop
Conference
Japan
2013
Winter
38
39. Case1: “Too many HLogs”
• Effect
– MemStore flush storm
– Compaction storm
• Cause
– Different regions growth
– Heterogeneous tables in a RS
• Solution
– Region balancing
– External flush scheduler
13.1.21
Hadoop
Conference
Japan
2013
Winter
39
40. Case1: Number of HLogs
Forced flushed
shed
N o flu
Periodic flushed
better case
peak
off-peak
worse case
Forced flushed
Forced flushed
flush storm
Forced flushed
13.1.21
Hadoop
Conference
Japan
2013
Winter
40
41. Case2: Hotspot problems
• Effect
– Excessive GC
– RS performance degradation (High CPU usage)
• Cause
– Get/Scan:
• Row or column, updated too frequently
• Row which has too many columns (+ tombstones)
• Solution
– Schema and row/column distribution are important
– Hotspot region isolation
13.1.21
Hadoop
Conference
Japan
2013
Winter
41
42. Case3: META region workload
isolation
• Effect
1. RS high CPU
2. Excessive timeout
3. META lookup timeout
• Cause
– Inefficient exception handling of HBase client
– Hotspot region and META in same RS
• Solution
– META only RS
13.1.21
Hadoop
Conference
Japan
2013
Winter
42
43. Case4: Region mappings to RS
• Effect
– Region mapping is not restored on RS restart
– Some region mappings aren’t restored properly
after graceful restart
• graceful_stop.sh --restart --reload
• Cause
– HBase does not support it well
• Solution
– Periodic dump and restore it
13.1.21
Hadoop
Conference
Japan
2013
Winter
43
44. Summary
• IDC online migration
– Without downtime
– LINE HBase Replicator & BulkMigrator
• NN failover
– Simple solution for a person saying
“What’s Hadoop?”
• Stabilizing LINE message cluster
– Improved response time of RS
13.1.21
Hadoop
Conference
Japan
2013
Winter
44
45. Conclusion
We won 100M user adopting HBase
LINE Storage is a successful example
of a messaging service using HBase
13.1.21
Hadoop
Conference
Japan
2013
Winter
45