Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
hosted by
HBase Practice
At China Life Insurance Co., Ltd.
Fan Zheng
August 17,2018
hosted by
Content
01
02
04
03
Scenarios
Integration, Processing, Query, Export
Optimizations
Cluster, Configuration, Writi...
hosted by
Scenarios
Integration, Processing, Query, Export
01
hosted by
Scenarios
Overview
1
2
3
4
Processing
Fast batch R/W through snapshot,
bulkload…
Export
Hive external table supp...
hosted by
Scale
Cluster
3 Clusters
200+ Nodes
Data
HBase data 300TB+
largest table 30TB+ , 2500 regions
Processing
Hundred...
hosted by
Integration
Scenarios
policy_detail
Business
Systems
Sales management
systems
ECIF CallCenter
Customer contact
s...
hosted by
Integration
Scenarios
• Data integrated by the key of business entity.
• One row for all information
hosted by
Integration
Scenarios
• RDBMS->Textfile->Hfile->BulkLoad to HBase
hosted by
Processing
Scenarios
• Analysis of entities (customer,policy …) .
• Build indexes between entities.
hosted by
Processing
Scenarios
• Processing hundreds of labels within one time I/O
hosted by
Processing
Scenarios
• Encapsulated, Configurable, Sql-like development framework
Development:
• Sql-like&Native...
hosted by
Query
Scenarios
• Unified query services centering on entities such as customer
hosted by
Query
Scenarios
• Sql-like input
• One Service for All labels
hosted by
Export
Scenarios
• Exporting to Hive to support batch query and analysis
hosted by
Export
Scenarios
• Universal MR, read snapshot
hosted by
Optimizations
Cluster, Configuration, Writing, Reading
02
hosted by
Optimizations
Multiple Cluster
• Read/Write Splitting
• Platform/Application Splitting
hosted by
Optimizations
Configuration
• Region Balanced By Table
• G1GC
• Daily/Weekly MajorCompaction
• Pre-split Regions...
hosted by
Optimizations
Reading
• Snapshot read
• ColumnPrefixFilter
• Data Blacklist
• Incremental read
• ……
• Minimal im...
hosted by
Optimizations
writing
• Write HFiles ->Do bulkLoad
• Skip WAL
• ……
• Minimal impact on RS
hosted by
Problems
Copy failed, Compact never end
03
hosted by
Problems
Failed to copy tables
1.Create snapshot
2.Export snapshot
3.Disable table
4.Restore snapshot
5.Enable t...
hosted by
Problems
Failed to copy tables
Problems
1.Failed to create snapshot
(Timeout/Wasn’t complet in expectedTime)
Sol...
hosted by
Problems
Failed to copy tables
Problems
1. Failed to export snapshot
(File not found)
Solutions 1. Disable table...
hosted by
Problems
Compact never ends
Problems
Solutions
Recreate table without this ‘PREFIX_TREE’
hosted by
Problems
Unresolved
• Table unavailable at the end of copy
(disable->restore->enable)
• Region’s locality drop t...
hosted by
Future Works
Pheonix, Realtime
04
hosted by
Future Works
Pheonix
Apps
Real-time
Customer View
On Phoenix
Business Systems
Kafka
Shareplex
sparkstreaming
• F...
hosted by
Future Works
Real-time
• Real-time detail tables and label tables
hosted by
Thanks
Upcoming SlideShare
Loading in …5
×

HBaseConAsia2018 Track3-3: HBase at China Life Insurance

309 views

Published on

Zheng Fan of China Life Insurance

Published in: Internet
  • Be the first to comment

HBaseConAsia2018 Track3-3: HBase at China Life Insurance

  1. 1. hosted by HBase Practice At China Life Insurance Co., Ltd. Fan Zheng August 17,2018
  2. 2. hosted by Content 01 02 04 03 Scenarios Integration, Processing, Query, Export Optimizations Cluster, Configuration, Writing, Reading Problems Copy failed, Compact never end Future Work Pheonix, Realtime
  3. 3. hosted by Scenarios Integration, Processing, Query, Export 01
  4. 4. hosted by Scenarios Overview 1 2 3 4 Processing Fast batch R/W through snapshot, bulkload… Export Hive external table support Integration Easy update, schemaless, millions of columns Query High concurrency & Low latency • Storage for integration • Data source for query and processing
  5. 5. hosted by Scale Cluster 3 Clusters 200+ Nodes Data HBase data 300TB+ largest table 30TB+ , 2500 regions Processing Hundreds of MR/Hive/Spark jobs per day 50TB+ Incremental data for update&insert Scenarios Queries Tens of millions of queries per day
  6. 6. hosted by Integration Scenarios policy_detail Business Systems Sales management systems ECIF CallCenter Customer contact system …… Customer_detailAgent_detail …… • Integrate policy, customer, agent… data from various systems
  7. 7. hosted by Integration Scenarios • Data integrated by the key of business entity. • One row for all information
  8. 8. hosted by Integration Scenarios • RDBMS->Textfile->Hfile->BulkLoad to HBase
  9. 9. hosted by Processing Scenarios • Analysis of entities (customer,policy …) . • Build indexes between entities.
  10. 10. hosted by Processing Scenarios • Processing hundreds of labels within one time I/O
  11. 11. hosted by Processing Scenarios • Encapsulated, Configurable, Sql-like development framework Development: • Sql-like&Native API • Off-line Unit Test …… Runtime Configuration: • Name of Source/target table • Read Table/Snapshot • Write Hive/Hbase • Rowkey blacklist • Init/Incr switch • BulkLoad switch • ColumnPrefixFilter switch • Single Row Processing • Single Label Processing ……
  12. 12. hosted by Query Scenarios • Unified query services centering on entities such as customer
  13. 13. hosted by Query Scenarios • Sql-like input • One Service for All labels
  14. 14. hosted by Export Scenarios • Exporting to Hive to support batch query and analysis
  15. 15. hosted by Export Scenarios • Universal MR, read snapshot
  16. 16. hosted by Optimizations Cluster, Configuration, Writing, Reading 02
  17. 17. hosted by Optimizations Multiple Cluster • Read/Write Splitting • Platform/Application Splitting
  18. 18. hosted by Optimizations Configuration • Region Balanced By Table • G1GC • Daily/Weekly MajorCompaction • Pre-split Regions • …… • Minimal impact on RS
  19. 19. hosted by Optimizations Reading • Snapshot read • ColumnPrefixFilter • Data Blacklist • Incremental read • …… • Minimal impact on RS
  20. 20. hosted by Optimizations writing • Write HFiles ->Do bulkLoad • Skip WAL • …… • Minimal impact on RS
  21. 21. hosted by Problems Copy failed, Compact never end 03
  22. 22. hosted by Problems Failed to copy tables 1.Create snapshot 2.Export snapshot 3.Disable table 4.Restore snapshot 5.Enable table 6.Delete snapshot Processes Problems 1.Failed to create snapshot (Wasn’t complet in expectedTime) 2. Failed to export snapshot (File not found)
  23. 23. hosted by Problems Failed to copy tables Problems 1.Failed to create snapshot (Timeout/Wasn’t complet in expectedTime) Solutions 1.Put required column into a small table. 2.Wait and retry. 3.Disable table. 4.Disable region balance temporarily. 5.skip_flush 6.Increase the timeout limit Reasons 1. compacting/splitting/transiting while creating snapshot 2. Spending to much time on flushing
  24. 24. hosted by Problems Failed to copy tables Problems 1. Failed to export snapshot (File not found) Solutions 1. Disable table before copy. 2. Fix bug to search hfile correctly Reasons 1. compaction/split happend after snapshot creation 2. Bug in ExportSnapshot
  25. 25. hosted by Problems Compact never ends Problems Solutions Recreate table without this ‘PREFIX_TREE’
  26. 26. hosted by Problems Unresolved • Table unavailable at the end of copy (disable->restore->enable) • Region’s locality drop to 0 • ……
  27. 27. hosted by Future Works Pheonix, Realtime 04
  28. 28. hosted by Future Works Pheonix Apps Real-time Customer View On Phoenix Business Systems Kafka Shareplex sparkstreaming • Flexible, real-time, precise query scenario
  29. 29. hosted by Future Works Real-time • Real-time detail tables and label tables
  30. 30. hosted by Thanks

×