Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How We Migrate PBs Data from Beijing to Shanghai
Wang Yuxi, Umeng
w@umeng.com
Agenda
● Why migrating
● Current Infrastructure
● Environment Setup
● Data Transfer(HBase)
● Data Transfer (MongoDB)
● Dat...
About Me
● Before 2014, the only ops at Umeng
● Now, core member of ops team
● Technical generalist, responsible for the o...
About Umeng
● Founded on April 2010
● Incubated by Innovation
Works
● $10 Million raised from
Matrix China
● Acquired by A...
Why Migrating
● Capex/Opex
● Unlimited resources, no worries for IDC, Racks, Bandwidth, etc.
● Integration with group's in...
Current Infrastructure
● Data center: 4
● Server: 1000
● Networking device: 100
● Bandwidth: 4Gbps+
● Realtime analytics: ...
Current Infrastructure(Cont.)
Environment Setup
● 1G dedicated fiber between Beijing and Shanghai ready
● Due to security reason, can only send SYN from...
Data transfer(HBase)
● HBase, 0.94 @ Beijing, 0.98 @ Shanghai
● Build-in import/export tools don’t work
● Write own import...
Data transfer(HBase)(Cont.)
Data transfer(MongoDB)
● Master/Slave
○ 2.4.11, obsolete
○ build-in replica mechanism
● Primary/Secondary/Arbiter
○ replic...
Data Transfer(MySQL/Redis)
● MySQL
○ percona-xtrabackup, quite handy
● Redis
○ single instance, slaveof command
○ twemprox...
Application Provision
● Stateless or stateful?
● Kafka & Mirror
○ consumer/producer queue
○ qps, topics, io
● Storm
○ thro...
Monitoring
● Internal monitoring system backed by HBase
● Zabbix, Ganglia
● Graphite for metrics
● Monit for process monit...
Benchmark and Stress Testing
● Single component
○ system level metrics
○ application level metrics
● Multiple components
●...
Go!
● What about PLAN B or C?
○ plan B usually does not work
● Friday night from 00:00 to 08:00
● Route tens of products t...
Results
● Sophisticated projects, all members in
● 6 months work pays off
● Hugely successful, no roll-back
● Part of prod...
Recap
● Tools are #1 productive forces
● Test, test and test
● Monitoring and metrics
End
Q & A
Upcoming SlideShare
Loading in …5
×

How We Migrate PBs Data from Beijing to Shanghai

3,449 views

Published on

We spent more than 6 months migrating our PBs data located in Beijing to Shanghai.
This slide gives you a brief introduction about how we do it.
The link *Umeng Operations Infrastructure & Practice* is here: http://www.slideshare.net/jaseywang/umeng-operations-infrastructure-practice

Published in: Internet
  • Be the first to comment

How We Migrate PBs Data from Beijing to Shanghai

  1. 1. How We Migrate PBs Data from Beijing to Shanghai Wang Yuxi, Umeng w@umeng.com
  2. 2. Agenda ● Why migrating ● Current Infrastructure ● Environment Setup ● Data Transfer(HBase) ● Data Transfer (MongoDB) ● Data Transfer (Mysql/Redis) ● Application Provision ● Monitoring ● Benchmark and Stress Testing ● Go! ● Results ● Recap
  3. 3. About Me ● Before 2014, the only ops at Umeng ● Now, core member of ops team ● Technical generalist, responsible for the overall reliability and performance of Umeng ● ArchLinux user @Jasey_Wang | http://JaseyWang.Me
  4. 4. About Umeng ● Founded on April 2010 ● Incubated by Innovation Works ● $10 Million raised from Matrix China ● Acquired by Alibaba ● Largest Mobile app analytical platform in China ● 400K+ Apps ● ~1B mobile device
  5. 5. Why Migrating ● Capex/Opex ● Unlimited resources, no worries for IDC, Racks, Bandwidth, etc. ● Integration with group's internal systems, massive advanced tools ● Make our PBs data safer
  6. 6. Current Infrastructure ● Data center: 4 ● Server: 1000 ● Networking device: 100 ● Bandwidth: 4Gbps+ ● Realtime analytics: 150K qps ● Batch processing: 4P/5P storage usage ● Know more? see Umeng Operations Infrastructure & Practice
  7. 7. Current Infrastructure(Cont.)
  8. 8. Environment Setup ● 1G dedicated fiber between Beijing and Shanghai ready ● Due to security reason, can only send SYN from Shanghai to Beijing ● Setup DNAT for Beijing cluster ● Raw data transfer test, saturate the bandwidth(iperf -P/netperf)
  9. 9. Data transfer(HBase) ● HBase, 0.94 @ Beijing, 0.98 @ Shanghai ● Build-in import/export tools don’t work ● Write own import/export tool, integrity check ● Task scheduler, historical & daily incremental data ● 2 months data transfer
  10. 10. Data transfer(HBase)(Cont.)
  11. 11. Data transfer(MongoDB) ● Master/Slave ○ 2.4.11, obsolete ○ build-in replica mechanism ● Primary/Secondary/Arbiter ○ replica arch ○ since Beijing can’t connect to Shanghai, write tools to read Oplog and replay into Shanghai DB cluster ● Oplog, lag, TCP keepalive, slow query, dead lock
  12. 12. Data Transfer(MySQL/Redis) ● MySQL ○ percona-xtrabackup, quite handy ● Redis ○ single instance, slaveof command ○ twemproxy, export and import later by tools
  13. 13. Application Provision ● Stateless or stateful? ● Kafka & Mirror ○ consumer/producer queue ○ qps, topics, io ● Storm ○ throughput, lag ● Zookeeper ○ 5 nodes, 4 letter words, log cleaning
  14. 14. Monitoring ● Internal monitoring system backed by HBase ● Zabbix, Ganglia ● Graphite for metrics ● Monit for process monitoring
  15. 15. Benchmark and Stress Testing ● Single component ○ system level metrics ○ application level metrics ● Multiple components ● Part of online traffic
  16. 16. Go! ● What about PLAN B or C? ○ plan B usually does not work ● Friday night from 00:00 to 08:00 ● Route tens of products traffic to Shanghai smoothly ● The site is fully available without outage
  17. 17. Results ● Sophisticated projects, all members in ● 6 months work pays off ● Hugely successful, no roll-back ● Part of products now running on private cloud
  18. 18. Recap ● Tools are #1 productive forces ● Test, test and test ● Monitoring and metrics
  19. 19. End Q & A

×