SlideShare a Scribd company logo
How We Migrate PBs Data from Beijing to Shanghai
Wang Yuxi, Umeng
w@umeng.com
Agenda
● Why migrating
● Current Infrastructure
● Environment Setup
● Data Transfer(HBase)
● Data Transfer (MongoDB)
● Data Transfer (Mysql/Redis)
● Application Provision
● Monitoring
● Benchmark and Stress Testing
● Go!
● Results
● Recap
About Me
● Before 2014, the only ops at Umeng
● Now, core member of ops team
● Technical generalist, responsible for the overall reliability and performance
of Umeng
● ArchLinux user
@Jasey_Wang | http://JaseyWang.Me
About Umeng
● Founded on April 2010
● Incubated by Innovation
Works
● $10 Million raised from
Matrix China
● Acquired by Alibaba
● Largest Mobile app
analytical platform in
China
● 400K+ Apps
● ~1B mobile device
Why Migrating
● Capex/Opex
● Unlimited resources, no worries for IDC, Racks, Bandwidth, etc.
● Integration with group's internal systems, massive advanced tools
● Make our PBs data safer
Current Infrastructure
● Data center: 4
● Server: 1000
● Networking device: 100
● Bandwidth: 4Gbps+
● Realtime analytics: 150K qps
● Batch processing: 4P/5P storage usage
● Know more? see Umeng Operations Infrastructure & Practice
Current Infrastructure(Cont.)
Environment Setup
● 1G dedicated fiber between Beijing and Shanghai ready
● Due to security reason, can only send SYN from Shanghai to Beijing
● Setup DNAT for Beijing cluster
● Raw data transfer test, saturate the bandwidth(iperf -P/netperf)
Data transfer(HBase)
● HBase, 0.94 @ Beijing, 0.98 @ Shanghai
● Build-in import/export tools don’t work
● Write own import/export tool, integrity check
● Task scheduler, historical & daily incremental data
● 2 months data transfer
Data transfer(HBase)(Cont.)
Data transfer(MongoDB)
● Master/Slave
○ 2.4.11, obsolete
○ build-in replica mechanism
● Primary/Secondary/Arbiter
○ replica arch
○ since Beijing can’t connect to Shanghai, write tools to read Oplog and
replay into Shanghai DB cluster
● Oplog, lag, TCP keepalive, slow query, dead lock
Data Transfer(MySQL/Redis)
● MySQL
○ percona-xtrabackup, quite handy
● Redis
○ single instance, slaveof command
○ twemproxy, export and import later by tools
Application Provision
● Stateless or stateful?
● Kafka & Mirror
○ consumer/producer queue
○ qps, topics, io
● Storm
○ throughput, lag
● Zookeeper
○ 5 nodes, 4 letter words, log cleaning
Monitoring
● Internal monitoring system backed by HBase
● Zabbix, Ganglia
● Graphite for metrics
● Monit for process monitoring
Benchmark and Stress Testing
● Single component
○ system level metrics
○ application level metrics
● Multiple components
● Part of online traffic
Go!
● What about PLAN B or C?
○ plan B usually does not work
● Friday night from 00:00 to 08:00
● Route tens of products traffic to Shanghai smoothly
● The site is fully available without outage
Results
● Sophisticated projects, all members in
● 6 months work pays off
● Hugely successful, no roll-back
● Part of products now running on private cloud
Recap
● Tools are #1 productive forces
● Test, test and test
● Monitoring and metrics
End
Q & A

More Related Content

What's hot

[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
Anna Ossowski
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
Digital Vidya
 
Rootconf 2017 - State of the Open Source monitoring landscape
Rootconf 2017 - State of the Open Source monitoring landscape Rootconf 2017 - State of the Open Source monitoring landscape
Rootconf 2017 - State of the Open Source monitoring landscape
NETWAYS
 
Database ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFiDatabase ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFi
Lucian Neghina
 
Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid
arupmalakar
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
kbajda
 
Apache flink
Apache flinkApache flink
Apache flink
pranay kumar
 
CDC to the Max!
CDC to the Max!CDC to the Max!
CDC to the Max!
Bronco Oostermeyer
 
aOS Kuala Lumpur - Migrating to SharePoint Online - Real-life Experiences
aOS Kuala Lumpur - Migrating to SharePoint Online - Real-life ExperiencesaOS Kuala Lumpur - Migrating to SharePoint Online - Real-life Experiences
aOS Kuala Lumpur - Migrating to SharePoint Online - Real-life Experiences
Rene Modery
 
Build intelligent, real-time applications using Machine Learning
Build intelligent, real-time applications using Machine LearningBuild intelligent, real-time applications using Machine Learning
Build intelligent, real-time applications using Machine Learning
Hotstar
 
Cypher for Apache Spark
Cypher for Apache SparkCypher for Apache Spark
Cypher for Apache Spark
openCypher
 
Towards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen Li
Towards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen LiTowards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen Li
Towards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen Li
Bowen Li
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
DataWorks Summit
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Flink Forward
 
Improving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time SparkImproving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time Spark
datamantra
 
Grafana 7.0
Grafana 7.0Grafana 7.0
Grafana 7.0
Juraj Hantak
 
ResourceSpace: Recent pains and future gains
ResourceSpace: Recent pains and future gainsResourceSpace: Recent pains and future gains
ResourceSpace: Recent pains and future gains
ResourceSpace
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex application
Akshay Gore
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
Yaroslav Tkachenko
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
kbajda
 

What's hot (20)

[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...
 
Structured Streaming in Spark
Structured Streaming in SparkStructured Streaming in Spark
Structured Streaming in Spark
 
Rootconf 2017 - State of the Open Source monitoring landscape
Rootconf 2017 - State of the Open Source monitoring landscape Rootconf 2017 - State of the Open Source monitoring landscape
Rootconf 2017 - State of the Open Source monitoring landscape
 
Database ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFiDatabase ingest with Apache NiFi and MiNiFi
Database ingest with Apache NiFi and MiNiFi
 
Streaming sql and druid
Streaming sql and druid Streaming sql and druid
Streaming sql and druid
 
Presto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix ContainersPresto Summit 2018 - 04 - Netflix Containers
Presto Summit 2018 - 04 - Netflix Containers
 
Apache flink
Apache flinkApache flink
Apache flink
 
CDC to the Max!
CDC to the Max!CDC to the Max!
CDC to the Max!
 
aOS Kuala Lumpur - Migrating to SharePoint Online - Real-life Experiences
aOS Kuala Lumpur - Migrating to SharePoint Online - Real-life ExperiencesaOS Kuala Lumpur - Migrating to SharePoint Online - Real-life Experiences
aOS Kuala Lumpur - Migrating to SharePoint Online - Real-life Experiences
 
Build intelligent, real-time applications using Machine Learning
Build intelligent, real-time applications using Machine LearningBuild intelligent, real-time applications using Machine Learning
Build intelligent, real-time applications using Machine Learning
 
Cypher for Apache Spark
Cypher for Apache SparkCypher for Apache Spark
Cypher for Apache Spark
 
Towards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen Li
Towards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen LiTowards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen Li
Towards Apache Flink 2.0 - Unified Data Processing and Beyond, Bowen Li
 
Iceberg: a fast table format for S3
Iceberg: a fast table format for S3Iceberg: a fast table format for S3
Iceberg: a fast table format for S3
 
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...Unify Enterprise Data Processing System Platform Level Integration of Flink a...
Unify Enterprise Data Processing System Platform Level Integration of Flink a...
 
Improving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time SparkImproving Mobile Payments With Real time Spark
Improving Mobile Payments With Real time Spark
 
Grafana 7.0
Grafana 7.0Grafana 7.0
Grafana 7.0
 
ResourceSpace: Recent pains and future gains
ResourceSpace: Recent pains and future gainsResourceSpace: Recent pains and future gains
ResourceSpace: Recent pains and future gains
 
University program - writing an apache apex application
University program  - writing an apache apex applicationUniversity program  - writing an apache apex application
University program - writing an apache apex application
 
Storing State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your AnalyticsStoring State Forever: Why It Can Be Good For Your Analytics
Storing State Forever: Why It Can Be Good For Your Analytics
 
Presto Summit 2018 - 03 - Starburst CBO
Presto Summit 2018  - 03 - Starburst CBOPresto Summit 2018  - 03 - Starburst CBO
Presto Summit 2018 - 03 - Starburst CBO
 

Similar to How We Migrate PBs Data from Beijing to Shanghai

Are we there yet?
Are we there yet?Are we there yet?
Are we there yet?
Johann Höchtl
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
Camille Salas
 
About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014
Michal Harish
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Demi Ben-Ari
 
Go at uber
Go at uberGo at uber
Go at uber
Rob Skillington
 
Testing data streaming applications
Testing data streaming applicationsTesting data streaming applications
Testing data streaming applications
Lars Albertsson
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
fschupp
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
Omid Vahdaty
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
Omid Vahdaty
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
Gaurav Bahrani
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
Zhenxiao Luo
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
Hakan Ilter
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in Production
Robert Sanders
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Omid Vahdaty
 
Introducing TiDB Operator
Introducing TiDB OperatorIntroducing TiDB Operator
Introducing TiDB Operator
Kevin Xu
 
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Neo4j
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
Itai Yaffe
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Alluxio, Inc.
 

Similar to How We Migrate PBs Data from Beijing to Shanghai (20)

Are we there yet?
Are we there yet?Are we there yet?
Are we there yet?
 
#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR#RADC4L16: An API-First Archives Approach at NPR
#RADC4L16: An API-First Archives Approach at NPR
 
About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014About VisualDNA Architecture @ Rubyslava 2014
About VisualDNA Architecture @ Rubyslava 2014
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-AriThinking DevOps in the era of the Cloud - Demi Ben-Ari
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
 
Go at uber
Go at uberGo at uber
Go at uber
 
Testing data streaming applications
Testing data streaming applicationsTesting data streaming applications
Testing data streaming applications
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | EnglishAWS big-data-demystified #1.1  | Big Data Architecture Lessons Learned | English
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
 
AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned AWS Big Data Demystified #1: Big data architecture lessons learned
AWS Big Data Demystified #1: Big data architecture lessons learned
 
Designing for operability and managability
Designing for operability and managabilityDesigning for operability and managability
Designing for operability and managability
 
Machine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systemsMachine learning and big data @ uber a tale of two systems
Machine learning and big data @ uber a tale of two systems
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
 
Apache Airflow in Production
Apache Airflow in ProductionApache Airflow in Production
Apache Airflow in Production
 
Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3  Big Data in 200 km/h | AWS Big Data Demystified #1.3
Big Data in 200 km/h | AWS Big Data Demystified #1.3
 
Introducing TiDB Operator
Introducing TiDB OperatorIntroducing TiDB Operator
Introducing TiDB Operator
 
AmazonRedshift
AmazonRedshiftAmazonRedshift
AmazonRedshift
 
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 

More from Elmer Brown

CDN 行业研究报告
CDN 行业研究报告CDN 行业研究报告
CDN 行业研究报告
Elmer Brown
 
OAuth 简介
OAuth 简介OAuth 简介
OAuth 简介
Elmer Brown
 
Archlinux 更适合开发人员的发行版
Archlinux 更适合开发人员的发行版Archlinux 更适合开发人员的发行版
Archlinux 更适合开发人员的发行版
Elmer Brown
 
Google,产品线与开源相关
Google,产品线与开源相关Google,产品线与开源相关
Google,产品线与开源相关
Elmer Brown
 
How to become a free software hacker
How to become a free software hackerHow to become a free software hacker
How to become a free software hacker
Elmer Brown
 
Ubuntu Natty 11.04 新特性
Ubuntu Natty 11.04 新特性Ubuntu Natty 11.04 新特性
Ubuntu Natty 11.04 新特性
Elmer Brown
 
火狐俱乐部有奖问答题
火狐俱乐部有奖问答题火狐俱乐部有奖问答题
火狐俱乐部有奖问答题
Elmer Brown
 

More from Elmer Brown (9)

CDN 行业研究报告
CDN 行业研究报告CDN 行业研究报告
CDN 行业研究报告
 
OAuth 简介
OAuth 简介OAuth 简介
OAuth 简介
 
Archlinux 更适合开发人员的发行版
Archlinux 更适合开发人员的发行版Archlinux 更适合开发人员的发行版
Archlinux 更适合开发人员的发行版
 
Google,产品线与开源相关
Google,产品线与开源相关Google,产品线与开源相关
Google,产品线与开源相关
 
How to become a free software hacker
How to become a free software hackerHow to become a free software hacker
How to become a free software hacker
 
Ubuntu Natty 11.04 新特性
Ubuntu Natty 11.04 新特性Ubuntu Natty 11.04 新特性
Ubuntu Natty 11.04 新特性
 
PKI
PKIPKI
PKI
 
火狐俱乐部有奖问答题
火狐俱乐部有奖问答题火狐俱乐部有奖问答题
火狐俱乐部有奖问答题
 
Gnu linux-start
Gnu linux-startGnu linux-start
Gnu linux-start
 

Recently uploaded

Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
wolfsoftcompanyco
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
Danica Gill
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
eutxy
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
Trending Blogers
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
3ipehhoa
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
Arif0071
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
Laura Szabó
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
vmemo1
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
SEO Article Boost
 

Recently uploaded (20)

Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
 
7 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 20247 Best Cloud Hosting Services to Try Out in 2024
7 Best Cloud Hosting Services to Try Out in 2024
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
一比一原版(LBS毕业证)伦敦商学院毕业证成绩单专业办理
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
 
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
原版仿制(uob毕业证书)英国伯明翰大学毕业证本科学历证书原版一模一样
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
test test test test testtest test testtest test testtest test testtest test ...
test test  test test testtest test testtest test testtest test testtest test ...test test  test test testtest test testtest test testtest test testtest test ...
test test test test testtest test testtest test testtest test testtest test ...
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
 
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
重新申请毕业证书(RMIT毕业证)皇家墨尔本理工大学毕业证成绩单精仿办理
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
 

How We Migrate PBs Data from Beijing to Shanghai

  • 1. How We Migrate PBs Data from Beijing to Shanghai Wang Yuxi, Umeng w@umeng.com
  • 2. Agenda ● Why migrating ● Current Infrastructure ● Environment Setup ● Data Transfer(HBase) ● Data Transfer (MongoDB) ● Data Transfer (Mysql/Redis) ● Application Provision ● Monitoring ● Benchmark and Stress Testing ● Go! ● Results ● Recap
  • 3. About Me ● Before 2014, the only ops at Umeng ● Now, core member of ops team ● Technical generalist, responsible for the overall reliability and performance of Umeng ● ArchLinux user @Jasey_Wang | http://JaseyWang.Me
  • 4. About Umeng ● Founded on April 2010 ● Incubated by Innovation Works ● $10 Million raised from Matrix China ● Acquired by Alibaba ● Largest Mobile app analytical platform in China ● 400K+ Apps ● ~1B mobile device
  • 5. Why Migrating ● Capex/Opex ● Unlimited resources, no worries for IDC, Racks, Bandwidth, etc. ● Integration with group's internal systems, massive advanced tools ● Make our PBs data safer
  • 6. Current Infrastructure ● Data center: 4 ● Server: 1000 ● Networking device: 100 ● Bandwidth: 4Gbps+ ● Realtime analytics: 150K qps ● Batch processing: 4P/5P storage usage ● Know more? see Umeng Operations Infrastructure & Practice
  • 8. Environment Setup ● 1G dedicated fiber between Beijing and Shanghai ready ● Due to security reason, can only send SYN from Shanghai to Beijing ● Setup DNAT for Beijing cluster ● Raw data transfer test, saturate the bandwidth(iperf -P/netperf)
  • 9. Data transfer(HBase) ● HBase, 0.94 @ Beijing, 0.98 @ Shanghai ● Build-in import/export tools don’t work ● Write own import/export tool, integrity check ● Task scheduler, historical & daily incremental data ● 2 months data transfer
  • 11. Data transfer(MongoDB) ● Master/Slave ○ 2.4.11, obsolete ○ build-in replica mechanism ● Primary/Secondary/Arbiter ○ replica arch ○ since Beijing can’t connect to Shanghai, write tools to read Oplog and replay into Shanghai DB cluster ● Oplog, lag, TCP keepalive, slow query, dead lock
  • 12. Data Transfer(MySQL/Redis) ● MySQL ○ percona-xtrabackup, quite handy ● Redis ○ single instance, slaveof command ○ twemproxy, export and import later by tools
  • 13. Application Provision ● Stateless or stateful? ● Kafka & Mirror ○ consumer/producer queue ○ qps, topics, io ● Storm ○ throughput, lag ● Zookeeper ○ 5 nodes, 4 letter words, log cleaning
  • 14. Monitoring ● Internal monitoring system backed by HBase ● Zabbix, Ganglia ● Graphite for metrics ● Monit for process monitoring
  • 15. Benchmark and Stress Testing ● Single component ○ system level metrics ○ application level metrics ● Multiple components ● Part of online traffic
  • 16. Go! ● What about PLAN B or C? ○ plan B usually does not work ● Friday night from 00:00 to 08:00 ● Route tens of products traffic to Shanghai smoothly ● The site is fully available without outage
  • 17. Results ● Sophisticated projects, all members in ● 6 months work pays off ● Hugely successful, no roll-back ● Part of products now running on private cloud
  • 18. Recap ● Tools are #1 productive forces ● Test, test and test ● Monitoring and metrics