SERVICE PRODUCTION

Walter Liu

2012/06/11




                              Confidential | Copyright 2012 Trend
             1   12/24/2012
                                            Micro Inc.
About me
• Architect in Core Tech WRS
  • Trend Micro 2007~ Now
• Chief Technical Director in Netgame Dep.
  • Softstar Inc. Taiwan 1998~2007
• Expertise:
  • Backend service development and operation
It’s coming ……
The beginning




             Confidential | Copyright 2012 Trend   4
12/24/2012                 Micro Inc.
Assault - Error 12 !!!




             Confidential | Copyright 2012 Trend   5
12/24/2012                 Micro Inc.
The nightmare Error 37
Out of stock in Taiwan D3 packages
     • D3 packages are out of stock in largest 3 convenient
       stores in Taiwan and all game shops.
     • Some people shared they finally got packages in Jibei
       island.
     • Many people shared they pursued after trucks of
       convenient stores to get the packages.




             Confidential | Copyright 2012 Trend   7
12/24/2012                 Micro Inc.
Out of Stock of Taiwan Game Card




             Confidential | Copyright 2012 Trend   8
12/24/2012                 Micro Inc.
Blizzard Korean lower priority of
Taiwan IP




                                    Korean IP




                                    Taiwan IP
Questions?
What’s the most important
             things to users?



              Confidential | Copyright 2012 Trend   11
12/24/2012                  Micro Inc.
What mistakes Blizzard
             make in this D3 service
             production?


              Confidential | Copyright 2012 Trend   1
12/24/2012                  Micro Inc.              2
Why Quantity Estimation?
    Cost Effective


 What if it is wrong?
Scalability & Elasticity
Fail in Scalability
                          Performance
   1200

   1000

    800

    600

    400

    200

      0
          0   1   2   3     4   5   6   7   8   9   10   11
EPIC Fail in Scalability
                      Performance
   1200

   1000

    800

    600

    400

    200

      0
          0   1   2   3   4   5   6   7   8   9   10
Ideal Horizontal Scalability
                          Performance
   2500

   2000

   1500

   1000

    500

      0
          0   1   2   3     4   5   6   7   8   9   10   11
Elasticity - Unpredictable Traffic
Elasticity - Cloud Solution
Elasticity – Others
• Elastic Application Architecture.
• Several flexible hardware providers.
• Flexible ISPs and pricing.
• ……
Customer Service and Social
     Communication
     • Bz is doing pretty bad.
       • FB Event: Closed-beta account in Taiwan.
       • Build the image to help their users. Like,
             • No explanation about incidents.




               Confidential | Copyright 2012 Trend
                                                     VS.
                                                      2
12/24/2012                   Micro Inc.               1
Incident happens
    - Especially your service goes production.
Avengers Assemble !!!
Fast and Responsive Organization

                     Teams




       Effective
                             Awareness
     Communication
Fast and Responsive Process
• Incident management
• Problem management
Some other practices
• Interlock with related teams at beginning/middle.
   • Customer service prepares resources for burst incoming
     calls/tickets.
   • Customer service prepares training for the new service/product/
   • Data Center team gives out advices and plan for your project.
• Recruit a Service Manager
  • Fail case: something that not belong to any team.
  • Have someone responsible for whole service.
• Update/patch/change SOP
  • Fail case: Service changed, but your CS don’t know about it. Your
    customers are confused when they call your CS.
  • Fail case: Service changed and caused some trouble, but your
    service manager said he didn’t decide/say it.
Not related to these Diablo 3 failures, but
     important to any system
     • Availability
     • Security
     • Easy to administrate
       • System Health/Statistics Monitoring
       • Easy Deployment
       • Easy Configure
     • Risk Management




             Confidential | Copyright 2012 Trend   2
12/24/2012                 Micro Inc.              7
工商服務時間
     • Web Reputation Service
       • Parental Control and Productivity Control
             • 像是Hinet色情守門員,防毒軟體的孩童防護鎖
             • 或者公司不希望員工上班看色情、賭博等網站
        • Web Threat Protection
             • 防止惡意網頁如病毒/木馬下載、釣魚網站等等。

     • Advanced Persistent Threat
       • 鎖定特定目標
       • 假冒信件或者其他
       • 低調且緩慢
       • 客製化惡意元件
       • 安裝遠端控制工具
       • 傳送情資
              Confidential | Copyright 2012 Trend   2
12/24/2012                  Micro Inc.              8
WRS – Parental Control
WRS – Web Threat Protection
Funny Diablo 3 sales on Taobao, China




             Confidential | Copyright 2012 Trend   3
12/24/2012                 Micro Inc.              1
Thank You!
Risk Management
     • Identify Critical Failure
     • Develop a feasible plan to stabilize customer’s
       satisfaction.
        • Workaround.
        • Rollback.




             Confidential | Copyright 2012 Trend   3
12/24/2012                 Micro Inc.              3
Quantity Estimation
     • Goal: Cost Effective Quantity Estimation
       • For estimating
       • For wrong estimation
     • What if the estimation is not correct?
       • Too few
       • Too many




             Confidential | Copyright 2012 Trend   3
12/24/2012                 Micro Inc.              4
Scalability & Elasticity
     • Scalability
       • Is your application horizontal scalable?
     • Elasticity
       • Speed of commissioning / decommissioning
       • Max amount of resource can be brought in
       • Granularity of usage accounting
     • Develop the plan for high traffic.




             Confidential | Copyright 2012 Trend   3
12/24/2012                 Micro Inc.              5
Dare to fail
• Fail is inevitable for fast changing application.
  (Web, service)
• Fast changing
• Create dare-to-fail process and environment
  • Facebook
  • Backup plan
  • Rollback plan

Service production from d3 pitfall viewpoint

  • 1.
    SERVICE PRODUCTION Walter Liu 2012/06/11 Confidential | Copyright 2012 Trend 1 12/24/2012 Micro Inc.
  • 2.
    About me • Architectin Core Tech WRS • Trend Micro 2007~ Now • Chief Technical Director in Netgame Dep. • Softstar Inc. Taiwan 1998~2007 • Expertise: • Backend service development and operation
  • 3.
  • 4.
    The beginning Confidential | Copyright 2012 Trend 4 12/24/2012 Micro Inc.
  • 5.
    Assault - Error12 !!! Confidential | Copyright 2012 Trend 5 12/24/2012 Micro Inc.
  • 6.
  • 7.
    Out of stockin Taiwan D3 packages • D3 packages are out of stock in largest 3 convenient stores in Taiwan and all game shops. • Some people shared they finally got packages in Jibei island. • Many people shared they pursued after trucks of convenient stores to get the packages. Confidential | Copyright 2012 Trend 7 12/24/2012 Micro Inc.
  • 8.
    Out of Stockof Taiwan Game Card Confidential | Copyright 2012 Trend 8 12/24/2012 Micro Inc.
  • 9.
    Blizzard Korean lowerpriority of Taiwan IP Korean IP Taiwan IP
  • 10.
  • 11.
    What’s the mostimportant things to users? Confidential | Copyright 2012 Trend 11 12/24/2012 Micro Inc.
  • 12.
    What mistakes Blizzard make in this D3 service production? Confidential | Copyright 2012 Trend 1 12/24/2012 Micro Inc. 2
  • 13.
    Why Quantity Estimation? Cost Effective What if it is wrong?
  • 14.
  • 15.
    Fail in Scalability Performance 1200 1000 800 600 400 200 0 0 1 2 3 4 5 6 7 8 9 10 11
  • 16.
    EPIC Fail inScalability Performance 1200 1000 800 600 400 200 0 0 1 2 3 4 5 6 7 8 9 10
  • 17.
    Ideal Horizontal Scalability Performance 2500 2000 1500 1000 500 0 0 1 2 3 4 5 6 7 8 9 10 11
  • 18.
  • 19.
  • 20.
    Elasticity – Others •Elastic Application Architecture. • Several flexible hardware providers. • Flexible ISPs and pricing. • ……
  • 21.
    Customer Service andSocial Communication • Bz is doing pretty bad. • FB Event: Closed-beta account in Taiwan. • Build the image to help their users. Like, • No explanation about incidents. Confidential | Copyright 2012 Trend VS. 2 12/24/2012 Micro Inc. 1
  • 22.
    Incident happens - Especially your service goes production.
  • 23.
  • 24.
    Fast and ResponsiveOrganization Teams Effective Awareness Communication
  • 25.
    Fast and ResponsiveProcess • Incident management • Problem management
  • 26.
    Some other practices •Interlock with related teams at beginning/middle. • Customer service prepares resources for burst incoming calls/tickets. • Customer service prepares training for the new service/product/ • Data Center team gives out advices and plan for your project. • Recruit a Service Manager • Fail case: something that not belong to any team. • Have someone responsible for whole service. • Update/patch/change SOP • Fail case: Service changed, but your CS don’t know about it. Your customers are confused when they call your CS. • Fail case: Service changed and caused some trouble, but your service manager said he didn’t decide/say it.
  • 27.
    Not related tothese Diablo 3 failures, but important to any system • Availability • Security • Easy to administrate • System Health/Statistics Monitoring • Easy Deployment • Easy Configure • Risk Management Confidential | Copyright 2012 Trend 2 12/24/2012 Micro Inc. 7
  • 28.
    工商服務時間 • Web Reputation Service • Parental Control and Productivity Control • 像是Hinet色情守門員,防毒軟體的孩童防護鎖 • 或者公司不希望員工上班看色情、賭博等網站 • Web Threat Protection • 防止惡意網頁如病毒/木馬下載、釣魚網站等等。 • Advanced Persistent Threat • 鎖定特定目標 • 假冒信件或者其他 • 低調且緩慢 • 客製化惡意元件 • 安裝遠端控制工具 • 傳送情資 Confidential | Copyright 2012 Trend 2 12/24/2012 Micro Inc. 8
  • 29.
  • 30.
    WRS – WebThreat Protection
  • 31.
    Funny Diablo 3sales on Taobao, China Confidential | Copyright 2012 Trend 3 12/24/2012 Micro Inc. 1
  • 32.
  • 33.
    Risk Management • Identify Critical Failure • Develop a feasible plan to stabilize customer’s satisfaction. • Workaround. • Rollback. Confidential | Copyright 2012 Trend 3 12/24/2012 Micro Inc. 3
  • 34.
    Quantity Estimation • Goal: Cost Effective Quantity Estimation • For estimating • For wrong estimation • What if the estimation is not correct? • Too few • Too many Confidential | Copyright 2012 Trend 3 12/24/2012 Micro Inc. 4
  • 35.
    Scalability & Elasticity • Scalability • Is your application horizontal scalable? • Elasticity • Speed of commissioning / decommissioning • Max amount of resource can be brought in • Granularity of usage accounting • Develop the plan for high traffic. Confidential | Copyright 2012 Trend 3 12/24/2012 Micro Inc. 5
  • 36.
    Dare to fail •Fail is inevitable for fast changing application. (Web, service) • Fast changing • Create dare-to-fail process and environment • Facebook • Backup plan • Rollback plan

Editor's Notes

  • #4 經過10年漫長的等待,男人的小三,女人的公敵終於來到地球,準備毀滅世界。工程師除了上班努力的寫code,下班也要努力的拯救世界。
  • #5 在座的各位知道Diablo的舉個手?在座的有在玩的舉手一下?
  • #6 能夠使用他們想用的服務能夠使用大部份主要的功能
  • #7 供應鏈 Supply Chain 出包數量預估錯誤 (both on 實體包以及玩家數量)服務 Infrastructure 的彈性不足在台灣的客戶服務與社群,如何應對外部的聲音的能力很差
  • #14 Goal: Cost Effective Quantity EstimationFor estimatingFor wrong estimationWhat if the estimation is not correct?Too fewToo many
  • #15 closed-beta account in Taiwan event: 半年前開始用施捨的方式的愚蠢行銷,台灣拿到的D3 封測帳號 都是在FB上用搶的,貼一張圖 讓一堆人去回數字 然後挑一個數字 給CB帳號,結果美洲那時候測試直接發10萬個帳號。跟台灣政府的施政差不多
  • #25 Awareness: exampleRO 洗錢, don’t hire/recruit who don’t care your users.Effective Communication: War room, MSN meeting room, etc.Team: Every talents, include related people in your team.
  • #27 鎖定特定目標: 針對特定政府或企業客製化惡意元件:少量,沒在外面流通,所以很多防毒公司抓不到。傳送情資:加密後傳送,所以很多data leak prevention的方案抓不到。
  • #28 Error 12 時,緊急讓使用者都具有close beta的資格,他們就有辦法進去玩。
  • #29 There is no best way for quantity estimation. All depends.
  • #30 Elasticity exampleHave a flexible hardware providers.Use cloud solution.Good Scalability: 系統可以同時容納數十萬人,問題卻不是很多。Bad Elasticity: 沒有辦法快速依據需求快速增加或者減少系統所需的資源,而造成許多玩家一直重複的重試Error37