Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Siguccs presentation pre

484 views

Published on

  • Be the first to comment

Siguccs presentation pre

  1. 1. ACM    SIGUCCS  2012 Service  &  Support  Conference Opera&onal  Experiences  from  the  Viewpoint  of   University  IT  System  Administrators   in  the  Metropolitan  Area   on  East  Japan  Great  Earthquake   Kohichi  Ogawa  and  Noriaki  Yoshiura Informa7on  Technology  Center Saitama  University 1Friday, October 19, 12
  2. 2. Great  Earthquake  and  Great  Tsunami 2Friday, October 19, 12
  3. 3. Loca7on  of  Earthquake  and  our  University 3Friday, October 19, 12
  4. 4. Loca7on  of  Earthquake  and  our  University 3Friday, October 19, 12
  5. 5. Loca7on  of  Earthquake  and  our  University Epicenter 3Friday, October 19, 12
  6. 6. Loca7on  of  Earthquake  and  our  University damaged   Areas Epicenter 3Friday, October 19, 12
  7. 7. Loca7on  of  Earthquake  and  our  University damaged   Areas Epicenter 3Friday, October 19, 12
  8. 8. Loca7on  of  Earthquake  and  our  University damaged   Areas Epicenter Tokyo 3Friday, October 19, 12
  9. 9. Loca7on  of  Earthquake  and  our  University damaged   Areas Epicenter Saitama University Tokyo 3Friday, October 19, 12
  10. 10. Loca7on  of  Earthquake  and  our  University damaged   Areas Epicenter Saitama University Tokyo 3Friday, October 19, 12
  11. 11. Loca7on  of  Earthquake  and  our  University damaged   Areas about 130 miles Epicenter Saitama University Tokyo 3Friday, October 19, 12
  12. 12. Topics  of  the  presenta7on • Energy  Problems  by  the  Earthquake • Some  Troubles  in  the  Rolling  Blackouts • Reloca7on  to  Data  Center  and  VPS • Lessons  and  Experiences 4Friday, October 19, 12
  13. 13. 1. Introduc7on l System  at  the  earthquake 2. Situa7on  aSer  the  Earthquake l immediately  aSer  the  Earthquake l Opera7on  for  rolling  power  outage l Impact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   l Data  Center   l VPS 4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 5Friday, October 19, 12
  14. 14. 1. Introduc7on l System  at  the  earthquake 2. Situa7on  aSer  the  Earthquake l immediately  aSer  the  Earthquake l Opera7on  for  rolling  power  outage l Impact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   l Data  Center   l VPS 4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 6Friday, October 19, 12
  15. 15. System  at  the  disaster(2007-­‐2011) • Network  System   – L3  Switches  x  6  switches – Wifi  Access  Points  x  80  aps • Server  System – About  40  Server • Hos7ng  Services – Web  Hos7ng  Service(200  sites) – DNS  Hos7ng  Service  (100  zones) – Mail  Hos7ng  Service  (40  sub  domains) • Housing  Service – Rent  Space  of  server  room  for  other  organiza7on  in  the   university 7Friday, October 19, 12
  16. 16. Network  Topology • Star  topology   Network • One-­‐to-­‐one   connec7on  from  lab   to  server  room • No  network  switch   between  each  room   and  the  server  room 8Friday, October 19, 12
  17. 17. 1. Introduc7on l System  at  the  earthquake 2. Situa7on  aSer  the  Earthquake l immediately  aSer  the  Earthquake l Opera7on  for  rolling  power  outage l Impact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   l Data  Center   l VPS 4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 9Friday, October 19, 12
  18. 18. Immediately  situa7on   aSer  the  Great  Earthquake • 5-­‐lower  in  Saitama  University • No  direct  damage  such  as  collapsed  buildings • Informa7on  Infrastructure – No  Op7cal  fiber  cut  in  the  server  room – No  troubles  in  network  equipment  and  servers 10Friday, October 19, 12
  19. 19. Immediately  situa7on   aSer  the  Great  Earthquake • 5-­‐lower  in  Saitama  University • No  direct  damage  such  as  collapsed  buildings • Informa7on  Infrastructure – No  Op7cal  fiber  cut  in  the  server  room – No  troubles  in  network  equipment  and  servers 10Friday, October 19, 12
  20. 20. The  Rolling  Blackouts • Damaged  nuclear  power  plant                                                   →the  supply  of  electricity  was  weakened • The  government  announced  implementa7on  of   the  rolling  blackouts. • 5  groups  by  regions • 4th  group  at  Saitama  University • Blackouts  for  about  4  hours  at  a  7me 11Friday, October 19, 12
  21. 21. Impacts  of  rolling  blackouts Electricity    Place   Groups  of  the  Rolling  Blackouts by  the  Rolling  Blackouts 12Friday, October 19, 12
  22. 22. Countermeasures  against  the  disasters • Informa7on  Infrastructure  during  Rolling   Blackouts – to  support  the  ac7vi7es  of  the  university  by  email   and  web  servers • Rented  Power  Generator • Switching  to  the  emergency  power  supply –  manpower   13Friday, October 19, 12
  23. 23. Prac7cal  use  of Rented  Power  Generator Rented   Temporary  Power   Power  Generator Connec7on  Board 14Friday, October 19, 12
  24. 24. Schedule  for  the  rolling  blackouts 3/14 3/15 3/16 3/17 3/18 3/19 3/20 3/21 3/22 3/23 Date Mon Tue Wed Thu Fri Sat Sun Mon Tue Wed 0:00 6:00 9:20 6:20 ∼ ∼ 12:30 10:00 Wait Wait Wait 12:00 13:50 15:20 15:50 ∼ ∼ ∼ 17:30 18:40 18:45 18:00 18:50 18:20 ∼ ∼ 21:45 21:00 15Friday, October 19, 12
  25. 25. Some  troubles   for  Informa7on  Infrastructure • March  22    Three  UPS  and  two  servers  failed  at   the  7me  of  changing  switch.     – Failure  of  the  DNS  server   – Unavailability  to  access  E-­‐mail  and  Web  Servers • March  23    Troubles  of  L3  switches – Layer  3  switches  trouble  by  rou7ng  processing  unit   failure • A  part  of  Campus  Network  stopped  for  3  days   16Friday, October 19, 12
  26. 26. Problems  of  fuel  exhaus7on • Emergency  power  fuel  exhaus7on – Oil  refinery  damaged  by  earthquake – Reduc7on  of  oil  fuel  supply • Staff  Problems: – Scheduling  of  opera7on  staffs – Traffic  paralysis – Health  status  of  opera7ons  staffsThe  difficulty  of  maintaining  the  informa7on  infrastructure 17Friday, October 19, 12
  27. 27. 1. Introduc7on l System  at  the  earthquake 2. Situa7on  aSer  the  Earthquake l immediately  aSer  the  Earthquake l Opera7on  for  rolling  power  outage l Impact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   l Data  Center   l VPS 4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 18Friday, October 19, 12
  28. 28. Countermeasures  against  this  situa7on • Data  Center – Physical  Reloca7on – Reloca7on  of  cri7cal  servers • VPS  (Virtual  Private  Server) – Logical  Reloca7on – Reloca7on  of  func7ons 19Friday, October 19, 12
  29. 29. Prepara7on  of  data  center  reloca7on • Ready-­‐to-­‐use  Data  Center • Tour  of  the  data  center – Two  weeks  before  the  earthquake • A  data  center  near  the  university  by  chance • Specifica7on – 1  rack(Full  Rack)  60A/100V   – 100Mbps  internet 20Friday, October 19, 12
  30. 30. Standards  of  selec7ng  the  data  center • Access  near  to  the  university • Prepara7on  of  private  power  generator   – fuel  is  always  stored  for  3  days • Physical  security   • Earthquake-­‐proof  construc7on   21Friday, October 19, 12
  31. 31. Plan  of  reloca7on  of  the  Data  Center • Carry  out  servers  in  three  groups – Many  checks   – Carefully • First  Reloca7on – impac7ng  only  a  few  users • Last  reloca7on   – E-­‐mail  System – impac7ng  many  users 22Friday, October 19, 12
  32. 32. How  to  move  to  the  Data  Center 23Friday, October 19, 12
  33. 33. How  to  move  to  the  Data  Center Firewall 23Friday, October 19, 12
  34. 34. How  to  move  to  the  Data  Center Firewall 23Friday, October 19, 12
  35. 35. How  to  move  to  the  Data  Center Net Firewall 23Friday, October 19, 12
  36. 36. How  to  move  to  the  Data  Center Net Firewall 23Friday, October 19, 12
  37. 37. How  to  move  to  the  Data  Center Net Firewall LDAP Server 23Friday, October 19, 12
  38. 38. How  to  move  to  the  Data  Center Net Firewall DNS Hosting LDAP Mailing Lists Mail Hosting Server Server Server Server 23Friday, October 19, 12
  39. 39. How  to  move  to  the  Data  Center Net Firewall DNS Hosting LDAP Mailing Lists Mail Hosting Server Server Server Server 23Friday, October 19, 12
  40. 40. How  to  move  to  the  Data  Center Net Firewall DNS Hosting LDAP Mailing Lists Mail Hosting Server Server Server Server Outside SMTP Web Mail University Mail Spam  Filter Server Server Server Appliance 23Friday, October 19, 12
  41. 41. The  actual  reloca7on  of  Data  Center • About  one  week  from  the  applica7on  of  the   data  center • Completed  the  reloca7on  of  all  the  hardware   at  the  end  of  March,  2011 • Reloca7on  experience   – one  of  the  opera7on  staff 24Friday, October 19, 12
  42. 42. Some  Troubles  of  DNS  sehngs • Mis-­‐opera7on  of  DNS  sehng – unaccessible  to  mail  servers • Changing  the  IP  addresses  of  servers  in  the   data  center  reloca7on – shorten  “TTL  values”  of  the  DNS  configura7on • Laboratory  Routers – A  func7on  of  the  DNS  cache – Reboot  aSer  the  big  change  of  infrastructure 25Friday, October 19, 12
  43. 43. Usage  of  VPS • VPS(Virtual  Private  Server) – Opera7ons  via  Web  Browsers – Installing  and  sehng  up  some  OS  (CentOS,  Fedora…) – Sehng  up  Servers  freely 26Friday, October 19, 12
  44. 44. Servers  relocated  to  VPS • Secondary  Mail  Spool  Server – Prevent  lost  mail,  during  data  center  reloca7on  or   the  rolling  blackouts • DNS  Server(Slave  Server) – Secondary  DNS  Server   • Web  Server  of  Saitama  University – www.saitama-­‐u.ac.jp • Web  Hos7ng  Server – Virtual  Web  server  for  laboratory  and  office 27Friday, October 19, 12
  45. 45. 1. Introduc7on l System  at  the  earthquake 2. Situa7on  aSer  the  Earthquake l immediately  aSer  the  Earthquake l Opera7on  for  rolling  power  outage l Impact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   l Data  Center   l VPS 4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 28Friday, October 19, 12
  46. 46. Effec7veness  of  Reloca7on  (1) • Reloca7on  of  the  server  decrease  consump7on   of  electric  power. • Electricity  consump7on  reduc7ons  suffice  for   the  cost  of  data  center   29Friday, October 19, 12
  47. 47. Trends  in  electricity  usage electricity  use  date 30Friday, October 19, 12
  48. 48. Trends  in  electricity  usage electricity  use  date 30Friday, October 19, 12
  49. 49. Trends  in  electricity  usage electricity  use earthquake  date 30Friday, October 19, 12
  50. 50. Trends  in  electricity  usage electricity  use earthquake  date 30Friday, October 19, 12
  51. 51. Trends  in  electricity  usage electricity  use earthquake  date 30Friday, October 19, 12
  52. 52. Trends  in  electricity  usage electricity  use earthquake new  system   star7ng  date 30Friday, October 19, 12
  53. 53. Trends  in  electricity  usage electricity  use earthquake new  system   star7ng  date 30Friday, October 19, 12
  54. 54. Effec7veness  of  Reloca7on  (2) • Reduc7on  of  the  opera7on  for  maintaining   informa7on  infrastructure   • Contribu7on  for  stable  Mail  service  and  Web   services – Availability  of  remote  support  without  going  to  the   server  room  at  the  university   31Friday, October 19, 12
  55. 55. 1. Introduc7on lSystem  at  the  earthquake 2. Situa7on  aSer  the  Earthquake limmediately  aSer  the  Earthquake lOpera7on  for  rolling  power  outage lImpact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 32Friday, October 19, 12
  56. 56. New  System  (2012∼now) • Wi-­‐Fi  Access  Points  (300  APs) • Virtualiza7on  Technology • Aware  of  using  the  Data  Center Cisco  UCS EMC  VMX 33Friday, October 19, 12
  57. 57. Comparison  of    server  hardware about  40  Servers (1U  or  2U  Server) 2  Servers (Cisco  UCS) 2007∼2011 2012∼now 34Friday, October 19, 12
  58. 58. Comparison  of    server  hardware about  40  Servers (1U  or  2U  Server) 2  Servers (Cisco  UCS) 2007∼2011 2012∼now 34Friday, October 19, 12
  59. 59. 1. Introduc7on l System  at  the  earthquake 2. Situa7on  aSer  the  Earthquake l immediately  aSer  the  Earthquake l Opera7on  for  rolling  power  outage l Impact  of  rolling  power  outage 3. Countermeasures  against  this  situa7on   l Data  Center   l VPS 4. Effec7veness  by  countermeasures 5. New  System  aSer  the  earthquake 6. Lessons  and  Approaches 35Friday, October 19, 12
  60. 60. Organiza7ons Lessons • The  top  execu7ves  of  the  university  and  person  in   charge  have  the  same  viewpoints. – “The  informa7on  infrastructure  is  important”   • Staff  skill  and  manpower  are  important 36Friday, October 19, 12
  61. 61. Organiza7ons Lessons • The  top  execu7ves  of  the  university  and  person  in   charge  have  the  same  viewpoints. – “The  informa7on  infrastructure  is  important”   • Staff  skill  and  manpower  are  important 36Friday, October 19, 12
  62. 62. Organiza7ons Lessons • The  top  execu7ves  of  the  university  and  person  in   charge  have  the  same  viewpoints. – “The  informa7on  infrastructure  is  important”   • Staff  skill  and  manpower  are  important Approaches • Take  smooth  communica7ons  in  organiza7on • Improve  technology  skills  of  opera7on  staffs • Make  compact  informa7on  system • Set  the  priori7es  of  elements  of  the  system 36Friday, October 19, 12
  63. 63. Environments Lessons • Because  it  was  one  campus,  communica7on   between  faculty  and  staff  was  good. 37Friday, October 19, 12
  64. 64. Environments Lessons • Because  it  was  one  campus,  communica7on   between  faculty  and  staff  was  good. 37Friday, October 19, 12
  65. 65. Environments Lessons • Because  it  was  one  campus,  communica7on   between  faculty  and  staff  was  good. Approaches • In  separate  Campus,  Unavailability  of   telephones                  →Preparing  Satellite-­‐based  mobile  phones 37Friday, October 19, 12
  66. 66. Coopera7on  among  Universi7es   Lessons • We  back  up  the  data  among  universi7es  for  each   other • Service  for  the  damaged  university  was  provided  by   other  non-­‐damaged  university 38Friday, October 19, 12
  67. 67. Coopera7on  among  Universi7es   Lessons • We  back  up  the  data  among  universi7es  for  each   other • Service  for  the  damaged  university  was  provided  by   other  non-­‐damaged  university 38Friday, October 19, 12
  68. 68. Coopera7on  among  Universi7es   Lessons • We  back  up  the  data  among  universi7es  for  each   other • Service  for  the  damaged  university  was  provided  by   other  non-­‐damaged  university Approaches • "Disaster  Net  Box” (from WTC2012) - Low  cost  backup  system  among  universi7es 38Friday, October 19, 12
  69. 69. System  Administrators  in  disasters Lessons • The  change  of  the  power  generator  required   manpower. • In  disasters,  the  traffic  paralysis  disrupted        commute  of  system  administrator. 39Friday, October 19, 12
  70. 70. System  Administrators  in  disasters Lessons • The  change  of  the  power  generator  required   manpower. • In  disasters,  the  traffic  paralysis  disrupted        commute  of  system  administrator. 39Friday, October 19, 12
  71. 71. System  Administrators  in  disasters Lessons • The  change  of  the  power  generator  required   manpower. • In  disasters,  the  traffic  paralysis  disrupted        commute  of  system  administrator. Approaches • The  measures  to  maintain  the  informa7on   infrastructure  remotely  are  effec7ve. 39Friday, October 19, 12
  72. 72. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. 40Friday, October 19, 12
  73. 73. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. 40Friday, October 19, 12
  74. 74. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. 40Friday, October 19, 12
  75. 75. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. 40Friday, October 19, 12
  76. 76. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. 40Friday, October 19, 12
  77. 77. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. Approaches 40Friday, October 19, 12
  78. 78. Contribu7on  for  Areas  near  the  University Lessons • Mobile  phones  were  unavailable  in  disasters. • People  could  not  use  the  Internet  during   disasters. Approaches • Open  the  university  resources  for  commuters  and  the   neighborhood  inhabitants  in  disasters • The  informa7on  infrastructure  of  the  university   • Be  careful  about  false  rumors! 40Friday, October 19, 12
  79. 79. Conclusion • We  relocated  servers  to  Data  Center  and  VPS   as  countermeasures  against  Rolling  Blackouts. • We  learned  some  lessons  by  the  Great   Earthquake  and  the  Rolling  Blackouts. 41Friday, October 19, 12
  80. 80. Ques7ons  and  Answers If  you  have  ques7on  or  interest,   please  send  E-­‐mail  as  follows. kogawa@mail.saitama-­‐u.ac.jp twiner  @gawakouen facebook  gawakou and  face  to  face  communica7on. 42Friday, October 19, 12

×