Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Open Standards and
Open Source in
Datacenter Management
蔡鎮宇 Chen-Yu Tsai <wens@csie.org>
2014/4/11 OSDC 2014 1
Who am I?
• Software Engineer @ CloudMosa, Inc.
• System Administrator for 10+ years starting in
college
• Skills: breakin...
Overview
•Monitoring
•Management
•Provisioning
2014/4/11 OSDC 2014 3
- Monitoring -
2014/4/11 OSDC 2014 4
Log Everything!
2014/4/11 OSDC 2014 5
Where to start?
2014/4/11 OSDC 2014 6
MRTG
2014/4/11 OSDC 2014 7
Based on SNMP
Supported by most network devices
2014/4/11 OSDC 2014 8
Exports data and metrics
2014/4/11 OSDC 2014 9
Network traffic counters
– used by MRTG
2014/4/11 OSDC 2014 10
Known MAC addresses
- Map the network
2014/4/11 OSDC 2014 11
2014/4/11 OSDC 2014 12
2014/4/11 OSDC 2014 13
Whatever the device
supports
Look up vendor specific MIBs
2014/4/11 OSDC 2014 14
RRDTool
Time Series Database
2014/4/11 OSDC 2014 15
MRTG uses it
2014/4/11 OSDC 2014 16
Munin uses it
2014/4/11 OSDC 2014 17
… uses it
2014/4/11 OSDC 2014 18
Write your own!
2014/4/11 OSDC 2014 19
2014/4/11 OSDC 2014 20
2014/4/11 OSDC 2014 21
Munin –
Resource Monitoring
2014/4/11 OSDC 2014 22
System is slow…
2014/4/11 OSDC 2014 23
CPU usage?
2014/4/11 OSDC 2014 24
2014/4/11 OSDC 2014 25
Memory usage?
2014/4/11 OSDC 2014 26
2014/4/11 OSDC 2014 27
Disk I/O?
2014/4/11 OSDC 2014 28
2014/4/11 OSDC 2014 29
Web requests?
2014/4/11 OSDC 2014 30
2014/4/11 OSDC 2014 31
Use plugins from
standard set
2014/4/11 OSDC 2014 32
Or write Your Own!
2014/4/11 OSDC 2014 33
2014/4/11 OSDC 2014 34
Aggregate Data
Manual configuration for now
2014/4/11 OSDC 2014 35
2014/4/11 OSDC 2014 36
Others
• Monitoring
• Xymon (Hobbit)
• Nagios
• Cacti
• Data collection / Graphing
• Graphite
• ZipKin (Twitter)
• Log col...
Management
2014/4/11 OSDC 2014 38
IPMI
Intelligent Platform Management Interface
2014/4/11 OSDC 2014 39
2014/4/11 OSDC 2014 40
Image from Wikipedia
Built into most BMCs
2014/4/11 OSDC 2014 41
Out-of-Band
vs
Side-band
2014/4/11 OSDC 2014 42
Power Control
On, Off, Reset
2014/4/11 OSDC 2014 43
Serial over LAN
Console Access
2014/4/11 OSDC 2014 44
Boot Order
Force PXE boot?
2014/4/11 OSDC 2014 45
SSH
Secure Shell
2014/4/11 OSDC 2014 46
SSH Public Key
Authentication
Don’t need to input password every time.
2014/4/11 OSDC 2014 47
OmniTTY
Console-based interactive SSH multiplexer
2014/4/11 OSDC 2014 48
Parallel-SSH (pssh)
Parallel versions of OpenSSH
2014/4/11 OSDC 2014 49
Fabric
Scriptable, Parallel SSH
2014/4/11 OSDC 2014 50
Provisioning
2014/4/11 OSDC 2014 51
DHCP
Network Provisioning
2014/4/11 OSDC 2014 52
PXE Boot
Boot over Network
2014/4/11 OSDC 2014 53
Auto-configuration
via DHCP
Network Switches
2014/4/11 OSDC 2014 54
Kickstart/Preseed
Automatic Install
2014/4/11 OSDC 2014 55
Chef
Puppet
Disclaimer: We don’t use them.
2014/4/11 OSDC 2014 56
Custom Packages
Put programs/services/settings
into native packages.
2014/4/11 OSDC 2014 57
Apt-cacher-ng
Web cache for package files
2014/4/11 OSDC 2014 58
Put It All Together
2014/4/11 OSDC 2014 59
2014/4/11 OSDC 2014 60
With the proper
hardware/software
2014/4/11 OSDC 2014 61
Datacenters Become
Manageable
2014/4/11 OSDC 2014 62
2~3 People
2k+ Nodes in
4 Datacenters
2014/4/11 OSDC 2014 63
Hands free after
racking and cabling
2014/4/11 OSDC 2014 64
2014/4/11 OSDC 2014 65
10k nodes?
2014/4/11 OSDC 2014 66
100k nodes?
2014/4/11 OSDC 2014 67
Evolve!
2014/4/11 OSDC 2014 68
We are Hiring!
2014/4/11 OSDC 2014 69
Thank You
2014/4/11 OSDC 2014 70
Upcoming SlideShare
Loading in …5
×

Open Standards and Open Source in Datacenter Management - OSDC.tw 2014

976 views

Published on

Slides to my talk @ OSDC 2014

Published in: Software, Technology
  • Be the first to comment

Open Standards and Open Source in Datacenter Management - OSDC.tw 2014

  1. 1. Open Standards and Open Source in Datacenter Management 蔡鎮宇 Chen-Yu Tsai <wens@csie.org> 2014/4/11 OSDC 2014 1
  2. 2. Who am I? • Software Engineer @ CloudMosa, Inc. • System Administrator for 10+ years starting in college • Skills: breaking and fixing things 2014/4/11 OSDC 2014 2
  3. 3. Overview •Monitoring •Management •Provisioning 2014/4/11 OSDC 2014 3
  4. 4. - Monitoring - 2014/4/11 OSDC 2014 4
  5. 5. Log Everything! 2014/4/11 OSDC 2014 5
  6. 6. Where to start? 2014/4/11 OSDC 2014 6
  7. 7. MRTG 2014/4/11 OSDC 2014 7
  8. 8. Based on SNMP Supported by most network devices 2014/4/11 OSDC 2014 8
  9. 9. Exports data and metrics 2014/4/11 OSDC 2014 9
  10. 10. Network traffic counters – used by MRTG 2014/4/11 OSDC 2014 10
  11. 11. Known MAC addresses - Map the network 2014/4/11 OSDC 2014 11
  12. 12. 2014/4/11 OSDC 2014 12
  13. 13. 2014/4/11 OSDC 2014 13
  14. 14. Whatever the device supports Look up vendor specific MIBs 2014/4/11 OSDC 2014 14
  15. 15. RRDTool Time Series Database 2014/4/11 OSDC 2014 15
  16. 16. MRTG uses it 2014/4/11 OSDC 2014 16
  17. 17. Munin uses it 2014/4/11 OSDC 2014 17
  18. 18. … uses it 2014/4/11 OSDC 2014 18
  19. 19. Write your own! 2014/4/11 OSDC 2014 19
  20. 20. 2014/4/11 OSDC 2014 20
  21. 21. 2014/4/11 OSDC 2014 21
  22. 22. Munin – Resource Monitoring 2014/4/11 OSDC 2014 22
  23. 23. System is slow… 2014/4/11 OSDC 2014 23
  24. 24. CPU usage? 2014/4/11 OSDC 2014 24
  25. 25. 2014/4/11 OSDC 2014 25
  26. 26. Memory usage? 2014/4/11 OSDC 2014 26
  27. 27. 2014/4/11 OSDC 2014 27
  28. 28. Disk I/O? 2014/4/11 OSDC 2014 28
  29. 29. 2014/4/11 OSDC 2014 29
  30. 30. Web requests? 2014/4/11 OSDC 2014 30
  31. 31. 2014/4/11 OSDC 2014 31
  32. 32. Use plugins from standard set 2014/4/11 OSDC 2014 32
  33. 33. Or write Your Own! 2014/4/11 OSDC 2014 33
  34. 34. 2014/4/11 OSDC 2014 34
  35. 35. Aggregate Data Manual configuration for now 2014/4/11 OSDC 2014 35
  36. 36. 2014/4/11 OSDC 2014 36
  37. 37. Others • Monitoring • Xymon (Hobbit) • Nagios • Cacti • Data collection / Graphing • Graphite • ZipKin (Twitter) • Log collection • Scribe (Facebook) 2014/4/11 OSDC 2014 37
  38. 38. Management 2014/4/11 OSDC 2014 38
  39. 39. IPMI Intelligent Platform Management Interface 2014/4/11 OSDC 2014 39
  40. 40. 2014/4/11 OSDC 2014 40 Image from Wikipedia
  41. 41. Built into most BMCs 2014/4/11 OSDC 2014 41
  42. 42. Out-of-Band vs Side-band 2014/4/11 OSDC 2014 42
  43. 43. Power Control On, Off, Reset 2014/4/11 OSDC 2014 43
  44. 44. Serial over LAN Console Access 2014/4/11 OSDC 2014 44
  45. 45. Boot Order Force PXE boot? 2014/4/11 OSDC 2014 45
  46. 46. SSH Secure Shell 2014/4/11 OSDC 2014 46
  47. 47. SSH Public Key Authentication Don’t need to input password every time. 2014/4/11 OSDC 2014 47
  48. 48. OmniTTY Console-based interactive SSH multiplexer 2014/4/11 OSDC 2014 48
  49. 49. Parallel-SSH (pssh) Parallel versions of OpenSSH 2014/4/11 OSDC 2014 49
  50. 50. Fabric Scriptable, Parallel SSH 2014/4/11 OSDC 2014 50
  51. 51. Provisioning 2014/4/11 OSDC 2014 51
  52. 52. DHCP Network Provisioning 2014/4/11 OSDC 2014 52
  53. 53. PXE Boot Boot over Network 2014/4/11 OSDC 2014 53
  54. 54. Auto-configuration via DHCP Network Switches 2014/4/11 OSDC 2014 54
  55. 55. Kickstart/Preseed Automatic Install 2014/4/11 OSDC 2014 55
  56. 56. Chef Puppet Disclaimer: We don’t use them. 2014/4/11 OSDC 2014 56
  57. 57. Custom Packages Put programs/services/settings into native packages. 2014/4/11 OSDC 2014 57
  58. 58. Apt-cacher-ng Web cache for package files 2014/4/11 OSDC 2014 58
  59. 59. Put It All Together 2014/4/11 OSDC 2014 59
  60. 60. 2014/4/11 OSDC 2014 60
  61. 61. With the proper hardware/software 2014/4/11 OSDC 2014 61
  62. 62. Datacenters Become Manageable 2014/4/11 OSDC 2014 62
  63. 63. 2~3 People 2k+ Nodes in 4 Datacenters 2014/4/11 OSDC 2014 63
  64. 64. Hands free after racking and cabling 2014/4/11 OSDC 2014 64
  65. 65. 2014/4/11 OSDC 2014 65
  66. 66. 10k nodes? 2014/4/11 OSDC 2014 66
  67. 67. 100k nodes? 2014/4/11 OSDC 2014 67
  68. 68. Evolve! 2014/4/11 OSDC 2014 68
  69. 69. We are Hiring! 2014/4/11 OSDC 2014 69
  70. 70. Thank You 2014/4/11 OSDC 2014 70

×