Open Standards and
Open Source in
Datacenter Management
蔡鎮宇 Chen-Yu Tsai <wens@csie.org>
2014/4/11 OSDC 2014 1
Who am I?
• Software Engineer @ CloudMosa, Inc.
• System Administrator for 10+ years starting in
college
• Skills: breakin...
Overview
•Monitoring
•Management
•Provisioning
2014/4/11 OSDC 2014 3
- Monitoring -
2014/4/11 OSDC 2014 4
Log Everything!
2014/4/11 OSDC 2014 5
Where to start?
2014/4/11 OSDC 2014 6
MRTG
2014/4/11 OSDC 2014 7
Based on SNMP
Supported by most network devices
2014/4/11 OSDC 2014 8
Exports data and metrics
2014/4/11 OSDC 2014 9
Network traffic counters
– used by MRTG
2014/4/11 OSDC 2014 10
Known MAC addresses
- Map the network
2014/4/11 OSDC 2014 11
2014/4/11 OSDC 2014 12
2014/4/11 OSDC 2014 13
Whatever the device
supports
Look up vendor specific MIBs
2014/4/11 OSDC 2014 14
RRDTool
Time Series Database
2014/4/11 OSDC 2014 15
MRTG uses it
2014/4/11 OSDC 2014 16
Munin uses it
2014/4/11 OSDC 2014 17
… uses it
2014/4/11 OSDC 2014 18
Write your own!
2014/4/11 OSDC 2014 19
2014/4/11 OSDC 2014 20
2014/4/11 OSDC 2014 21
Munin –
Resource Monitoring
2014/4/11 OSDC 2014 22
System is slow…
2014/4/11 OSDC 2014 23
CPU usage?
2014/4/11 OSDC 2014 24
2014/4/11 OSDC 2014 25
Memory usage?
2014/4/11 OSDC 2014 26
2014/4/11 OSDC 2014 27
Disk I/O?
2014/4/11 OSDC 2014 28
2014/4/11 OSDC 2014 29
Web requests?
2014/4/11 OSDC 2014 30
2014/4/11 OSDC 2014 31
Use plugins from
standard set
2014/4/11 OSDC 2014 32
Or write Your Own!
2014/4/11 OSDC 2014 33
2014/4/11 OSDC 2014 34
Aggregate Data
Manual configuration for now
2014/4/11 OSDC 2014 35
2014/4/11 OSDC 2014 36
Others
• Monitoring
• Xymon (Hobbit)
• Nagios
• Cacti
• Data collection / Graphing
• Graphite
• ZipKin (Twitter)
• Log col...
Management
2014/4/11 OSDC 2014 38
IPMI
Intelligent Platform Management Interface
2014/4/11 OSDC 2014 39
2014/4/11 OSDC 2014 40
Image from Wikipedia
Built into most BMCs
2014/4/11 OSDC 2014 41
Out-of-Band
vs
Side-band
2014/4/11 OSDC 2014 42
Power Control
On, Off, Reset
2014/4/11 OSDC 2014 43
Serial over LAN
Console Access
2014/4/11 OSDC 2014 44
Boot Order
Force PXE boot?
2014/4/11 OSDC 2014 45
SSH
Secure Shell
2014/4/11 OSDC 2014 46
SSH Public Key
Authentication
Don’t need to input password every time.
2014/4/11 OSDC 2014 47
OmniTTY
Console-based interactive SSH multiplexer
2014/4/11 OSDC 2014 48
Parallel-SSH (pssh)
Parallel versions of OpenSSH
2014/4/11 OSDC 2014 49
Fabric
Scriptable, Parallel SSH
2014/4/11 OSDC 2014 50
Provisioning
2014/4/11 OSDC 2014 51
DHCP
Network Provisioning
2014/4/11 OSDC 2014 52
PXE Boot
Boot over Network
2014/4/11 OSDC 2014 53
Auto-configuration
via DHCP
Network Switches
2014/4/11 OSDC 2014 54
Kickstart/Preseed
Automatic Install
2014/4/11 OSDC 2014 55
Chef
Puppet
Disclaimer: We don’t use them.
2014/4/11 OSDC 2014 56
Custom Packages
Put programs/services/settings
into native packages.
2014/4/11 OSDC 2014 57
Apt-cacher-ng
Web cache for package files
2014/4/11 OSDC 2014 58
Put It All Together
2014/4/11 OSDC 2014 59
2014/4/11 OSDC 2014 60
With the proper
hardware/software
2014/4/11 OSDC 2014 61
Datacenters Become
Manageable
2014/4/11 OSDC 2014 62
2~3 People
2k+ Nodes in
4 Datacenters
2014/4/11 OSDC 2014 63
Hands free after
racking and cabling
2014/4/11 OSDC 2014 64
2014/4/11 OSDC 2014 65
10k nodes?
2014/4/11 OSDC 2014 66
100k nodes?
2014/4/11 OSDC 2014 67
Evolve!
2014/4/11 OSDC 2014 68
We are Hiring!
2014/4/11 OSDC 2014 69
Thank You
2014/4/11 OSDC 2014 70
Upcoming SlideShare
Loading in …5
×

Open Standards and Open Source in Datacenter Management - OSDC.tw 2014

920 views

Published on

Slides to my talk @ OSDC 2014

Published in: Software, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
920
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Open Standards and Open Source in Datacenter Management - OSDC.tw 2014

  1. 1. Open Standards and Open Source in Datacenter Management 蔡鎮宇 Chen-Yu Tsai <wens@csie.org> 2014/4/11 OSDC 2014 1
  2. 2. Who am I? • Software Engineer @ CloudMosa, Inc. • System Administrator for 10+ years starting in college • Skills: breaking and fixing things 2014/4/11 OSDC 2014 2
  3. 3. Overview •Monitoring •Management •Provisioning 2014/4/11 OSDC 2014 3
  4. 4. - Monitoring - 2014/4/11 OSDC 2014 4
  5. 5. Log Everything! 2014/4/11 OSDC 2014 5
  6. 6. Where to start? 2014/4/11 OSDC 2014 6
  7. 7. MRTG 2014/4/11 OSDC 2014 7
  8. 8. Based on SNMP Supported by most network devices 2014/4/11 OSDC 2014 8
  9. 9. Exports data and metrics 2014/4/11 OSDC 2014 9
  10. 10. Network traffic counters – used by MRTG 2014/4/11 OSDC 2014 10
  11. 11. Known MAC addresses - Map the network 2014/4/11 OSDC 2014 11
  12. 12. 2014/4/11 OSDC 2014 12
  13. 13. 2014/4/11 OSDC 2014 13
  14. 14. Whatever the device supports Look up vendor specific MIBs 2014/4/11 OSDC 2014 14
  15. 15. RRDTool Time Series Database 2014/4/11 OSDC 2014 15
  16. 16. MRTG uses it 2014/4/11 OSDC 2014 16
  17. 17. Munin uses it 2014/4/11 OSDC 2014 17
  18. 18. … uses it 2014/4/11 OSDC 2014 18
  19. 19. Write your own! 2014/4/11 OSDC 2014 19
  20. 20. 2014/4/11 OSDC 2014 20
  21. 21. 2014/4/11 OSDC 2014 21
  22. 22. Munin – Resource Monitoring 2014/4/11 OSDC 2014 22
  23. 23. System is slow… 2014/4/11 OSDC 2014 23
  24. 24. CPU usage? 2014/4/11 OSDC 2014 24
  25. 25. 2014/4/11 OSDC 2014 25
  26. 26. Memory usage? 2014/4/11 OSDC 2014 26
  27. 27. 2014/4/11 OSDC 2014 27
  28. 28. Disk I/O? 2014/4/11 OSDC 2014 28
  29. 29. 2014/4/11 OSDC 2014 29
  30. 30. Web requests? 2014/4/11 OSDC 2014 30
  31. 31. 2014/4/11 OSDC 2014 31
  32. 32. Use plugins from standard set 2014/4/11 OSDC 2014 32
  33. 33. Or write Your Own! 2014/4/11 OSDC 2014 33
  34. 34. 2014/4/11 OSDC 2014 34
  35. 35. Aggregate Data Manual configuration for now 2014/4/11 OSDC 2014 35
  36. 36. 2014/4/11 OSDC 2014 36
  37. 37. Others • Monitoring • Xymon (Hobbit) • Nagios • Cacti • Data collection / Graphing • Graphite • ZipKin (Twitter) • Log collection • Scribe (Facebook) 2014/4/11 OSDC 2014 37
  38. 38. Management 2014/4/11 OSDC 2014 38
  39. 39. IPMI Intelligent Platform Management Interface 2014/4/11 OSDC 2014 39
  40. 40. 2014/4/11 OSDC 2014 40 Image from Wikipedia
  41. 41. Built into most BMCs 2014/4/11 OSDC 2014 41
  42. 42. Out-of-Band vs Side-band 2014/4/11 OSDC 2014 42
  43. 43. Power Control On, Off, Reset 2014/4/11 OSDC 2014 43
  44. 44. Serial over LAN Console Access 2014/4/11 OSDC 2014 44
  45. 45. Boot Order Force PXE boot? 2014/4/11 OSDC 2014 45
  46. 46. SSH Secure Shell 2014/4/11 OSDC 2014 46
  47. 47. SSH Public Key Authentication Don’t need to input password every time. 2014/4/11 OSDC 2014 47
  48. 48. OmniTTY Console-based interactive SSH multiplexer 2014/4/11 OSDC 2014 48
  49. 49. Parallel-SSH (pssh) Parallel versions of OpenSSH 2014/4/11 OSDC 2014 49
  50. 50. Fabric Scriptable, Parallel SSH 2014/4/11 OSDC 2014 50
  51. 51. Provisioning 2014/4/11 OSDC 2014 51
  52. 52. DHCP Network Provisioning 2014/4/11 OSDC 2014 52
  53. 53. PXE Boot Boot over Network 2014/4/11 OSDC 2014 53
  54. 54. Auto-configuration via DHCP Network Switches 2014/4/11 OSDC 2014 54
  55. 55. Kickstart/Preseed Automatic Install 2014/4/11 OSDC 2014 55
  56. 56. Chef Puppet Disclaimer: We don’t use them. 2014/4/11 OSDC 2014 56
  57. 57. Custom Packages Put programs/services/settings into native packages. 2014/4/11 OSDC 2014 57
  58. 58. Apt-cacher-ng Web cache for package files 2014/4/11 OSDC 2014 58
  59. 59. Put It All Together 2014/4/11 OSDC 2014 59
  60. 60. 2014/4/11 OSDC 2014 60
  61. 61. With the proper hardware/software 2014/4/11 OSDC 2014 61
  62. 62. Datacenters Become Manageable 2014/4/11 OSDC 2014 62
  63. 63. 2~3 People 2k+ Nodes in 4 Datacenters 2014/4/11 OSDC 2014 63
  64. 64. Hands free after racking and cabling 2014/4/11 OSDC 2014 64
  65. 65. 2014/4/11 OSDC 2014 65
  66. 66. 10k nodes? 2014/4/11 OSDC 2014 66
  67. 67. 100k nodes? 2014/4/11 OSDC 2014 67
  68. 68. Evolve! 2014/4/11 OSDC 2014 68
  69. 69. We are Hiring! 2014/4/11 OSDC 2014 69
  70. 70. Thank You 2014/4/11 OSDC 2014 70

×