Summary ReportYupeng Chen2012.8.15
Introduce to GangliaProblem & SolutionMy Harvest
Introduction and overview• Scalable distributed monitoring system for high-performancecomputing systems• XML - data repres...
Ganglia Architecture• Gmond - Ganglia Monitoring DaemonMetric gathering agent installed on individual servers• Gmetad - Ga...
Multicast – All gmond nodes are capable of listening to andreporting on the status of the entire cluster
Unicast - Send the localhost monitoring data to specificmachines, cross-network segment is allowed.
VMHTest Clustermastergmondgmondslave1gmondslave2gmondslave3gmondslave4gmondzookeeperXDR / UDPOmnilab Clusterdev gmondgangl...
Gmond – Metric Gathering Agent• Built-in metrics– Various CPU, Network I/O, Disk and Memory• Extensible– Gmetric – Out-of-...
• Based on open standard• Low per-node overheads and high concurrency• High reliability and independence : failover• Data ...
Problems & Bottlenecks• Overhead evaluation of central node• CPU ( XDR XML )• network I/O• disk I/O• Gmetad RRD write bott...
Solutions• Distributed monitoring system• Separated clusters into small pieces• Multiple Gmetad
• Datebase should be placed in RAM• tmpfs• RAID 0• Reduce the sampling frequencySolutions
My Harvest• Dev - Ops• Linux• git• wiki• Cloud computing• OpenStack• Virtualization• BigData• Hadoop• HBase
Thank youThank youThank youThank youWPS OfficeMake Presentation much more fun
Upcoming SlideShare
Loading in...5
×

SJTU Summary report

108

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
108
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

SJTU Summary report

  1. 1. Summary ReportYupeng Chen2012.8.15
  2. 2. Introduce to GangliaProblem & SolutionMy Harvest
  3. 3. Introduction and overview• Scalable distributed monitoring system for high-performancecomputing systems• XML - data representation• XDR(XML-Data Reduced) - compact, portable data transport• RRDTool - data storage and visualization• PHP - web frontend interface
  4. 4. Ganglia Architecture• Gmond - Ganglia Monitoring DaemonMetric gathering agent installed on individual servers• Gmetad - Ganglia Meta DaemonMetric aggregation agent installed on specificservers• Apache(Nginx + php5-fpm) web frontendMetric presentation and analysis server• Model - Multicast or Unicast
  5. 5. Multicast – All gmond nodes are capable of listening to andreporting on the status of the entire cluster
  6. 6. Unicast - Send the localhost monitoring data to specificmachines, cross-network segment is allowed.
  7. 7. VMHTest Clustermastergmondgmondslave1gmondslave2gmondslave3gmondslave4gmondzookeeperXDR / UDPOmnilab Clusterdev gmondgangliagmondomnilab gmondXDR / UDPgmetadpollXML / TCPXML / TCPpollrrdtoolweb-frontendpushpush
  8. 8. Gmond – Metric Gathering Agent• Built-in metrics– Various CPU, Network I/O, Disk and Memory• Extensible– Gmetric – Out-of-process utility capable of invokingcommand line based metric gathering scripts– Loadable modules capable of gathering multiplemetrics or using advanced metric gathering APIs• Work with Hadoop & HBase– NameNode, DataNode, JobTracker, TaskTracker, etc.– JVM, rpc, etc.
  9. 9. • Based on open standard• Low per-node overheads and high concurrency• High reliability and independence : failover• Data storage and presentation : RRDTool• Ported to various different platforms(Linux, FreeBSD, Solaris, others)Feature & Advantage
  10. 10. Problems & Bottlenecks• Overhead evaluation of central node• CPU ( XDR XML )• network I/O• disk I/O• Gmetad RRD write bottleneck• Every metric has a corresponding a data file (*.rrd )• Write a large number of small files at the same time20 nodes,for each has 500+ metrics10000+ times read/writerequests in a few seconds
  11. 11. Solutions• Distributed monitoring system• Separated clusters into small pieces• Multiple Gmetad
  12. 12. • Datebase should be placed in RAM• tmpfs• RAID 0• Reduce the sampling frequencySolutions
  13. 13. My Harvest• Dev - Ops• Linux• git• wiki• Cloud computing• OpenStack• Virtualization• BigData• Hadoop• HBase
  14. 14. Thank youThank youThank youThank youWPS OfficeMake Presentation much more fun

×