• Save
SJTU Summary report
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
266
On Slideshare
266
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Summary ReportYupeng Chen2012.8.15
  • 2. Introduce to GangliaProblem & SolutionMy Harvest
  • 3. Introduction and overview• Scalable distributed monitoring system for high-performancecomputing systems• XML - data representation• XDR(XML-Data Reduced) - compact, portable data transport• RRDTool - data storage and visualization• PHP - web frontend interface
  • 4. Ganglia Architecture• Gmond - Ganglia Monitoring DaemonMetric gathering agent installed on individual servers• Gmetad - Ganglia Meta DaemonMetric aggregation agent installed on specificservers• Apache(Nginx + php5-fpm) web frontendMetric presentation and analysis server• Model - Multicast or Unicast
  • 5. Multicast – All gmond nodes are capable of listening to andreporting on the status of the entire cluster
  • 6. Unicast - Send the localhost monitoring data to specificmachines, cross-network segment is allowed.
  • 7. VMHTest Clustermastergmondgmondslave1gmondslave2gmondslave3gmondslave4gmondzookeeperXDR / UDPOmnilab Clusterdev gmondgangliagmondomnilab gmondXDR / UDPgmetadpollXML / TCPXML / TCPpollrrdtoolweb-frontendpushpush
  • 8. Gmond – Metric Gathering Agent• Built-in metrics– Various CPU, Network I/O, Disk and Memory• Extensible– Gmetric – Out-of-process utility capable of invokingcommand line based metric gathering scripts– Loadable modules capable of gathering multiplemetrics or using advanced metric gathering APIs• Work with Hadoop & HBase– NameNode, DataNode, JobTracker, TaskTracker, etc.– JVM, rpc, etc.
  • 9. • Based on open standard• Low per-node overheads and high concurrency• High reliability and independence : failover• Data storage and presentation : RRDTool• Ported to various different platforms(Linux, FreeBSD, Solaris, others)Feature & Advantage
  • 10. Problems & Bottlenecks• Overhead evaluation of central node• CPU ( XDR XML )• network I/O• disk I/O• Gmetad RRD write bottleneck• Every metric has a corresponding a data file (*.rrd )• Write a large number of small files at the same time20 nodes,for each has 500+ metrics10000+ times read/writerequests in a few seconds
  • 11. Solutions• Distributed monitoring system• Separated clusters into small pieces• Multiple Gmetad
  • 12. • Datebase should be placed in RAM• tmpfs• RAID 0• Reduce the sampling frequencySolutions
  • 13. My Harvest• Dev - Ops• Linux• git• wiki• Cloud computing• OpenStack• Virtualization• BigData• Hadoop• HBase
  • 14. Thank youThank youThank youThank youWPS OfficeMake Presentation much more fun