Performance Analysis
Upcoming SlideShare
Loading in...5
×
 

Performance Analysis

on

  • 826 views

 

Statistics

Views

Total Views
826
Views on SlideShare
826
Embed Views
0

Actions

Likes
0
Downloads
20
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Performance Analysis Performance Analysis Presentation Transcript

  • Performance Analysis Necessity or Add-on in Grid Computing Michael Gerndt Technische Universität München [email_address]
  • LRR at Technische Universität München
    • Chair for Computer Hardware & Organisation / Parallel Computer Architecture (Prof. A. Bode)
    • Three groups in parallel & distributed architectures
      • Architectures
        • SCI Smile project
        • DAB
        • Hotswap
      • Tools
        • CrossGrid
        • APART
      • Applications
        • CFD
        • Medicine
        • Bioinformatics
  • New Campus at Garching
  • Outline
    • PA on parallel systems
    • Scenarios for PA in Grids
    • PA support in Grid projects
    • APART
  • Performance Analysis for Parallel Systems
    • Development cycle
      • Assumption: Reproducibility
    • Instrumentation
      • Static vs Dynamic
      • Source-level vs object-level
    • Monitoring
      • Software vs Hardware
      • Statistical profiles vs Event traces
    • Analysis
      • Source-based tools
      • Visualization tools
      • Automatic analysis tools
    Coding Performance Monitoring and Analysis Production Program Tuning
  • Grid Computing
    • Grids
      • enable communities (“virtual organizations”) to share geographically distributed resources as they pursue common goals -- assuming the absence of…
        • central location,
        • central control,
        • omniscience,
        • existing trust relationships.
          • [Globus Tutorial]
    • Major differences to parallel systems
      • Dynamic system of resources
      • Large number of diverse systems
      • Sharing of resources
      • Transparent resource allocation
  • Scenarios for Performance Monitoring and Analysis
    • Post-mortem application analysis
    • Self-tuning applications
    • Grid scheduling
    • Grid management
      • [GGF performance working group, DataGrid, CrossGrid]
  • Post-Mortem Application Analysis
    • Requires
      • either resources with known performance characteristics (QoS)
      • or system-level information to assess performance data
      • scalability of performance tools
    • Focus will be on interacting components
    • George submits job to the Grid
    • Job is executed on some resources
    • George receives performance data
    • George analyzes performance
  • Self-Tuning Applications
    • Requires
      • Integration of system and application monitoring
      • On-the-fly performance analysis
      • API for accessing monitor data (if PA by application)
      • Performance model and interface to steer adaptation (If PA and tuning decision by external component.)
    • Chris submits job
    • Application adapts to assigned resources
    • Application starts
    • Application monitors performance and adapts to resource changes
  • Grid-Scheduling
    • Requires
      • PA of the grid application
      • Possibly benchmarking the application
      • Access to current performance capabilities of resources
      • Even better to predicted capabilities
    • Gloria determines performance critical application properties
    • She specifies a performance model
    • Grid scheduler selects resources
    • Application is started
  • Grid-Management
    • Requires
      • PA of historical system information
      • Need to be done in a distributed fashion
    • George claims to see bad performance since one week.
    • The helpdesk runs the Grid performance analysis software.
    • Periodical saturation of connections is detected.
  • New Aspect of Performance Analysis
    • Transparent resource allocation
    • Dynamism in resource availability
    • Approaches in the following projects:
      • Damien
      • Datagrid
      • Crossgrid
      • GrADS
  • Analyzing Meta-Computing Applications
    • DAMIEN (IST-25406), 5 partners
    • www.hlrs.de/organization/pds/projects/damien/
    • Goals
      • Analysis of GRID-enabled applications
        • using MpCCI (www.mpcci.org)
        • using PACX-MPI (www.hlrs.de/organization/pds/projects/pacx-mpi)
      • Analysis of GRID components
        • PACX-MPI and MpCCI
      • Extend Vampir/Vampirtrace technology
  • MetaVampirtrace for Application Analysis GRID-MPI profiling routine ( PPACX_Send ) Native MPI GRID communication layer Compiled code ( PACX_Send ) Routine call Tracefile MetaVT wrapper ( PACX_Send ) Routine call Name shift (CPP) Application code ( MPI_Send )
  • MetaVampirtrace for GRID Component Analysis Name shift (CPP) Application code ( MPI_Send ) Tracefile MetaVT wrapper ( MPI_Send ) MPI profiling routine ( PMPI_Send ) Compiled code ( PACX_Send ) Routine call GRID-MPI layer ( PACX_Send ) Routine call TCP/IP GRID-MPI communication layer
  • MetaVampir
    • General counter support
      • Grid component metrics
    • Hierarchical analysis
      • Analysis at each level
      • Aggregate data for groups
      • Improves scalability
    • Structured tracefiles
      • Subdivided into frames
      • Stripe data across multiple files
    Metacomputer Node 2 Node 1 SMP node 1 P_1 GRID–Daemons MPI processes Send Recv SMP node 2 P_n All MPI Processes P_1 P_n
  • Process Level
  • System Level
  • Grid Monitoring Architecture
    • Developed by GGF Performance working group
    • Separation of data discovery and data transfer
      • Data discovery via (possibly distributed) directory service
      • Data transfer among producer – consumer
    • GMA interactions
      • Publish/subscribe
      • Query/response
      • Notification
    • Directory includes
      • Types of events
      • Accepted protocols
      • Security mechanisms
    Consumer Producer Directory Service event publication information event publication information
  • R-GMA in DataGrid
    • DataGrid www.eu-datagrid.org
    • R-GMA www.cs.nwu.edu/~rgis
    • DataGrid WP3 hepunx.rl.ac.uk/edg/wp3
    • Relational approach to GMA
      • Producers announce: SQL “CREATE TABLE” publish: SQL “INSERT”
      • Consumers collect: SQL “SELECT”
      • Approach to use the relational model in a distributed environment
      • It can be used for information service as well as system and application monitoring.
  • P-Grade and R-GMA
    • P-GRADE Environment developed at MTA SZTAKI
      • GRM (Distributed monitor)
      • Prove (Visualization tool)
    • GRM creates two tables in R-GMA
      • GRMTrace (String appName, String event): all events
      • GRMHeader (String appName, String event): important header events only
    • GRM Main Monitor
      • SELECT “*” FROM GRMHeader WHERE appName=“...”
      • SELECT “*” FROM GRMTrace WHERE appName=“...”
  • Main Monitor Site User’s Host Host 1 Host 2 Application Process Appl. Process Appl. Process R-GMA PROVE Connection to R-GMA
  • Analyzing Interactive Applications in CrossGrid
    • CrossGrid funded by EU: 03/2002 – 02/2005
    • www.eu-crossgrid.org
    • Simulation of vascular blood flow
      • Interactive visualization and simulation
        • response times are critical
        • 0.1 sec (head movement) to 5 min (change in simulation)
      • Performance analysis
        • response time and its breakdown
        • performance data for specific interactions
  • CrossGrid Application Monitoring Architecture
    • OCM-G = Grid-enabled OMIS-Compliant Monitor
    • OMIS = On-line Monitoring Interface Specification
    • Application-oriented
      • Information about running applications
    • On-line
      • Information collected at runtime
      • Immediately delivered to consumers
    • Information collected via instrumentation
      • Activated / deactivated on demand
      • Information of interest defined at runtime (lower overhead)
  • OMIS Performance Tool Service Manager LM LM LM P3 th_stop(Sim) P1 P2 P4 P5 th_stop(P1,P2) th_stop(P4,P5) th_stop(P3) Stop Stop Stop Stop Stop
  • G-PM
  • Application Specific Measurement
    • G-PM offers standard metrics
      • CPU time, communication time, disk I/O, ...
    • Application programmer provides
      • Relevant events inside application ( probes )
      • Relevant data computed by the application
      • Association between events in different processes
    • G-PM allows to define new metrics
      • Based on existing ones and application specific information
      • Metric Definition Language under development
      • Compilation or interpretation will be done by High-Level Analysis Component .
  • Managing Dynamism: The GrADS Approach
    • GrADS (Grid Application Development Software)
      • Funded by National Science Foundation, started 2000
    • Goal:
      • Provide application development technologies that make it easy to construct and execute applications with reliable [and often high] performance in the constantly-changing environment of the Grid.
    • Major techniques to handle transparency and dynamism:
      • Dynamic configuration to available resources (configurable object programs)
      • Performance contracts and dynamic reconfiguration
  • GrADS Software Architecture P S E Config. object program whole program compiler Source appli- cation libraries Realtime perf monitor Dynamic optimizer Grid runtime System (Globus) negotiation Software Components Scheduler/ Service Negotiator Performance feedback Program Preparation System Execution Environment
  • Configurable Object Programs
    • Integrated mapping strategy and cost model
    • Performance enhanced by context-depend. variants
    • Context includes potential execution platforms
    • Dynamic Optimizer performs final binding
      • Implements mapping strategy
      • Chooses machine-specific variants
      • Inserts sensors and actuators
      • Perform final compilation and optimization
  • Performance Contracts
    • A performance contract specifies the measurable performance of a grid application.
    • Given
      • set of resources,
      • capabilities of resources,
      • problem parameters
    • the application will
      • achieve a specified, measurable performance
  • Creation of Performance Contracts Program Performance Model Resource Broker Resource Assignment Performance Contract
    • Developer
    • Compiler
    • Measurements
    MDS NWS
  • History-Based Contracts
    • Resources given by broker
    • Capabilities of resources given by
      • Measurements of this code on those resources
      • Possibly scaled by the Network Weather Service
      • e.g. Flops/second and Bytes/second
    • Problem parameters
      • Given by the input data set
    • Application intrinsic parameters
      • Independent of execution platform
      • Measurements of this code with same problem parameters
      • e.g. floating point operation count, message count, message bytes count
    • Measurable Performance Prediction
      • Combining application parameters and resource capabilities
  • Application and System Space Signature
    • Application Signature
      • trajectory of values through N-dimensional metric space
      • one trajectory per process
      • e.g. one point per iteration
      • e.g. metric: iterations/flop
    • System Signature
      • trajectory of values through N-dimensional metric space
      • will vary across application executions, even on the same resources
      • e.g. metric iterations/second
    resource capabilities
  • Verification of Performance Contracts Execution Contract Monitor Rescheduling Sensor Data Steer Dynamic Optimizer
    • Violation detection
    • Fault detection
  • APART
    • ESPRIT IV Working Group, 01/1999 – 12/2000
    • IST Working Group, 08/2001 – 07/2004
    • www.fz-juelich.de/apart
    • Focus:
      • Network European development projects for automatic performance analysis tools
        • Testsuite for automatic analysis tools
      • Automatic Performance Analysis and Grid Computing (WP3 – Peter Kacsuk)
  • Summary
    • Scenarios
      • Post-mortem Application Tuning
      • Self-tuning applications
      • Grid scheduling
      • Grid management
    • How to handle transparency and dynamism?
    • Approaches here:
      • Damien: Provide static environment.
      • Datagrid: Combining system and application monitoring
      • Crossgrid: On-line analysis
      • GrADS: Performance models and contracts