Your SlideShare is downloading. ×
  • Like
4 HPA Examples Of Vampir Usage
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

4 HPA Examples Of Vampir Usage

  • 1,003 views
Published

 

Published in Education , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,003
On SlideShare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
5
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. What Is It Good For?! Real World Examples of Vampir Usage at IU Robert Henschel rhensche@indiana.edu Thanks to Scott Teige, Don Berry, Scott Michael, Huian Li, Judy Qiu for providing traces. May 2009
  • 2. Contents • IPP – Repeated Execution of an Application • Runtime Anomaly on the iDataPlex (Tachyon) • Cell B.E. Tracing • Particle Swarm Optimizer – Memory Tracing • Finding an I/O Bottleneck (EMAN) • Cluster Challenge – WPP, GAMESS • Tracing on Windows • Swarp – Pthread Tracing • MD Simulation Robert Henschel
  • 3. What is this all about • Provide a feeling for what Vampir can be used for • Show a few of Vampirs features • Raise the awareness that tracing software is a complex undertaking Robert Henschel
  • 4. IPP - Repeated Execution of an Application • Image analysis pipeline • Every module/binary is called a few hundred times during a single run • Setting VT_FILE_UNIQUE=yes allows for normal execution of the pipeline, while traces are produced • We ended up with 640 traces, that we did not want to look at • otfprofile was used in bash script to retrieve the profile of every trace, and then combine to a profile of the entire run Robert Henschel
  • 5. IPP - Repeated Execution of an Application ppImage_103.0.def.z ppImage_103.0.marker.z ppImage_103.1.events.z ppImage_103.otf ppImage_103.otf_collop.csv ppImage_103.otf_data.csv ppImage_103.otf_func.csv ppImage_103.otf_p2p.csv ppImage_103.otf_result.aux ppImage_103.otf_result.dvi ppImage_103.otf_result.log ppImage_103.otf_result.ps ppImage_103.otf_result.tex Robert Henschel
  • 6. IPP - Repeated Execution of an Application Robert Henschel
  • 7. IPP - Repeated Execution of an Application Robert Henschel
  • 8. IPP - Repeated Execution of an Application Robert Henschel
  • 9. IPP - Repeated Execution of an Application Robert Henschel
  • 10. IPP - Repeated Execution of an Application Robert Henschel
  • 11. IPP - Repeated Execution of an Application Robert Henschel
  • 12. IPP - Repeated Execution of an Application Robert Henschel
  • 13. IPP - Repeated Execution of an Application Robert Henschel
  • 14. IPP - Repeated Execution of an Application Robert Henschel
  • 15. IPP - Repeated Execution of an Application Robert Henschel
  • 16. IPP - Repeated Execution of an Application Robert Henschel
  • 17. IPP - Repeated Execution of an Application Robert Henschel
  • 18. IBM iDataPlex Testing • We are testing an IBM iDataPlex system – 84 dual quad-core node cluster – Infiniband interconnect • Tachyon, a part of the SPEC MPI2007 benchmark suite had a strange runtime behavior – 100 vs. 200 seconds • Tracing the application, made the problem go away, stable runtime at 110 seconds • There will be a deeper interaction of OpenMPI and VampirTrace in the future, OpenMPI 1.3 now contains VampirTrace Robert Henschel
  • 19. IBM iDataPlex Testing Robert Henschel
  • 20. IBM iDataPlex Testing Robert Henschel
  • 21. IBM iDataPlex Testing Robert Henschel
  • 22. IBM iDataPlex Testing Robert Henschel
  • 23. IBM iDataPlex Testing Robert Henschel
  • 24. Cell B.E. Tracing • Code exists as serial version, MPI version and as Cell B.E. version • Great to be able to use the same tracing tool on both platforms and for different parallelization strategies • We used a beta version of VampirTrace for Cell B.E. Robert Henschel
  • 25. Cell B.E. Tracing Robert Henschel
  • 26. Cell B.E. Tracing Robert Henschel
  • 27. Cell B.E. Tracing Robert Henschel
  • 28. Cell B.E. Tracing Robert Henschel
  • 29. Cell B.E. Tracing Robert Henschel
  • 30. Cell B.E. Tracing Robert Henschel
  • 31. Cell B.E. Tracing Robert Henschel
  • 32. Cell B.E. Tracing Robert Henschel
  • 33. Cell B.E. Tracing Robert Henschel
  • 34. Cell B.E. Tracing Robert Henschel
  • 35. Cell B.E. Tracing Robert Henschel
  • 36. Cell B.E. Tracing Robert Henschel
  • 37. Cell B.E. Tracing Robert Henschel
  • 38. PSO – Memory Footprint • Determining the memory footprint of a particle swarm optimizer Robert Henschel
  • 39. PSO – Memory Footprint Robert Henschel
  • 40. PSO – Memory Footprint Robert Henschel
  • 41. EMAN Tracing • Application ran very slow on regular hardware, NFS mounted file system • Application was faster on Lustre • Application was really fast on shared memory filesystem • Suspected I/O problem • However, the actual I/O problem was not visible in Vampir, as this had to do with file locking Robert Henschel
  • 42. EMAN Tracing Robert Henschel
  • 43. EMAN Tracing Robert Henschel
  • 44. EMAN Tracing Robert Henschel
  • 45. Cluster Challenge • WPP (Wave Propagation Program) • Vampir helped to understand the code (few comments, not sure what the code does, and how it does it) • PAPI counter helped to show that on X86, peak floating point performance was already achieved, thus, not much room for tuning, besides rewriting the algorithm Robert Henschel
  • 46. Cluster Challenge Robert Henschel
  • 47. Cluster Challenge Robert Henschel
  • 48. Cluster Challenge Robert Henschel
  • 49. Cluster Challenge Robert Henschel
  • 50. Cluster Challenge Robert Henschel
  • 51. Cluster Challenge Robert Henschel
  • 52. Cluster Challenge Robert Henschel
  • 53. Cluster Challenge Robert Henschel
  • 54. Cluster Challenge • GAMESS • Problem with MPI communication, while socket communication worked just fine Robert Henschel
  • 55. Tracing on Windows • On Windows, the trace creation is handled by MS MPI • Unfortunately, not a lot of details are provided, thus ,the traces look a bit different to traces that are created on Linux – MPI collective operations are not marked as such! • However, on Windows, you are able to trace C# application in addition to C/C++/Fortran applications – And possibly other applications, that are build on top of MS MPI Robert Henschel
  • 56. Tracing on Windows Robert Henschel
  • 57. Tracing on Windows Robert Henschel
  • 58. Tracing on Windows Robert Henschel
  • 59. Tracing on Windows Robert Henschel
  • 60. Tracing on Windows Robert Henschel
  • 61. Tracing on Windows Robert Henschel
  • 62. Tracing on Windows Robert Henschel
  • 63. Tracing on Windows Robert Henschel
  • 64. Tracing on Windows Robert Henschel
  • 65. Swarp - Pthread Tracing • SWARP can be run as a Pthread parallel application • Tracing of Pthread parallel applications is a new feature of the latest VampirTrace version • Enabled by default, if Pthread flag is detected on the compile and link commands, but can be forced as well • Compiling with -DVTRACE_PTHREAD and including vt_user.h in the appropriate file – Will trace the overhead of Pthread functions Robert Henschel
  • 66. Swarp - Pthread Tracing Robert Henschel
  • 67. Swarp - Pthread Tracing Robert Henschel
  • 68. Swarp - Pthread Tracing Robert Henschel
  • 69. Swarp - Pthread Tracing Robert Henschel
  • 70. Swarp - Pthread Tracing Robert Henschel
  • 71. MD Application, Serial, OpenMP, MPI, Hybrid • MD application, part of the research of physics professor Chuck Horowitz • Used for studying dense nuclear matter in supernovae, white dwarf and neutron stars • In production use on a daily basis • Exists as serial, MPI, OpenMP, Hybrid and MDGRAPE-2 application • MDGRAPE-2 tracing is not supported out of the box, but could be implemented using manual instrumentation Robert Henschel
  • 72. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 73. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 74. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 75. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 76. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 77. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 78. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 79. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 80. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 81. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 82. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 83. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 84. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 85. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 86. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel
  • 87. MD Application, Serial, OpenMP, MPI, Hybrid Robert Henschel