This document presents research on mapping parallel programs to hierarchical distributed computer systems. It introduces a heuristic algorithm called TMMGP that partitions task graphs and maps them to optimize execution time. Experiments show TMMGP reduces execution time of MPI benchmarks on two computer clusters by up to 40% on average compared to standard MPI mapping tools. The research aims to develop models and algorithms that better account for the hierarchical organization of modern distributed systems and structure of parallel programs. Future work includes mapping to arbitrary subsystems and integrating the algorithm with resource management systems.