This document discusses fault tolerance techniques for computational grids. It begins with an introduction to grid computing and defines some key terms related to faults and failures. It then discusses different types of faults that can occur in grids, including physical faults, network faults, and process faults. It outlines several fault tolerance techniques used in grids, including job and data replication, checkpointing, scheduling approaches, and load balancing strategies. The document concludes with suggestions for future work, such as optimizing checkpoint storage and granularity.