This document summarizes a research project on GPUrdma, which enables direct RDMA communication from GPU kernels without CPU intervention. GPUrdma provides a 5 microsecond latency for GPU-to-GPU communication and up to 50 Gbps bandwidth. It implements a direct data path and control path from the GPU to the InfiniBand HCA. Evaluation shows GPUrdma outperforms CPU-based RDMA by a factor of 4.5x for small messages. The document also discusses using GPUrdma to enable the GPI2 framework for partitioned global address space programming across GPUs.