This document discusses using Hadoop clusters and MapReduce programming to analyze large electronic health records databases. It proposes that Hadoop can efficiently process and analyze huge healthcare datasets in a distributed computing environment. Specifically, the document proposes a MapReduce program to analyze electronic health records stored in Hadoop clusters. Analyzing these large datasets with Hadoop could provide personalized treatment planning, assisted diagnosis, and utilization review to improve healthcare outcomes and reduce costs.