Be the first to like this
This research project involves parsing human rights reports produced by the United States Government and rating the human practices for various countries based on the CIRI (Cingranelli-Richards) Human Rights Data Project dataset. The United States Human Rights Reports are annual reports that cover internationally recognized human rights practices in regards to individual, civil, political, and worker rights. CIRI analyzed the annual reports from 1981 to 2011 and then stopped releasing the dataset for any further years. CIRI coders relies on a manual process of scoring the Human Rights Reports and then applying rating scores to each human rights practice for each country. This project has automated the process of scoring the human rights country reports. To accomplish this objective, we use GATE (General Architecture for Text Engineering) text mining platform. GATE is an open source software project used to provide solutions for text processing. We used various customizable GATE plugins in conjunction with the coding schemes provided by the CIRI Project documentation to create an automated ratings process. The accuracy of this tool was evaluated by comparing the automated ratings to the existing ratings within the CIRI dataset. Evaluation results indicate that our automated process has higher accuracy (ranges from 60 to 85%) for physical integrity human practices whereas much lower accuracy for empowerment (ranges from 40% to 60%) and women’s rights (ranges from 30% to 60%) practices. CIRI scoring process for physical integrity consists of primarily quantified measures whereas other two practices has more of subjective measures. Thus, additional research and modifications to text mining algorithm is required to improve the automated process of rating human right reports.