Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Automating CIRI Ratings of Human Rights Reports Using GATE

73 views

Published on

This project involves parsing human rights reports produced by the United States Government and rating the human practices for various countries based on the CIRI (Cingranelli-Richards) Human Rights Data Project dataset. The United States Human Rights Reports are annual reports that cover internationally recognized human rights practices in regards to individual, civil, political, and worker rights. Students, scholars, policymakers, and analysts use the CIRI data for practical and research purposes. CIRI analyzed the annual reports from 1981 to 2011 and then stopped releasing the dataset for any further years. CIRI coders relies on a manual process of scouring the Human Rights Reports and then applying rating scores to each human rights practice for each country. The objective of this project is to automate the process of scouring the human rights country reports. To accomplish this objective, we use GATE (General Architecture for Text Engineering) text mining platform. GATE is an open source software project used to provide solutions for text processing. We use various customizable GATE plugins in conjunction with the coding schemes provided by the CIRI Project documentation to create an automated ratings process. The accuracy of this tool will be evaluated by comparing the automated ratings to the existing ratings within the CIRI dataset. The expected contribution of this project is to provide an automated way to rate country human rights practices so that the purpose of the CIRI Data Project can be continued.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Automating CIRI Ratings of Human Rights Reports Using GATE

  1. 1. Automating CIRI Ratings of Human Rights Reports Using GATE Joshua Joiner and Karthikeyan Umapathy School of Computing, University of North Florida, Jacksonville, FL USA 32224 R E S E A R C H C O N T E X T This project involves parsing human rights reports produced by the U.S Government and rating the human practices for various countries. The U.S Human Rights Reports are annual reports that cover internationally recognized human rights practices in regards to individual, civil, political, and worker rights. T E X T M I N I N G T O O L GATE is an open source text mining platform used for developing custom text processing solutions. G E N E R A T I N G C I R I R A T I N G U S I N G G A T E C O N C L U S I O N S In conclusion, I believe the automated process will not provide a high accuracy when comparing to the CIRI dataset because the dataset was compiled by humans. I do, however, believe that processes involved in creating the automated process can create a more objective standard when analyzing country report text and producing ratings for the human practices. There also needs to be more patterns implements within the automated process to more accurately match with the qualitative text from the Women’s Rights and Independent Judiciary sections. CIRI rating: Text Mining of Human Rights Reports Project Objective: CIRI Sample Dataset U.S. Department of State CIRI coders rely on a manual process of reading through the Human Rights Reports and then applying ratings to each human rights practice for each country. • The objective of this project is to automate the process of scouring the human rights country reports. CIRI (Cingranelli-Richards) Human Rights Data Project rates the human rights practices of the U.S. Human Rights country reports. Students, scholars, policymakers, and analysts use the CIRI ratings for practical and research purposes. CIRI Rating of Human Rights Reports Standard ANNIE process flow: C I R I R A T I N G S C O M P A R I S O N Rating Produced by Automation GATE Architecture Overview: E V A L U A T I O N P L A N C O N T R I B U T I O N S • CIRI Coding Annotation Processing Resource • Custom JAPE patterns for keywords and phrases. • Custom annotations for entity extraction. • Custom implementation of sentiment analysis. • Ontology Storage • CIRI Dataset Source Ratings. • Automatically generated CIRI Ratings. • For the Occurrence section which includes KILL, DISAP, POLPRIS, and TORT the accuracy of the rating is 60%. • Women’s Right overall averaged 45% accuracy and Independent Judiciary averaged 70%.

×