The goal of FamilySearch is to help people find their ancestors. It is a freely available resource that compiles information from databases from around the world. The LDS church sponsors it, but it can be used by anyone for free.
FamilySearch Indexing’s role is to transcribe text from scanned images so it is in a machine-readable format that can be searched. This is done by hundreds of thousands of indexers. [would be nice to include some background slides on FamilySearchIndexing].This is likely one of the largest crowdsourcing projects in the world.
The current quality control mechanism is called A-B-Arbitrate (or just A-B-ARB for short). In this process A and B index the document independently, and an experience arbitrator (ARB) reviews any discrepancies between the two.
Documents are being scanned at an increasing rate. If we are to benefit from these new resources we’ll need to keep pace with the indexing efforts.
A new approach based on peer review instead of independent indexing would likely improve efficiency, but its effect on quality is unknown. Anecdotal evidence suggests that peer reviewing may be twice as fast as indexing from scratch.
The model could include arbitration (ARB) or that step could be skipped if A-B results in high enough quality on its own.
Data is currently being collected for R and ARB. It should be done in a few weeks.
Combining humans and algorithms into the same process would allow Family Search to continue to improve machine learning algorithms based on millions of records.
Improving Family Search Indexing Efficiency and Quality
IMPROVING INDEXING EFFICIENCY & QUALITY:COMPARING A-B-ARBITRATE AND PEER REVIEW FAMILY HISTORY TECHNOLOGY WORKSHOP FEBRUARY 3, 2012 DEREK HANSEN, JAKE GEHRING, PATRICK SCHONE, AND MATTHEW REID
OUR APPROACH• Historical Data Analysis• Field Experiment comparing quality control models
HISTORICAL DATA ANALYSIS• Quality (estimated based on A-B agreement) • Measures difficulty more than actual quality • Underestimates quality, since an experienced Arbitrator reviews all A-B disagreements • Good at capturing differences across people, fields, and projects• Time (calculated using keystroke-logging data) • Idle time is tracked separately, making actual time measurements more accurate • Outliers removed
A NEW APPROACH? (A-R-ARB)• Peer review model• Efficiency ++• Quality ?
PEER REVIEW PROCESS (A-R-ARB)A R ARB Already Filled In Optional?
FIELD EXPERIMENT• Develop Truth Set of 2,000 1930 Census images• Use historical A-B-ARB data• Create new A-R-ARB dataset by having new indexers review and arbitrate• Compare quality & efficiency• Qualitatively identify types of errors
DISCUSSIONIMPLICATIONS• Transition users from novice to expert• Recruit foreign language indexers• Intelligent matching based on expertise (in A-B-ARB &/or A-R-ARB)FUTURE POSSIBILITIES• Peer review by algorithms?• Initial indexing by algorithms?
QUESTIONS• Derek Hansen (email@example.com)• Jake Gehring (GehringJG@familysearch.org)• Patrick Schone (BoiseBound@aol.com)• Matthew Reid (firstname.lastname@example.org)