SlideShare a Scribd company logo
1 of 8
Download to read offline
FactMiners & PRImA’s
Knight News Challenge Entry
Turn Text Soup into Smart Data in
Newspaper & Magazine Archives”
A self-running video slideshow.
One slide every 15 seconds.
Pause as needed. 
Solution: People
Crowdsourcing Ground-Truth
Q: Why do we need people?
• If all we had to do was write some
smart “Robot” programs & simply put
them to work, we wouldn’t need people.
• But writing smart code is just the “birth”
of a Machine-Learning Robot.
• We have to teach our Robots how to
read magazines & newspapers!
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
011010..01.. INIT…
What am I? Who will
teach me? What’s a
magazine?
Q: What is Ground-Truth?
• Teaching means training; lessons, study
materials, tests & their answer sheets, etc.
• An “answer sheet” in OCR research is
called a Ground-Truth solution – the
human-crafted “perfect answer” to
recognition of a scanned page.
• To teach our Robots to read magazines,
we’ll need a pile of TOC* Ground-Truth!
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
*TOC being “Table of Contents”
See our 3rd Silent Ignite slideshow
for more on TOCs & Technology
Yes, your Honor…
That is EXACTLY
what I saw and ONLY
what I saw on the
Table of Contents
page shown to me
as Exhibit A.
Q: What’s the TOC Pattern Reference Library?
• It will be a Special Purpose Research
Collection at the Internet Archive to
be used to “teach Robots to read
magazines & newspapers.”
• Will Include a TOC Image Dataset,
TOC Ground-Truth Solutions, & Open
Source library of TOC-Spotting &
TOC-Reading software.
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
Welcome to the
TOC Pattern Reference Library
Yes, counsel, in answer to
your question allow me to
reference material from
the Library.
Q: How will Citizen Scientists help?
• “Volunpeers” are already generating
Ground-Truth data for the TOC Pattern
Reference Library through our
project on the Zooniverse
crowdsourcing platform.
• In addition to refining the workflow for
Ground-Truth data collection, this
project will develop Zooniverse data
export to PRImA’s Aletheia.
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
Q: What is Alethia “FactMiners Ed.”?
• Aletheia is PRImA’s desktop & web
Ground-Truth Tool.
• Funding will allow PRImA to add
features to Aletheia to support
“whole issue” modeling in
Ground-Truth Solutions.
• We get a Power Tool for Citizen
Scientists who want to “dig deeper”
into Internet Archive newspaper &
magazine collections as pioneer
FactMiners!
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
We have a design to “tame” Text Soup and
unlock “facts” in archive data.
• We are confident that the applied research project
submitted as our Knight News Challenge entry will
make substantive contributions to the domain of Open
Data by helping to turn Text Soup into Smart Data in
newspaper & magazine archives.
• We hope you have enjoyed all four of our “silent Ignite Talk”
video slideshows. We welcome your comments, questions,
& (of course) “applause” at: https://goo.gl/99Vn5M
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
FactMiners & PRImA:
Our Knight News Challenge Entry
•“Turn Text Soup into Smart Data in
Newspaper & Magazine Archives”
https://goo.gl/99Vn5M
• Team
• Jim Salmons, FactMiners
• Timlynn Babitsky, FactMiners
• Apostolos Antonacopoulos, PRImA
• Christian Clausner, PRImA
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”

More Related Content

Viewers also liked

Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...
Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...
Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...Jim Salmons
 
Dell Retail Brochure V3 New url
Dell Retail Brochure V3 New urlDell Retail Brochure V3 New url
Dell Retail Brochure V3 New urlAlex Glushchenko
 
Kelas viii smp matematika_endah budi rahaju
Kelas viii smp matematika_endah budi rahajuKelas viii smp matematika_endah budi rahaju
Kelas viii smp matematika_endah budi rahajuw0nd0
 
Anexos2 160306034153 SMO
Anexos2 160306034153 SMOAnexos2 160306034153 SMO
Anexos2 160306034153 SMOPaola Tellez
 
Be going to
Be going toBe going to
Be going torrreana
 
Mhna – epe meeting 08112015
Mhna – epe meeting 08112015Mhna – epe meeting 08112015
Mhna – epe meeting 08112015elpasonaturally
 
Implementación de la Administración de Integridad de ductos en México
Implementación de la Administración de Integridad de ductos en MéxicoImplementación de la Administración de Integridad de ductos en México
Implementación de la Administración de Integridad de ductos en MéxicoAcademia de Ingeniería de México
 
O-Pump | Automated Wireless Water Distribution System (Español)
O-Pump | Automated Wireless Water Distribution System (Español)O-Pump | Automated Wireless Water Distribution System (Español)
O-Pump | Automated Wireless Water Distribution System (Español)Liwa Automation Engineering
 

Viewers also liked (12)

Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...
Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...
Notes and Letters of Support for Crowdsourcing Ground Truth - FactMiners, PRI...
 
Dell Retail Brochure V3 New url
Dell Retail Brochure V3 New urlDell Retail Brochure V3 New url
Dell Retail Brochure V3 New url
 
Hitler, o quase anticristo
Hitler, o quase anticristoHitler, o quase anticristo
Hitler, o quase anticristo
 
Kelas viii smp matematika_endah budi rahaju
Kelas viii smp matematika_endah budi rahajuKelas viii smp matematika_endah budi rahaju
Kelas viii smp matematika_endah budi rahaju
 
Anexos2 160306034153 SMO
Anexos2 160306034153 SMOAnexos2 160306034153 SMO
Anexos2 160306034153 SMO
 
El Muestreo
El MuestreoEl Muestreo
El Muestreo
 
Be going to
Be going toBe going to
Be going to
 
Mhna – epe meeting 08112015
Mhna – epe meeting 08112015Mhna – epe meeting 08112015
Mhna – epe meeting 08112015
 
IDA Workshop #1 Arquitectura de Información
IDA Workshop #1 Arquitectura de InformaciónIDA Workshop #1 Arquitectura de Información
IDA Workshop #1 Arquitectura de Información
 
Implementación de la Administración de Integridad de ductos en México
Implementación de la Administración de Integridad de ductos en MéxicoImplementación de la Administración de Integridad de ductos en México
Implementación de la Administración de Integridad de ductos en México
 
O-Pump | Automated Wireless Water Distribution System (Español)
O-Pump | Automated Wireless Water Distribution System (Español)O-Pump | Automated Wireless Water Distribution System (Español)
O-Pump | Automated Wireless Water Distribution System (Español)
 
Sesion com 2g_05
Sesion com 2g_05Sesion com 2g_05
Sesion com 2g_05
 

More from Jim Salmons

NewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slidesNewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slidesJim Salmons
 
NewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slidesNewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slidesJim Salmons
 
The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...
The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...
The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...Jim Salmons
 
ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...
ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...
ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...Jim Salmons
 
"Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA...
"Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA..."Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA...
"Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA...Jim Salmons
 
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart DataFactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart DataJim Salmons
 

More from Jim Salmons (6)

NewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slidesNewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slides
 
NewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slidesNewsEye WPIP21 conference: The Case for Magazines slides
NewsEye WPIP21 conference: The Case for Magazines slides
 
The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...
The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...
The Yin-Yang Epigenesis of the Long-tail of the Scale-free Social Network of ...
 
ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...
ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...
ExperOPS5: A Rule-based, Data-driven Production System Language Puts a Mind b...
 
"Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA...
"Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA..."Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA...
"Big Picture" Backgrounder for Crowdsourcing Ground-Truth - FactMiners, PRImA...
 
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart DataFactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data
 

Recently uploaded

INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...KarteekMane1
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxTasha Penwell
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 

Recently uploaded (20)

INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
wepik-insightful-infographics-a-data-visualization-overview-20240401133220kwr...
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptxThe Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
The Power of Data-Driven Storytelling_ Unveiling the Layers of Insight.pptx
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 

FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Solution: People

  • 1. FactMiners & PRImA’s Knight News Challenge Entry Turn Text Soup into Smart Data in Newspaper & Magazine Archives” A self-running video slideshow. One slide every 15 seconds. Pause as needed.  Solution: People Crowdsourcing Ground-Truth
  • 2. Q: Why do we need people? • If all we had to do was write some smart “Robot” programs & simply put them to work, we wouldn’t need people. • But writing smart code is just the “birth” of a Machine-Learning Robot. • We have to teach our Robots how to read magazines & newspapers! FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” 011010..01.. INIT… What am I? Who will teach me? What’s a magazine?
  • 3. Q: What is Ground-Truth? • Teaching means training; lessons, study materials, tests & their answer sheets, etc. • An “answer sheet” in OCR research is called a Ground-Truth solution – the human-crafted “perfect answer” to recognition of a scanned page. • To teach our Robots to read magazines, we’ll need a pile of TOC* Ground-Truth! FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” *TOC being “Table of Contents” See our 3rd Silent Ignite slideshow for more on TOCs & Technology Yes, your Honor… That is EXACTLY what I saw and ONLY what I saw on the Table of Contents page shown to me as Exhibit A.
  • 4. Q: What’s the TOC Pattern Reference Library? • It will be a Special Purpose Research Collection at the Internet Archive to be used to “teach Robots to read magazines & newspapers.” • Will Include a TOC Image Dataset, TOC Ground-Truth Solutions, & Open Source library of TOC-Spotting & TOC-Reading software. FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” Welcome to the TOC Pattern Reference Library Yes, counsel, in answer to your question allow me to reference material from the Library.
  • 5. Q: How will Citizen Scientists help? • “Volunpeers” are already generating Ground-Truth data for the TOC Pattern Reference Library through our project on the Zooniverse crowdsourcing platform. • In addition to refining the workflow for Ground-Truth data collection, this project will develop Zooniverse data export to PRImA’s Aletheia. FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
  • 6. Q: What is Alethia “FactMiners Ed.”? • Aletheia is PRImA’s desktop & web Ground-Truth Tool. • Funding will allow PRImA to add features to Aletheia to support “whole issue” modeling in Ground-Truth Solutions. • We get a Power Tool for Citizen Scientists who want to “dig deeper” into Internet Archive newspaper & magazine collections as pioneer FactMiners! FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
  • 7. We have a design to “tame” Text Soup and unlock “facts” in archive data. • We are confident that the applied research project submitted as our Knight News Challenge entry will make substantive contributions to the domain of Open Data by helping to turn Text Soup into Smart Data in newspaper & magazine archives. • We hope you have enjoyed all four of our “silent Ignite Talk” video slideshows. We welcome your comments, questions, & (of course) “applause” at: https://goo.gl/99Vn5M FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
  • 8. FactMiners & PRImA: Our Knight News Challenge Entry •“Turn Text Soup into Smart Data in Newspaper & Magazine Archives” https://goo.gl/99Vn5M • Team • Jim Salmons, FactMiners • Timlynn Babitsky, FactMiners • Apostolos Antonacopoulos, PRImA • Christian Clausner, PRImA FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”