This is the last of four short "silent Ignite Talk" video slideshows that explain FactMiners and PRImA's entry in the Knight News Challenge. Our Solution: People. How will we teach Robots to read magazines? How will Ground-Truth lead to Fact-Mining?
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Solution: People
1. FactMiners & PRImA’s
Knight News Challenge Entry
Turn Text Soup into Smart Data in
Newspaper & Magazine Archives”
A self-running video slideshow.
One slide every 15 seconds.
Pause as needed.
Solution: People
Crowdsourcing Ground-Truth
2. Q: Why do we need people?
• If all we had to do was write some
smart “Robot” programs & simply put
them to work, we wouldn’t need people.
• But writing smart code is just the “birth”
of a Machine-Learning Robot.
• We have to teach our Robots how to
read magazines & newspapers!
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
011010..01.. INIT…
What am I? Who will
teach me? What’s a
magazine?
3. Q: What is Ground-Truth?
• Teaching means training; lessons, study
materials, tests & their answer sheets, etc.
• An “answer sheet” in OCR research is
called a Ground-Truth solution – the
human-crafted “perfect answer” to
recognition of a scanned page.
• To teach our Robots to read magazines,
we’ll need a pile of TOC* Ground-Truth!
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
*TOC being “Table of Contents”
See our 3rd Silent Ignite slideshow
for more on TOCs & Technology
Yes, your Honor…
That is EXACTLY
what I saw and ONLY
what I saw on the
Table of Contents
page shown to me
as Exhibit A.
4. Q: What’s the TOC Pattern Reference Library?
• It will be a Special Purpose Research
Collection at the Internet Archive to
be used to “teach Robots to read
magazines & newspapers.”
• Will Include a TOC Image Dataset,
TOC Ground-Truth Solutions, & Open
Source library of TOC-Spotting &
TOC-Reading software.
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
Welcome to the
TOC Pattern Reference Library
Yes, counsel, in answer to
your question allow me to
reference material from
the Library.
5. Q: How will Citizen Scientists help?
• “Volunpeers” are already generating
Ground-Truth data for the TOC Pattern
Reference Library through our
project on the Zooniverse
crowdsourcing platform.
• In addition to refining the workflow for
Ground-Truth data collection, this
project will develop Zooniverse data
export to PRImA’s Aletheia.
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
6. Q: What is Alethia “FactMiners Ed.”?
• Aletheia is PRImA’s desktop & web
Ground-Truth Tool.
• Funding will allow PRImA to add
features to Aletheia to support
“whole issue” modeling in
Ground-Truth Solutions.
• We get a Power Tool for Citizen
Scientists who want to “dig deeper”
into Internet Archive newspaper &
magazine collections as pioneer
FactMiners!
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
7. We have a design to “tame” Text Soup and
unlock “facts” in archive data.
• We are confident that the applied research project
submitted as our Knight News Challenge entry will
make substantive contributions to the domain of Open
Data by helping to turn Text Soup into Smart Data in
newspaper & magazine archives.
• We hope you have enjoyed all four of our “silent Ignite Talk”
video slideshows. We welcome your comments, questions,
& (of course) “applause” at: https://goo.gl/99Vn5M
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
8. FactMiners & PRImA:
Our Knight News Challenge Entry
•“Turn Text Soup into Smart Data in
Newspaper & Magazine Archives”
https://goo.gl/99Vn5M
• Team
• Jim Salmons, FactMiners
• Timlynn Babitsky, FactMiners
• Apostolos Antonacopoulos, PRImA
• Christian Clausner, PRImA
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”