SlideShare a Scribd company logo
1 of 8
Download to read offline
Goal: Smart Data
From “readable” to “computable”
FactMiners & PRImA’s
Knight News Challenge Entry
Turn Text Soup into Smart Data in
Newspaper & Magazine Archives”
A self-running video slideshow.
One slide every 15 seconds.
Pause as needed. 
Q: What is Smart Data?
• A: Smart Data is self-descriptive
data that can “carry on a conversation”
with Smart Programs to support
access, editing, and visualization of
the data itself.
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
The “actual” data of the database
To access the “actual” data of the database,
Smart Programs “talk” to an embedded
“database about the database” (AKA a metamodel )
Q: What does Smart Data look like?
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
• A: Smart Data includes BOTH the
complex document structure
of the source AND the underlying
conceptual model of the source
content.
Q: What can Smart Data do?
• A: Turn expensive, time-
consuming, labor-intensive
research studies into “Just ask!”
queries
• Good for things like:
• How did local reporting of race
relations impact public policy in
Indiana in the 1950s?
• Did advertising or editorial
coverage account for the
popularity of programs in the
Softalk Bestseller lists?
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
Q: How “smart” is our Smart Data design?
• We spent a year researching
museum informatics and
prototyping Smart Data designs.
• Our software architecture is based
on CIDOC-CRM (Conceptual
Reference Model for Museums)
microservice workflows and
PRESSoo, the ISSN.org
metamodel for serial publications
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
Winter, 2013
Spring, 2014
Fall, 2014
Summer, 2015
Neo4j GraphGist Challenge,
a 1st place for Metamodel
Subgraph domain model
Semi-finals Ashoka/LEGO
“Re-imagine Learning” Challenge.
#MW2014 FactMiners demo.
Introduced to #cidocCRM.
Museum Computer Network
Emerging Professional Scholarship.
#MCN2014 paper & demo.
“Massively Addressable Text” published
in peer-reviewed CODE|WORDS.
#HILT2015 Crowdsourcing Course
DPLA Community Reps.
Internet Archive Content Partner.
ICOM #cidocCRM SIG member.
Incorporate PRESSoo into design.
Begin PRImA Collaboration.
Q: How “open” is our Smart Data design?
• Using a metamodel
subgraph design
pattern to embed and pass
info about data and its access
and transformation is
technology neutral &
future-proof.
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
Without Smart Data
With Smart Data
Database
10 Load X
20 Print X
30 Goto 10
Domain knowledge written
into task-specific programs
Metamodel statically stored
within #TEI header section of
source documents std. text files
<teiHeader>
<metamodel />
<structure />
<content />
Any “smart” DB
For dynamic Linked Open Data access,
DB need only have import &
ability to represent data structures
read from metamodel header.
10 Load metamodel
20 Configure editors
30 Do stuff…
“Smart” program in
any language
We have a design to “tame” Text Soup and
unlock “facts” in archive data.
• An innovative design combining international standards
for conceptual modeling of museum collections
(cidocCRM and PRESSoo) together with a “self-
descriptive” software/database design pattern provide the
foundation for mining Smart Data from Text Soup.
• In the next slideshow, we describe our design for the
technology to “fact-mine” Smart Data from
newspaper & magazine digital archives…
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
FactMiners & PRImA:
Our Knight News Challenge Entry
•“Turn Text Soup into Smart Data in
Newspaper & Magazine Archives”
https://goo.gl/99Vn5M
• Team
• Jim Salmons, FactMiners
• Timlynn Babitsky, FactMiners
• Apostolos Antonacopoulos, PRImA
• Christian Clausner, PRImA
FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”

More Related Content

Similar to FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data

NDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceMark West
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceMark West
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceMark West
 
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceMark West
 
Marvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine LearningMarvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine LearningDaniel Takabayashi, MSc
 
LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...
LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...
LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...Jack Molisani
 
Big Data in Education Sector
Big Data in Education SectorBig Data in Education Sector
Big Data in Education SectorKaran Sachdeva
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsNeo4j
 
IoT as a metaphor!
IoT as a metaphor!IoT as a metaphor!
IoT as a metaphor!PG Madhavan
 
DXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. Service
DXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. ServiceDXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. Service
DXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. ServiceLukas Ott
 
Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...
Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...
Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...dlvr.it
 
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptxIoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptxAurelia JQ
 
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePointSemantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePointDIQA Projektmanagement GmbH
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationInside Analysis
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trendsAlan Morrison
 
Data Culture Keynote and Exec Track Birm Dec 8th
Data Culture Keynote and Exec Track Birm Dec 8thData Culture Keynote and Exec Track Birm Dec 8th
Data Culture Keynote and Exec Track Birm Dec 8thJonathan Woodward
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with MicrosoftCaserta
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionWeCloudData
 

Similar to FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data (20)

NDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data Science
 
JavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data Science
 
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data ScienceGeeCon Prague 2018 - A Practical-ish Introduction to Data Science
GeeCon Prague 2018 - A Practical-ish Introduction to Data Science
 
Marvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine LearningMarvin Platform – Potencializando equipes de Machine Learning
Marvin Platform – Potencializando equipes de Machine Learning
 
LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...
LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...
LavaCon 2017 - Smarter Enterprise Collaboration through Content 4.0 and Micro...
 
Big Data in Education Sector
Big Data in Education SectorBig Data in Education Sector
Big Data in Education Sector
 
Government GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 StandardsGovernment GraphSummit: And Then There Were 15 Standards
Government GraphSummit: And Then There Were 15 Standards
 
IoT as a metaphor!
IoT as a metaphor!IoT as a metaphor!
IoT as a metaphor!
 
DXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. Service
DXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. ServiceDXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. Service
DXC Industrialized A.I. – Von der Data Story zum industrialisierten A.I. Service
 
Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...
Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...
Content Marketing Strategies Conference: Ted Greenwald Attract & Engage Audie...
 
Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
The Non-Terrifying Intro to Semantic Content
The Non-Terrifying Intro to Semantic ContentThe Non-Terrifying Intro to Semantic Content
The Non-Terrifying Intro to Semantic Content
 
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptxIoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
IoT-Fundamentals-And-Digital-Tranformation-Repaired.pptx
 
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePointSemantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
Semantische Technologien (nicht nur) für die verbesserte Suche in SharePoint
 
Business in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for IntegrationBusiness in the Driver’s Seat – An Improved Model for Integration
Business in the Driver’s Seat – An Improved Model for Integration
 
Data centric business and knowledge graph trends
Data centric business and knowledge graph trendsData centric business and knowledge graph trends
Data centric business and knowledge graph trends
 
Data Culture Keynote and Exec Track Birm Dec 8th
Data Culture Keynote and Exec Track Birm Dec 8thData Culture Keynote and Exec Track Birm Dec 8th
Data Culture Keynote and Exec Track Birm Dec 8th
 
Big Data Analytics with Microsoft
Big Data Analytics with MicrosoftBig Data Analytics with Microsoft
Big Data Analytics with Microsoft
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 

Recently uploaded

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...ThinkInnovation
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 

Recently uploaded (20)

dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
Predictive Analysis - Using Insight-informed Data to Determine Factors Drivin...
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 

FactMiners & PRImA's "Turning Text Soup into Smart Data" - The Goal: Smart Data

  • 1. Goal: Smart Data From “readable” to “computable” FactMiners & PRImA’s Knight News Challenge Entry Turn Text Soup into Smart Data in Newspaper & Magazine Archives” A self-running video slideshow. One slide every 15 seconds. Pause as needed. 
  • 2. Q: What is Smart Data? • A: Smart Data is self-descriptive data that can “carry on a conversation” with Smart Programs to support access, editing, and visualization of the data itself. FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” The “actual” data of the database To access the “actual” data of the database, Smart Programs “talk” to an embedded “database about the database” (AKA a metamodel )
  • 3. Q: What does Smart Data look like? FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” • A: Smart Data includes BOTH the complex document structure of the source AND the underlying conceptual model of the source content.
  • 4. Q: What can Smart Data do? • A: Turn expensive, time- consuming, labor-intensive research studies into “Just ask!” queries • Good for things like: • How did local reporting of race relations impact public policy in Indiana in the 1950s? • Did advertising or editorial coverage account for the popularity of programs in the Softalk Bestseller lists? FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
  • 5. Q: How “smart” is our Smart Data design? • We spent a year researching museum informatics and prototyping Smart Data designs. • Our software architecture is based on CIDOC-CRM (Conceptual Reference Model for Museums) microservice workflows and PRESSoo, the ISSN.org metamodel for serial publications FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” Winter, 2013 Spring, 2014 Fall, 2014 Summer, 2015 Neo4j GraphGist Challenge, a 1st place for Metamodel Subgraph domain model Semi-finals Ashoka/LEGO “Re-imagine Learning” Challenge. #MW2014 FactMiners demo. Introduced to #cidocCRM. Museum Computer Network Emerging Professional Scholarship. #MCN2014 paper & demo. “Massively Addressable Text” published in peer-reviewed CODE|WORDS. #HILT2015 Crowdsourcing Course DPLA Community Reps. Internet Archive Content Partner. ICOM #cidocCRM SIG member. Incorporate PRESSoo into design. Begin PRImA Collaboration.
  • 6. Q: How “open” is our Smart Data design? • Using a metamodel subgraph design pattern to embed and pass info about data and its access and transformation is technology neutral & future-proof. FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives” Without Smart Data With Smart Data Database 10 Load X 20 Print X 30 Goto 10 Domain knowledge written into task-specific programs Metamodel statically stored within #TEI header section of source documents std. text files <teiHeader> <metamodel /> <structure /> <content /> Any “smart” DB For dynamic Linked Open Data access, DB need only have import & ability to represent data structures read from metamodel header. 10 Load metamodel 20 Configure editors 30 Do stuff… “Smart” program in any language
  • 7. We have a design to “tame” Text Soup and unlock “facts” in archive data. • An innovative design combining international standards for conceptual modeling of museum collections (cidocCRM and PRESSoo) together with a “self- descriptive” software/database design pattern provide the foundation for mining Smart Data from Text Soup. • In the next slideshow, we describe our design for the technology to “fact-mine” Smart Data from newspaper & magazine digital archives… FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”
  • 8. FactMiners & PRImA: Our Knight News Challenge Entry •“Turn Text Soup into Smart Data in Newspaper & Magazine Archives” https://goo.gl/99Vn5M • Team • Jim Salmons, FactMiners • Timlynn Babitsky, FactMiners • Apostolos Antonacopoulos, PRImA • Christian Clausner, PRImA FactMiners & PRImA: Knight News Challenge – “Turning Text Soup into Smart Data in Newspaper & Magazine Archives”