SlideShare a Scribd company logo
1 of 46
Karma: A Data Integration 
Tool 
Pedro Szekely 
USC/Information Sciences Institute 
pszekely@isi.edu, http://isi.edu/~szekely 
November 2014 
CC-By 2.0
Outline 
• Problem 
• Linked Data 
• Tools to produce Linked Data 
• Karma 
• Storing and maintaining the data 
Pedro Szekely CC-By 2.0 2
Pedro Szekely 
Problem 
CC-By 2.0 3
dsearles/Flickr 
Karma’s Goals 
tear down data silos 
connect information in separate databases 
expose untapped value of database content 
Pedro Szekely CC-By 2.0 4
Karma’s Audience 
Cultural heritage 
Entertainment 
Intelligence 
Science 
... anyone who has data silo problems 
Pedro Szekely CC-By 2.0 5
Pedro Szekely 
Linked Data 
CC-By 2.0 6
The Web of Documents 
Pedro Szekely CC-By 2.0 7
What We See 
Pedro Szekely CC-By 2.0 8
What the Computer Sees 
blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah 
blah blah blah blah 
blah blah blah 
blah blah 
blah blah blah 
blah blah blah blah 
blah blah blah blah 
blah blah blah blah 
blah blah blah 
blah blah blah blah 
blah blah blah 
blah blah blah blah 
blah blah blah blah 
blah blah blah blah 
blah blah blah 
blah blah blah blah 
blah blah blah 
blah blah blah blah 
blah blah blah blah blah blah blah blah 
blah blah 
blah blah blah blah blah blah blah blah blah blah blah blah 
blah blah blah blah blah blah blah blah blah blah blah blah 
Pedro Szekely CC-By 2.0 9
Pedro Szekely 
web pages are machine processable, 
but not machine understandable 
impractical for building applications using the data 
Pedro Szekely CC-By 2.0 10
Solution: Linked Data 
A method of publishing structured data 
so that it can be interlinked 
and become more useful 
Builds upon standard Web technologies 
such as HTTP and URIs 
to share information 
in a way that can be read automatically by 
from Wikipedia computers 
Pedro Szekely CC-By 2.0 11
Represent Resources Using 
URIs 
That guy has first name “Pedro” 
http://szekelys.com/family#pedro 
“Pedro” 
http://xmlns.com/foaf/0.1/firstName 
Pedro Szekely CC-By 2.0 12
Represent Information as 
Triples 
http://szekelys.com/family#pedro 
http://xmlns.com/foaf/0.1/firstName 
Subject 
Predicate 
“Pedro” 
Object 
The resource being described 
A property of the resource 
The value of the property 
Pedro Szekely CC-By 2.0 13
RDF Graphs 
http://szekelys.com/family#pedro 
“Pedro” 
foaf:firstName 
foaf:Person 
rdf:type 
foaf:homepage 
http://isi.edu/~szekely 
Pedro Szekely CC-By 2.0 14
Linked Open Data 
Pedro Szekely CC-By 2.0 15
Pedro Szekely 
Tools to 
Produce 
Linked Data 
CC-By 2.0 16
Steps to Create Linked Open 
Data 
• Select ontologies 
… that define classes and properties for our 
data 
• Convert data to RDF 
… from data sources to the ontologies 
• Identify links to other Linked Data datasets 
… to other Linked Data 
Pedro Szekely CC-By 2.0 17
• Select ontologies 
… that define classes and properties for our 
data 
Pedro Szekely 
CC-By 2.0 18 
CIDOC CRM 
http://www.cidoc-crm.org/ 
e.g.
Pedro Szekely 
CC-By 2.0 19 
• Select ontologies 
… that define classes and properties for our 
data 
• Convert data to RDF 
… from data sources to the ontologies
RDF Mapping Tools 
Tool Shortcomings Benefits 
custom 
labor intensive, 
flexible 
code 
error prone 
R2RML difficult to learn, 
only for SQL 
databases 
W3C standard, good 
documentation, multiple 
vendors 
RDF 
Refine 
only for tabular 
data 
graphical user interface, 
support for reconciliation, 
open source 
Karma semi-automatic, graphical 
user interface, supports 
tabular data, XML and JSON, 
multiple export formats, 
Pedro Szekely CC-By 2.0 20
Pedro Szekely 
Karma 
CC-By 2.0 21
Karma 
Interactive tool for rapidly extracting, cleaning, 
transforming, integrating and publishing data 
Tabular 
Sources 
Hierarchic 
al Sources 
Service 
Karma 
s Model 
RDF 
Database 
JSON 
… 
Pedro Szekely 22
Inputs: Ontologies and Data 
Sources 
Data Source 
object property 
data property 
subClassOf 
Domain Ontology 
birthdate 
Person 
Organization 
Place 
State 
name 
bornIn 
worksFor state 
name 
phone 
name 
livesIn 
City 
Event 
ceo 
location 
organizer 
nearby 
startDate 
title 
isPartOf 
postalCode 
Column 1 Column 2 Column 3 Column 4 Column 5 
Bill Gates Oct 1955 Microsoft Seattle WA 
Mark Zuckerberg May 1984 Facebook White Plains NY 
Pedro Szekely Larry Page Mar 1973 Google East Lansing CCM-BI y 2.0 23
Pedro Szekely 
Semantic Model: maps 
source to domain 
Source 
object property 
data property 
subClassOf 
Domain Ontology 
birthdate 
Person 
Organization 
Place 
State 
name 
bornIn 
worksFor state 
name 
phone 
name 
livesIn 
City 
Event 
ceo 
location 
organizer 
nearby 
startDate 
title 
isPartOf 
postalCode 
ontology 
Column 1 Column 2 Column 3 Column 4 Column 5 
Bill Gates Oct 1955 Microsoft Seattle WA 
Mark Zuckerberg May 1984 Facebook White Plains NY 
CC-By 2.0 24 
Larry Page Mar 1973 Google East Lansing MI
Semantic Model = Semantic Types + Relationships 
Pedro Szekely 
CC-By 2.0 25
Semantic Types 
Person Person 
Organization City State 
name birthdate name name name 
Column 1 Column 2 Column 3 Column 4 Column 5 
Bill Gates Oct 1955 Microsoft Seattle WA 
Mark Zuckerberg May 1984 Facebook White Plains NY 
Larry Page Mar 1973 Google East Lansing MI 
Pedro Szekely CC-By 2.0 26
Relationships 
Person 
City 
worksFor 
Organization 
State 
bornIn 
name birthdate 
state 
name 
name 
name 
Column 1 Column 2 Column 3 Column 4 Column 5 
Bill Gates Oct 1955 Microsoft Seattle WA 
Mark Zuckerberg May 1984 Facebook White Plains NY 
Larry Page Mar 1973 Google East Lansing MI 
Pedro Szekely CC-By 2.0 27
Pedro Szekely 
CC-By 2.0 28 
so what?
Karma uses semantic models to create linked Pedro Szekely 
Linked Data 
CC-By 2.0 29
Karma uses semantic models to create linked Karma Pedro Szekely 
semi-automatically builds semantic models 
Linked Data 
CC-By 2.0 30
Karma uses semantic models to create linked Karma Pedro Szekely 
semi-automatically builds semantic models 
… and provides a nice GUI to edit them 
Linked Data 
CC-By 2.0 31
Pedro Szekely 
Karma 
Slide Demo 
CC-By 2.0 32
Load A Source 
Pedro Szekely CC-By 2.0 33
Define URIs 
Pedro Szekely CC-By 2.0 34
Define All URIs 
Pedro Szekely CC-By 2.0 35
Karma Suggests Semantic 
Types 
Pedro Szekely CC-By 2.0 36
Semantic Model with One Semantic 
Type 
Pedro Szekely CC-By 2.0 37
Two Semantic Types 
Pedro Szekely CC-By 2.0 38
3 Semantic Types + 1 
Relationship 
Pedro Szekely CC-By 2.0 39
… 5 Minutes Later: 
Complete Semantic Model 
Pedro Szekely CC-By 2.0 40
Pedro Szekely 
CC-By 2.0 41 
so what?
Publish Linked Data 
Pedro Szekely CC-By 2.0 42
Linked Data in JSON-LD 
Format 
Pedro Szekely CC-By 2.0 43
Pedro Szekely 
Storing and 
Maintaining 
the Data 
CC-By 2.0 44
Storage Options 
Technolog 
y 
Shortcomings Benefits 
SPARQL 
endpoint 
low reliability, 
esoteric, slow 
sophisticated query 
language 
RDF 
dump 
no query capability, 
esoteric 
flexibility: clients can 
download and use in 
applications, easy to 
publish 
JSON-LD 
+ 
ElasticSe 
arch 
restricted query 
language 
very high performance, 
mainstream technology, 
Karma supports the etahsyre toe p uobplisthions 
Pedro Szekely CC-By 2.0 45
thanks for your attention 
https://github.com/usc-isi-i2/Web-Karma 
Open Source, Apache 2 License 
CC-By 2.0 46

More Related Content

Similar to November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Interoperability & Systems Integration

Linked Data and Tools
Linked Data and ToolsLinked Data and Tools
Linked Data and ToolsPedro Szekely
 
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data WebData Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data WebJohn Breslin
 
20100614 ISWSA Keynote
20100614 ISWSA Keynote20100614 ISWSA Keynote
20100614 ISWSA KeynoteAxel Polleres
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Jane Stevenson
 
Linked Data, Cultural Heritage & the Karma Mapping Software
Linked Data, Cultural Heritage & the Karma Mapping SoftwareLinked Data, Cultural Heritage & the Karma Mapping Software
Linked Data, Cultural Heritage & the Karma Mapping SoftwarePedro Szekely
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked datavafopoulos
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
The CSO Open Data Experience
The CSO Open Data ExperienceThe CSO Open Data Experience
The CSO Open Data ExperienceDublinked .
 
Still using MySQL? Maybe you should reconsider.
Still using MySQL? Maybe you should reconsider.Still using MySQL? Maybe you should reconsider.
Still using MySQL? Maybe you should reconsider.Radu-Sebastian Amarie
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
MySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New WorldMySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New WorldFrederic Descamps
 
SMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebSMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebMatthew Brown
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data introvafopoulos
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
 
Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]
Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]
Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]Teodora Petkova
 

Similar to November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Interoperability & Systems Integration (20)

Linked Data and Tools
Linked Data and ToolsLinked Data and Tools
Linked Data and Tools
 
Linked Data and Tools
Linked Data and ToolsLinked Data and Tools
Linked Data and Tools
 
final ppt.pptx
final ppt.pptxfinal ppt.pptx
final ppt.pptx
 
final ppt.pptx
final ppt.pptxfinal ppt.pptx
final ppt.pptx
 
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data WebData Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
Data Accessibility and Me: Introducing SIOC, FOAF and the Linked Data Web
 
20100614 ISWSA Keynote
20100614 ISWSA Keynote20100614 ISWSA Keynote
20100614 ISWSA Keynote
 
Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011 Linked Data and Locah, UKSG2011
Linked Data and Locah, UKSG2011
 
Linked Data, Cultural Heritage & the Karma Mapping Software
Linked Data, Cultural Heritage & the Karma Mapping SoftwareLinked Data, Cultural Heritage & the Karma Mapping Software
Linked Data, Cultural Heritage & the Karma Mapping Software
 
SWT Lecture Session 1 - Introduction
SWT Lecture Session 1 - IntroductionSWT Lecture Session 1 - Introduction
SWT Lecture Session 1 - Introduction
 
2011 05-01 linked data
2011 05-01 linked data2011 05-01 linked data
2011 05-01 linked data
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
The CSO Open Data Experience
The CSO Open Data ExperienceThe CSO Open Data Experience
The CSO Open Data Experience
 
Still using MySQL? Maybe you should reconsider.
Still using MySQL? Maybe you should reconsider.Still using MySQL? Maybe you should reconsider.
Still using MySQL? Maybe you should reconsider.
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
MySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New WorldMySQL 8.0 Document Store - Discovery of a New World
MySQL 8.0 Document Store - Discovery of a New World
 
SMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebSMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic Web
 
Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
2011 05-02 linked data intro
2011 05-02 linked data intro2011 05-02 linked data intro
2011 05-02 linked data intro
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]
Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]
Solid: An Ecology of Digital Being [@SLA Europe October 28, 2020]
 

More from National Information Standards Organization (NISO)

More from National Information Standards Organization (NISO) (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"Bazargan "NISO Webinar, Sustainability in Publishing"
Bazargan "NISO Webinar, Sustainability in Publishing"
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"Compton "NISO Webinar, Sustainability in Publishing"
Compton "NISO Webinar, Sustainability in Publishing"
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
Hazen, Morse, and Varnum "Spring 2024 ODI Conformance Statement Workshop for ...
 
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
Mattingly "AI & Prompt Design" - Introduction to Machine Learning"
 
Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"Mattingly "Text and Data Mining: Building Data Driven Applications"
Mattingly "Text and Data Mining: Building Data Driven Applications"
 
Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"Mattingly "Text and Data Mining: Searching Vectors"
Mattingly "Text and Data Mining: Searching Vectors"
 
Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"Mattingly "Text Mining Techniques"
Mattingly "Text Mining Techniques"
 
Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"Mattingly "Text Processing for Library Data: Representing Text as Data"
Mattingly "Text Processing for Library Data: Representing Text as Data"
 
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
Carpenter "Designing NISO's New Strategic Plan: 2023-2026"
 
Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"Ross and Clark "Strategic Planning"
Ross and Clark "Strategic Planning"
 
Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"Mattingly "Data Mining Techniques: Classification and Clustering"
Mattingly "Data Mining Techniques: Classification and Clustering"
 
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...Straza "Global collaboration towards equitable and open science: UNESCO Recom...
Straza "Global collaboration towards equitable and open science: UNESCO Recom...
 
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
Lippincott "Beyond access: Accelerating discovery and increasing trust throug...
 
Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"Kriegsman "Integrating Open and Equitable Research into Open Science"
Kriegsman "Integrating Open and Equitable Research into Open Science"
 
Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"Mattingly "Ethics and Cleaning Data"
Mattingly "Ethics and Cleaning Data"
 
Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"Mercado-Lara "Open & Equitable Program"
Mercado-Lara "Open & Equitable Program"
 

Recently uploaded

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 

Recently uploaded (20)

POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 

November 19, 2014 NISO Virtual Conference: Can't We All Work Together?: Interoperability & Systems Integration

  • 1. Karma: A Data Integration Tool Pedro Szekely USC/Information Sciences Institute pszekely@isi.edu, http://isi.edu/~szekely November 2014 CC-By 2.0
  • 2. Outline • Problem • Linked Data • Tools to produce Linked Data • Karma • Storing and maintaining the data Pedro Szekely CC-By 2.0 2
  • 3. Pedro Szekely Problem CC-By 2.0 3
  • 4. dsearles/Flickr Karma’s Goals tear down data silos connect information in separate databases expose untapped value of database content Pedro Szekely CC-By 2.0 4
  • 5. Karma’s Audience Cultural heritage Entertainment Intelligence Science ... anyone who has data silo problems Pedro Szekely CC-By 2.0 5
  • 6. Pedro Szekely Linked Data CC-By 2.0 6
  • 7. The Web of Documents Pedro Szekely CC-By 2.0 7
  • 8. What We See Pedro Szekely CC-By 2.0 8
  • 9. What the Computer Sees blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah Pedro Szekely CC-By 2.0 9
  • 10. Pedro Szekely web pages are machine processable, but not machine understandable impractical for building applications using the data Pedro Szekely CC-By 2.0 10
  • 11. Solution: Linked Data A method of publishing structured data so that it can be interlinked and become more useful Builds upon standard Web technologies such as HTTP and URIs to share information in a way that can be read automatically by from Wikipedia computers Pedro Szekely CC-By 2.0 11
  • 12. Represent Resources Using URIs That guy has first name “Pedro” http://szekelys.com/family#pedro “Pedro” http://xmlns.com/foaf/0.1/firstName Pedro Szekely CC-By 2.0 12
  • 13. Represent Information as Triples http://szekelys.com/family#pedro http://xmlns.com/foaf/0.1/firstName Subject Predicate “Pedro” Object The resource being described A property of the resource The value of the property Pedro Szekely CC-By 2.0 13
  • 14. RDF Graphs http://szekelys.com/family#pedro “Pedro” foaf:firstName foaf:Person rdf:type foaf:homepage http://isi.edu/~szekely Pedro Szekely CC-By 2.0 14
  • 15. Linked Open Data Pedro Szekely CC-By 2.0 15
  • 16. Pedro Szekely Tools to Produce Linked Data CC-By 2.0 16
  • 17. Steps to Create Linked Open Data • Select ontologies … that define classes and properties for our data • Convert data to RDF … from data sources to the ontologies • Identify links to other Linked Data datasets … to other Linked Data Pedro Szekely CC-By 2.0 17
  • 18. • Select ontologies … that define classes and properties for our data Pedro Szekely CC-By 2.0 18 CIDOC CRM http://www.cidoc-crm.org/ e.g.
  • 19. Pedro Szekely CC-By 2.0 19 • Select ontologies … that define classes and properties for our data • Convert data to RDF … from data sources to the ontologies
  • 20. RDF Mapping Tools Tool Shortcomings Benefits custom labor intensive, flexible code error prone R2RML difficult to learn, only for SQL databases W3C standard, good documentation, multiple vendors RDF Refine only for tabular data graphical user interface, support for reconciliation, open source Karma semi-automatic, graphical user interface, supports tabular data, XML and JSON, multiple export formats, Pedro Szekely CC-By 2.0 20
  • 21. Pedro Szekely Karma CC-By 2.0 21
  • 22. Karma Interactive tool for rapidly extracting, cleaning, transforming, integrating and publishing data Tabular Sources Hierarchic al Sources Service Karma s Model RDF Database JSON … Pedro Szekely 22
  • 23. Inputs: Ontologies and Data Sources Data Source object property data property subClassOf Domain Ontology birthdate Person Organization Place State name bornIn worksFor state name phone name livesIn City Event ceo location organizer nearby startDate title isPartOf postalCode Column 1 Column 2 Column 3 Column 4 Column 5 Bill Gates Oct 1955 Microsoft Seattle WA Mark Zuckerberg May 1984 Facebook White Plains NY Pedro Szekely Larry Page Mar 1973 Google East Lansing CCM-BI y 2.0 23
  • 24. Pedro Szekely Semantic Model: maps source to domain Source object property data property subClassOf Domain Ontology birthdate Person Organization Place State name bornIn worksFor state name phone name livesIn City Event ceo location organizer nearby startDate title isPartOf postalCode ontology Column 1 Column 2 Column 3 Column 4 Column 5 Bill Gates Oct 1955 Microsoft Seattle WA Mark Zuckerberg May 1984 Facebook White Plains NY CC-By 2.0 24 Larry Page Mar 1973 Google East Lansing MI
  • 25. Semantic Model = Semantic Types + Relationships Pedro Szekely CC-By 2.0 25
  • 26. Semantic Types Person Person Organization City State name birthdate name name name Column 1 Column 2 Column 3 Column 4 Column 5 Bill Gates Oct 1955 Microsoft Seattle WA Mark Zuckerberg May 1984 Facebook White Plains NY Larry Page Mar 1973 Google East Lansing MI Pedro Szekely CC-By 2.0 26
  • 27. Relationships Person City worksFor Organization State bornIn name birthdate state name name name Column 1 Column 2 Column 3 Column 4 Column 5 Bill Gates Oct 1955 Microsoft Seattle WA Mark Zuckerberg May 1984 Facebook White Plains NY Larry Page Mar 1973 Google East Lansing MI Pedro Szekely CC-By 2.0 27
  • 28. Pedro Szekely CC-By 2.0 28 so what?
  • 29. Karma uses semantic models to create linked Pedro Szekely Linked Data CC-By 2.0 29
  • 30. Karma uses semantic models to create linked Karma Pedro Szekely semi-automatically builds semantic models Linked Data CC-By 2.0 30
  • 31. Karma uses semantic models to create linked Karma Pedro Szekely semi-automatically builds semantic models … and provides a nice GUI to edit them Linked Data CC-By 2.0 31
  • 32. Pedro Szekely Karma Slide Demo CC-By 2.0 32
  • 33. Load A Source Pedro Szekely CC-By 2.0 33
  • 34. Define URIs Pedro Szekely CC-By 2.0 34
  • 35. Define All URIs Pedro Szekely CC-By 2.0 35
  • 36. Karma Suggests Semantic Types Pedro Szekely CC-By 2.0 36
  • 37. Semantic Model with One Semantic Type Pedro Szekely CC-By 2.0 37
  • 38. Two Semantic Types Pedro Szekely CC-By 2.0 38
  • 39. 3 Semantic Types + 1 Relationship Pedro Szekely CC-By 2.0 39
  • 40. … 5 Minutes Later: Complete Semantic Model Pedro Szekely CC-By 2.0 40
  • 41. Pedro Szekely CC-By 2.0 41 so what?
  • 42. Publish Linked Data Pedro Szekely CC-By 2.0 42
  • 43. Linked Data in JSON-LD Format Pedro Szekely CC-By 2.0 43
  • 44. Pedro Szekely Storing and Maintaining the Data CC-By 2.0 44
  • 45. Storage Options Technolog y Shortcomings Benefits SPARQL endpoint low reliability, esoteric, slow sophisticated query language RDF dump no query capability, esoteric flexibility: clients can download and use in applications, easy to publish JSON-LD + ElasticSe arch restricted query language very high performance, mainstream technology, Karma supports the etahsyre toe p uobplisthions Pedro Szekely CC-By 2.0 45
  • 46. thanks for your attention https://github.com/usc-isi-i2/Web-Karma Open Source, Apache 2 License CC-By 2.0 46