SlideShare a Scribd company logo
1 of 23
Download to read offline
Best practices for generating
linked data
Tutorial @ ICBO 2013
Tutorial Roadmap
Bio2RDF Best Practices
1. Assign a URI for all things
2. Assign labels and identifiers
3. Declare and assign types
4. Provide dataset provenance
1. Assign URIs for all things
● The base Bio2RDF URI pattern:
http://bio2rdf.org/namespace:identifier
● Data provider record identifiers are
maintained from source
● Linked Data = no blank nodes!
1. Assign URIs for all things
● Data provider records are maintained from
source
○ e.g. DrugBank’s resource IRI for
Leucovorin
http://bio2rdf.org/drugbank:DB00650
1. Assign URIs for all things
● Vocabulary namespaces are used for
dataset specific types and predicates
http://bio2rdf.org/drugbank_vocabulary:Drug
● Resource namespaces are used to assign
an identifier when one isn't a provided by the
source
- unique identifier with UUID, hash, counter, concatenated
strings, etc
http://bio2rdf.org/drugbank_resource:DB00440_DB00650
1. Assign URIs for all things
● All valid namespaces are listed in the
Bio2RDF Life Sciences Registry
○ ensures that URIs are consistent across all Bio2RDF
datasets
○ registry is publicly available at http://tinyurl.
com/dataregistry
2. Assign labels and identifiers
● Use rdfs:label to assign a language-specified
label for all resources
○ can be a source provided title, a script generated
phrase, or a phrase provided in a third party dataset
○ Pattern: rdfs:label "label [ns:id]"@lang
● Use Dublin Core predicates for source-
provided label and identifiers
○ Pattern: dc:title "label"@lang (assign language tag
only when one is provided)
○ Pattern: dc:identifier "ns:id"^^xsd:string
2. Assign labels and identifiers
● Use Bio2RDF predicates to assign Bio2RDF
namespace and Bio2RDF identifiers:
○ Pattern: bio2rdf_vocabulary:namespace "ns"^^xsd:
string
○ Pattern: bio2rdf_vocabulary:identifier "id"^^xsd:
string
2. Assign labels and identifiers
Example: DrugBank entry for Nitrazepam
drugbank:DB0159
rdfs:label "Nitrazepam [drugbank:DB0159]"@en ;
dc:title “Nitrazepam”@en ;
dc:identifier “drugbank:DB0159”^^xsd:string ;
bio2rdf_vocabulary:namespace “drugbank”^^xsd:string ;
bio2rdf_vocabulary:identifier “DB0159”^^xsd:string .
3. Declare and assign types
● All resources should be typed as being
resources of the dataset
○ Pattern: rdf:type namespace_vocabulary:Resource
● Instances of a dataset vocabulary type
should also be typed as owl:
NamedIndividual
○ Pattern: rdf:type namespace_vocabulary:Type
○ Pattern: rdf:type owl:NamedIndividual
● Classes should be typed as owl:Class
○ Pattern: rdf:type owl:Class
○ If superclass has been described using
namespace_vocabulary pattern, then link class
using rdfs:subClassOf
3. Declare and assign types
● Object properties and datatype properties
should also be typed
○ Pattern: rdf:type owl:ObjectProperty
○ Pattern: rdf:type owl:DatatypeProperty
● Examples:
drugbank:DB0159
rdf:type drugbank_vocabulary:Resource ;
rdf:type owl:Class ;
rdfs:subClassOf drugbank_vocabulary:Drug .
drugbank_vocabulary:ddi-interactor-in
rdf:type owl:ObjectProperty .
4. Provide dataset provenance
data item
Bio2RDF dataset
Features
-Entity-dataset link
-Creator
-Publisher
-Date created
-License & rights
-Source
-Availability
- SPARQL endpoint
- Data dump
Vocabularies
VoID
Dublin Core
W3C Provenance
Bio2RDF vocabulary
Source dataset
prov:wasDerivedFrom
void:inDataset
4. Provide dataset provenance
● link every resource to the versioned/dated
Bio2RDF dataset in which it is described
○ Pattern: void:inDataset <http://bio2rdf.org/dataset:
namespace-dd-mm-yyyy.rdf>
○ Example:
drugbank:DB0159 void:inDataset <http://bio2rdf.
org/dataset:drugbank-03-07-2013> .
A crash course in PHP
PHP : Hypertext Preprocessor
● A general-purpose open source scripting
language
○ homepage : http://php.net
● PHP scripts can be executed from the
command line or embedded in HTML
documents
● Syntactically similar to C/C++/Java but it is
not strongly typed
A hello world PHP script
● All PHP scripts are surrounded by the <?php
and ?> tags
Declaring and instantiating classes
Using the Bio2RDF PHP API to create an
RDFizer
● Basic structure of a Bio2RDFizer script:
○ Initialize script parameters - input file(s), default
dataset namespace, etc.
○ Define a Run() function that handles downloading
and iterating over input files, as well as function calls
to parse and convert input data to RDF
○ Define function(s) to convert input data to RDF using
Bio2RDF API helper functions
Using the Bio2RDF PHP API to create an
RDFizer
● Bio2RDF PHP API defines helper functions
that implement Bio2RDF best practices:
○ getNamespace()
○ getVoc()
○ getRes()
○ triplify($subject, $predicate, $object) //object is an rdf resource
○ triplifyString($subject, $predicate, "string")// object is a literal
○ describeIndividual($uri, $label, $type, $title, $description, $language)
○ describeClass( ... )
○ describeProperty ( ... )
Example: The Comparative
Toxicogenomics Database
CTD Bio2RDFizer
script is available
on GitHub
Using and contributing to the
Bio2RDF project on GitHub
Using and contributing to the
Bio2RDF project on GitHub
1. Fork the bio2rdf-scripts and php-lib
repositories on Github
https://help.github.com/articles/fork-a-repo
2. Write some code!
3. Commit code to your fork
4. Make a pull request to the bio2rdf-scripts
repo

More Related Content

What's hot

Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphsandyseaborne
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFSNilesh Wagmare
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod LacoulShamod Lacoul
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itJose Luis Lopez Pino
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesJose Emilio Labra Gayo
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesAlexandra Roatiș
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introductionGraphity
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapesJose Emilio Labra Gayo
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesMarin Dimitrov
 

What's hot (19)

Getting triples from records: the role of ISBD
Getting triples from records: the role of ISBDGetting triples from records: the role of ISBD
Getting triples from records: the role of ISBD
 
Data shapes-test-suite
Data shapes-test-suiteData shapes-test-suite
Data shapes-test-suite
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
Two graph data models : RDF and Property Graphs
Two graph data models : RDF and Property GraphsTwo graph data models : RDF and Property Graphs
Two graph data models : RDF and Property Graphs
 
RDFa Tutorial
RDFa TutorialRDFa Tutorial
RDFa Tutorial
 
java programming
java programmingjava programming
java programming
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
"RDFa - what, why and how?" by Mike Hewett and Shamod Lacoul
 
RDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use itRDFa: introduction, comparison with microdata and microformats and how to use it
RDFa: introduction, comparison with microdata and microformats and how to use it
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
 
Efficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF DatabasesEfficient Query Answering against Dynamic RDF Databases
Efficient Query Answering against Dynamic RDF Databases
 
5 rdfs
5 rdfs5 rdfs
5 rdfs
 
Semantic Web introduction
Semantic Web introductionSemantic Web introduction
Semantic Web introduction
 
Introduction to SPARQL
Introduction to SPARQLIntroduction to SPARQL
Introduction to SPARQL
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
ShEx by Example
ShEx by ExampleShEx by Example
ShEx by Example
 
Challenges and applications of RDF shapes
Challenges and applications of RDF shapesChallenges and applications of RDF shapes
Challenges and applications of RDF shapes
 
RDF validation tutorial
RDF validation tutorialRDF validation tutorial
RDF validation tutorial
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic Repositories
 

Viewers also liked

Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemFrançois Belleau
 
As Outline
As OutlineAs Outline
As Outlinedc1
 
What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?Christophe Porteneuve
 
Email Delivery Support
Email Delivery SupportEmail Delivery Support
Email Delivery Supportrobbie2629
 
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?Charles Nouyrit
 
Compa 2009 Giurus
Compa 2009 GiurusCompa 2009 Giurus
Compa 2009 Giurusgiurus
 
Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014Francesca Murtas
 
Info literacy and social media in a public library
Info literacy and social media in a public libraryInfo literacy and social media in a public library
Info literacy and social media in a public librarySue Lawson
 
Visual Public Communication And Art
Visual Public Communication And ArtVisual Public Communication And Art
Visual Public Communication And ArtFrancesca Murtas
 
DevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIsDevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIsJerome Louvel
 
Best Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With PrototypeBest Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With PrototypeChristophe Porteneuve
 

Viewers also liked (20)

Querying Bio2RDF data
Querying Bio2RDF dataQuerying Bio2RDF data
Querying Bio2RDF data
 
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge SystemBio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
Bio2RDF: Towards A Mashup To Build Bioinformatics Knowledge System
 
Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
As Outline
As OutlineAs Outline
As Outline
 
What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?What's up with Prototype and script.aculo.us?
What's up with Prototype and script.aculo.us?
 
Email Delivery Support
Email Delivery SupportEmail Delivery Support
Email Delivery Support
 
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
Ignite Paris 2009 - Is World of Warcraft the best leadership training solution?
 
Compa 2009 Giurus
Compa 2009 GiurusCompa 2009 Giurus
Compa 2009 Giurus
 
Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014Sardsos more than a map, the role of the community in osm SOTMEU 2014
Sardsos more than a map, the role of the community in osm SOTMEU 2014
 
Gezinsbond
GezinsbondGezinsbond
Gezinsbond
 
Info literacy and social media in a public library
Info literacy and social media in a public libraryInfo literacy and social media in a public library
Info literacy and social media in a public library
 
Visual Public Communication And Art
Visual Public Communication And ArtVisual Public Communication And Art
Visual Public Communication And Art
 
DevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIsDevOps D-Day - Streamline DevOps workflows with APIs
DevOps D-Day - Streamline DevOps workflows with APIs
 
Best Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With PrototypeBest Practice Solutions for Frequest Ajax Use Cases With Prototype
Best Practice Solutions for Frequest Ajax Use Cases With Prototype
 
Vertsol Report
Vertsol ReportVertsol Report
Vertsol Report
 
Docker wjax2014
Docker wjax2014Docker wjax2014
Docker wjax2014
 
Thesis 1 4
Thesis 1 4Thesis 1 4
Thesis 1 4
 
Nilai nilai Aqidah
Nilai nilai AqidahNilai nilai Aqidah
Nilai nilai Aqidah
 
Clutrain Ppt
Clutrain PptClutrain Ppt
Clutrain Ppt
 
RIM Conference
RIM ConferenceRIM Conference
RIM Conference
 

Similar to Best practices for generating Bio2RDF linked data

GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelabCAMELIA BOBAN
 
Exploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your CloudExploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your Clouddyahalom
 
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLSamuel Lampa
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Michel Dumontier
 
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)Rensselaer Polytechnic Institute
 
Php training in_noida
Php training in_noidaPhp training in_noida
Php training in_noidaTech Mentro
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo cleanHector Canto
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked DataJane Stevenson
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And VisualizationIvan Ermilov
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
 
Dublin Core Description Set Profiles
Dublin Core Description Set ProfilesDublin Core Description Set Profiles
Dublin Core Description Set ProfilesPete Johnston
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataMetaSolutions AB
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)Rensselaer Polytechnic Institute
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2nolmar01
 
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Data Consortium
 

Similar to Best practices for generating Bio2RDF linked data (20)

GDG Meets U event - Big data & Wikidata - no lies codelab
GDG Meets U event - Big data & Wikidata -  no lies codelabGDG Meets U event - Big data & Wikidata -  no lies codelab
GDG Meets U event - Big data & Wikidata - no lies codelab
 
Exploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your CloudExploring Oracle Database 12c Multitenant best practices for your Cloud
Exploring Oracle Database 12c Multitenant best practices for your Cloud
 
Hooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQLHooking up Semantic MediaWiki with external tools via SPARQL
Hooking up Semantic MediaWiki with external tools via SPARQL
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
 
Data in RDF
Data in RDFData in RDF
Data in RDF
 
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
Engineering a Semantic Web: ITWS Capstone Lecture (Spring 2014)
 
Php training in_noida
Php training in_noidaPhp training in_noida
Php training in_noida
 
Keep your repo clean
Keep your repo cleanKeep your repo clean
Keep your repo clean
 
Lifting the Lid on Linked Data
Lifting the Lid on Linked DataLifting the Lid on Linked Data
Lifting the Lid on Linked Data
 
Data Integration And Visualization
Data Integration And VisualizationData Integration And Visualization
Data Integration And Visualization
 
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018
 
Dublin Core Description Set Profiles
Dublin Core Description Set ProfilesDublin Core Description Set Profiles
Dublin Core Description Set Profiles
 
Introduction to Bio SPARQL
Introduction to Bio SPARQL Introduction to Bio SPARQL
Introduction to Bio SPARQL
 
Nobel Prizes as Linked Open Data
Nobel Prizes as Linked Open DataNobel Prizes as Linked Open Data
Nobel Prizes as Linked Open Data
 
Xiaoli Li: MARC to BIBFRAME (Linked Data)
Xiaoli Li: MARC to BIBFRAME (Linked Data)Xiaoli Li: MARC to BIBFRAME (Linked Data)
Xiaoli Li: MARC to BIBFRAME (Linked Data)
 
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
ITWS 4310: Building and Consuming the Web of Data (Fall 2013)
 
How To Recoord
How To RecoordHow To Recoord
How To Recoord
 
W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2W4 4 marc-alexandre-nolin-v2
W4 4 marc-alexandre-nolin-v2
 
Expanding the content categories at JaLC
Expanding the content categories at JaLCExpanding the content categories at JaLC
Expanding the content categories at JaLC
 
Health Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha NoyHealth Datapalooza 2013: Open Government Data - Natasha Noy
Health Datapalooza 2013: Open Government Data - Natasha Noy
 

Recently uploaded

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Best practices for generating Bio2RDF linked data

  • 1. Best practices for generating linked data Tutorial @ ICBO 2013
  • 3. Bio2RDF Best Practices 1. Assign a URI for all things 2. Assign labels and identifiers 3. Declare and assign types 4. Provide dataset provenance
  • 4. 1. Assign URIs for all things ● The base Bio2RDF URI pattern: http://bio2rdf.org/namespace:identifier ● Data provider record identifiers are maintained from source ● Linked Data = no blank nodes!
  • 5. 1. Assign URIs for all things ● Data provider records are maintained from source ○ e.g. DrugBank’s resource IRI for Leucovorin http://bio2rdf.org/drugbank:DB00650
  • 6. 1. Assign URIs for all things ● Vocabulary namespaces are used for dataset specific types and predicates http://bio2rdf.org/drugbank_vocabulary:Drug ● Resource namespaces are used to assign an identifier when one isn't a provided by the source - unique identifier with UUID, hash, counter, concatenated strings, etc http://bio2rdf.org/drugbank_resource:DB00440_DB00650
  • 7. 1. Assign URIs for all things ● All valid namespaces are listed in the Bio2RDF Life Sciences Registry ○ ensures that URIs are consistent across all Bio2RDF datasets ○ registry is publicly available at http://tinyurl. com/dataregistry
  • 8. 2. Assign labels and identifiers ● Use rdfs:label to assign a language-specified label for all resources ○ can be a source provided title, a script generated phrase, or a phrase provided in a third party dataset ○ Pattern: rdfs:label "label [ns:id]"@lang ● Use Dublin Core predicates for source- provided label and identifiers ○ Pattern: dc:title "label"@lang (assign language tag only when one is provided) ○ Pattern: dc:identifier "ns:id"^^xsd:string
  • 9. 2. Assign labels and identifiers ● Use Bio2RDF predicates to assign Bio2RDF namespace and Bio2RDF identifiers: ○ Pattern: bio2rdf_vocabulary:namespace "ns"^^xsd: string ○ Pattern: bio2rdf_vocabulary:identifier "id"^^xsd: string
  • 10. 2. Assign labels and identifiers Example: DrugBank entry for Nitrazepam drugbank:DB0159 rdfs:label "Nitrazepam [drugbank:DB0159]"@en ; dc:title “Nitrazepam”@en ; dc:identifier “drugbank:DB0159”^^xsd:string ; bio2rdf_vocabulary:namespace “drugbank”^^xsd:string ; bio2rdf_vocabulary:identifier “DB0159”^^xsd:string .
  • 11. 3. Declare and assign types ● All resources should be typed as being resources of the dataset ○ Pattern: rdf:type namespace_vocabulary:Resource ● Instances of a dataset vocabulary type should also be typed as owl: NamedIndividual ○ Pattern: rdf:type namespace_vocabulary:Type ○ Pattern: rdf:type owl:NamedIndividual ● Classes should be typed as owl:Class ○ Pattern: rdf:type owl:Class ○ If superclass has been described using namespace_vocabulary pattern, then link class using rdfs:subClassOf
  • 12. 3. Declare and assign types ● Object properties and datatype properties should also be typed ○ Pattern: rdf:type owl:ObjectProperty ○ Pattern: rdf:type owl:DatatypeProperty ● Examples: drugbank:DB0159 rdf:type drugbank_vocabulary:Resource ; rdf:type owl:Class ; rdfs:subClassOf drugbank_vocabulary:Drug . drugbank_vocabulary:ddi-interactor-in rdf:type owl:ObjectProperty .
  • 13. 4. Provide dataset provenance data item Bio2RDF dataset Features -Entity-dataset link -Creator -Publisher -Date created -License & rights -Source -Availability - SPARQL endpoint - Data dump Vocabularies VoID Dublin Core W3C Provenance Bio2RDF vocabulary Source dataset prov:wasDerivedFrom void:inDataset
  • 14. 4. Provide dataset provenance ● link every resource to the versioned/dated Bio2RDF dataset in which it is described ○ Pattern: void:inDataset <http://bio2rdf.org/dataset: namespace-dd-mm-yyyy.rdf> ○ Example: drugbank:DB0159 void:inDataset <http://bio2rdf. org/dataset:drugbank-03-07-2013> .
  • 15. A crash course in PHP
  • 16. PHP : Hypertext Preprocessor ● A general-purpose open source scripting language ○ homepage : http://php.net ● PHP scripts can be executed from the command line or embedded in HTML documents ● Syntactically similar to C/C++/Java but it is not strongly typed
  • 17. A hello world PHP script ● All PHP scripts are surrounded by the <?php and ?> tags
  • 19. Using the Bio2RDF PHP API to create an RDFizer ● Basic structure of a Bio2RDFizer script: ○ Initialize script parameters - input file(s), default dataset namespace, etc. ○ Define a Run() function that handles downloading and iterating over input files, as well as function calls to parse and convert input data to RDF ○ Define function(s) to convert input data to RDF using Bio2RDF API helper functions
  • 20. Using the Bio2RDF PHP API to create an RDFizer ● Bio2RDF PHP API defines helper functions that implement Bio2RDF best practices: ○ getNamespace() ○ getVoc() ○ getRes() ○ triplify($subject, $predicate, $object) //object is an rdf resource ○ triplifyString($subject, $predicate, "string")// object is a literal ○ describeIndividual($uri, $label, $type, $title, $description, $language) ○ describeClass( ... ) ○ describeProperty ( ... )
  • 21. Example: The Comparative Toxicogenomics Database CTD Bio2RDFizer script is available on GitHub
  • 22. Using and contributing to the Bio2RDF project on GitHub
  • 23. Using and contributing to the Bio2RDF project on GitHub 1. Fork the bio2rdf-scripts and php-lib repositories on Github https://help.github.com/articles/fork-a-repo 2. Write some code! 3. Commit code to your fork 4. Make a pull request to the bio2rdf-scripts repo