+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Bringing semantic publishing into TEI: ideas and pointers
1. Bringing semantic publishing
into TEI
ideas and pointers
Silvio Peroni
Fabio Vitali
Department of Computer Science and Engineering
University of Bologna
Italy
3. Semantic Web / Open Linked Data
Yet another definition of Semantic Web:
The evolution of the World Wide Web encompassing the integration of the
WWW with formal semantics to:
Yet another definition of Open Linked Data:
The incremental implementation of many layers of
semantics of data released to the Commons:
• Structured and semi-structured data
• Abstraction and conceptualisation of data
• Inferences on data
• enable visualisation and elaboration of
complex data
• provide languages (e.g., OWL) to
formalise the meaning of data (e.g.,
using description logics)
4. Semantic publishing
« anything that
• enhances the meaning of a published journal article,
• facilitates its automated discovery,
• enables its linking to semantically related articles,
• provides access to data within the article in actionable form, or
• facilitates integration of data between papers.
Among other things, it involves enriching the article with appropriate
metadata that
• are amenable to automated processing and analysis,
• allowing enhanced verifiability of published information and
• providing the capacity for automated discovery and summarization »
Shotton, D. (2009). Semantic publishing: the coming revolution in scientific
journal publishing. Learned Publishing, 22(2): 85–94. DOI: 10.1087/2009202
5. Why Semantic Publishing?
• Increase the intrinsic value of publications,
• Increase the richness of information, understanding
and knowledge that can be extracted from
publications;
• Enable the development of additional services
• Integrate information from multiple enhanced articles,
• Provide additional business opportunities for the
publishers
6. Goals of semantic publishing
• Evaluating the pertinence of a document to a scientific field
• Discovering research trends and propagation of research findings
• Tracking of research activities, institutions and disciplines
• Analysing quantitative aspects of the output of researchers
• Evaluating the multi-disciplinarity of the output of scholars
• Measuring positive/negative citations to a particular work
• Designing and including algorithms to compute metrics indicators
• Helping final users to find related materials to a topic and/or article
• Evaluating the social acceptability of the scientific production
• Enabling users to annotate documents with related semantic data
• Querying (semantic) bibliographic data
7. SPAR
• One of the most complete
set of ontologies to
describe scholarly objects
• It uses:
– Common vocabulary of
terms
– External metadata
schemas (SKOS, PRISM,
DC)
– FRBR concepts to
distinguish between work,
version, edition and copy
– Document components
– Roles of people, status of
documents and publishing
workflows
– Citations, citation contexts,
reference lists
8. Semantic lenses
• Particular points of
view on scholarly
entities
• Contextual data:
– Research context
– Roles and contribution
– Publishing context
• Content data:
– Text:
• Text structure
• Rhetoric
– Message:
• Argumentation
• Citation network
• Textual semantics
9. An example
The Tempest by William Shakespeare
as available in the Oxford Text Archive
:work a fabio:Play ; frbr:realization :expression ;!
dcterms:creator [ a foaf:Person ; foaf:name “William Shakespeare” ] .!
!
:expression a fabio:Book ; frbr:embodiment :manifestation .!
!
:manifestation a fabio:DigitalManifestation ; frbr:exemplar :item ;!
dcterms:format [ a dcterms:MediaType ; dcterms:description “application/tei+xml”] ;!
dcterms:publisher [ a foaf:Organization ; foaf:name “OUCS” ] ; !
!
:item a fabio:ComputerFile ; fabio:storedOn fabio:web .!
Closed view
dbpedia:The_Tempest a fabio:Play ; frbr:realization <http://ota.ox.ac.uk/id/5725> ;!
dcterms:creator dbpedia:William_Shakespeare .!
!
<http://ota.ox.ac.uk/id/5725> a fabio:Book ; !
frbr:embodiment <http://ota.ox.ac.uk/text/5725/xml> .!
!
<http://ota.ox.ac.uk/text/5725/xml> a fabio:DigitalManifestation ; !
frbr:exemplar <http://ota.ox.ac.uk/text/5725.xml> ; dcterms:format application:tei+xml ;
dcterms:publisher dbpedia:Oxford_University_Computing_Services .!
!
<http://ota.ox.ac.uk/text/5725.xml> a fabio:ComputerFile ; fabio:storedOn fabio:web .!
Open (Linked Data) View
10. Annotating the content
<body> !
...!
<sp> !
<speaker rend="italic">Ari.</speaker>!
<ab>!
All haile, great Master, graue Sir, haile: I come<lb n="301"/>!
To answer thy best pleasure; be’t to fly,<lb n="302"/>!
To swim, to diue into the fire: to ride<lb n="303"/>!
On the curld clowds: to thy strong bidding,taske<lb n="304"/>!
<hi rend="italic">Ariel,</hi> and all his Qualitie.<lb n="305"/>!
</ab>!
</sp>!
<sp> !
<speaker rend="italic">Pro.</speaker>!
<ab>!
Hast thou, Spirit,<lb n="306"/> !
Performd to point, the Tempest that I !
<seg type="homograph">bad</seg> thee.<lb n="307"/>!
</ab>!
</sp>!
... !
</body>!
“Ari.”, “Ariel”, “Spirit” refer to the same entity
“Master.”, “Pro.” refer to the same entity
Both are defined in DBPedia!
How can I annotate such an XML document
without having permission to modify it?
11. • The Extremely Annotational RDF
Markup, a.k.a. EARMARK, is an
OWL 2 DL ontology that defines
document meta-markup
• It is an ontologically precise
definition of markup that
instantiates the markup of a text
document as an independent OWL
document outside of the text strings
it annotates
• It can define structures such as
trees or graphs (i.e. overlapping
markup) and can be used to
generate validity constraints
(including co-constraints currently
unavailable in most validation
languages)
• Using the Linguistic Meta-Model, it
becomes possible to express and
assess facts, constraints and rules
about the markup structure as well
as about the semantics of the
content of the document
URIDocuverse to define the whole textual
content of the document to annotate – in this
case the Oxford Text Archive TEI version of the
play The Tempest, available at a particular URL
PointerRange to define textual ranges upon it
LinguisticAct to represent annotations made on
ranges by someone at a certain time
12. Multiple interpretations
<ab>!
All haile, great Master, graue Sir, haile: I come<lb n="301"/>!
...!
</ab>!
# The textual content of the document to annotate !
:content a earmark:URIDocuverse ;!
earmark:hasContent "http://ota.ox.ac.uk/text/5725.xml"^^xsd:anyURI .!
# The string "Master"!
:master-string a earmark:PointerRange ;!
earmark:refersTo :content ;!
earmark:begins "34023"^^xsd:nonNegativeInteger ; !
earmark:ends "34029"^^xsd:nonNegativeInteger .!
# Silvio’s interpretation!
:prospero-as-person a la:LinguisticAct ;!
la:hasInformationEntity :master-string ; !
la:hasReference dbpedia:Prospero ; !
la:hasMeaning foaf:Person ; !
prov:wasAttributedTo :silvio ; !
prov:generatedAtTime!
"2013-06-18T17:23:23Z"^^xsd:dateTime .!
# Fabio’s interpretation!
:prospero-as-character a la:LinguisticAct ;!
la:hasInformationEntity :master-string ; !
la:hasReference dbpedia:Prospero ; !
la:hasMeaning yago:ShakespeareanCharacters ;!
prov:wasAttributedTo :fabio; !
prov:generatedAtTime!
"2013-07-23T17:45:23Z"^^xsd:dateTime .!
13. Conclusions
• Semantic Publishing is a natural and inevitable
evolution of the technological advances of the
publishing industry
• Shared ontologies are the only way to provide
interoperability of data between publishers
• SPAR and Earmark do provide interesting contact
points between metadata hidden in XML vocabularies
and shared publishing ontologies
• TEI, which is orthogonal to these languages, can and
should work well with them.
14. Thank you for your attention
Emails:
essepuntato@cs.unibo.it
fabio@cs.unibo.it