1. An architecture and process of
implantation for Linked Data
environments
A case study for the Library of Congress of Chile
Francisco Cifuentes – José María Álvarez
Christian Sifaqui – José Emilio Labra
http://www.weso.es TLDE-CAEPIA 2011 http://www.bcn.cl
2. Overview: this talk in 1’
Why?
Linked Open Data in Public Administrations
How?
Proposal of Architecture
Adoption process
Where?
Library of Congress - Chile
http://www.bcn.cl
http://www.weso.es
3. Linked Open Data in
Public Administrations
Government data & actions can be supervised
Improve transparency & confidence
http://www.bcn.cl
http://www.weso.es
4. Linked Open Data in
Public Administrations
Public value (generates citizen experience)
Research & Collaboration
Reuse data
http://www.bcn.cl
http://www.weso.es
5. Linked Open Data in
Public Administrations
Public information belongs to citizens
Financed by public resources
Return of inversion
http://www.bcn.cl
http://www.weso.es
6. Linked Open Data in
Public Administrations
Legislation is public information…
…and must be of public domain
Everyone is affected by laws
http://www.bcn.cl
http://www.weso.es
8. Architecture &
Adoption Process
There is huge interest to publish LOD
Practical guidelines & methodologies ?
Our proposal:
Architecture of Linked Open Data
Implementation methodology
http://www.bcn.cl
http://www.weso.es
9. Considerations in
Public Administrations context
Large volumes of data
Semistructured content
Contents of general interest
High expectations
New projects should not interfere
Small teams in large organizations
Low semantic expertise
http://www.bcn.cl
http://www.weso.es
10. Linked Open Data Architecture
Web Browser Semantic Application
Client side
Server side
Web Application Server
Output
Update RDF Ontologies
RDF Graph
Graph
Service
Endpoint SPARQL
Documentation
Portal
Cache
RDF Storage
DB
Web Server Operating System
http://www.bcn.cl
http://www.weso.es
11. Adoption Process
Contextualization
Ontology design
RDF Graph Modeling
Phases SPARQL Endpoint Implementation
RDF Graph Implementation
Update Graph Service
Documentation Web Portal
Non functional Requirements
Optional Data Visualization & demos
Time
http://www.bcn.cl
http://www.weso.es
15. Contextualization
Publish Linked Open Data – 5 stars
Norms and relationships in a global RDF graph
Infrastructure for future developments
First stage, pilot project
http://www.bcn.cl
http://www.weso.es
16. Contextualization
≈ 300.000 norms and their relationships
Modifications, Concordances, etc.
First stage ⇒ Only main metadata of norms
Title, important dates, types, relationships
We exclude body text (articles, chapters, etc.)
http://www.bcn.cl
http://www.weso.es
17. Contextualization
Definition of domain model:
Norms, relationships, types of norms, metadata,
Functional requirements for bibliographical records (FRBR)
Output formats: RDFa, RDF/XML, JSON, N3,…
http://www.bcn.cl
http://www.weso.es
18. Domain Ontologies
Small Ontology about Norms
http://www.bcn.cl
http://www.weso.es
19. RDF Graph Modeling
A norm can be modified by another norm
Decree 296 Decree 12066
Published 1995-02-17 Published 2005-05-15
Art..1. abc. Art. 1. Modify decree 296 in the following way::
Art. 2. def. substitute in Art.1 the words “a” by “xyz”.
Artí.3. ghi.
Now, Decree 296 should be:
Decree 296
Artículo 1. xyzbc.
Artículo 2. def.
Artículo 3. ghi.
http://www.bcn.cl
http://www.weso.es
20. RDF Graph Modeling
Careful URI Design
Expressiveness
http://www.bcn.cl
http://www.weso.es
21. RDF Graph Modeling
Decree 296 http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/
http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@1995-02-17
Original
Latest version http://datos.bcn.cl/recurso/cl/DTO/ministerio-del-interior/1995-02-17/296/es@2005-05-10
http://www.bcn.cl
http://www.weso.es
22. SPARQL Endpoint
Links to other datasets (Countries for International
Treaties)
DBPedia, Geonames
Reuse vocabularies / Ontologies
SKOS, DC, FOAF, DBPedia, ORG
Triplestore: Openlink Virtuoso
http://www.bcn.cl
http://www.weso.es
23. SPARQL Endpoint
Example of query
Find all norms emitted by a municipality between 1995 and 2000
that were modified after 2005.
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX n: <http://datos.bcn.cl/ontologies/bcn-norms#>
SELECT ?normTitle ?creatorName ?pubDate ?pubDateOther
WHERE {
?norm n:createdBy ?creator .
?creator n:hasName ?creatorName .
?norm dc:title ?normTitle .
?norm n:publishDate ?pubDate .
?norm n:isModifiedBy ?otherNorm .
?otherNorm n:publishDate ?pubDateOther .
FILTER (regex(?creatorName,"MUNICIPALIDAD","i"))
FILTER (?pubDate > "1995" &&
?pubDate < "2000" &&
?pubDateOther > "2005")
}
ORDER BY (?pubDate)
http://www.bcn.cl
http://www.weso.es
24. RDF Graph Implementation
We developed a Linked Data Frontend (WESO-DESH)
Content negotiation based on HTTP 303 See Other
Definition of URIs based on regular expressions
Easy configuration
Support for CONSTRUCT, ASK & DESCRIBE
Delegates output formats to SPARQL Endpoint
Result caching
GUI for administration backend (in progress)
http://code.google.com/p/weso-desh/
http://www.bcn.cl
http://www.weso.es
25. RDF Graph Implementation
WESO-DESH (Linked Data Frontend)
XML Configuration
Output HTML+RDFa
http://www.bcn.cl
http://www.weso.es
26. Update Graph Service
Automatic extraction & transformation process
to update the RDF Graph
Based on Pentaho - Kettle ETL
Executes Transformations in threads
Configuration in XML
26
*ETL = Extraction, Transformation Loading
http://www.bcn.cl
http://www.weso.es
27. Documentation
Documentation Web Portal: TYPO3 CMS
Sections:
URI construction guidelines
Example queries
Output formats
Ontology documentation
etc.
http://www.bcn.cl
http://www.weso.es
28. Non-Functional Requirements
Answer time
Cache system, Profiling
Security & privacity
Different views and access levels of RDF Graph
Others
Internationalization
Accessibility
Use of standards
http://www.bcn.cl
http://www.weso.es
29. Optional: Data visualization
Protype tool: LODViz (Linked Open Data Vizualization)
Based on HTML5 (pattern library)
Work in progress
http://www.weso.es/lodviz/
29
http://www.bcn.cl
http://www.weso.es
31. Results
Public Dataset Catalogs Faceted Browser - CTIC Foundation
Five stars Linked Open Data
31
http://www.bcn.cl
http://www.weso.es
32. Conclusions
First stage finished
> 300.000 norms exported
≈ 8mill. triples, ≈ 27 triples by norm
200/400 triples added each day
3 tools in development
WESO DESH - Linked data frontend
WESO RUD – RDF Updater
LODVIZ – Linked Open Data Visualization
Proposed methodology of Linked Open Data
32
http://www.bcn.cl
http://www.weso.es
33. Future Work
Library of Congress of Chile
More datasets: Biographies, Geographical data
History of Law
Improve documentation
WESO Research group
Semantic search engine
Entity extraction & reconciliation in text
Resource Recommendation
Provenance & graph views
http://www.bcn.cl
http://www.weso.es
34. The End
http://www.weso.es
More Information
http://www.bcn.cl
35. Main Team
Francisco Cifuentes
Member of WESO Research Group and Library of Congress of Chile
http://www.weso.es/~fcifuentes
José María Álvarez
Member of WESO Research Group
http://josemalvarez.es
Christian Sifaqui
Head of Systems and Network information services
Library of Congress of Chile
http://sifaqui.blogspot.com/
Jose Emilio Labra
Associate Professor of University of Oviedo and
Head of WESO Research Group
http://www.di.uniovi.es/~labra/
35
http://www.bcn.cl
http://www.weso.es
36. Credits
Most of the people were obtained from Internet.
Imagen transparencia: http://2.bp.blogspot.com/--wFwsKwMgAg/TjSDXOLCTzI/AAAAAAAAOzQ/qvBtbShckdI/s1600/11.2.bmp
Euros: Minuto digital. http://www.minutodigital.com/wp-content/uploads/euros-300x196.jpg
Biblioteca: http://ffernandez.files.wordpress.com/2010/04/biblioteca.jpg
FRBR: http://cucataloging.blogspot.com/
Contextualization: http://tentblogger.com/right-advertisers/
Documentation: http://susops.blogspot.com/2010/07/power-of-documentation.html
http://www.bcn.cl
http://www.weso.es