Linking the world with Python and Semantics
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Linking the world with Python and Semantics

on

  • 7,357 views

Introduction on how to use open data and Python, with examples of RDFLib, SuRF and RDF-Alchemy.

Introduction on how to use open data and Python, with examples of RDFLib, SuRF and RDF-Alchemy.
http://softwarelivre.org/fisl13

Statistics

Views

Total Views
7,357
Views on SlideShare
7,259
Embed Views
98

Actions

Likes
20
Downloads
131
Comments
10

7 Embeds 98

http://www.scoop.it 35
https://twitter.com 26
http://aws.w3db.us 24
http://eventifier.co 10
https://abs.twimg.com 1
http://instacurate.com 1
http://ec2-54-243-189-159.compute-1.amazonaws.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial LicenseCC Attribution-NonCommercial License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

15 of 10 Post a comment

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • FWIW, here's a post of mine linking to your slides : http://www-public.telecom-sudparis.eu/~berger_o/weblog/2014/05/14/using-rdfalchemy-together-with-rdflibs-sparqlstore-to-query-dbpedia-and-process-resources-in-oo-way/
    Are you sure you want to
    Your message goes here
    Processing…
  • This is past my level of semantic *and* dev knowledge - could you recommend a slighlty more basic intro on how to add results of SPARQL queries to a web page using the simplest server-side scripting.

    I've made some basic SPARQL queries using Snorql, and I'm assuming this is an appropriate first step. My end goal is to have a portable, relatively simple php/python solution of making queries to datasets such as dbpedia, for output to a webpage.

    Any advice would be much appreciated.
    Are you sure you want to
    Your message goes here
    Processing…
  • @tatiana: nice!!
    Are you sure you want to
    Your message goes here
    Processing…
  • @EvstifeevRoman thanks! Although the slides are in English, this talk was given in Portuguese, during FISL 13 (Forum Internacional de Software Livre), in Brazil. The recording is available: http://hemingway.softwarelivre.org/fisl13/high/41f/sala41f-high-201207251201.ogg
    Are you sure you want to
    Your message goes here
    Processing…
  • +subcsribe
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Linking the world with Python and Semantics Presentation Transcript

  • 1. Linking the world with Python and Semantics@tati_alchueyr (Globo.com)25th July 2012, FISL 13
  • 2. how do you store your data?
  • 3. how do you store your data?[ ] data... what data?![ ] raw files (csv, json, xml)[ ] database (eg. Relational Data Base)[ ] graphs (eg. Resource Description Framework)[ ] other...
  • 4. how do you search for...?Apartments near English-Portuguese bilingualchildcare in Rio de Janeiro state.ERP service providers with offices in São Pauloand New York.Researchers working on artificial intelligence inSoutheast of Brazil.GNU GPL software for image processingdeveloped from 2009 to 2010 authored also byBrazilian developers
  • 5. how do you search for...?Apartments near English-Portuguese bilingualchildcare in Rio de Janeiro state.ERP service providers with offices in São Pauloand New York.Researchers working on artificial intelligence inSoutheast of Brazil.GNU GPL software for image processingdeveloped from 2009 to 2010 authored also byBrazilian developers
  • 6. how do you search for...?Apartments near English-Portuguese bilingualchildcare in Rio de Janeiro state.ERP service providers with offices in São Pauloand New York.Researchers working on artificial intelligence inSoutheast of Brazil.GNU GPL software for image processingdeveloped from 2009 to 2010 authored also byBrazilian developers
  • 7. how do you search for...?Apartments near English-Portuguese bilingualchildcare in Rio de Janeiro state.ERP service providers with offices in São Pauloand New York.Researchers working on artificial intelligence inSoutheast of Brazil.GNU GPL software for image processingdeveloped from 2009 to 2010 authored also byBrazilian developers
  • 8. what ^ have in common?
  • 9. linked open data in 2007
  • 10. linked open data in 2008
  • 11. linked open data in 2009
  • 12. linked open data in 2011
  • 13. traditional RDMS
  • 14. linked data graph
  • 15. linked data modelling
  • 16. modelling
  • 17. modelling
  • 18. quering RDBselect bookID, authorNamefrom books, authorswhere books.aid = authors.aid and books.isbn = ‘006251587X’.
  • 19. quering RDFselect ?authName ?authEmailwhere { <amazon:book#006251587X> <amazon:hasAuthor><foaf:name#TimBerners-Lee> <foaf:name#TimBerners-Lee> <foaf:name> ?authName <foaf:name#TimBerners-Lee> <foaf:email>?authEmail}
  • 20. globo.com developers before using web semantics
  • 21. globo.com developers while learning web semantics (?w ?t ?f)
  • 22. globo.com developers after using web semantics
  • 23. Sample hard to test code
  • 24. approach 1# queries isolation
  • 25. approach 2# data as object DAO
  • 26. Y U NO makeSPARQL queries?!
  • 27. Y U NO makedata access easy?!
  • 28. Y U NO makethings testable?!
  • 29. product developers evaluating web semantics
  • 30. fact 1: we dont have an out-of-box solution
  • 31. fact 2: but we do have some options
  • 32. some options#1: create a solutionfrom scratch#2: study existingsolutions and then[ ] contribute to them[ ] develop on top ofthem[ ] goto #1
  • 33. the final decision is not only ours
  • 34. but we chose starting from #2#2: study existing solutions and then (...)
  • 35. ok, lmgfy
  • 36. a few results from googleActiveRDF PyRdfaactive-semantic pysparqlDjango4Store RDFAlchemyDjango-RDF RdfLibDjango-RDFAlchemy RedlandDjubby semantic-djangoEasyRDF SPARQLWrapperJenaFuXi SparrowOort SpartaPymantic SuRF
  • 37. click to know moreActiveRDF PyRdfaactive-semantic pysparqlDjango4Store RDFAlchemyDjango-RDF RdfLibDjango-RDFAlchemy RedlandDjubby semantic-djangoEasyRDF SPARQLWrapperJenaFuXi SparrowOort SpartaPymantic SuRF
  • 38. {?project :by_author ?author .?author :works_at :globocom . }ActiveRDF PyRdfaactive-semantic pysparqlDjango4Store RDFAlchemyDjango-RDF RdfLibDjango-RDFAlchemy RedlandDjubby semantic-djangoEasyRDF SPARQLWrapperJenaFuXi SparrowOort SpartaPymantic SuRF
  • 39. {?project :use_language :python . } ActiveRDF PyRdfa active-semantic pysparql Django4Store RDFAlchemy Django-RDF RdfLib Django-RDFAlchemy Redland Djubby semantic-django EasyRDF SPARQLWrapper Jena FuXi Sparrow Oort Sparta Pymantic SuRF
  • 40. {?project :use_language :python ; :last_commit ?commit . FILTER (?commit >= "2011-12-01"^^xsd:date) }ActiveRDF PyRdfaactive-semantic pysparqlDjango4Store RDFAlchemyDjango-RDF RdfLibDjango-RDFAlchemy RedlandDjubby semantic-djangoEasyRDF SPARQLWrapperJenaFuXi SparrowOort SpartaPymantic SuRF
  • 41. relation between these tools
  • 42. team filteringActiveRDF PyRdfaactive-semantic pysparqlDjango4Store RDFAlchemyDjango-RDF RdfLibDjango-RDFAlchemy RedlandDjubby semantic-djangoEasyRDF SPARQLWrapperJenaFuXi SparrowOort SpartaPymantic SuRF
  • 43. SPARQLWrapperproblem: list all predicates of a class # List all predicates of dbonto:Band query = """ SELECT distinct ?subject FROM <http://dbpedia.org> { ?subject rdfs:domain ?object . <http://dbpedia.org/ontology/Band> rdfs:subClassOf ?object OPTION (TRANSITIVE, t_distinct, t_step(step_no) as ?n, t_min (0) ). }""" http://live.dbpedia.org/sparql sparql = SPARQLWrapper("http://dbpedia.org/sparql") sparql.setQuery(query) sparql.setReturnFormat(JSON) results = sparql.query().convert() for result in results["results"]["bindings"]: print(result["subject"]["value"])
  • 44. SPARQLWrapperabstract endpoint returns dict # List all predicates of dbonto:Band query = """ SELECT distinct ?subject FROM <http://dbpedia.org> { ?subject rdfs:domain ?object . <http://dbpedia.org/ontology/Band> rdfs:subClassOf ?object OPTION (TRANSITIVE, t_distinct, t_step(step_no) as ?n, t_min (0) ). }""" http://live.dbpedia.org/sparql sparql = SPARQLWrapper("http://dbpedia.org/sparql") sparql.setQuery(query) sparql.setReturnFormat(JSON) results = sparql.query().convert() for result in results["results"]["bindings"]: print(result["subject"]["value"])
  • 45. SPARQLWrapperOk, not different from what we have...
  • 46. SPARQLWrapperjust a wrapper around a SPARQL serverwell tested ;)
  • 47. SPARQLWrapperproblem: list all subjects given ?p ?o from SPARQLWrapper import SPARQLWrapper, JSON # List all instances (eg. bands) with genre Metal query = """ PREFIX db: <http://dbpedia.org/resource/> PREFIX dbonto: <http://dbpedia.org/ontology/> SELECT DISTINCT ?who FROM <http://dbpedia.org> WHERE { ?who dbonto:genre db:Metal . } """ sparql = SPARQLWrapper("http://dbpedia.org/sparql") sparql.setQuery(query) sparql.setReturnFormat(JSON) results = sparql.query().convert() for result in results["results"]["bindings"]: print(result["who"]["value"])
  • 48. RdfLibproblem: list all subjects given ?p ?o import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Definitions genre = rdflib.URIRef("http://dbpedia.org/ontology/genre") metal = rdflib.URIRef("http://dbpedia.org/resource/Metal") # Query for label in graph.subjects(genre, metal): print label
  • 49. RdfLibabstract endpoint returns dict namespace import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Query for label in graph.subjects(DBONTO.genre, DB.Metal): print label
  • 50. RdfLibabstract endpoint returns dict namespace import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Query for label in graph.subjects(DBONTO.genre, DB.Metal): print label subjects predicates objects subject_predicates subject_objects predicates_objects
  • 51. RdfLibabstract endpoint returns dict namespace import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Using triples for musician, _, _ in graph.triples((None, DBONTO.genre, DB.Metal)): print musician
  • 52. RdfLibabstract endpoint returns dict namespace query by triples import rdflib import rdfextras.store.SPARQL # SPARQL endpoint setup endpoint = "http://dbpedia.org/sparql" store = rdfextras.store.SPARQL.SPARQLStore(endpoint) graph = rdflib.Graph(store) # Namespaces to clear up definitions DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Query for label in graph.subjects(DBONTO.genre, DB.Metal): print label
  • 53. RdfLibabstract endpoint returns dict namespace query by triples add / remove import rdflib import rdfextras.store.SPARQL # n3 fixture file graph = rdflib.Graph() graph.parse("fixture_genre_metal.nt", format="nt") # Namespace DBONTO = rdflib.Namespace("http://dbpedia.org/ontology/") DB = rdflib.Namespace("http://dbpedia.org/resource/") # Add nodes graph.add((DB.AndrewsMedina, DBONTO.genre, DB.Metal)) graph.add((DB.Siminino, DBONTO.genre, DB.Metal)) graph.add((DB.Herman, DBONTO.genre, DB.Metal)) # Remove nodes graph.remove((DB.AndrewsMedina, DBONTO.genre, DB.Metal))
  • 54. RdfLib concentrates on providing the core RDF types and interfaces, through plugin interface
  • 55. RdfLib makes testing simple, allowing fixtures using n3 files, add triples and remove triples
  • 56. RdfLibbut...each triple queryrequires a newconnection toSPARQL
  • 57. RdfLibthereforetoo many access toSPARQL endpoint
  • 58. RdfLiband...doesnt provide anORM (objectrelational mapping)
  • 59. SuRFabstract endpoint returns dict namespace query by triples add / remove from surf import Store, Session, ns, query store = Store(reader=sparql_protocol, endpoint=http://dbpedia.org/sparql) session = Session(store, {}) session.enable_logging = False ns.register(db=http://dbpedia.org/resource/) ns.register(dbonto=http://dbpedia.org/ontology/) MusicalArtist = session.get_class(ns.DB[MusicalArtist]) artistas_metal = MusicalArtist.get_by(dbonto_genre=ns.DB["Metal"]) print artistas_metal ORM
  • 60. SuRFproblem: list all subjects given ?p ?o from surf import Store, Session, ns, query store = Store(reader=sparql_protocol, endpoint=http://dbpedia.org/sparql) session = Session(store, {}) ns.register(db=http://dbpedia.org/resource/) ns.register(dbonto=http://dbpedia.org/ontology/) query_surf = query.select("?who").distinct() query_surf.where(("?who", ns.DBONTO.genre, ns.DB.Metal)) metal_bands = session.default_store.execute(query_surf) for band in metal_bands: print band composed ORM queries
  • 61. SuRF various approaches ORM programaticaly
  • 62. SuRF simple ORM no need to redeclare TTL definitions
  • 63. SuRF “complex” queries using lazy evalutation
  • 64. SuRF documentation & community
  • 65. SuRFbut...no django-stylemodels
  • 66. SuRFverbose syntax
  • 67. RDFAlchemyproblem: list all subjects given ?p ?o from rdfalchemy.sparql import SPARQLGraph from rdflib import Namespace endpoint = "http://dbpedia.org/sparql" graph = SPARQLGraph(endpoint) DB = Namespace("http://dbpedia.org/resource/") DBONTO = Namespace("http://dbpedia.org/ontology/") metal_bands = graph.subjects(predicate=DBONTO.genre, object=DB.Metal) for band in metal_bands: print band
  • 68. RDFAlchemyabstract endpoint returns dict namespace query by triples add / remove from rdfalchemy.sparql import SPARQLGraph from rdfalchemy import rdfSubject, rdfSingle from rdflib import Namespace DB = Namespace(http://dbpedia.org/resource/) DBONTO = Namespace("http://dbpedia.org/ontology/") RDFS = Namespace(http://www.w3.org/2000/01/rdf-schema#) endpoint = "http://live.dbpedia.org/sparql" graph = SPARQLGraph(endpoint) rdfSubject.db = graph class MusicalArtist(rdfSubject): rdfs_label = rdfSingle(RDFS.label, label) genre = rdfSingle(DBONTO.genre, genre) metal_artists = MusicalArtist.filter_by(genre=DB.Metal) for band in metal_artists: print band ORM django-like
  • 69. RDFAlchemy django-like models
  • 70. RDFAlchemy simple syntax
  • 71. RDFAlchemybut...non-lazy
  • 72. RDFAlchemy we have to declare all data alreadydescribed in TTL files as python classes
  • 73. semantic-djangoabstract endpoint returns dict namespace query by triples add / remove # Classes similar to django models are created from TTL # files (using manage.py) class BaseLugar(BaseEntidade): latitude = models.UriField() longitude = models.UriField() geonameid = models.UriField() tem_mapa = models.UriField() apelido = models.UriField() ImagemMapa = models.UriField() genero_gramatical = models.UriField() class Meta: semantic_graph = http://semantica.globo.com/base/Lugar ORM django-like
  • 74. semantic-djangohttps://github.com/rfloriano/semantic-django
  • 75. semantic-django dream of many product developers
  • 76. semantic-djangobut...just started to bedeveloped
  • 77. study existing solutions, and now?[ ] contribute to them[ ] develop on top of them[ ] create a solution from scratch[ ] other, _________________
  • 78. grab your post-it, its review time! =) =( comments shows no mySuRF query models favorite not my nice models lazy choiceRDFAlchemy API name lowRDFlib space layer django justsemantic-django like started(...)
  • 79. any questions...? @tati_alchueyr
  • 80. casting by(click to know more about each meme)