SuRf – Tapping Into The Web Of Data

  • 3,786 views
Uploaded on

SuRF is an Object - RDF Mapper based on the popular rdflib python library. It exposes the RDF triple sets as sets of resources and seamlessly integrates them into the Object Oriented paradigm of …

SuRF is an Object - RDF Mapper based on the popular rdflib python library. It exposes the RDF triple sets as sets of resources and seamlessly integrates them into the Object Oriented paradigm of python in a similar manner as ActiveRDF does for ruby.

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
3,786
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
25
Comments
0
Likes
11

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. SuRF – Tapping into the Web of Data
    Cosmin Basca
    Digital Enterprise Research Institute, Galway
    cosmin.basca@gmail.com
    Special Thanks to: Benjamin Heitman andUldis Bojars
    Digital Enterprise Research Institute, Galway
    firstname.lastname@deri.org
  • 2. Outline
    About DERI
    Why Semantic Web?
    Linked Open Data (LOD)
    RDF (Resource Description Framework)
    SPARQL
    O-RDF Mapping (ActiveRDF / SuRF)
    How?
    Architecture
    Installation
    Examples
    Simple: access DBpedia (Semantic Wikipedia)
    More complex: create a blog on top of RDF
    2
  • 3. DERI – http://www.deri.ie/
    Digital Enterprise Research Institute (DERI):
    http://www.deri.ie/
    main goal: enabling networked knowledge
    research about the future of the Web
    biggest Semantic Web research institute in the world
    120 people
    part of the National University of Ireland, Galway
    3
  • 4. Outline
    About DERI
    Why Semantic Web?
    Linked Open Data (LOD)
    RDF (Resource Description Framework)
    SPARQL
    O-RDF Mapping (ActiveRDF / SuRF)
    How?
    Architecture
    Installation
    Examples
    Simple: access DBpedia (Semantic Wikipedia)
    More complex: create a blog on top of RDF
    4
  • 5. Why ?
    Develop Web applications that allow
    Data Integration
    Flexibility
    Schema definition and modeling
    Schema evolution
    Robustness
    Support for new Data
    Sources
    Types
    5
  • 6. There is a Wealth of (RDF) data out there
    6
  • 7. Popular Semantic Web Vocabularies
    FOAF = for describing people and social network connections between them   http://xmlns.com/foaf/spec/
    SIOC = for describing Social Web content created by people   http://sioc-project.org/
    DOAP = for describing software projects   http://trac.usefulinc.com/doap
    used by PyPi
    7
  • 8. Linked Open Data - Growth
    8
  • 9. Linked Open Data - Growth
    9
  • 10. Linked Open Data - Growth
    10
  • 11. The data model
    Traditional Approach use the Relational model
    Usually leads to big ugly Schemas
    11
  • 12. The RDF (Graph) Data model
    Flexible
    Support for both schema and data evolution during runtime
    Simple model
    Relations are represented explicitly
    Schema is a graph
    Can integrate data – union of two graphs
    12
  • 13. The RDF (Graph) Data model
    13
    Subject
    Object
    Predicate
    A triple
    is a
    Eric
    Person
  • 14. Example RDF graph describing Eric Miller (RDF Primer) – human readable format
    14
    Person
    is a
    Eric
    has full name
    Eric Miller
    has e-mail
    em@w3.org
    has personal title
    Dr.
  • 15. Example RDF graph describing Eric Miller (RDF Primer) – machine readable format
    15
    http://w3.org/2000/10/swap/pic/contact#Person
    http://www.w3.org/1999/02/22-rdf-syntax-ns#type
    http://w3.org/People/EM/contact#me
    http://www.w3.org/2000/10/swap/pim/contact#fullName
    Eric Miller
    http://www.w3.org/2000/10/swap/pim/contact#mailbox
    mailto:em@w3.org
    http://www.w3.org/2000/10/swap/pim/contact#personalTitle
    Dr.
  • 16. The RDF (Graph) Data model – Identification
    URI’s provide strong references
    The URIref is a an unambiguous pointer to something of meaning
    Nodes (“Subjects”)
    connect via Links (“Predicates”)
    to Objects
    Can be Nodes or Literals (plain or typed strings)
    16
  • 17. SPARQL – Querying the Semantic Web
    SPARQL is to RDF what SQL is to Relational tables
    Expressive, designed with the Graph data model in mind
    17
    Carrie
    Fisher
    starred_in
    Star
    Wars
    starred_in
    Harrison
    Ford
    starred_in
    Blade Runner
    starred_in
    Darryl
    Hannah
    SELECT ?actor ?movie WHERE {
    ?actor starred_in ?movie
    }
  • 18. Levels of Data abstraction
    18
    Direct SPARQL
    Access
    O-RDF Mapper
    SuRF
  • 19. O-RDF Mapper, Why?
    Clean OO design
    Increased productivity
    model is free from persistence constraints
    Separation of concerns and specialization
    ORMs often reduce the amount of code needed to be written, making the software more robust
    20% to 30% less code needs to be written
    Less code – less testing – less errors
    19
  • 20. O-RDF Mapper, How?
    How do we see RDF data?
    As a SET of triples?
    As a SET of resources?
    The resource view is more suitable for the OO model
    How do we define an RDF resource ?
    All triples <S,P,O>with same subject (ActiveRDF, SuRF)
    And all triples <O,P,S> (SuRF)
    Apply Open World principles
    20
  • 21. Outline
    About DERI
    Why Semantic Web?
    Linked Open Data (LOD)
    RDF (Resource Description Framework)
    SPARQL
    O-RDF Mapping (ActiveRDF / SuRF)
    How?
    Architecture
    Installation
    Examples
    Simple: access DBpedia (Semantic Wikipedia)
    More complex: create a blog on top of RDF
    21
  • 22. SuRF – Semantic Resource Framework
    Inspired by ActiveRDF
    Developed in DERI for ruby
    Expose RDF as sets of resources
    Semantic attributes exposed as a “virtual API”, generated through introspection.
    Naming convention:
    instance.namespace_attribute
    cosmin.foaf_knows
    Finder methods
    Retrieve resources by type or by attributes
    Session keeps track of resources, when calling session.commit() only dirty resources will be persisted
    22
  • 23. SuRF – Architecture
    23
  • 24. SuRF – Architecture – Currently supported plugins
    24
    Add your own plugins, extend:
    surf.store.plugins.RDFReader
    surf.store.plugins.RDFWriter
    Redefine the __type__ attribute
    This is the plugin identifier
    To install plugins
    import my_plugin
  • 25. SuRF - installation
    Available on PyPi
    easy_install –U surf (to get the latest)
    Open-source available on Google Code, BSD licence
    http://code.google.com/p/surfrdf/
    25
  • 26. Outline
    About DERI
    Why Semantic Web?
    Linked Open Data (LOD)
    RDF (Resource Description Framework)
    SPARQL
    O-RDF Mapping (ActiveRDF / SuRF)
    How?
    Architecture
    Installation
    Examples
    Simple: access DBpedia (Semantic Wikipedia)
    More complex: create a blog on top of RDF
    26
  • 27. DBpedia public SPARQL endpoint - read-only
    Create the store proxy
    from surf import*
    store = Store(reader='sparql-protocol',endpoint='http://dbpedia.org/sparql',                default_graph='http://dbpedia.org')
    Create the surf session
    print'Create the session'session =Session(store,{})
    Map a dbpedia concept to an internal class
    PhilCollinsAlbums=session.get_class(ns.YAGO['PhilCollinsAlbums'])
    SuRF – simple example
    27
  • 28. SuRF – simple example
    DBpedia public SPARQL endpoint - read-only
    Get all Phill Collins albums
    all_albums=PhilCollinsAlbums.all()
    Do something with the albums (display the links to their covers)
    print'All covers'for a inall_albums:    ifa.dbpedia_name:        print' Cover %s for "%s"'%(a.dbpedia_cover,a.dbpedia_name)
    28
  • 29. Outline
    About DERI
    Why Semantic Web?
    Linked Open Data (LOD)
    RDF (Resource Description Framework)
    SPARQL
    O-RDF Mapping (ActiveRDF / SuRF)
    How?
    Architecture
    Installation
    Examples
    Simple: access DBpedia (Semantic Wikipedia)
    More complex: create a blog on top of RDF
    29
  • 30. SuRF – integrate into Pylons
    Create a blog on top of an RDF database
    Replace SQLAlchemy with SuRF
    Download and install either AllegroGraph Free Edition (preferred) or Sesame2
    http://www.franz.com/downloads/clp/ag_survey
    Free for up to 50.000.000 triples (records)
    Install pylons: easy_install pylons
    Install SuRF: easy_install surf
    Create a pylons application:
    paster create -t pylons MyBlog
    cd MyBlog
    30
  • 31. SuRF – Pylons Blog
    ~/MyBlog/development.ini: In the [app:main] section add
    rdf_store = localhost
    rdf_store_port = 6789
    rdf_repository = tagbuilder
    rdf_catalog = repositories
    ~/MyBlog/myblog/config/environment.py
    from surf import *
    rdf_store = Store( reader = 'sparql-sesame2-api',
    writer = 'sesame2-api',
    server = config['rdf_store'],
    port = config['rdf_store_port'],
    catalog = config['rdf_catalog'],
    repository = config['rdf_repository'])
    rdf_session = Session(rdf_store, {})
    31
  • 32. SuRF – Pylons Blog
    ~/MyBlog/myblog/model/__ init __.py
    from surf import *
    definit_model(session):
    global rdf_session
    rdf_session = session
    # register a namespace for the concepts in my blog
    ns.register(myblog=‘http://example.url/myblog/namespace#’)
    Blog = rdf_session.get_class(ns.MYBLOG[‘Blog’])
    Create the blog controllerpaster controller blog
    ~/MyBlog/myblog/controllers/blog.py
    import logging
    frommyblog.lib.baseimport *
    log = logging.getLogger(__name__)
    classBlogController(BaseController):
    def index(self):
    c.posts = model.Blog.all(0,5)
    return render("/blog/index.html")
    32
  • 33. SuRF – Pylons Blog
    Create the template mkdir ~/MyBlog/myblog/templates/blog
    ~/MyBlog/myblog/templates/blog/index.html
    <%inherit file="site.html" />
    <%def name="title()">MyBlog Home</%def>
    <p>${len(c.posts)} new blog posts!</p>
    % for post inc.posts:
    <p class="content" style="border-style:solid;border-width:1px">
    <span class="h3"> ${post.myblog_title} </span>
    <span class="h4">Posted on: ${post.myblog_date} by ${post.myblog_author}</span>
    <br> ${post.myblog_content}
    </p>
    % endfor
    ~/MyBlog/myblog/templates/blog/site.html
    Start the development built in server:
    paster serve --reload development.ini
    33
  • 34. SuRF – Tapping into the Web of Data
    Can tap into the web of Data
    SPARQL endpoints
    Local or remote RDF Stores
    Plugin framework, allows for more access protocols to be defined
    Code is generated dynamically (pragmatic bottom up approach):
    Introspection, meta-programming,
    exposing a virtual API (defined by the data and the schema) to the developer
    Can easily be integrated into popular python frameworks
    pylons
    34
  • 35. exit()
    cosmin.basca@deri.prg
    http://code.google.com/p/surfrdf/
    easy_install –U surf
    35