Getting Started With The Talis Platform


Published on

Developer training session providing an overview of the core features and services of the Talis Platform. Includes basic overview of REST and RDF

Published in: Education, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Getting Started With The Talis Platform

    1. 1. Getting Started with the Talis Platform <ul><ul><li>Leigh Dodds </li></ul></ul><ul><ul><li>Platform Programme Manager </li></ul></ul><ul><ul><li>Talis </li></ul></ul><ul><ul><li>December 2008 </li></ul></ul>
    2. 2. Agenda <ul><li>Platform Overview </li></ul><ul><li>Core Concepts </li></ul><ul><li>Review of the RDF Model </li></ul><ul><li>Managing binary data </li></ul><ul><li>Managing structured metadata </li></ul><ul><li>Exploring RDF data with SPARQL </li></ul><ul><li>Extra Features </li></ul><ul><li>Store Administration </li></ul><ul><li>Summary </li></ul>
    3. 3. Platform Overview
    4. 4. Software as a Service Multi-Tenant Data Storage Service
    5. 5. Unstructured Data Storage e.g. binary files, including images, documents, etc
    6. 6. Structured Data Storage RDF metadata
    7. 7. Access Control All data is open (to read) by default Configurable access options
    8. 8. Full-Text Searching and Querying
    9. 9. Standards Compliance RDF, SPARQL, HTTP
    10. 10. Platform Architecture Web API Metabox Contentbox
    11. 11. REST, RDF Authentication & Authorization Content Negotiation Core Concepts aka “The Science Bit”
    12. 12. REST Re presentational S tate T ransfer Correct Use of HTTP
    13. 13. Resource-Centric API Everything has a unique URI
    14. 14. Interact with resources using HTTP GET = read PUT = write POST = update/modify DELETE = delete
    15. 15. Use HTTP Response Codes 200 = OK 201 = Created (new resource) 202 = Accepted (for processing) 400 = Bad Request 500 = Server Error
    16. 16. Mime Types Used to identifiy content & meaning of request and response body
    17. 17. Content Negotiation Majority of services support multiple output options, list varies by resource Accept header output parameter
    18. 18. Our Service Checklist <ul><ul><li>Consistent URI structure </li></ul></ul><ul><ul><li>Every service has human interface </li></ul></ul><ul><ul><li>Plain text error messages for easy debugging </li></ul></ul><ul><ul><li>Cacheable </li></ul></ul><ul><ul><li>… etc </li></ul></ul>
    19. 19. Authentication HTTP Digest Authentication
    20. 20. Authentication Example
    21. 21. Authorization By default stores are world-readable, Store owner writable Customisable roles and privileges per-Store
    22. 22. Review of the RDF Model
    23. 23. Apollo 11 was launched from Cape Canaveral
    24. 24. Apollo 11 was launched from Cape Canaveral Subject Predicate Object
    25. 25. <> <> <>.
    26. 26. space: spacecraft/apollo-11 space: launchsite space: launchsite/capecanaveral.
    27. 27. space:spacecraft/apollo-11 space:launchsite space:launchsite/capecanaveral. space:spacecraft/apollo-11 rdfs:label “Apollo 11” . space:launchsite/capecanaveral rdfs:label “Cape Canaveral” .
    28. 30. Benefits of RDF?
    29. 31. Good for Semi-structured Data “Schema-Free” Very Flexible
    30. 32. Extensible New properties New resources New types of resource New statements
    31. 33. Encourages Convergence Reuse of vocabularies (i.e. properties) Reuse of identifiers (i.e. talk about the same things)
    32. 34. Simplifies Data Integration and Aggregation Shared identifiers Common data model Common query language Common data formats
    33. 35. Several Different Ways to Serialize RDF Optimized for different purposes
    34. 36. Turtle Simple to read and hand-author Used in SPARQL query language
    35. 37. @prefix rdf: <> @prefix space: <> @ @prefix dc: <> <> rdf:type <>; dc:description &quot;Apollo 11 was…”; space:agency &quot;United States&quot; .
    36. 38. RDF/XML Best for data interchange Harder to read
    37. 39. <rdf:RDF xmlns:j.0=&quot;“ xmlns:rdf=&quot;; xmlns:space=&quot;; xmlns:dc=&quot;; xml:base=&quot;;> <rdf:Description rdf:about=&quot;/spacecraft/1969-059A&quot;> <dc:description>Apollo 11 was…</dc:description> <rdf:type rdf:resource=&quot;;/> <space:agency>United States</space:agency> </rdf:Description> </rdf:RDF>
    38. 40. The Content Box Managing unstructured, binary data
    39. 41. Store any stream of binary data Images, documents, Javascript, etc
    40. 42. Full HTTP Caching Support ETags Efficient retrieval Conditional updates
    41. 43. Server or Client Assignment of Identifiers Provides full control over how URIs assigned
    42. 44. ContentBox URLs <ul><li>/storename/items </li></ul><ul><ul><li>The Contentbox container </li></ul></ul><ul><li>/storename/items/<id> </li></ul><ul><ul><li>An individual item </li></ul></ul>
    43. 45. Adding Content
    44. 46. Deleting Content
    45. 47. Metadata for Contentbox Resources Minimum is URI and ETag Extract height & width of images … more metadata extraction in future
    46. 48. The Meta Box Managing structured metadata
    47. 49. Full RDF Data Storage Create, read, update, delete RDF resources Query RDF data
    48. 50. Configurable Full Text Indexing of RDF Indexes updated whenever new metadata added
    49. 51. Versioned and Un-Versioned Updates By submitting data to separate resources Maintain audit trail
    50. 52. Can be Divided into Sub-Graphs Separate access control options
    51. 53. Metabox URLs <ul><li>/storename/meta </li></ul><ul><ul><li>The metabox </li></ul></ul><ul><li>/storename/meta/changesets </li></ul><ul><ul><li>The collection of changesets associated with this metabox </li></ul></ul><ul><li>/storename/meta/graphs </li></ul><ul><ul><li>The collection of sub-graphs </li></ul></ul><ul><li>/storename/meta/graphs/{id} </li></ul><ul><ul><li>A sub-graph </li></ul></ul><ul><li>/storename/meta/graphs/{id}/changesets </li></ul><ul><ul><li>The collection of changesets associated with a sub-graph </li></ul></ul><ul><li>/storename/services/sparql </li></ul><ul><ul><li>SPARQL endpoint for metabox </li></ul></ul><ul><li>/storename/services/multisparql </li></ul><ul><ul><li>SPARQL endpoint for querying across all sub-graphs </li></ul></ul>
    52. 54. Storing RDF POST application/rdf+xml Changes saved immediately Search indexing asynchronous
    53. 55. Triples are Merged into Store Can catch out the unwary Updates happen through separate mechanism
    54. 56. Retrieving Metadata /meta?about=…URI… Can select RDF serialization
    55. 57. Updating Resources POST application/vnd.talis.changeset+xml
    56. 58. ChangeSets Vocabulary that specifies removals/additions to an RDF graph
    57. 59. <ul><li><rdf:RDF xmlns:rdf=&quot;; </li></ul><ul><li>xmlns:cs=&quot;;> </li></ul><ul><li><cs:ChangeSet rdf:about=&quot;;> </li></ul><ul><li>< cs:subjectOfChange </li></ul><ul><li>rdf:resource=&quot;;/> </li></ul><ul><li>< cs:createdDate >2008-12-08T00:00:00Z</ cs:createdDate > </li></ul><ul><li>< cs:creatorName >Leigh Dodds</ cs:creatorName > </li></ul><ul><li>< cs:changeReason >More accurate launch time</ cs:changeReason > </li></ul><ul><li><cs:removal> </li></ul><ul><li><rdf:Statement> </li></ul><ul><li><rdf:subject rdf:resource=&quot;;/> </li></ul><ul><li><rdf:predicate rdf:resource=&quot;;/> </li></ul><ul><li><rdf:object>1969-07-16</rdf:object> </li></ul><ul><li></rdf:Statement> </li></ul><ul><li></cs:removal> </li></ul><ul><li><cs:addition> </li></ul><ul><li><rdf:Statement> </li></ul><ul><li><rdf:subject rdf:resource=&quot;;/> </li></ul><ul><li><rdf:predicate rdf:resource=&quot;;/> </li></ul><ul><li><rdf:object>1969-07-16T13:32:00</rdf:object> </li></ul><ul><li></rdf:Statement> </li></ul><ul><li></cs:addition> </li></ul><ul><li></cs:ChangeSet> </li></ul><ul><li></rdf:RDF> </li></ul>
    58. 60. <ul><li><rdf:RDF xmlns:rdf=&quot;; </li></ul><ul><li>xmlns:cs=&quot;;> </li></ul><ul><li><cs:ChangeSet rdf:about=&quot;;> </li></ul><ul><li><cs:subjectOfChange </li></ul><ul><li>rdf:resource=&quot;;/> </li></ul><ul><li><cs:createdDate>2008-12-08T00:00:00Z</cs:createdDate> </li></ul><ul><li><cs:creatorName>Leigh Dodds</cs:creatorName> </li></ul><ul><li><cs:changeReason>More accurate launch time</cs:changeReason> </li></ul><ul><li>< cs:removal > </li></ul><ul><li><rdf:Statement> </li></ul><ul><li>< rdf:subject rdf:resource=&quot;;/> </li></ul><ul><li>< rdf:predicate rdf:resource=&quot;;/> </li></ul><ul><li>< rdf:object >1969-07-16</ rdf:object > </li></ul><ul><li></rdf:Statement> </li></ul><ul><li></ cs:removal > </li></ul><ul><li><cs:addition> </li></ul><ul><li><rdf:Statement> </li></ul><ul><li><rdf:subject rdf:resource=&quot;;/> </li></ul><ul><li><rdf:predicate rdf:resource=&quot;;/> </li></ul><ul><li><rdf:object>1969-07-16T13:32:00</rdf:object> </li></ul><ul><li></rdf:Statement> </li></ul><ul><li></cs:addition> </li></ul><ul><li></cs:ChangeSet> </li></ul><ul><li></rdf:RDF> </li></ul>
    59. 61. <ul><li><rdf:RDF xmlns:rdf=&quot;; </li></ul><ul><li>xmlns:cs=&quot;;> </li></ul><ul><li><cs:ChangeSet rdf:about=&quot;;> </li></ul><ul><li><cs:subjectOfChange </li></ul><ul><li>rdf:resource=&quot;;/> </li></ul><ul><li><cs:createdDate>2008-12-08T00:00:00Z</cs:createdDate> </li></ul><ul><li><cs:creatorName>Leigh Dodds</cs:creatorName> </li></ul><ul><li><cs:changeReason>More accurate launch time</cs:changeReason> </li></ul><ul><li><cs:removal> </li></ul><ul><li><rdf:Statement> </li></ul><ul><li><rdf:subject rdf:resource=&quot;;/> </li></ul><ul><li><rdf:predicate rdf:resource=&quot;;/> </li></ul><ul><li><rdf:object>1969-07-16</rdf:object> </li></ul><ul><li></rdf:Statement> </li></ul><ul><li></cs:removal> </li></ul><ul><li>< cs:addition > </li></ul><ul><li><rdf:Statement> </li></ul><ul><li>< rdf:subject rdf:resource=&quot;;/> </li></ul><ul><li>< rdf:predicate rdf:resource=&quot;;/> </li></ul><ul><li>< rdf:object >1969-07-16T13:32:00</ rdf:object > </li></ul><ul><li></rdf:Statement> </li></ul><ul><li></ cs:addition > </li></ul><ul><li></cs:ChangeSet> </li></ul><ul><li></rdf:RDF> </li></ul>
    60. 62. Versioned Updates POST to /meta/changesets Apply update and stores changeset for later retrieval
    61. 63. Batch Updates Combine several changesets into single POST Linked together to define ordering
    62. 64. <rdf:RDF xmlns:rdf=&quot;; xmlns:cs=&quot;;> <cs:ChangeSet rdf:about=&quot;;> <cs:subjectOfChange rdf:resource=&quot;;/> <cs:changeReason>More accurate launch time</cs:changeReason> < cs:precedingChangeset rdf:resource=&quot; &quot;/> <!– changes --> </cs:ChangeSet> <cs:ChangeSet rdf:about=&quot; &quot;> <cs:subjectOfChange rdf:resource=&quot;;/> <cs: precedingChangeset rdf:resource=&quot; &quot;/> <!– changes --> </cs:ChangeSet> <cs:ChangeSet rdf:about=&quot; &quot;> <cs:subjectOfChange rdf:resource=&quot;;/> <!– changes --> ... </cs:ChangeSet> </rdf:RDF>
    63. 65. Data Extraction & Exploration with SPARQL
    64. 66. SPARQL RDF query language; HTTP protocol; Results format 4 different forms of query
    65. 67. ASK Test whether the graph contains some data of interest
    66. 68. #Was there a launch on 16 th July 1969? PREFIX space: <> PREFIX xsd: <> ASK WHERE { ?launch space:launched &quot;1969-07-16&quot;^^xsd:date. }
    67. 69. <?xml version=&quot;1.0&quot;?> <sparql xmlns=&quot;;> <head> </head> <boolean>true</boolean> </sparql>
    68. 70. DESCRIBE Generate an RDF description of a resource(s)
    69. 71. #Describe launch(es) that occurred on 16 th July 1969 PREFIX space: <> PREFIX xsd: <> DESCRIBE ?launch WHERE { ?launch space:launched &quot;1969-07-16&quot;^^xsd:date. }
    70. 72. #Describe spacecraft launched on 16 th July 1969 PREFIX space: <> PREFIX xsd: <> DESCRIBE ?spacecraft WHERE { ?launch space:launched &quot;1969-07-16&quot;^^xsd:date. ?spacecraft space:launch ?launch. }
    71. 73. CONSTRUCT Create a custom RDF graph based on query criteria
    72. 74. PREFIX space: <> PREFIX xsd: <> PREFIX foaf: <> CONSTRUCT { ?spacecraft foaf:name ?name; space:agency ?agency; space:mass ?mass. } WHERE { ?launch space:launched &quot;1969-07-16&quot;^^xsd:date. ?spacecraft space:launch ?launch; foaf:name ?name; space:agency ?agency; space:mass ?mass. }
    73. 75. SELECT SQL style result set retrieval
    74. 76. PREFIX space: <> PREFIX xsd: <> PREFIX foaf: <> SELECT ?name ?agency ?mass WHERE { ?launch space:launched &quot;1969-07-16&quot;^^xsd:date. ?spacecraft space:launch ?launch; foaf:name ?name; space:agency ?agency; space:mass ?mass. }
    75. 77. …as XML <?xml version=&quot;1.0&quot;?> <sparql xmlns:rdf=&quot;; xmlns=&quot;; > <head> <variable name=&quot;name&quot;/> <variable name=&quot;agency&quot;/> <variable name=&quot;mass&quot;/> </head> <results> <result> <binding name=&quot;name&quot;> <literal>Apollo 11 Command and Service Module (CSM)</literal> </binding> <binding name=&quot;agency&quot;> <literal>United States</literal> </binding> <binding name=&quot;mass&quot;> <literal>28801.0</literal> </binding> </result> <!– more results --> </results> </sparql>
    76. 78. …as JSON { &quot;head&quot;: { &quot;vars&quot;: [ &quot;name&quot; , &quot;agency&quot; , &quot;mass&quot; ] } , &quot;results&quot;: { &quot;bindings&quot;: [ { &quot;name&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;Apollo 11 Command and Service Module (CSM)&quot; } , &quot;agency&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;United States&quot; } , &quot;mass&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;28801.0&quot; } } , { &quot;name&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;Apollo 11 SIVB&quot; } , &quot;agency&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;United States&quot; } , &quot;mass&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;13300.0&quot; } } , { &quot;name&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;Apollo 11 Lunar Module / EASEP&quot; } , &quot;agency&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;United States&quot; } , &quot;mass&quot;: { &quot;type&quot;: &quot;literal&quot; , &quot;value&quot;: &quot;15065.0&quot; } } ] } }
    77. 79. Tour of Extra Features Searching, browsing, augmentation
    78. 80. Searching Full text index over RDF literals Configurable indexing options
    79. 81. /items?query=[query] &max=[10] &offset=[0] &sort=[comma-separated fieldnames] &xsl=[XSLT stylesheet] &content-type=[mimetype for XSLT results]
    80. 82. Query Syntax <ul><li>lunar </li></ul><ul><li>luna* </li></ul><ul><li>“ apollo 11” </li></ul><ul><li>lunar OR apollo </li></ul><ul><li>name:apollo </li></ul><ul><li>(lunar OR apollo) AND agency:united states </li></ul>
    81. 83. Query Results RSS 1.0 feed OpenSearch extensions (paging, relevance) Full description of each resource
    82. 84. <rdf:RDF xmlns=&quot;; xmlns:foaf=&quot;; xmlns:relevance=&quot;; xmlns:rdf=&quot;; xmlns:os=&quot;; xmlns:ns.1=&quot;;> <channel rdf:about=“…&quot;> <title>lunar</title> <link>…</link> <description>Results of a search for lunar on space</description> <items> <rdf:Seq rdf:about=&quot;urn:uuid:eae4ead8-ca6a-4b12-b714-fe631d38e447&quot;> <rdf:li resource=&quot;; /> </rdf:Seq> </items> < os:startIndex >0</ os:startIndex > < os:itemsPerPage >10</ os:itemsPerPage > < os:totalResults >118</ os:totalResults > </channel> <item rdf:about=&quot;;> <title>Item</title> <link></link> < relevance:score >1.0</ relevance:score > <foaf:name>Lunar-A</foaf:name> <space:mass>520.0</space:mass> <space:internationalDesignator>LUNAR-A</space:internationalDesignator> </item> </rdf:RDF>
    83. 85. Facetted Search Similar to Amazon product search, etc Group search results by specific fields
    84. 86. /services/facet?query=[query] &fields=[comma-separated fieldnames] &top=[10] &format=[xml|html]
    85. 87. <facet-results xmlns=&quot;;> <head> <query>name:luna*</query> <fields>agency</fields> <top>10</top> <output>xml</output> </head> <fields> <field name=&quot;agency&quot;> <term value=&quot;U.S.S.R&quot; number=&quot;25&quot; facet-uri=“…&quot; search-uri=“…&quot;/> <term value=&quot;United States&quot; number=&quot;9&quot; facet-uri=“…&quot; search-uri=“…&quot;/> <term value=&quot;Japan&quot; number=&quot;1&quot; facet-uri=“…&quot; search-uri=“…&quot;/> <term value=&quot;India&quot; number=&quot;1&quot; facet-uri=“…&quot; search-uri=“…&quot;/> </field> </fields> </facet-results>
    86. 88. Augmentation Annotate an RSS 1.0 feed against a store Automatically add a description of each referenced resource
    87. 89. Store Administration Job Control, Store Configuration
    88. 90. Field Predicate Map Associate a short name to a RDF property Properties in field predicate map are indexed for searching Short name used in query syntax, sort order, etc
    89. 91. <rdf:RDF xmlns:rdf=&quot;; xmlns:rdfs=&quot;; xmlns:bf=&quot;; xmlns:frm=&quot;“ xml:base=“”> <bf:FieldPredicateMap rdf:about=&quot;/indexes/default/fpmaps/default&quot;> <frm:mappedDatatypeProperty> <rdf:Description rdf:about=&quot;/indexes/default/fpmaps/default#agency&quot;> <frm:property rdf:resource=&quot;;/> <frm:name>agency</frm:name> </rdf:Description> </frm:mappedDatatypeProperty> </bf:FieldPredicateMap> </rdf:RDF>
    90. 92. Query Profile Assign weightings to fields for searching
    91. 93. <rdf:RDF xmlns:rdf=&quot;; xmlns:rdfs=&quot;; xmlns:bf=&quot;; xmlns:frm=&quot;“ xml:base=“”> <bf:QueryProfile rdf:about=&quot;&quot;> <bf:fieldWeight> <rdf:Description rdf:about=&quot;/indexes/default/queryprofiles/default#name&quot;> <bf:weight>10.0</bf:weight> <frm:name>name</frm:name> </rdf:Description> </bf:fieldWeight> <bf:fieldWeight> <rdf:Description rdf:about=&quot;/indexes/default/queryprofiles/default#agency&quot;> <bf:weight>5.0</bf:weight> <frm:name>agency</frm:name> </rdf:Description> </bf:fieldWeight> </bf:QueryProfile> </rdf:RDF>
    92. 94. Job Control Reindex, Reset, Snapshot, Restore POST Job Request to /jobs
    93. 95. <rdf:RDF xmlns:rdf=&quot;; xmlns:rdfs=&quot;; xmlns:bf=&quot;;> <bf:JobRequest> <rdfs:label>Reset the data in my store</rdfs:label> <bf:jobType rdf:resource=&quot;;/> <bf:startTime>2008-12-01T15:10:00Z</bf:startTime> </bf:JobRequest> </rdf:RDF>
    94. 96. Jobs Each job is a resource, with a URI GET to monitor status, DELETE to remove
    95. 97. Summing Up Summary, Additional Resources
    96. 98. The Talis Platform… <ul><li>Provides a standards compliant storage infrastructure for structured and unstructured metadata </li></ul><ul><li>Uses RDF to support widest possible variety of data models and integration options </li></ul><ul><li>Allow managing of data assets through simple web APIs </li></ul><ul><li>Offers a range of data extraction options including full-text searching, SPARQL, RSS augmentation </li></ul><ul><li>Can be tailored to individual applications using the API </li></ul><ul><li>Can be driven by scheduling jobs to perform data management tasks </li></ul><ul><li>Is constantly evolving… </li></ul>
    97. 99. Additional Resources <ul><li>API Reference </li></ul><ul><ul><li> </li></ul></ul><ul><li>Mailing List </li></ul><ul><ul><li> </li></ul></ul><ul><li>Blog </li></ul><ul><ul><li> </li></ul></ul>
    98. 100. Client Libraries (in various states of development) <ul><li>Moriarty </li></ul><ul><ul><li> </li></ul></ul><ul><li>Javascript/JQuery </li></ul><ul><ul><li> </li></ul></ul><ul><li>Ruby Client </li></ul><ul><ul><li> </li></ul></ul><ul><li>Java Client </li></ul><ul><ul><li> </li></ul></ul>
    99. 101. shared innovation