The Talis Platform A Linked Data Engine Leigh Dodds Platform Programme Manager SemTech June 2010 http://creativecommons.org/licenses/by/2.0/uk/
Agenda Platform Overview Managing RDF in the Platform Data Extraction Features Current & Recent Projects
 
Platform Overview
Multi-Tenant Data Storage Service Software as a Service
Self-contained data stores with services that operate on their contents Platform Stores
Store any binary content Unstructured Data Storage
RDF triple store Structured Data Storage
Stores are world-readable by default Configurable access options HTTP Digest Authentication Access Control
Trigger or schedule store management jobs Reset, Snapshot, Restore, Reindex (Future feature: Bulk Load) Job Control
Generic services that operate on any kind of data Aim for design consistency via service checklist RESTful APIs
RDF, SPARQL, HTTP Where there are no standards we create open specifications Standards Compliance
Branded Linked Data Hosting Domain hosting Surfacing of platform services
The Meta Box Managing structured metadata
Create, read, update, delete RDF resources Web-accessible RDF triplestore
Public/Private application data Separate access control options Future Feature: API for managing graphs Partition data into sub-graphs
Support RDF/XML, Turtle, N-Triples Store data with HTTP POST
Vocabulary and protocol for describing changes to RDF triple stores Updates using Changesets
Maintain audit trail of changes to Metabox Support for Versioned Updates
Combine several changesets into single request Applied atomically Batch Update Mechanism
Searching SPARQL Augmentation Data Extraction Features
Full-text index over RDF literals in the Metabox Configurable indexing options Every Store has a Search Index
Paging, Sorting, Relevance Ranking Flexible query syntax (fielded and boolean searches) Standard Search Engine features
/items?query=[query] &max=[10] &offset=[0] &sort=[comma-separated fieldnames] &xsl=[XSLT stylesheet] &content-type=[mimetype for XSLT results]
Includes Open Search extensions:  paging, relevance ranking Includes full description of each RDF resource Search Results are RSS 1.0
Group search results by specific fields Simple XML response format Facetted Search
/services/facet   ?query=[query]   &fields=[comma-separated fieldnames]   &top=[10]   &format=[xml|html]
Automatic data annotation Pipe an RSS 1.0 feed through a Store and enrich it with available data Augmentation
RSS 1.0 RSS 1.0 Augmenter MetaBox
/services/augment   ?data-uri=[url-of-RSS-feed] &xsl=[XSLT stylesheet] &content-type=[mimetype for XSLT results]
Standards compliant SPARQL 1.0 service Early access to draft SPARQL 1.1 features SPARQL Query API
/services/sparql   ?query=[query]   &output=[syntax (xml, rdf, json)]
Current Projects Quick tour of current & recent projects
Crawling and hosting Linked Data from bbc.co.uk Public SPARQL Endpoint BBC
Community annotation of Linked Data using Twitter Based on BBC Linked Data fanhu.bz
Linked Data from UK Government Domain hosting  Public SPARQL and Search APIs data.gov.uk
Explore UK research project funding http://bis.clients.talis.com BIS Research Explorer
Linked Data UK Geography and Gazetteer Domain hosting  Public SPARQL and Search APIs Ordnance Survey
Linked Data from  EU Government and UK cultural heritage UK & EU Research Projects
Free use of the Platform for Public Domain data http://www.talis.com/cc Talis Connected Commons
Help explore potential of Linked Data  Developer workshops, training, data conversions Your Organisation?
Summing Up Summary, Additional Resources
The Talis Platform provides… Cloud based data storage Simple API for managing data Flexible data extraction features Linked Data publishing platform
Additional Resources API Reference http://n2.talis.com/wiki/Platform_API Mailing List http://groups.google.com/group/n2-dev Blog http://blogs.talis.com/n2/ Support Desk http://talisplatform.zendesk.com
 
 

Talis Platform: A Linked Data Engine

Editor's Notes

  • #6 Multiple users Zero install SaaS model: instant access to features We worry about the data management, but leave you in control All new (versions) Talis applications now built on same Platform, e.g. Engage. So can not only build new apps on the service, can also access data underlying existing services
  • #7 i.e. similar to Amazon S3. Can upload and store any kinds of data. May be web site assets, e.g. images, CSS, javascript, etc. May be other documents or collateral.
  • #8 i.e. similar to Amazon S3. Can upload and store any kinds of data. May be web site assets, e.g. images, CSS, javascript, etc. May be other documents or collateral.
  • #9 Main set of features are around the structured data storage. Management of RDF metadata. Resource Description Framework More later but basically a means to capture in a highly structured and flexible way, metadata about anything.
  • #10 As you remain in control of your data, you obviously want to control who has access to it. By default on the platform we allow public read access, but this can be changed. Each store can has its own set of access control options, i.e. which platform users can access which features. There are a useful set of defaults, i.e. public read, admin to add, update or modify configuration
  • #12 Roy Fielding’s thesis, responsible for many of the Internet RFCs. Describes a formal basis for the web architecture. Growing agreement that following these architectural principles is the best way to build internet-scale applications, whether that means web sites or web APIs. Anything else means you’re working against the web architecture meaning that you’re using a sub-optimal solution. This is why the Talis Platform follows these principles rather than using, say, SOAP or some other web services APIs. REST, essentially, involves using HTTP correctly. About understanding and using the HTTP protocol to its fullest extent, because in doing so you allow web browsers, proxy servers, search engines, etc to all interact with you application correctly and in a way that has massive scalability.
  • #13 And al of this is made available through a standards compliant framework. With essentially one exception (which I’ll point out later) everything that we’ll look at is based on open internet standards. The technologies like HTTP, RDF, SPARQL, all of the data formats we generate are open standards. This is part of the Talis ethos. We don’t believe in proprietary software. We use and create a lot of open source software ourselves and believe this is the only viable way for internet services to develop.
  • #14 As well as following the REST architectural guidelines, within the Platform team we have our own set of best practices that apply to the design of new services. The service checklist is online as part of the API documentation, but includes things like ensuring we have a consistent url structure, that there’s a human interface to every API, to make it easy to play with the system. That error messages are human-readable, etc
  • #17 E.g. public data, but also private authentication data.
  • #18 E.g. public data, but also private authentication data.
  • #19 E.g. public data, but also private authentication data.
  • #22 Want to review some basic concepts and technologies that underpin the design and implementation of the Platform. How many people already understand the terms REST, Content Negotation, RDF.
  • #32 E.g. public data, but also private authentication data.