Your SlideShare is downloading. ×
SemTech 2010: Pelorus Platform
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

SemTech 2010: Pelorus Platform

1,382
views

Published on

Pelorus Platform is a suite of tools for building Linked Data and Semantic Web applications.

Pelorus Platform is a suite of tools for building Linked Data and Semantic Web applications.

Published in: Technology, Education

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,382
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Pelorus: A Semantic Web Application Platform 2010 Semantic Technology Conference Michael Grove Director of Software Development Clark & Parsia, LLC. mike@clarkparsia.com http://clarkparsia.com -- http://www.twitter.com/candp
  • 2. Who are we? Clark & Parsia is a Semantic software startup founded in 2005 Offices in DC and Cambridge, MA Software products for end-user and OEM use Provides software development and integration services Specializing in Semantic Web, web services, and advanced AI technologies for federal and enterprise customers.
  • 3. Where do we start? No, literally, where do we start? Enterprise increasingly wants to utilize semweb tech to manage information Lack of in-house SemWeb expertise So what's the first step in these cases? It's hard to get a project off the ground without expertise In many cases, you just want to get a prototype running ASAP to evaluate the approach An integrated platform to rapidly prototype and assess semweb tech, which also scales to production, is crucial
  • 4. The Pelorus Platform Pelorus Platform aims to ease this situation It's a standards-based application development stack geared toward enterprise information integration via RDF, SPARQL and OWL. Provides a collection of software designed to take you from ontology (or data) to application Based on years of customer engagements learning what parts are the same for everyone, and what parts are customized by everyone--and facilitating both. Minimal or no human in the loop steps are required to get a barebones application running From there, it's just UI customization
  • 5. Ingredients PelletServer RESTful server-side component powered by Pellet Provides: Reasoning Semantic Search Integrity constraints Query services Machine Learning ... and Planning too! Semantic ETL Toolkit for transforming existing data into RDF Support for most common formats, XML, CSV, Excel, relational, etc. Conversion driven from domain ontology
  • 6. More Ingredients Annex - A linked data server Publishes your RDF as linked data Works in-place against any RDF database No files to parse and directory structure to fill out Javascript module and pluggable template API for rendering resources CRUD workflow support for maintaining your data
  • 7. More Ingredients Machine Learning Suite Bootstrap ontologies from existing data Provides capabilities for learning ETL transformations from existing data, decreasing by-hand mapping burden Automatically create Pelorus models for browsing Analysis support, clustering, classification, and more. Pelorus Faceted browsing via SPARQL for RDF data.
  • 8. So What Now? Intent of Platform is to take either your existing data, or an existing ontology, as input and provide as output a working skeleton application. This is the Staples Easy button for the Semantic Web Some minimal configuration and UI customize may be required The goal is to Just Add Data and get back a working, full- service, modern app that's optimized for data integration and analysis.
  • 9. Getting Started Legacy data in a series of databases, XML files, etc This is a maintenance nightmare How to you search this data, analyze it, or verify it's correctness? If we could get the data out of these legacy formats and integrate them, then we could do something useful...
  • 10. 1. Integrate Legacy Data Ontology Bootstrapping via ML We can learn the basic ontology from our existing data Feed data to a ML process that will produce our ontology Semantic ETL Using our ontology, and some additional ML, we can generate mappings from the source data to the ontology Automatically convert our legacy data into RDF
  • 11. 2. Publish Integrated Data Now that we have RDF, we'd like to publish it as Linked Data Annex Linked Data server takes any RDF database and exposes it's contents as Linked Data. Customizable template framework Javascript API to access original RDF database We'd also like to maintain our data Using Empire, we can generate Java beans to represent our domain ontology. Annex provides generic CRUD templates driven from standard Java beans, using JPA as a persistence mechanism. By virtue of simply having RDF in a database, we've got publication as Linked Data, and maintenance via simple CRUD pages for free.
  • 12. 3. Browse & Search & Query We've published our RDF, but clicking around pages looking for a particular resource is not ideal Having a simple interface to browse the data would be great. Pelorus is served via Annex Facet model is generated dynamically via more ML Uses same Javascript template framework for custom display of RDF content.
  • 13. Step 4: Analyze & Plan & Act We can use OWL reasoning via Pellet to learn new things about the data; for example: which products should we sell to which customers? which products should we sell to which prospects? why do we make these recommendations? We can use Machine Learning to learn new things, too: which customers are like others? (similarity) which groups do our customers fall into? (clustering) which employees are liaisons between parts of the company (social network analysis) which employees are most likely to retire in the next year? (classification) We can use Automated Planning to: build actionable plans/workflows based on these analyses
  • 14. Interlude: Pelorus Demos http://pelorus.clarkparsia.com/ -- American baseball http://nasa.clarkparsia.com/ -- NASA Space Program http://datagov.clarkparsia.com/ -- data.gov data catalog
  • 15. What's the point? Getting to step 4 (and beyond) is the point, that's where the real ROI lives... You want to get there sooner & cheaper But many times step 1-3 is a hurdle If you've got limited time and/or budget to prove value in step 4, you don't want to waste it on the drudgery of getting off the ground This is the key to semantic technology's value proposition
  • 16. Questions?