A Semantic Web Application
2010 Semantic Technology
Director of Software Development
Clark & Parsia, LLC.
http://clarkparsia.com -- http://www.twitter.com/candp
Who are we?
Clark & Parsia is a Semantic software startup founded in
Offices in DC and Cambridge, MA
Software products for end-user and OEM use
Provides software development and integration services
Specializing in Semantic Web, web services, and
advanced AI technologies for federal and enterprise
Where do we start?
No, literally, where do we start?
Enterprise increasingly wants to utilize semweb tech to
Lack of in-house SemWeb expertise
So what's the first step in these cases?
It's hard to get a project off the ground without
In many cases, you just want to get a prototype
running ASAP to evaluate the approach
An integrated platform to rapidly prototype and assess
semweb tech, which also scales to production, is crucial
The Pelorus Platform
Pelorus Platform aims to ease this situation
It's a standards-based application development stack
geared toward enterprise information integration via RDF,
SPARQL and OWL.
Provides a collection of software designed to take you
from ontology (or data) to application
Based on years of customer engagements learning
what parts are the same for everyone, and what parts
are customized by everyone--and facilitating both.
Minimal or no human in the loop steps are required to get
a barebones application running
From there, it's just UI customization
RESTful server-side component powered by Pellet
Machine Learning ... and Planning too!
Toolkit for transforming existing data into RDF
Support for most common formats, XML, CSV,
Excel, relational, etc.
Conversion driven from domain ontology
Annex - A linked data server
Publishes your RDF as linked data
Works in-place against any RDF database
No files to parse and directory structure to fill out
CRUD workflow support for maintaining your data
Machine Learning Suite
Bootstrap ontologies from existing data
Provides capabilities for learning ETL transformations
from existing data, decreasing by-hand mapping
Automatically create Pelorus models for browsing
Analysis support, clustering, classification, and more.
Faceted browsing via SPARQL for RDF data.
So What Now?
Intent of Platform is to take either your existing data, or an
existing ontology, as input and provide as output a
working skeleton application.
This is the Staples Easy button for the Semantic Web
Some minimal configuration and UI customize may be
The goal is to Just Add Data and get back a working, full-
service, modern app that's optimized for data integration
Legacy data in a series of databases, XML files, etc
This is a maintenance nightmare
How to you search this data, analyze it, or verify it's
If we could get the data out of these legacy formats and
integrate them, then we could do something useful...
1. Integrate Legacy Data
Ontology Bootstrapping via ML
We can learn the basic ontology from our existing data
Feed data to a ML process that will produce our
Using our ontology, and some additional ML, we can
generate mappings from the source data to the
Automatically convert our legacy data into RDF
2. Publish Integrated Data
Now that we have RDF, we'd like to publish it as Linked
Annex Linked Data server takes any RDF database
and exposes it's contents as Linked Data.
Customizable template framework
We'd also like to maintain our data
Using Empire, we can generate Java beans to
represent our domain ontology.
Annex provides generic CRUD templates driven from
standard Java beans, using JPA as a persistence
By virtue of simply having RDF in a database, we've got
publication as Linked Data, and maintenance via simple
CRUD pages for free.
3. Browse & Search & Query
We've published our RDF, but clicking around pages
looking for a particular resource is not ideal
Having a simple interface to browse the data would be
Pelorus is served via Annex
Facet model is generated dynamically via more ML
display of RDF content.
Step 4: Analyze & Plan & Act
We can use OWL reasoning via Pellet to learn new things
about the data; for example:
which products should we sell to which customers?
which products should we sell to which prospects?
why do we make these recommendations?
We can use Machine Learning to learn new things, too:
which customers are like others? (similarity)
which groups do our customers fall into? (clustering)
which employees are liaisons between parts of the
company (social network analysis)
which employees are most likely to retire in the next
We can use Automated Planning to:
build actionable plans/workflows based on these
Interlude: Pelorus Demos
http://pelorus.clarkparsia.com/ -- American baseball
http://nasa.clarkparsia.com/ -- NASA Space Program
http://datagov.clarkparsia.com/ -- data.gov data catalog
What's the point?
Getting to step 4 (and beyond) is the point, that's where
the real ROI lives...
You want to get there sooner & cheaper
But many times step 1-3 is a hurdle
If you've got limited time and/or budget to prove
value in step 4, you don't want to waste it on the
drudgery of getting off the ground
This is the key to semantic technology's value