Stardog 1.1: Easier, Smarter, Faster RDF Database

stardog.com

Stardog 1.1
An Easier, Smarter,
Faster RDF Database
Michael Grove, Clark & Parsia LLC
mike@clarkparsia.com
@mikegrovesoft, @stardog_db, @candp

1

stardog.com

About C&P

• We build semantic technology tools
for enterprise solutions

• Proud bootstrappers since 2005
• Offices in DC and Cambridge, MA
• Government & enterprise customers
2

stardog.com

What is Stardog?

• a pure Java RDF database
• full-service, feature rich
• focus on query performance
• standards compliant
• scalable (up ﬁrst, out next)

3

stardog.com

History
• Development started summer 2010
• Stardog 0.5 alpha - 2 May 2011
• Stardog 1.0 ﬁnal - 19 June 2012
• Total of 32 releases, ~500 tickets,
100s of email on the mailing list

• Stardog 1.0.7 presently
• Stardog 1.1 real soon now...
4

stardog.com

Easier.

5

stardog.com

What is easy?
• What’s “easy” in an RDF database?
• Conﬁguration
• Maintenance
• User Experience
• i.e., rationally predictable
• Easier for whom? Not a simple
question.

6

stardog.com

Conﬁguration
• Convention, not conﬁguration
• “Quick Start” is shortest page in the
docs

• 4 steps to querying
• Predictable, sane defaults throughout
• Adapted to Java, Unix, Semtech cultures
• Culture is key to convention
• Very good (!) documentation
7

stardog.com

Maintenance
• Nothing is easier than doing nothing
• RDF & OWL are ideally schema
ﬂexible

• Job scheduler: search, indexes, etc.
• Data migration tools since < 1.0
• Multi-tenancy, online & offline DBs
• Just add data...Automatic data
quality*

• NoSQL == Anti-jobs program for DBAs
8

stardog.com

Except that...
• Every DB has to be admin’d &
maintained

• Matter of degree, not kind
• Stardog Enterprise Server Management
• audit logging
• JMX monitoring
• web console
• online backups (coming soon!)
9

stardog.com

User Experience
• Client-server & Embeddable
• Jena, Sesame, SNARL, HTTP
• SPARQL query simpliﬁcations
• ACID transactions
• Idiomatic Java & Unix interfaces
• Great CLI & shell…
• Windows has gotten much better! :>
• Rich security model
10

stardog.com

Smarter.

11

stardog.com

Okay...that’s BS.
• “Smarter” is market speak
• But Stardog 1.1 has rich feature set
• Reasoning, including UDR
• Integrity Constraint Validation (ICV)
• Semantic Search
• Security
• Spring
• Linked Data Platform
12

stardog.com

Reasoning
• OWL 2 DL, QL, EL, and RL
• Query-time, no materialization
• Only pay for what you eat
• Embarrassingly parallel in part
• Pellet 3 embedded for OWL 2 DL
schema reasoning only

• Very ﬂexible re: NGs & schemas
13

stardog.com

User-deﬁned Rules
• New in 1.1!
• Using SWRL syntax
• Including all SWRL builtins
• Which are also available to SPARQL
• Recently added new individual builtin
• Create new individuals in your rules
• Beware of non-termination!
• Executed at query time like everything else
14

stardog.com

ICV?
• Integrity Constraint Validation
• Automated data quality
• Closed world semantics
• Transactional
• High-level & declarative
• ICs can be OWL, SWRL, or SPARQL
15

stardog.com

Example...
Only employees who are US citizens can
work on projects that receive funding from a
US government agency.

Class:
Project and
(receivesFundsFrom some USGovAgency)
SubClassOf:
inverse(worksOn) only
(Employee and nationality value "US")

More examples: http://stardog.com/docs/
16

stardog.com

Semantic Search
• Uses Waldo, our deep adaptation of
Lucene

• Text index from RDF literals
• Search for resources or literals
• Integrated with SPARQL query
evaluation

• Auto-managed search indexes
17

stardog.com

Security

• Based on standard RBAC model
• Applies at database-level
• Will extend to Named Graphs in 1.x
• Easy CLI admin tools (& Java API)
18

stardog.com

Spring
• Love it or not, Spring isn’t going away
• Support Batch, Data Import, etc.
• Open Source: http://github.com/
clark-parsia/spring-stardog

• Developed by an early adopter who
needed it; supported/maintained by
C&P

19

stardog.com

Linked Data
• Stardog ﬁlls a hole in our Linked
Data Platform

• HTML5, pure JS, client side web
framework (based on backbone.js)

• Linked Data publishing suite
• Stardog Linked Data
Catalog...Enterprise Linked Data
management app

20

stardog.com

Faster.

21

stardog.com

Finally...
• Now we can talk about something
that’s objective, context-free, and
measurable

• Yes!
• But no…#include <std_disclaim.h>
• Your data & your queries are the
only things that really matter

22

stardog.com

That said...
• Two de facto benchmarks for
SPARQL:

• BSBM, OLTP-style, query mixes
per hour (QMpH · 25)

• SP2B, OLAP-style (torture test), set
of queries within a timeout, T, at a
data size D

23

stardog.com

SP2B
• Stardog completes SP2B at 5M,
10M, and 25M (except q5a)

• No other RDF database completes >
5M. (As of the most recent report.
Things change.)

• Considerable performance
differential

• Pushing this out to 100M+ in 1.x
24

stardog.com

BSBM
• A throughput test, primarily. Not
necessarily simple queries

• On modest machine, 255 clients, 10M
triples, we sustain 7m queries per hour
(277k QMpH)

• At 100M, 255 clients, sustain 3m
queries per hour (125k QMpH)

• Among the top 2 or 3 RDF DBs for BSBM
performance

• We will tackle BSBM BI next...
25

stardog.com

Data Loading
• Two indexing modes
• Triples only indexing
• Faster loading, slower NG query
• Up to 250,000 triples per second
• Quads indexing
• Slower loading, faster NG query
• Up to 150,000 triples per second
• More improvements coming in the future
• Customized RDF parser
• Will look at user-deﬁned index subsets
26

stardog.com

What’s new in 1.1

• Aforementioned user deﬁned rules
• But most notably, SPARQL 1.1
• Our most requested feature in a
survey

• Oh, we also made it faster

27

stardog.com

SPARQL 1.1
• Latest revision of the SPARQL query
language
• Put off implementing until spec ﬁnalized
• It’s still in ﬂux, but we decided to go for it
• Adds useful new features to SPARQL
• Aggregates, grouping, sub-query,
negation

• Oh, and the entailment regimes
28

stardog.com

SPARQL 1.1
• Rewrite of query planner & engine for 1.0.5
• Changes needed to support SPARQL 1.1
• Tested by users for the past 3 releases
• With great power comes great responsibility...
• New features are not without cost
• Query planning & optimization more crucial
than ever

• Majority of development time
29

stardog.com

Roadmap
1. Transitivity & 6. “Stardocs”: doc/blob
equality storage & NLP
analytics
2. GeoSPARQL
7. Graph Traversals,
3. Web Console Algorithms & query
langs
4. Statement identiﬁers
8. Statistical inference
5. Stored procedures & & machine learning
database triggers
9. Stardog 2.0:
Distributed Cluster
Super Cloud Thingie!
30

stardog.com

Summary
Easier.
Smarter.
Faster.
Pick all three!

31

stardog.com

Thanks!

32

stardog.com

Licensing

33

stardog.com

Feature Rich
• Support for RDFS, OWL2 proﬁles (EL, RL, QL) & OWL2 DL
via schema only queries

• Semantic Search

• ICV

• Transactions


• Support for major APIs

• Jena & Sesame, and our own SNARL

• SPARQL HTTP protocol, Graph Store protocol

• Also includes a CLI & Shell environment

34

Stardog 1.1: Easier, Smarter, Faster RDF Database

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Stardog 1.1: Easier, Smarter, Faster RDF Database

Similar to Stardog 1.1: Easier, Smarter, Faster RDF Database (20)

More from Clark & Parsia LLC

More from Clark & Parsia LLC (8)

Recently uploaded

Recently uploaded (20)

Stardog 1.1: Easier, Smarter, Faster RDF Database