Roberto Franchini
@robfrankie
The 2nd generation of
(multi model) NoSQL
whoami(1)
More (and more) than 15 years of experience
Software craftsman
Pragmatic problem solver
Remote worker
Member of OrientDB team, tech lead for full-text & spatial indexes, maintainer of
JDBC driver and Docker images
JUG-Torino co-lead
Member of Toastmasters
Agenda
Managing relations with graph Database
Data modelling and schema
Query
Live queries (reactive)
Full text and spatial search
Deployment scenario
Twitter property graph and demo
Meet OrientDB
The First Ever Multi-Model
Database Combining Flexibility
of Documents with
Connectedness of Graphs
What is OrientDB
Multi-Model Database (Document, Graph, Spatial, FullText)
Tables -> Classes
Extended SQL
JOIN -> Physical Pointers
Schema, No-Schema, Hybrid
HTTP + Binary protocols
Stand-alone, embedded, distributed Multi-Master
Graph databases
Order
#134
(Order)
John
(Provider)
Commodor
e Amiga
1200
(Product)
Frank
(Customer)
Monitor
40”
(Product)
Mouse
(Product)
Bruno
(Provider)
Just Data
Order
#134
(Order)
John
(Provider)
Commodor
e Amiga
1200
(Product)
Frank
(Customer)
Monitor
40”
(Product)
Mouse
(Product)
Bruno
(Provider)
Data by itself has little value, it’s the
relationship
between data that gives it
incredible value
Order
#134
(Order)
John
(Provider)
Commodor
e Amiga
1200
(Product)
(Sells)
Frank
(Customer)
(Has)(Makes)
Monitor
40”
(Product)
(Sells)(Has)
Mouse
(Product)
Bruno
(Provider)
(Sells)
(Has)
Data and relationships
Every developer knows
the Relational Model,
but who knows the
Graph one?
Back to school:
Graph Theory crash course
Frank Turin
#13:55 #15:99out = 22:11
in = #22:11
#22:11
(Edge)
(Vertex) (Vertex)
out = #13:55
in = #15:99
Connections use
persistent
pointers
Each element in the
Graph has own
immutable Record ID
Each element in the
Graph has own
immutable Record ID
Each element in the
Graph has its own
immutable Record ID
Lives
Since:1993
Vertices and Edges are Documents
`
{ “@rid”: “12:382”,
“@class”: “Customer”,
“name”: “Frank”,
“surname” : “Raggio”,
“phone” : “+39 33123212”,
“details”: {
“city”:”London",
“tags”:”millennial” }
}
Frank
Order
Makes
General purpose solution:
• JSON
• Schema-less
• Schema-full
• Schema-hybrid
• Nested documents
• Rich indexing and querying
• Developer friendly
No joins
The Index-Free
Adjacency
Solution
Customers
Smith Doe
Stocks
Special
Customers
Orders
Customers
#13:432
#15:19345
#12:243
#13:231
#12:974
#13:10
Green
Order
2332
Order
8834
White
Soap
#12:468
#13:765
#15:49602
#15:4334
Persistent Pointers
SELECT expand( out() )
FROM #12:468
SELECT expand( out() )
FROM Customer
WHERE name = ‘Green’
This uses an index
to retrieve the
starting vertex
(#12:468) vertex
Traversing the Graph
Green
Order
2332
Order
8834
White
Soap
#12:468
#15:19345
#15:49602
#15:4334
Green
Order
2332
Order
8834
White
Soap
#12:468
SELECT expand( out().out() )
FROM Customer
WHERE name = ‘Green’
SELECT expand( out().out() )
FROM #12:468
#15:19345
#15:49602
#15:4334
Traversing the Graph
SELECT expand( in().in() )
FROM #15:49602
SELECT expand( in().in() )
FROM Product
WHERE name = ‘White Soap’
Traversing the Graph
Green
Order
2332
Order
8834
White
Soap
#12:468
#15:19345
#15:49602
#15:4334
The Index-Free Adjacency
is O(1) means constant traversing time, no matter the
database size
vs
Index approach that is O(logN) means the traversal
speed is affected by the database size: the bigger it
is, the slower it is
Data modelling
Person
V E
Product Order
Customer Provider
Purchase
ProvidedBy
MadeOf
Polymorphic domain schema
All vertices
classes extend
the “V”
class
All edges
classes extend
the “E”
class
Polymorphic Queries
Brown
(Provider)
Green
(Customer)
SELECT * FROM Customer
SELECT * FROM Provider
SELECT * FROM Person
Smith
(Provider)
Smith
(Provider)
Green
(Customer)
Brown
(Provider)
Schema
Property types
STRING, DATE, DATETIME, BYTE, BOOLEAN, SHORT, BINARY
Constraint
MANDATORY, NOTNULL, MIN, MAX, READONLY, REGEX
Indexes on on single property or multiple properties
UNIQUE, NOT UNIQUE, FULL TEXT (Lucene), SPATIAL (Lucene)
Schema
Create a class (table) with constraints
CREATE CLASS User EXTENDS V
CREATE PROPERTY User.userId LONG
CREATE PROPERTY User.description STRING
CREATE INDEX User.userId ON User(userId) UNIQUE
CREATE INDEX User.description
ON User(description) FULLTEXT ENGINE LUCENE
Data retrieval
Query
OrientDB supports SQL as a query language with some differences
SELECT city, sum(salary) AS salary FROM Employee
GROUP BY city
HAVING salary > 1000
Query
Get all the outgoing vertices connected with edges with label (class) "Eats" and
"Favourited" from all the Restaurant vertices in Rome
SELECT out('Eats', 'Favorited')
FROM Restaurant
WHERE city = 'Rome'
Traverse
In a social network-like domain, a user profile is connected to friends through links.
TRAVERSE out("Friend")
FROM #10:1234 WHILE $depth <= 3
STRATEGY BREADTH_FIRST
Pattern Matching
Pattern Matching
MATCH
{class: Person, WHERE: (name = ‘Luigi’), AS: me}
-Friend->{}-Friend->{AS: foaf}, {AS: me}-Friend->{AS: foaf}
RETURN me.name AS myName, foaf.name AS foafName
Me
F
FoaF
Friend
Friend
Friend
Reactive Model
What happens when you’re looking for updates
Reactive Model
By using the reactive model, you don’t poll the database, but
rather you subscribe to changes and OrientDB will push
updates:
LIVE SELECT FROM Order
WHERE status = ‘approved’
Supported by Java and JS api natively
Reactive Model
What happens when you register OrientDB for updates
Searching
Search what: full text support
Based on Lucene
provides Java-based indexing and search technology, as well as
spellchecking, hit highlighting and advanced analysis/tokenization capabilities.
Configurable: analyzers, stopwords, query parser behaviour, index writer tuning
Features: term and phrase queries, numeric and date range queries
Multi-field search
CREATE CLASS City EXTENDS V
CREATE PROPERTY City.name STRING
CREATE PROPERTY City.description STRING
CREATE PROPERTY City.size INTEGER
CREATE INDEX City.name_description_size
ON City(name, description,size) FULLTEXT ENGINE LUCENE METADATA {...}
SELECT FROM City
WHERE
SEARCH_CLASS ("name:cas* AND description:piemonte AND size:[20000 TO 40000]" = true
Search where: spatial module
External module, just add the jar and restart the server
Lucene, Spatial4J, JTS
Geometry data (OrientDB additional types)
Point, Line, Polygon, Multiline, Multipolygon
Functions
follows The Open Geospatial Consortium OGC for extending SQL to
support spatial data.
Implements a subset of SQL-MM functions with ST prefix (Spatial Type)
Spatial search
Add location
CREATE class Restaurant
CREATE PROPERTY Restaurant.name STRING
CREATE PROPERTY Restaurant.location EMBEDDED OPoint
Insert
INSERT INTO Restaurant SET name = 'Dar Poeta',
location = {"@class": "OPoint","coordinates" : [12.4684635,41.8914114]}
Spatial search
WKT support
INSERT INTO Restaurant SET name = 'Dar Poeta',
location = St_GeomFromText("POINT (12.4684635 41.8914114)")
Create index
CREATE INDEX Restaurant.location ON Restaurant(location) SPATIAL ENGINE LUCENE
Search index
SELECT FROM Restaurant
WHERE ST_WITHIN(location, ST_Buffer(ST_GeomFromText( 'POINT(50 50)' ), 30)) = true
Function without index
SELECT ST_Intersects(ST_GeomFromText('POINT(0 0)'),
ST_GeomFromText('LINESTRING ( 2 0, 0 2 )'));
Result → (false)
SELECT ST_Disjoint(ST_GeomFromText('POINT(0 0)'),
ST_GeomFromText('LINESTRING ( 2 0, 0 2 )'));
Result → (true)
Java
orientDB = new OrientDB("remote:localhost", "root", "root", OrientDBConfig.defaultConfig());
pool = new ODatabasePool(orientDB, "demodb", "admin", "admin");
db = pool.acquire();
OResultSet result = db.query(
"SELECT from ArchaeologicalSites where search_fields(['Name'],'foro') = true");
result.vertexStream() .forEach(v-> System.out.println("v = " + v.toJSON()));
db.close();
pool.close();
orientDB.close();
Deploying
Availability and Integrity
Atomic, Consistent, Isolated and Durable (ACID) multi-statement transactions
Master
Node
Master
Node
C
C C C
CC
C
Multi-master
Replication
Scalability and Performance
Multi-Master Replication, Sharding and Auto-Discovery to Simplify Ops
Master
Node
Master
Node
C
C C C
CC
C
Auto-Disco
vered Node
Deployment scenarios
Single, stand-alone node
Embedded (in-process) DB
Multi-Master Replica
Mixed
DB
Application
Application DB
Application
DBApplication
Application DBDB
(replica N)
DBApplication
Application DBDB
ApplicationDB
(replica N)
Whole picture
Snow
Patrol
(Band)
Luca
(Accoun
t)
Indie
(Genre)123, 1st
Street
Austin,
TX
(Locatio
n)
Jill
(Accoun
t)
Graphs
{
”@rid": “12:382”,
”@class": ”Customer",
“name”: “Jill”,
“surname” : “Raggio”,
“phone” : “+39 33123212”,
“details”: {
“city”:”London",
“tags”:”millennial”
}
}
Schema-less structures
Object Oriented
Key-Value pairs
Geo-Spatial
Full-Text
GraphDocument
Object
Key/Valu
e
Multi-Model represents the
intersection
of multiple models in just one
product
Full-Text Spatial
Multi-model
Additional tools and features
CLI console
Sequences
Server side functions in SQL, Java and JS
ETL tool: csv, json, jdbc
Teleporter: RDBMS to OrientDB
Neo4j Importer
API & Standards
REST and HTTP/JSON support
Support for TinkerPop standard for Graph DB: Gremlin
language and Blueprints API
JDBC driver to connect any BI tool
Drivers in Java, Node.js, Python, PHP, .NET, Perl, C/C++,
Elixir and more
Spring Data
Spark connector(s) (community)
Demo time
Download and unpack the server from the OrientDB site
Launch the server
./bin/server.sh
Point your browser to
http://localhost:2480/
Play around!
Get Started for Free
OrientDB Community Edition is FREE for any
purpose (Apache 2 license)
Udemy Getting Started Training is ★★★★★
OrientDB Enterprise Edition if you want more
OrientDB
Multi-Model DBMS with a Graph-Engine
Open Source Apache2 license
Data Models are built into the core engine
Schema-less, Schema-full and Schema-mixed
Written in Java (runs on every platform)
Zero-config HA
Useful links
Main site http://orientdb.com/
Documentation http://orientdb.com/docs/
GameOfGraph http://gog.orientdb.com/
GitHub https://github.com/orientechnologies/orientdb
Twitter demo https://github.com/robfrank/orientdb-twitter
Geospatial demo https://github.com/luigidellaquila/geospatial-demo
Thanks!
ROME 18-19 MARCH 2016
http://www.orientdb.com
@robfrankie
r.franchini@orientdb.com

OrientDB - The 2nd generation of (multi-model) NoSQL