Objectivity/DB - A Multi-Purpose
NoSQL Database

Leon Guzenda and Lenny Hoffman

Lunch and Learn - December 12, 2013
© Objectivity, Inc. 2013
Agenda
• Objectivity
• NoSQL?
• Objectivity/DB
• Typical Configurations
• Placement Manager

© Objectivity, Inc. 2013
Objectivity, Inc.
• A privately held, profitable company founded in 1988 to
provide object database software for engineering, scientific
and defense applications.
• Defence and Intelligence Community customers include:
US DoD/DoE/DHS, BaE, Boeing, General Dynamics,
L-3, Lockheed Martin, NGC, Raytheon, Titan...
• Equipment manufacturing customers include: Alcatel, Ciena,
Emerson, Ericsson, Iridium, NEC, Qualcomm, Siemens...
• “Big Science” customers include Berkeley NL, Brookhaven NL, CERN,
Lawrence Livermore NL, FermiLab, JHU Sloan Digital Sky Survey, Los
Alamos NL, Max Planck...

© Objectivity, Inc. 2013
Typical Deployments...
Objectivity/DB is used to
aggregate knowledge and
disseminate it back into
the field

HUMINT

© Objectivity, Inc. 2013
...Typical Deployments...
Astronomy

© Objectivity, Inc. 2013
...Typical Deployments

Zeus Anesthesia

Base Station Controller

Iridium LEO Network

High Energy Physics

Process Control
© Objectivity, Inc. 2013
ODBMS Evolution
• 1980s
– “Performance, Performance, Performance!”
– Primarily scientific and engineering applications

• 1990s
– Embedded applications
– Reliability and Scalability
– New languages and Operating Systems

DATA
MANIPULATION
Applications tended
to generate data and
relationships

– Large deployments in the scientific domain

• 2000s

RELATIONSHIP
ANALYTICS

– Ease of use and instrumentation

– Query languages

Applications ingest
and correlate data
and relationships

– Performance and scalability
– Grids and Clouds
– Embedded systems, government and more...

© Objectivity, Inc. 2013
High Ingest Systems
• 2000
– System handles 1 billion objects per day
– Correlates them with existing knowledge
– Raises alerts when new information is found
– Services about 100 analysts at one agency

• 2010
– System handles about 50 trillion objects every day
– Services 1000+ analysts at six agencies

• 2013
– A multi-sensor platform collects 2 Petabytes per day...

© Objectivity, Inc. 2013
NoSQL?

© Objectivity, Inc. 2013
Not Only SQL – A group of 4 primary technologies
Object Database

Key-Value Store

Highly
Interconnected

Simple
Leon Guzenda at ICOODB 2010

© Objectivity, Inc. 2010
© Objectivity, Inc. 2013
NoSQL?
•All of these features could have been obtained
from “Commercial Off The Shelf” ODBMSs:
• Unique object/document IDs

• Flexible object clustering
• Effectivity (data in a relationship)
Shared-Nothing
• Geospatial and multi-dimensional indexing
Fully distributed
• Hash table (key-value) lookups
“No-lock” and novel transaction modes
• Hyperspace = = single logical view of a federation
Iterators and fast, predictable traversals
• Multi-way replication
Fast scans
• High Availability
Optimization for random access
• Text searching

• Sharding
•
•
•
•
•
•

• In Memory Database configurability

Source: ICOODB 2011
© Objectivity, Inc. 2013
Not Only SQL – Is There Actually Anything New?
Other NoSQL

Objectivity/DB

Rediscovered Database Technologies

Scalable Collections:
• Hash Maps
• Plus Trees, Collections, B-Tree
indices and fast navigation

Key-Value Store

Named Groups of Objects:
• Document, Volume, Chapter,
Paragraph, Sentence,
Word…
• Or JSON, XML etc.

Objectivity/SQL++:
• SQL with object extensions
• Every row has a ROWID

© Objectivity, Inc. 2013
OBJECTIVITY/DB

© Objectivity, Inc. 2013
Objects
•
•
•
•
•
•
•
•
•

Accessible via a unique identifier (OID)
Can be linked by named relationships
Can be indexed in multiple ways
Can be included in multiple collections
Can be named in multiple scopes
Can be dynamically clustered
Can be versioned
Can be transient or persistent
Interoperable across multiple languages
(including SQL++) on multiple platforms
© Objectivity, Inc. 2013
Architectural Features
• Single Logical View
• Simple, Distributed Servers
• Replaceable Components (plug-ins)

© Objectivity, Inc. 2013
SINGLE LOGICAL VIEW

© Objectivity, Inc. 2013
SINGLE LOGICAL VIEW

Single Logical View of Millions of Exabytes

© Objectivity, Inc. 2013
Distributed Architecture
Client

Simple, Distributed Servers

Application

Lock Server
Lock Server

Smart
Cache

Lock Server
Query Server
Lock Server
Data Server

Objectivity/DB

Enhances scalability and availability
© Objectivity, Inc. 2013
Parallel Query Engine
Application

Lock Server
Lock Server

Smart
Cache

Lock Server
Query Server
Query Server

Objectivity/DB

PQE

Lock Server
Data Server

Task
Splitter

The Task Splitter aims queries at specific
databases and containers

Filter/Gat
eway

Filters can run complex qualification methods.
Gateways can access other databases or search engines.

Replaceable components for smarter optimization
© Objectivity, Inc. 2013
Replaceable Components

© Objectivity, Inc. 2013
Faster Navigation
PROBLEM: Find all (N) of the Suspects linked to a chosen Incident

Incident_Table

Relational solution:
N * 2 B-Tree lookups
N * 2 logical reads
B-Trees

© Objectivity, Inc. 2013

Join_Table

Suspect_Table
Faster Navigation
PROBLEM: Find all (N) of the Suspects linked to a chosen Incident

Incident_Table

Join_Table

Suspect_Table

Relational solution:
N * 2 B-Tree lookups
N * 2 logical reads
B-Trees

Incident

Objectivity/DB solution:
1
B-Tree lookup
1 + N logical reads

7

© Objectivity, Inc. 2013

Suspects
...Relationship Analytics...
Relational Database

Try constructing the SQL query to find the shortest path between these two rows

© Objectivity, Inc. 2013
...Relationship Analytics...
Relational Database

Object Database

With Objectivity/DB it’s a
few lines of code
© Objectivity, Inc. 2013
Typical Configurations

© Objectivity, Inc. 2013
Configuration Considerations
Processor Power

System Latency

Input/Output Volumes

Concurrency

Data Locality

Data Usage (random/scan,
volatility, query profiles...)

Storage media limitations

Geography and Mobility

Network Bandwidth

Security

Caching

Availability

© Objectivity, Inc. 2013
Standalone

Multithreaded process(es) run on a laptop
or workstation using the local disk(s).

© Objectivity, Inc. 2013
Network of Workstations

© Objectivity, Inc. 2013
In-Memory Database

The reference data is read into a large
cache that is preserved across
transactions, eliminating costly I/Os.

© Objectivity, Inc. 2013
Client-Server

© Objectivity, Inc. 2013
ODBC Server

Allows industry standard tools to access the
federation.

© Objectivity, Inc. 2013
Service-Oriented Architecture

Particularly useful where the link between
the client and the data is very slow.

© Objectivity, Inc. 2013
Parallel Query Engine

© Objectivity, Inc. 2013
High Performance Cluster
BaBAR Project
• “Data Intensive

Computing”
• 2000 CPU Farm
• 40 Terabytes of disk
• 1 Petabyte of tapes

First “public” project to
build a Petabyte+ database

© Objectivity, Inc. 2013
High Availability Cluster

© Objectivity, Inc. 2013
Geographically Distributed

© Objectivity, Inc. 2013
Grid/Cloud Deployment

Grid/Cloud

© Objectivity, Inc. 2013
Logger/Archiver

© Objectivity, Inc. 2013
Hybrid…

© Objectivity, Inc. 2013
…Hybrid
Sources

Ingest

Correlation & Analytics

© Objectivity, Inc. 2013

Visual Analytics

Actions
Flexibility Rules!
• The Single Logical View, Simple Distributed
Servers, Replaceable Components and Small
Footprint.....
• ... Provide Many Deployment Options

© Objectivity, Inc. 2013
Flexibility Rules!
• The Single Logical View, Simple Distributed
Servers, Replaceable Components and Small
Footprint.....
• ... Provide Many Deployment Options
Let's look at physical placement of data in more detail.

© Objectivity, Inc. 2013
www.Objectivity.com

Flexible
Placement
Lenny Hoffman
Chief Architect, Objectivity
Placement
• The vast majority of DBMSs place data
– On disk using some type of organization.
– On different disks.
– On different nodes.

• A lot of what differentiates DBMSs is how they place
data.
• The way data is placed greatly affects the applicability
of the DBMS for certain models (e.g. relational,
document, graph) as well as different use cases.
– Grouping by sets is good for relational, but not so good for
document or graph.
Model Influences Placement
• Most DBMSs are model specific:
– Relational – Key-Value – Document – Graph

• They naturally layout (place) data in ways that best support
those models and the use cases they tend to attract.
– Relational DBMSs tend to place like type data together as that
performs best given their set based access.
– Key-Value DBMSs tend to place data according to key values
(hashed), as that makes looking up by key fast.
– Document DBMSs tend to place data in hierarchies that represent
documents to make working with a document fast.
– Graph DBMSs tend to place data according to its connectivity with
other data as that makes navigation fast.
Placement Configurability
• Performance is the main reason behind placement
choices. Orders of magnitude differences can occur
between efficient and inefficient data placements.
• DBMSs place data based on their understanding and
experience with the use cases that fit with the model
and their target markets.
• Knowing that they can’t in a general way place data
that is best for every application they usually make
placement configurable to some degree.
Typical Placement Configurability
• DBMSs usually provide configurability within the
general framework of placement they do:
– Table spaces, Fragmenting, partitioning, sharding.
– Node assignments.
– Limited clustering (e.g. across joins).

• Most do not provide the ability to change the
framework, in whole or in part.
– When you model or use data differently.
– When some of your data does not fit well with general
assumptions made by the framework.
Fully Configurable Placement
• Model independent.
• Choosing which objects (e.g. records/rows) are placed together
(clustering).
– Because they are going to be used together.

• Which ones are placed separately.
– Because they are going to be used separately and may be distributed
separately.

• Which ones you base a primary, secondary, tertiary, etc. organization on.
• Having different organizations for different data.
– Different types of data.
– The same types but used differently.
– Data used in different modes, e.g. demo vs. production.

• Distributing data based on what is used together and locality of use.
Objectivity Placement System
Placement Model
• The Placement Model is the means by which a database designer
expresses a placement design and what is used by the
Placement Manager when placing objects or helping the query
system look them up.
• A Placement Model primarily consists of rules and placers:
– Rule -- expresses the conditions that are to lead to a particular object
placer.
– Object Placer -- responsible for placing objects into containers and
keeping track of where they where placed.
– Container Placer -- responsible for placing containers into databases
and keeping track of where they were placed.
– Database Placer -- responsible for placing database files (including
those for its externalized containers) into storage locations (host/path
combinations) and keeping track where they were placed.
Rule-Placer Delegation
Place According To Type
Place According to Hierarchy
Place According to Connectivity
Place According to Related (Partition)
Place According to Value (Partition)
A

D

3

D

E

8

F

H

12

J

K

55

P

Q

106

S

Z

197
Placement Summary
• Placement has a huge influence on performance.
• You can mix and match all possible placement
arrangements.
• You can use different placement schemes for different parts
of your data.
• We place it, therefore we find it (efficiently).
• It is easy to try different placement schemes to see what
works best for you.
• You can have different placement schemes for demo vs.
production, etc.
• You can evolve your placement as models are versioned.
Thank you!
Q&A
Follow us on twitter: @infinitegraph
and @objectivitydb

Objectivity/DB: A Multipurpose NoSQL Database

  • 1.
    Objectivity/DB - AMulti-Purpose NoSQL Database Leon Guzenda and Lenny Hoffman Lunch and Learn - December 12, 2013 © Objectivity, Inc. 2013
  • 2.
    Agenda • Objectivity • NoSQL? •Objectivity/DB • Typical Configurations • Placement Manager © Objectivity, Inc. 2013
  • 3.
    Objectivity, Inc. • Aprivately held, profitable company founded in 1988 to provide object database software for engineering, scientific and defense applications. • Defence and Intelligence Community customers include: US DoD/DoE/DHS, BaE, Boeing, General Dynamics, L-3, Lockheed Martin, NGC, Raytheon, Titan... • Equipment manufacturing customers include: Alcatel, Ciena, Emerson, Ericsson, Iridium, NEC, Qualcomm, Siemens... • “Big Science” customers include Berkeley NL, Brookhaven NL, CERN, Lawrence Livermore NL, FermiLab, JHU Sloan Digital Sky Survey, Los Alamos NL, Max Planck... © Objectivity, Inc. 2013
  • 4.
    Typical Deployments... Objectivity/DB isused to aggregate knowledge and disseminate it back into the field HUMINT © Objectivity, Inc. 2013
  • 5.
  • 6.
    ...Typical Deployments Zeus Anesthesia BaseStation Controller Iridium LEO Network High Energy Physics Process Control © Objectivity, Inc. 2013
  • 7.
    ODBMS Evolution • 1980s –“Performance, Performance, Performance!” – Primarily scientific and engineering applications • 1990s – Embedded applications – Reliability and Scalability – New languages and Operating Systems DATA MANIPULATION Applications tended to generate data and relationships – Large deployments in the scientific domain • 2000s RELATIONSHIP ANALYTICS – Ease of use and instrumentation – Query languages Applications ingest and correlate data and relationships – Performance and scalability – Grids and Clouds – Embedded systems, government and more... © Objectivity, Inc. 2013
  • 8.
    High Ingest Systems •2000 – System handles 1 billion objects per day – Correlates them with existing knowledge – Raises alerts when new information is found – Services about 100 analysts at one agency • 2010 – System handles about 50 trillion objects every day – Services 1000+ analysts at six agencies • 2013 – A multi-sensor platform collects 2 Petabytes per day... © Objectivity, Inc. 2013
  • 9.
  • 10.
    Not Only SQL– A group of 4 primary technologies Object Database Key-Value Store Highly Interconnected Simple Leon Guzenda at ICOODB 2010 © Objectivity, Inc. 2010 © Objectivity, Inc. 2013
  • 11.
    NoSQL? •All of thesefeatures could have been obtained from “Commercial Off The Shelf” ODBMSs: • Unique object/document IDs • Flexible object clustering • Effectivity (data in a relationship) Shared-Nothing • Geospatial and multi-dimensional indexing Fully distributed • Hash table (key-value) lookups “No-lock” and novel transaction modes • Hyperspace = = single logical view of a federation Iterators and fast, predictable traversals • Multi-way replication Fast scans • High Availability Optimization for random access • Text searching • Sharding • • • • • • • In Memory Database configurability Source: ICOODB 2011 © Objectivity, Inc. 2013
  • 12.
    Not Only SQL– Is There Actually Anything New? Other NoSQL Objectivity/DB Rediscovered Database Technologies Scalable Collections: • Hash Maps • Plus Trees, Collections, B-Tree indices and fast navigation Key-Value Store Named Groups of Objects: • Document, Volume, Chapter, Paragraph, Sentence, Word… • Or JSON, XML etc. Objectivity/SQL++: • SQL with object extensions • Every row has a ROWID © Objectivity, Inc. 2013
  • 13.
  • 14.
    Objects • • • • • • • • • Accessible via aunique identifier (OID) Can be linked by named relationships Can be indexed in multiple ways Can be included in multiple collections Can be named in multiple scopes Can be dynamically clustered Can be versioned Can be transient or persistent Interoperable across multiple languages (including SQL++) on multiple platforms © Objectivity, Inc. 2013
  • 15.
    Architectural Features • SingleLogical View • Simple, Distributed Servers • Replaceable Components (plug-ins) © Objectivity, Inc. 2013
  • 16.
    SINGLE LOGICAL VIEW ©Objectivity, Inc. 2013
  • 17.
    SINGLE LOGICAL VIEW SingleLogical View of Millions of Exabytes © Objectivity, Inc. 2013
  • 18.
    Distributed Architecture Client Simple, DistributedServers Application Lock Server Lock Server Smart Cache Lock Server Query Server Lock Server Data Server Objectivity/DB Enhances scalability and availability © Objectivity, Inc. 2013
  • 19.
    Parallel Query Engine Application LockServer Lock Server Smart Cache Lock Server Query Server Query Server Objectivity/DB PQE Lock Server Data Server Task Splitter The Task Splitter aims queries at specific databases and containers Filter/Gat eway Filters can run complex qualification methods. Gateways can access other databases or search engines. Replaceable components for smarter optimization © Objectivity, Inc. 2013
  • 20.
  • 21.
    Faster Navigation PROBLEM: Findall (N) of the Suspects linked to a chosen Incident Incident_Table Relational solution: N * 2 B-Tree lookups N * 2 logical reads B-Trees © Objectivity, Inc. 2013 Join_Table Suspect_Table
  • 22.
    Faster Navigation PROBLEM: Findall (N) of the Suspects linked to a chosen Incident Incident_Table Join_Table Suspect_Table Relational solution: N * 2 B-Tree lookups N * 2 logical reads B-Trees Incident Objectivity/DB solution: 1 B-Tree lookup 1 + N logical reads 7 © Objectivity, Inc. 2013 Suspects
  • 23.
    ...Relationship Analytics... Relational Database Tryconstructing the SQL query to find the shortest path between these two rows © Objectivity, Inc. 2013
  • 24.
    ...Relationship Analytics... Relational Database ObjectDatabase With Objectivity/DB it’s a few lines of code © Objectivity, Inc. 2013
  • 25.
  • 26.
    Configuration Considerations Processor Power SystemLatency Input/Output Volumes Concurrency Data Locality Data Usage (random/scan, volatility, query profiles...) Storage media limitations Geography and Mobility Network Bandwidth Security Caching Availability © Objectivity, Inc. 2013
  • 27.
    Standalone Multithreaded process(es) runon a laptop or workstation using the local disk(s). © Objectivity, Inc. 2013
  • 28.
    Network of Workstations ©Objectivity, Inc. 2013
  • 29.
    In-Memory Database The referencedata is read into a large cache that is preserved across transactions, eliminating costly I/Os. © Objectivity, Inc. 2013
  • 30.
  • 31.
    ODBC Server Allows industrystandard tools to access the federation. © Objectivity, Inc. 2013
  • 32.
    Service-Oriented Architecture Particularly usefulwhere the link between the client and the data is very slow. © Objectivity, Inc. 2013
  • 33.
    Parallel Query Engine ©Objectivity, Inc. 2013
  • 34.
    High Performance Cluster BaBARProject • “Data Intensive Computing” • 2000 CPU Farm • 40 Terabytes of disk • 1 Petabyte of tapes First “public” project to build a Petabyte+ database © Objectivity, Inc. 2013
  • 35.
    High Availability Cluster ©Objectivity, Inc. 2013
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
    …Hybrid Sources Ingest Correlation & Analytics ©Objectivity, Inc. 2013 Visual Analytics Actions
  • 41.
    Flexibility Rules! • TheSingle Logical View, Simple Distributed Servers, Replaceable Components and Small Footprint..... • ... Provide Many Deployment Options © Objectivity, Inc. 2013
  • 42.
    Flexibility Rules! • TheSingle Logical View, Simple Distributed Servers, Replaceable Components and Small Footprint..... • ... Provide Many Deployment Options Let's look at physical placement of data in more detail. © Objectivity, Inc. 2013
  • 43.
  • 44.
    Placement • The vastmajority of DBMSs place data – On disk using some type of organization. – On different disks. – On different nodes. • A lot of what differentiates DBMSs is how they place data. • The way data is placed greatly affects the applicability of the DBMS for certain models (e.g. relational, document, graph) as well as different use cases. – Grouping by sets is good for relational, but not so good for document or graph.
  • 45.
    Model Influences Placement •Most DBMSs are model specific: – Relational – Key-Value – Document – Graph • They naturally layout (place) data in ways that best support those models and the use cases they tend to attract. – Relational DBMSs tend to place like type data together as that performs best given their set based access. – Key-Value DBMSs tend to place data according to key values (hashed), as that makes looking up by key fast. – Document DBMSs tend to place data in hierarchies that represent documents to make working with a document fast. – Graph DBMSs tend to place data according to its connectivity with other data as that makes navigation fast.
  • 46.
    Placement Configurability • Performanceis the main reason behind placement choices. Orders of magnitude differences can occur between efficient and inefficient data placements. • DBMSs place data based on their understanding and experience with the use cases that fit with the model and their target markets. • Knowing that they can’t in a general way place data that is best for every application they usually make placement configurable to some degree.
  • 47.
    Typical Placement Configurability •DBMSs usually provide configurability within the general framework of placement they do: – Table spaces, Fragmenting, partitioning, sharding. – Node assignments. – Limited clustering (e.g. across joins). • Most do not provide the ability to change the framework, in whole or in part. – When you model or use data differently. – When some of your data does not fit well with general assumptions made by the framework.
  • 48.
    Fully Configurable Placement •Model independent. • Choosing which objects (e.g. records/rows) are placed together (clustering). – Because they are going to be used together. • Which ones are placed separately. – Because they are going to be used separately and may be distributed separately. • Which ones you base a primary, secondary, tertiary, etc. organization on. • Having different organizations for different data. – Different types of data. – The same types but used differently. – Data used in different modes, e.g. demo vs. production. • Distributing data based on what is used together and locality of use.
  • 49.
  • 50.
    Placement Model • ThePlacement Model is the means by which a database designer expresses a placement design and what is used by the Placement Manager when placing objects or helping the query system look them up. • A Placement Model primarily consists of rules and placers: – Rule -- expresses the conditions that are to lead to a particular object placer. – Object Placer -- responsible for placing objects into containers and keeping track of where they where placed. – Container Placer -- responsible for placing containers into databases and keeping track of where they were placed. – Database Placer -- responsible for placing database files (including those for its externalized containers) into storage locations (host/path combinations) and keeping track where they were placed.
  • 51.
  • 52.
  • 53.
  • 54.
    Place According toConnectivity
  • 55.
    Place According toRelated (Partition)
  • 56.
    Place According toValue (Partition) A D 3 D E 8 F H 12 J K 55 P Q 106 S Z 197
  • 57.
    Placement Summary • Placementhas a huge influence on performance. • You can mix and match all possible placement arrangements. • You can use different placement schemes for different parts of your data. • We place it, therefore we find it (efficiently). • It is easy to try different placement schemes to see what works best for you. • You can have different placement schemes for demo vs. production, etc. • You can evolve your placement as models are versioned.
  • 58.
    Thank you! Q&A Follow uson twitter: @infinitegraph and @objectivitydb