This presentation gives an overview of the OrientDB database project. It explains OrientDB in terms of it's functionality, its indexing and architecture. It examines the ETL functionality as well as the UI available.
Links for further information and connecting
http://www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
https://nz.linkedin.com/pub/mike-frampton/20/630/385
https://open-source-systems.blogspot.com/
1. What Is Apache OrientDB ?
● A NoSQL database
● Uses a multi model approach – supports models
– Graph, document, key/value, object
– Reactive, Geospatial
● Open source Apache 2.0 license
● Offers scaleability / high performance
● Supports polyglot persistence i.e.
– The idea that different kinds of data benefit
– From being stored in different formats
2. What Is Apache OrientDB ?
● Offered from orientdb.com as
– Community and enterprise editions
●
Supports use of Tinkerpop 2.x and 3.x
● Has multi-master and sharded architecture
● Offers Gremlin and extended SQL interfaces
● Supports ACID transactions
● Offers record level security
● Supports schema-full, schema-less, schema mix
● Written in Java
3. OrientDB Indexes
● SB-Tree Index
– Default, durable, transactional and supports range queries
● Hash Index
– Fast lookup, light on disk usage, no range queries
● Auto Sharding Index
– Durable and transactional, no range queries
● Lucene Full Text Index
– Durable and transactional, text only, supports range queries
● Lucene Spatial Index
– Durable and transactional, spatial only, supports range queries
6. OrientDB Integration
● Uses ETL tool ( JSON cfg ) for data import
● Compatible with most RDBMS with a JDBC driver
– Tested Oracle, SQLServer, MySQL, PostgreSQL, HyperSQL
● Has an Apache Spark connector (2.2.7+)
● Has a Neo4j importer
– Uses Neo4j Java API to extract graph
– Uses OrientDB Java API to import graph
7. OrientDB Backups
● Possible to export database in JSON format but
– No locking so possibly not an exact replica
● Backups lock database and create an exact replica
– Database in read only mode
– Concurrent writes blocked
● Distributed cluster allows ( during backup )
– Read / write
– Snapshots
8. OrientDB Cluster
● Supports a distributed architecture
● Uses HazelCast for auto discovery of nodes
● Has a multi-master system
● Supports REPLICA nodes ( read only )
● Records can be created in distributed mode
● Supports distributed transactions
● Cannot import DB in distributed mode
9. OrientDB Teleporter
● Uses JDBC to import RDBMS database
● Enterprise version has a synchronisation function
● Tested Oracle, SQLServer, MySQL, PostgreSQL, HyperSQL
● Execution involves
– Source DB Schema Building
– Graph Model Building
– OrientDB Schema Writing
– OrientDB importing
● Default admin, reader, writer accounts created
12. Available Books
● See “Big Data Made Easy”
– Apress Jan 2015
●
See “Mastering Apache Spark”
– Packt Oct 2015
●
See “Complete Guide to Open Source Big Data Stack
– “Apress Jan 2018”
● Find the author on Amazon
– www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
●
Connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
13. Connect
● Feel free to connect on LinkedIn
– www.linkedin.com/in/mike-frampton-38563020
● See my open source blog at
– open-source-systems.blogspot.com/
● I am always interested in
– New technology
– Opportunities
– Technology based issues
– Big data integration