Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Oracle's Take On
NoSQL
Alexander Shopov
<ash@kambanaria.org>
[ash@edge ~]$ whoami
By day: Software Engineer at Cisco
By night: OSS contributor
Coordinator of Bulgarian Gnome TP
Contac...
Please Learn And Share

License: CC-BY v4.0
Contents
●

What is NoSQL?

●

Has it beaten SQL?

●

What is Oracle's take on it?

●

Do these have any foothold with us?
The NoSQL Story
What is NoSQL
●
●

Hardest question we will have to answer today
Simplest definition is that NoSQL databases are
a set (no...
Simple, yet true?
●

●

Simple answers may be simple, though they
are not necessarily correct
Especially with NoSQL becaus...
●

NoSQL isn't (something)
●

NoSQL isn't (something)

●

NoSQL is (NOT something)
In particular
●

NoSQL are data stores that are NOT relational
databases

●

They are something else

●

Thus it follows t...
NoSQL = Without SQL?
●

●

●

Perhaps a datastore without a SQL dialect of its
own?
Well no – some NoSQL solutions do have...
Not Only SQL
●
●

Which is fair enough, but a subtle point
Not in terms of black and white, cool – not cool,
works – sucks...
How did the term NoSQL caught on
then?
●

●
●

●

You should know the answer – many people
hate SQL
And SQL is easy to hat...
SQL – Love or Hate?
●

●

●

It is not a single language, rather it is a family of
dialects by competing companies
Have yo...
As Academic As You Can Get
●

SQL is not even relational

●

It is a language of bags, rather than sets
If you can make the last point –
you do not hate SQL, you have
grown used to it
What else NoSQL is not?
●

Martin Folwer says
NoSQL should be
called NoDBA
because developers
use NoSQL to run
around trad...
Dear DBAs,
Do your developers love you?
Do they hate you?
The term NoSQL
●

●

●

Was coined in 1998
by Carlo Strozzi
Lightweight relational
database that lacks
SQL dialect – NoSQL...
The current usage of the term is a
#tag
●

●

●

Started in 2009 when
Eric Evans who worked
at Rackspace
He proposed NoSQL...
NoSQL stems from needs that are
●

Hard

●

Impossible

●

Or even worse – prohibitively expensive to fulfil
with a tradit...
Examples
●

●

●

●

Not-structured data or hard to model in a relational
way
Big data - generated by interuser interactio...
NoSQL comes from the need to
scale out cheaply
NoSQL comes from the need to
scale out cheaply
NoSQL comes from the need to
scale out cheaply
Origins of Scale
●

●

●

Towards Robust
Distributed Systems –
Symposium on
Principles of Distributed
Computing - 2000
Eri...
Proven as theorem in 2002 by
Nancy Lynch and Seth Gilbert
The Fall Of the Triad
2 Out of 3
`

Consistency Availability

Tolerance to
network
partitions
The Fall Of the Triad
2 Out of 3
`

Consistency Availability

Tolerance to
network
partitions

You cannot
have full
availa...
The Fall Of the Triad
2 Out of 3
While keeping
consistency all nodes have
the same data
(not the same
as C in ACID)

`

Co...
The Fall Of the Triad
2 Out of 3
While keeping
consistency all nodes have
the same data
(not the same
as C in ACID)

`

Co...
A Somewhat Better Representation
100%
consistency

Impossible
to achieve

100%
availability
100% partition
tolerance
A Somewhat Better Representation
100%
consistency

Impossible
to achieve

The Whole
Volume Is
Interesting
100%
availabilit...
Do Not Forget
100%
consistency

Impossible
to achieve

100%
availability
100% partition
tolerance

These are actually
mult...
A Single System Can Wander In
The Space
100%
consistency

Impossible
to achieve

100%
availability
100% partition
toleranc...
Or Have Data Operations In
Different Points At The Same Time
100%
consistency

Impossible
to achieve

100%
availability
10...
CAP is easy to prove
●

●

●

●

Think of two nodes on opposite sides of a partition
Allowing at least one node to update ...
ACID vs. BASE
●

●

Brewer called these
BASE: Basically
Available, Soft state,
Eventually consistent
to pun the pun of
Jim...
A Typical NoSQL Taxonomy
●

Key-value stores
A Typical NoSQL Taxonomy
●

Key-value stores

●

Document databases
A Typical NoSQL Taxonomy
●

Key-value stores

●

Document databases

●

Column family stores
A Typical NoSQL Taxonomy
●

Key-value stores

●

Column family stores

●

Document databases

●

Graph databases
A Typical NoSQL Taxonomy
●

Key-value stores

●

Column family stores

●

Document databases

●

Graph databases
Not True Taxonomy
●

These are folk taxonomies

●

What happens to exist currently

●

No family relations – no speciation...
How do you use these NoSQLs?
●

Get one or few values out of the store

●

Either modify and store

●

Or go on looking fo...
The Other Way to Use Is MapReduce
●

●

●

Similar to the way we process garbage for
recycling
Make heaps of garbage, make...
NoSQL are Less Capable than
RDBMS
●

●

●

Do not expect similar behavior or even
capabilities – even when you have seen s...
Less Sophisticated Than RDBMS
●

Less to learn

●

Less to administer

●

Easy to start using

●

●

Similar to pointer an...
Loved By Developers? What About
Admins And Operations?
●

Ad Hoc Data Fixing – how?

●

Ad Hoc Data Querying – how?

●

Da...
So will NoSQL beat SQL?
●

●

●

First – why do we ask this? Hype and
fanboyism, tradition and rut all have their answer
F...
NoSQL is changing, so does SQL
●

Champions of NoSQL like Google are moving
closer to SQL and RDBMS
–

–

Transactions hel...
O, champion of NoSQL – Where Art
Thou Now?
●

Google
–

–

●

Spanner – ACID, SQL, schematized tables,
PAXOS, descendant o...
Michael Stonebraker
●

Ingres

●

Postgres

●

Informix

●

Vertica

●

VoltDB

●

Next 5 slides –
quoting him
SQL is so last millenium, there is
NewSQL
●

●

The variety in NoSQL and competition among
RDBMS are pushing traditional S...
OLAP/DW
●

Moving to column stores rather than traditional row
oriented stores – 50-100 faster
–
–

Better compression

–
...
Current OLTP

Buffer pool ≈ 24%
24

24

Locking ≈ 24%
4

24

Latching ≈ 24%
Recovery ≈ 24%

24

Useful work ≈ 4%
Ideal OLTP
Buffer pool ≈ 0%
Locking ≈ 0%
Latching ≈ 0%
100

Recovery ≈ 0%
Useful work ≈
100%
How to get this ideal OLTP?
●

●

●

●

Latching – due to multithreadness. Go single
threaded, each core – like a single t...
The Oracle Story
Who Is This Guy?
Who Is This Guy?
●

Designed and
implemented Unix
Who Is This Guy?
●

●

Designed and
implemented Unix
UTF-8
Who Is This Guy?
●

●
●

Designed and
implemented Unix
UTF-8
B – the direct
predecessor of C

●

Go, Plan 9

●

Early Rege...
Ken Thompson
●

Please do read:
Reflections on
Trusting Trust –
Turing Award
Lecture
Back in 1979
●

●

As part of Unix he
also wrote DBM
(database manager)
Basically a hashtable
backed by disk
storage
To Put Tings Into Perspective

1970 1971 1970 1973 1970 1975 1970 1977 1970 1979 1970 1981
1972
1974
1976
1978
1980
To Put Tings Into Perspective
A Relational
Model of Data
for Large
Shared
Data Banks
June 1970

1970 1971 1970 1973 1970 1...
To Put Tings Into Perspective
A Relational
Model of Data
for Large
Shared
Data Banks
June 1970

System R
first
research
pr...
To Put Tings Into Perspective
A Relational
Model of Data
for Large
Shared
Data Banks
June 1970

System R
first
research
pr...
To Put Tings Into Perspective
A Relational
Model of Data
for Large
Shared
Data Banks
June 1970

System R
first
research
pr...
Edward
Oates
Edward
Oates

Bruce Scott,
1st employee
Edward
Oates

Bruce Scott,
1st employee

Robert
Miner
Edward
Oates

Bruce Scott,
1st employee

Robert
Miner

Lawrence
Joseph
Lawrence
Joseph
Ellison
But this was yet to come
Back to the past
Unix Went To College – Berkeley

1986
1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997
1988
1990
1992
1994
1996
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

1986
1970 1987 1970 1989 1970 1991 1970 1993 1970 ...
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

Lawsuit –
in 1992, ended
in 1994. Effort to
rewrit...
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

Linus
Torvalds
Started
Linux

Lawsuit –
in 1992, e...
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

Linus
Torvalds
Started
Linux

Lawsuit –
in 1992, e...
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

Linus
Torvalds
Started
Linux

Lawsuit –
in 1992, e...
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

Lawsuit –
in 1992, ended
in 1994. Effort to
rewrit...
Unix Went To College – Berkeley
DBM became
ndbm –
new database
manager

Lawsuit –
in 1992, ended
in 1994. Effort to
rewrit...
Berkeley DB by Sleepycat Software

1996
1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007
1998
2000
2002
2004
20...
Berkeley DB by Sleepycat Software
Sleepycat
Software

1996
1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007
199...
Berkeley DB by Sleepycat Software
Sleepycat
Software

BerkleyDB
2.0 –
transactions

1996
1970 1997 1970 1999 1970 2001 197...
Berkeley DB by Sleepycat Software
Sleepycat
Software

BerkleyDB
2.0 –
transactions

BerkleyDB
3.0 – API

1996
1970 1997 19...
Berkeley DB by Sleepycat Software
Sleepycat
Software

BerkleyDB
2.0 –
transactions

BerkleyDB
3.0 – API

BerkleyDB
4.0 –
H...
Berkeley DB by Sleepycat Software
Sleepycat
Software

BerkleyDB
2.0 –
transactions

BerkleyDB
3.0 – API

BerkleyDB
4.0 –
H...
Berkeley DB by Oracle
Sleepycat
Software

BerkleyDB
2.0–
transactions

BerkleyDB
3.0 – API

BerkleyDB
4.0 –
HA single
mast...
And Then Everybody And Their Dog
Were Creating Databases
Google
FileSystem
2003

Google
MapReduce
2004

Google
BigTable
20...
And Then Everybody And Their Dog
Were Creating Databases
Google
FileSystem
2003

Google
MapReduce
2004

Google
BigTable
20...
Vive La Révolution!
●

●

●

●

Just 4 months later on Oracle OpenWorld in start
of October 2011 Oracle announced they wer...
General Architecture
Storage Nodes
Replication Node – DB of key value
pairs
Divide keyspace into shards
Populate
Storage Nodes
Shard 1, Replication node 1, master
Shard 1, Replication node 1, master
+ replicas
Shard 1, 2, 3, Replication node 1,
master
Shard 1, 2, 3, Replication node 1,
master + replicas
Shard 1, 2, 3, RN 1, MR + REP
Shard 1, 2, 3, RN 2, MR
Shard 1, 2, 3, RN 1, MR + REP
Shard 1, 2, 3, RN 2, MR + REP
Add last masters
Add All Data
Clients With Clever Drivers – Less
Roundtrips
Initialize Handle
private KVStore store;
private KVStoreConfig config;
public void getHandle() {
String[] hosts = {"localh...
Release Resources

public void release() {
store.close();
}
But Let's Do It the Java 7 Way

public void getHandleJava7() {
try (KVStore store1 = KVStoreFactory.getStore(config)) {
//...
How to Write
public void writeKeyValue() throws
UnsupportedEncodingException {
List<String> major = new ArrayList<>();
maj...
How to Delete

public void deleteKeyValue(){
List<String> major = Arrays.asList("Muffin", "Man");
List<String> minor = Arr...
How to Read – 1

public void readARecord() throws
UnsupportedEncodingException{
List<String> major = Arrays.asList("Muffin...
How to Read – 2
public void readFullMajor1Go(){
List<String> major = Arrays.asList("Muffin", "Man");
Key k = Key.createKey...
How to Read – 3
public void readFullMajorManyGoes(){
List<String> major = Arrays.asList("Muffin", "Man");
Key k = Key.crea...
How to Read – 4
public void readPartialMatch() {
List<String> major = Arrays.asList("Muffin");
Key k = Key.createKey(major...
Key ranges

public void prepareKeyRange() {
// Bowerick Wowbagger the Infinitely Prolonged
// Hitchhikers Guide To the Gal...
Sequence of Operations - TX
public void sequence(){
OperationFactory of = store.getOperationFactory();
List<Operation> ops...
Apache Avro
Avro JSON Schemas
{
"type": "record",
"namespace": "bgoug",
"name": "Developer",
"fields": [
{ "name": "name",
"type": "st...
Prepare Avro

public void prepareSchemas() throws IOException{
Map<String, Schema> schemas = new HashMap<>();
Schema.Parse...
Generic Avro
public void genericAvro(){
AvroCatalog catalog = store.getAvroCatalog();
GenericAvroBinding binding =
catalog...
Specific Avro
public void specificAvro(){
AvroCatalog catalog = store.getAvroCatalog();
SpecificAvroBinding binding =
cata...
JSON Avro
public void jsonAvro(){
AvroCatalog catalog = store.getAvroCatalog();
JsonAvroBinding binding =
catalog.getJsonM...
Licensing
●

Community Edition – FLOSS, AGPLv3

●

Enterprise edition:
–

SNMP, Oracle RDBMS compatibility, JMX

–

$40/us...
Further General Resources
●

●

●

●

●

Martin Fowler: NoSQL Distilled to an hour
http://vimeo.com/66052102
Martin Fowler...
Further Resources on CAP
●

●

●

●

Eric Brewer: Towards Robust Distributed Systems
http://www.cs.berkeley.edu/~brewer/cs...
Further Oracle NoSQL Resources
●

●

Oracle's Product Page:
http://www.oracle.com/technetwork/products/no
sqldb/overview/i...
Code examples
●

https://github.com/alshopov/OracleNoSQLExam
ples
Oracle's Take On NoSQL
Upcoming SlideShare
Loading in …5
×

Oracle's Take On NoSQL

3,212 views

Published on

History of NoSQL, architecture of Oralce's NoSQL database, examples for using it in Java

Published in: Technology
  • @al_shopov You're welcome :)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • @issildur - Thanks for the link. I have missed putting in the presentation. Plus it is easer to read it in Google docs than actually search torrent sites for it.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Address for the oracle whitepaper https://docs.google.com/a/nertim.com/viewer?a=v&q=cache:G4pI4ZOkzWYJ:www.oracle.com/technetwork/database/debunking-nosql-twp-399992.pdf+oracle+debunking+nosql&hl=en&gl=uk&pid=bl&srcid=ADGEESiaUPuEdyJ9cnDc_GzgsfsNq6UytDZeO5f0pgDJyUeo7x-xfe2W091nseq4s1cIl9lZ79jmGT0TRpE5PF8svROWbJSjcbrm6TXb2AWfM2TaAa6Z80dEupN3oSFzZG6y9mWBsgTd&sig=AHIEtbSXOrH6n87xP4yC4bqqMaLHSMBBNg&pli=1
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • https://docs.google.com/a/nertim.com/viewer?a=v&q=cache:G4pI4ZOkzWYJ:www.oracle.com/technetwork/database/debunking-nosql-twp-399992.pdf+oracle+debunking+nosql&hl=en&gl=uk&pid=bl&srcid=ADGEESiaUPuEdyJ9cnDc_GzgsfsNq6UytDZeO5f0pgDJyUeo7x-xfe2W091nseq4s1cIl9lZ79jmGT0TRpE5PF8svROWbJSjcbrm6TXb2AWfM2TaAa6Z80dEupN3oSFzZG6y9mWBsgTd&sig=AHIEtbSXOrH6n87xP4yC4bqqMaLHSMBBNg&pli=1
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Oracle's Take On NoSQL

  1. 1. Oracle's Take On NoSQL Alexander Shopov <ash@kambanaria.org>
  2. 2. [ash@edge ~]$ whoami By day: Software Engineer at Cisco By night: OSS contributor Coordinator of Bulgarian Gnome TP Contacts: E-mail: ash@kambanaria.org Jabber: al_shopov@jabber.minus273.org LinkedIn: http://www.linkedin.com/in/alshopov SlideShare: http://www.slideshare.net/al_shopov Web: Just search “al_shopov”
  3. 3. Please Learn And Share License: CC-BY v4.0
  4. 4. Contents ● What is NoSQL? ● Has it beaten SQL? ● What is Oracle's take on it? ● Do these have any foothold with us?
  5. 5. The NoSQL Story
  6. 6. What is NoSQL ● ● Hardest question we will have to answer today Simplest definition is that NoSQL databases are a set (not even a family) of mechanisms for storage and retrieval of data that try to be: – highly available – able to scale horizontally
  7. 7. Simple, yet true? ● ● Simple answers may be simple, though they are not necessarily correct Especially with NoSQL because:
  8. 8. ● NoSQL isn't (something)
  9. 9. ● NoSQL isn't (something) ● NoSQL is (NOT something)
  10. 10. In particular ● NoSQL are data stores that are NOT relational databases ● They are something else ● Thus it follows that they are not relational ● And some say they are not even true databases
  11. 11. NoSQL = Without SQL? ● ● ● Perhaps a datastore without a SQL dialect of its own? Well no – some NoSQL solutions do have SQL or SQL- ish dialects So NoSQL is not even NO to SQL
  12. 12. Not Only SQL ● ● Which is fair enough, but a subtle point Not in terms of black and white, cool – not cool, works – sucks binary viewpoints ● And then it should be NOSQL, but it is NoSQL ● But this still does not explain things
  13. 13. How did the term NoSQL caught on then? ● ● ● ● You should know the answer – many people hate SQL And SQL is easy to hate! It is large, Large, LARGE, LARGE – Oracle SQL Language reference for 12.1 is 1826 pages. Have you actually seen the whole rail-road diagram for Oracle's SELECT anywhere?
  14. 14. SQL – Love or Hate? ● ● ● It is not a single language, rather it is a family of dialects by competing companies Have you seen the standard (9 parts, 10 th under way, more than 4000 pages)? Do you even care about the standard?
  15. 15. As Academic As You Can Get ● SQL is not even relational ● It is a language of bags, rather than sets
  16. 16. If you can make the last point – you do not hate SQL, you have grown used to it
  17. 17. What else NoSQL is not? ● Martin Folwer says NoSQL should be called NoDBA because developers use NoSQL to run around traditional databases with their DBAs and bureaucracies.
  18. 18. Dear DBAs, Do your developers love you? Do they hate you?
  19. 19. The term NoSQL ● ● ● Was coined in 1998 by Carlo Strozzi Lightweight relational database that lacks SQL dialect – NoSQL NoSQL should be NoREL
  20. 20. The current usage of the term is a #tag ● ● ● Started in 2009 when Eric Evans who worked at Rackspace He proposed NoSQL as a Twitter #tag for a conference for the existing distributed databases The term stayed and gained popularity
  21. 21. NoSQL stems from needs that are ● Hard ● Impossible ● Or even worse – prohibitively expensive to fulfil with a traditional relational databases
  22. 22. Examples ● ● ● ● Not-structured data or hard to model in a relational way Big data - generated by interuser interaction (Facebook), imported from external sources (WWW) Bringing structure to otherwise unstructured data – what we usually model as LOBs or BLOBs in RDBMS Graphs – Hierarchies in RDBMS (even bi-directional). Storing vertices and edges in a table and then modeling paths with joins – like the Entity Attribute Value anti-pattern
  23. 23. NoSQL comes from the need to scale out cheaply
  24. 24. NoSQL comes from the need to scale out cheaply
  25. 25. NoSQL comes from the need to scale out cheaply
  26. 26. Origins of Scale ● ● ● Towards Robust Distributed Systems – Symposium on Principles of Distributed Computing - 2000 Eric Brewer then at Inktomi Called his conjecture – the CAP theorem
  27. 27. Proven as theorem in 2002 by Nancy Lynch and Seth Gilbert
  28. 28. The Fall Of the Triad 2 Out of 3 ` Consistency Availability Tolerance to network partitions
  29. 29. The Fall Of the Triad 2 Out of 3 ` Consistency Availability Tolerance to network partitions You cannot have full availability – all operations can proceed, even writes
  30. 30. The Fall Of the Triad 2 Out of 3 While keeping consistency all nodes have the same data (not the same as C in ACID) ` Consistency Availability Tolerance to network partitions You cannot have full availability – all operations can proceed, even writes
  31. 31. The Fall Of the Triad 2 Out of 3 While keeping consistency all nodes have the same data (not the same as C in ACID) ` Consistency Availability Tolerance to network partitions You cannot have full availability – all operations can proceed, even writes When you have partitions – machines that cannot communicate
  32. 32. A Somewhat Better Representation 100% consistency Impossible to achieve 100% availability 100% partition tolerance
  33. 33. A Somewhat Better Representation 100% consistency Impossible to achieve The Whole Volume Is Interesting 100% availability 100% partition tolerance
  34. 34. Do Not Forget 100% consistency Impossible to achieve 100% availability 100% partition tolerance These are actually multidimensional
  35. 35. A Single System Can Wander In The Space 100% consistency Impossible to achieve 100% availability 100% partition tolerance
  36. 36. Or Have Data Operations In Different Points At The Same Time 100% consistency Impossible to achieve 100% availability 100% partition tolerance
  37. 37. CAP is easy to prove ● ● ● ● Think of two nodes on opposite sides of a partition Allowing at least one node to update state will cause the nodes to become inconsistent, thus forfeiting C. If we preserve consistency, one side of the partition must act as if it is unavailable, thus forfeiting A. Only when nodes communicate is it possible to preserve both consistency and availability, thereby forfeiting P.
  38. 38. ACID vs. BASE ● ● Brewer called these BASE: Basically Available, Soft state, Eventually consistent to pun the pun of Jim Gray But NoSQL caught on
  39. 39. A Typical NoSQL Taxonomy ● Key-value stores
  40. 40. A Typical NoSQL Taxonomy ● Key-value stores ● Document databases
  41. 41. A Typical NoSQL Taxonomy ● Key-value stores ● Document databases ● Column family stores
  42. 42. A Typical NoSQL Taxonomy ● Key-value stores ● Column family stores ● Document databases ● Graph databases
  43. 43. A Typical NoSQL Taxonomy ● Key-value stores ● Column family stores ● Document databases ● Graph databases
  44. 44. Not True Taxonomy ● These are folk taxonomies ● What happens to exist currently ● No family relations – no speciation ● Even putting them in four corners is visually lying – some key-value stores are very close to some document databases, while graph databases look like the odd man out and stand on their own
  45. 45. How do you use these NoSQLs? ● Get one or few values out of the store ● Either modify and store ● Or go on looking for other values ● It is like pointer chasing ● p->p1->p2->p3...
  46. 46. The Other Way to Use Is MapReduce ● ● ● Similar to the way we process garbage for recycling Make heaps of garbage, make many teams sort each out (map) Aggregate iron, plastics, paper, glass from each team (reduce) ● Very efficient batch processing ● But it is batch processing
  47. 47. NoSQL are Less Capable than RDBMS ● ● ● Do not expect similar behavior or even capabilities – even when you have seen so on first glance in the documentation The maturity of RDBMS ecosystem and your expectations may be a bad service for the wild west of NoSQL Do not assume – double check
  48. 48. Less Sophisticated Than RDBMS ● Less to learn ● Less to administer ● Easy to start using ● ● Similar to pointer and reference programming models Programmers like them
  49. 49. Loved By Developers? What About Admins And Operations? ● Ad Hoc Data Fixing – how? ● Ad Hoc Data Querying – how? ● Data Export – how?
  50. 50. So will NoSQL beat SQL? ● ● ● First – why do we ask this? Hype and fanboyism, tradition and rut all have their answer For some NoSQL has already beaten SQL (no matter what NoSQL and SQL mean) For others NoSQL is way too young and not providing even a part of what SQL does (similarly - no matter what NoSQL and SQL mean)
  51. 51. NoSQL is changing, so does SQL ● Champions of NoSQL like Google are moving closer to SQL and RDBMS – – Transactions help developers reason about what is happening, developers also like it – No ACID in DB means it is maddengly hard to ACID on application level – ● Declarativeness of SQL is fine and actually developers like it Speed is not everythg – it is just part of the equasion Move from batch to online processing
  52. 52. O, champion of NoSQL – Where Art Thou Now? ● Google – – ● Spanner – ACID, SQL, schematized tables, PAXOS, descendant of Megastore rather than BigTable F1 – General transactions, Paxos, relational schema + extensions hierarchy, rich data types, Facebook – Presto – standard SQL, window functions, ad hoc queries
  53. 53. Michael Stonebraker ● Ingres ● Postgres ● Informix ● Vertica ● VoltDB ● Next 5 slides – quoting him
  54. 54. SQL is so last millenium, there is NewSQL ● ● The variety in NoSQL and competition among RDBMS are pushing traditional SQL engnes to differentiate more strongly No more – One size fits all
  55. 55. OLAP/DW ● Moving to column stores rather than traditional row oriented stores – 50-100 faster – – Better compression – ● No row per header IO much better for sparsely filled wide tables when running aggregates on several columns IBM DB2 (10.5, June 2013), Oracle (some in 11g2 2009 Exadata, more in 12c), MS SQL Server (some in 2012/2014 CTP1, June 2013), SAP HANA, MySQL
  56. 56. Current OLTP Buffer pool ≈ 24% 24 24 Locking ≈ 24% 4 24 Latching ≈ 24% Recovery ≈ 24% 24 Useful work ≈ 4%
  57. 57. Ideal OLTP Buffer pool ≈ 0% Locking ≈ 0% Latching ≈ 0% 100 Recovery ≈ 0% Useful work ≈ 100%
  58. 58. How to get this ideal OLTP? ● ● ● ● Latching – due to multithreadness. Go single threaded, each core – like a single thread, divide memory or remove all shared data Buffer pool – go into main memory, use anticaching Row level locking – MVCC, timestamp ordering, lightweight locking Recovery – replication rather than rely on Aries, replicate via command logging – Algorithms for Recovery and Isolation Exploiting Semantics
  59. 59. The Oracle Story
  60. 60. Who Is This Guy?
  61. 61. Who Is This Guy? ● Designed and implemented Unix
  62. 62. Who Is This Guy? ● ● Designed and implemented Unix UTF-8
  63. 63. Who Is This Guy? ● ● ● Designed and implemented Unix UTF-8 B – the direct predecessor of C ● Go, Plan 9 ● Early Regex ● Turing Award
  64. 64. Ken Thompson ● Please do read: Reflections on Trusting Trust – Turing Award Lecture
  65. 65. Back in 1979 ● ● As part of Unix he also wrote DBM (database manager) Basically a hashtable backed by disk storage
  66. 66. To Put Tings Into Perspective 1970 1971 1970 1973 1970 1975 1970 1977 1970 1979 1970 1981 1972 1974 1976 1978 1980
  67. 67. To Put Tings Into Perspective A Relational Model of Data for Large Shared Data Banks June 1970 1970 1971 1970 1973 1970 1975 1970 1977 1970 1979 1970 1981 1972 1974 1976 1978 1980
  68. 68. To Put Tings Into Perspective A Relational Model of Data for Large Shared Data Banks June 1970 System R first research prototype 1974 1970 1971 1970 1973 1970 1975 1970 1977 1970 1979 1970 1981 1972 1974 1976 1978 1980
  69. 69. To Put Tings Into Perspective A Relational Model of Data for Large Shared Data Banks June 1970 System R first research prototype 1974 First IBM commercial product SQL/DS - 1981. 1970 1971 1970 1973 1970 1975 1970 1977 1970 1979 1970 1981 1972 1974 1976 1978 1980
  70. 70. To Put Tings Into Perspective A Relational Model of Data for Large Shared Data Banks June 1970 System R first research prototype 1974 First IBM commercial product SQL/DS - 1981. 1970 1971 1970 1973 1970 1975 1970 1977 1970 1979 1970 1981 1972 1974 1976 1978 1980 Beat to the market by a smaller firm RSI – Relational Software. Inc. Founded as Software Development Laboratories (SDL) 1977 by these guys
  71. 71. Edward Oates
  72. 72. Edward Oates Bruce Scott, 1st employee
  73. 73. Edward Oates Bruce Scott, 1st employee Robert Miner
  74. 74. Edward Oates Bruce Scott, 1st employee Robert Miner Lawrence Joseph
  75. 75. Lawrence Joseph Ellison
  76. 76. But this was yet to come Back to the past
  77. 77. Unix Went To College – Berkeley 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996
  78. 78. Unix Went To College – Berkeley DBM became ndbm – new database manager 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996
  79. 79. Unix Went To College – Berkeley DBM became ndbm – new database manager Lawsuit – in 1992, ended in 1994. Effort to rewrite AT&T copyrighted utilities. 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996
  80. 80. Unix Went To College – Berkeley DBM became ndbm – new database manager Linus Torvalds Started Linux Lawsuit – in 1992, ended in 1994. Effort to rewrite AT&T copyrighted utilities. 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996
  81. 81. Unix Went To College – Berkeley DBM became ndbm – new database manager Linus Torvalds Started Linux Lawsuit – in 1992, ended in 1994. Effort to rewrite AT&T copyrighted utilities. 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996 Keith Bostic Designed the API
  82. 82. Unix Went To College – Berkeley DBM became ndbm – new database manager Linus Torvalds Started Linux Lawsuit – in 1992, ended in 1994. Effort to rewrite AT&T copyrighted utilities. 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996 Keith Bostic Michael Olson Designed the API Btree impl.
  83. 83. Unix Went To College – Berkeley DBM became ndbm – new database manager Lawsuit – in 1992, ended in 1994. Effort to rewrite AT&T copyrighted utilities. Linus Torvalds Started Linux 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996 Keith Bostic Michael Olson Designed the API Btree impl. Db 1.85 part of 4.4 BSD
  84. 84. Unix Went To College – Berkeley DBM became ndbm – new database manager Lawsuit – in 1992, ended in 1994. Effort to rewrite AT&T copyrighted utilities. Linus Torvalds Started Linux 1986 1970 1987 1970 1989 1970 1991 1970 1993 1970 1995 1970 1997 1988 1990 1992 1994 1996 Keith Bostic Michael Olson Designed the API Btree impl. Db 1.85 part of 4.4 BSD Margo Seltzer Paper on TX variant
  85. 85. Berkeley DB by Sleepycat Software 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006
  86. 86. Berkeley DB by Sleepycat Software Sleepycat Software 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006
  87. 87. Berkeley DB by Sleepycat Software Sleepycat Software BerkleyDB 2.0 – transactions 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006
  88. 88. Berkeley DB by Sleepycat Software Sleepycat Software BerkleyDB 2.0 – transactions BerkleyDB 3.0 – API 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006
  89. 89. Berkeley DB by Sleepycat Software Sleepycat Software BerkleyDB 2.0 – transactions BerkleyDB 3.0 – API BerkleyDB 4.0 – HA single master, multiple reader 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006
  90. 90. Berkeley DB by Sleepycat Software Sleepycat Software BerkleyDB 2.0 – transactions BerkleyDB 3.0 – API BerkleyDB 4.0 – HA single master, multiple reader 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006 BerkleyDB Java Edition pure Java impl.
  91. 91. Berkeley DB by Oracle Sleepycat Software BerkleyDB 2.0– transactions BerkleyDB 3.0 – API BerkleyDB 4.0 – HA single master, multiple reader 1996 1970 1997 1970 1999 1970 2001 1970 2003 1970 2005 1970 2007 1998 2000 2002 2004 2006 BerkleyDB Java Edition pure Java impl. Oracle bought Sleepycat – embedded DB
  92. 92. And Then Everybody And Their Dog Were Creating Databases Google FileSystem 2003 Google MapReduce 2004 Google BigTable 2006 Apache CouchDB 2005 Apache Hadoop 2007 Amazon Dynamo 2007 Facebook Cassandra 2008 Yahoo! PNUTS 2008 Basho Riak 2009 10gen MongoDB 2009 VMWare Redis 2009 LinkedIn Voldemort 2009 Twitter FlockDB 2010
  93. 93. And Then Everybody And Their Dog Were Creating Databases Google FileSystem 2003 Google MapReduce 2004 Google BigTable 2006 Apache CouchDB 2005 Apache Hadoop 2007 r pe 11 epape QL hy net 20 Facebook Yahoo! oS Basho er 10gen Amazon ay hit PNUTS M Dynamo Cassandra Riak Int MongoDB W th2008N e 2008 g e ly in 2007 acl 2009 2009 r O kin blTwitter n sites nLinkedIna e o t VMWare u eb Voldemort FlockDBen D ail torr Redis av 2009 w 2009 and 2010 No es ch ca
  94. 94. Vive La Révolution! ● ● ● ● Just 4 months later on Oracle OpenWorld in start of October 2011 Oracle announced they were working on a NoSQL solution, availability – end of October 2011 December 2011 – version 1.2.x December 2012 – Oracle NoSQL Database 2.0, 11gR2 (11.2.x) The Old Dog Learns The New Tricks – VERY, VERY FAST
  95. 95. General Architecture Storage Nodes
  96. 96. Replication Node – DB of key value pairs
  97. 97. Divide keyspace into shards
  98. 98. Populate Storage Nodes
  99. 99. Shard 1, Replication node 1, master
  100. 100. Shard 1, Replication node 1, master + replicas
  101. 101. Shard 1, 2, 3, Replication node 1, master
  102. 102. Shard 1, 2, 3, Replication node 1, master + replicas
  103. 103. Shard 1, 2, 3, RN 1, MR + REP Shard 1, 2, 3, RN 2, MR
  104. 104. Shard 1, 2, 3, RN 1, MR + REP Shard 1, 2, 3, RN 2, MR + REP
  105. 105. Add last masters
  106. 106. Add All Data
  107. 107. Clients With Clever Drivers – Less Roundtrips
  108. 108. Initialize Handle private KVStore store; private KVStoreConfig config; public void getHandle() { String[] hosts = {"localhost:5000", "127.0.0.1:5000"}; config = new KVStoreConfig("example", hosts); // set default time out config.setRequestTimeout(50, TimeUnit.MILLISECONDS); // can set consistency, durability // config.setConsistency(Consistency.ABSOLUTE) // .setDurability(Durability.COMMIT_SYNC); // set*Void() store = KVStoreFactory.getStore(config); }
  109. 109. Release Resources public void release() { store.close(); }
  110. 110. But Let's Do It the Java 7 Way public void getHandleJava7() { try (KVStore store1 = KVStoreFactory.getStore(config)) { // MEAT GOES HERE } catch (Exception e) { } finally { } }
  111. 111. How to Write public void writeKeyValue() throws UnsupportedEncodingException { List<String> major = new ArrayList<>(); major.add("Muffin"); major.add("Man"); List<String> minor = new ArrayList<>(); minor.add("address"); Key k = Key.createKey(major, minor); String address = "Drury Lane"; Value v = Value.createValue(address.getBytes("UTF-8")); store.put(k, v); store.putIfAbsent(k, v); store.putIfPresent(k, v); store.putIfVersion(k, v, null); }
  112. 112. How to Delete public void deleteKeyValue(){ List<String> major = Arrays.asList("Muffin", "Man"); List<String> minor = Arrays.asList("address"); Key k = Key.createKey(major, minor); store.delete(k); store.multiDelete(Key.createKey(major), null, null); }
  113. 113. How to Read – 1 public void readARecord() throws UnsupportedEncodingException{ List<String> major = Arrays.asList("Muffin", "Man"); List<String> minor = Arrays.asList("address"); Key k = Key.createKey(major, minor); ValueVersion vv = store.get(k); Value v = vv.getValue(); String result = new String(v.getValue(), "UTF-8"); result.equals("Drury Lane"); }
  114. 114. How to Read – 2 public void readFullMajor1Go(){ List<String> major = Arrays.asList("Muffin", "Man"); Key k = Key.createKey(major); // Single operation SortedMap<Key,ValueVersion> records = store.multiGet(k, null, null); for (Map.Entry<Key, ValueVersion> entry : records.entrySet()) { Key key = entry.getKey(); List<String> minor = key.getMinorPath(); ValueVersion vv = entry.getValue(); Value v = vv.getValue(); // Do some work with the Value here } }
  115. 115. How to Read – 3 public void readFullMajorManyGoes(){ List<String> major = Arrays.asList("Muffin", "Man"); Key k = Key.createKey(major); // Non atomic Iterator<KeyValueVersion> it = store.multiGetIterator( Direction.FORWARD, // BACKWARD, UNORDEREDED 0, // Batch size, 0 - use default k, // the key null, // KeyRange null); // Depth - CHILDREN_ONLY, PARENT_AND_CHILDREN, // DESCENDANTS_ONLY, PARENT_AND_DESCENDANTS while (it.hasNext()){ Value v = it.next().getValue(); // Do some work with the Value here } }
  116. 116. How to Read – 4 public void readPartialMatch() { List<String> major = Arrays.asList("Muffin"); Key k = Key.createKey(major); // Non atomic, read large part of DB Iterator<KeyValueVersion> it = store.storeIterator( Direction.UNORDERED, // BACKWARD, FORWARD 0, // Batch size, 0 - use default k, // the key null, // KeyRange null); // Depth - CHILDREN_ONLY, PARENT_AND_CHILDREN, // DESCENDANTS_ONLY, PARENT_AND_DESCENDANTS while (it.hasNext()){ Value v = it.next().getValue(); // Do some work with the Value here } }
  117. 117. Key ranges public void prepareKeyRange() { // Bowerick Wowbagger the Infinitely Prolonged // Hitchhikers Guide To the Galaxy // Arthur Philip Dent - You are a jerk KeyRange kr = new KeyRange( "Arthur Philip Dent", // start true, // inclusive? [( "A-Rth-Urp-Hil-Ipdenu", // slug true); // inclusive? )] }
  118. 118. Sequence of Operations - TX public void sequence(){ OperationFactory of = store.getOperationFactory(); List<Operation> ops = new ArrayList<>(); Key k = null; Value v = null; ops.add(of.createDelete(k)); ops.add(of.createPut(k, v)); // of.createDeleteIfVersion(); of.createPutIfAbsent(); // of.createPutIfPresent(); of.createPutIfVersion() try { store.execute(ops); } catch (OperationExecutionException | // cannot exec DurabilityException | // durability not met IllegalArgumentException | // list is ∅, null RequestTimeoutException e) { // timeout } catch (FaultException e) { // sth else } }
  119. 119. Apache Avro
  120. 120. Avro JSON Schemas { "type": "record", "namespace": "bgoug", "name": "Developer", "fields": [ { "name": "name", "type": "string", "default" : "NONE"}, { "name": "age", "type": "int", "default" : "NONE"}, { "name": "language", "type": "string", "default" : "Java"} ] }
  121. 121. Prepare Avro public void prepareSchemas() throws IOException{ Map<String, Schema> schemas = new HashMap<>(); Schema.Parser parser = new Schema.Parser(); Schema developerSchema = parser.parse(new File("DeveloperSchema.avsc")); schemas.put(developerSchema.getFullName(), developerSchema); Schema dbAdminSchema = parser.parse(new File("DbAdminSchema.avsc")); schemas.put(dbAdminSchema.getFullName(), dbAdminSchema); }
  122. 122. Generic Avro public void genericAvro(){ AvroCatalog catalog = store.getAvroCatalog(); GenericAvroBinding binding = catalog.getGenericMultiBinding(schemas); GenericRecord dev = new GenericData.Record(developerSchema); dev.put("name", "Sam A. Hacker"); dev.put("age", 37); dev.put("language", "Java"); Key k = null; //Key.createKey store.put(k, binding.toValue(dev)); Value v = store.get(k).getValue(); GenericRecord dbAdmin = binding.toObject(v); dbAdmin.get("name"); }
  123. 123. Specific Avro public void specificAvro(){ AvroCatalog catalog = store.getAvroCatalog(); SpecificAvroBinding binding = catalog.getSpecificMultiBinding(); // generate via provided ant task // org.apache.avro.compiler.specific.SchemaTask Developer dev = new Developer(); dev.setName("Sam. A. Hacker"); dev.setAge(37); dev.setLanguage("Java"); Key k = null; //Key.createKey store.put(k, binding.toValue(dev)); Value v = store.get(k).getValue(); SpecificRecord sr = binding.toObject(v); if (sr.getSchema().getFullName().equals("dba")){ DbAdmin dbAdmin = (DbAdmin) sr; } }
  124. 124. JSON Avro public void jsonAvro(){ AvroCatalog catalog = store.getAvroCatalog(); JsonAvroBinding binding = catalog.getJsonMultiBinding(schemas); String jsonText = "{"name": "Sam. A. Hacker"," + " "age": 34, "language": "Java"}"; ObjectMapper jsonMapper = new ObjectMapper(); JsonNode json = jsonMapper.readTree(jsonText); JsonRecord dev = new JsonRecord(json, developerSchema); Key k = null; //Key.createKey store.put(k, binding.toValue(dev)); Value v = store.get(k).getValue(); JsonRecord jr = binding.toObject(v); if (jr.getSchema().getFullName().equals("dba")){ JsonNode dbAdmin = jr.getJsonNode(); dbAdmin.get("db"); } }
  125. 125. Licensing ● Community Edition – FLOSS, AGPLv3 ● Enterprise edition: – SNMP, Oracle RDBMS compatibility, JMX – $40/user/year (min. 25), $2000/processor/year – RDBMS Standard Edition One ≤ NoSQL ≤ RDBMS Standard Edition
  126. 126. Further General Resources ● ● ● ● ● Martin Fowler: NoSQL Distilled to an hour http://vimeo.com/66052102 Martin Fowler: NoSQL Distilled http://martinfowler.com/nosql.html Ilya Katsov: NoSQL Data Modelling Techniques http://highlyscalable.wordpress.com/2012/03/01/nosql-dat a-modeling-techniques/ Christof Strauch: NoSQL Databases http://www.christof-strauch.de/nosqldbs.pdf Michael Stonebreaker http://slideshot.epfl.ch/play/suri_stonebraker
  127. 127. Further Resources on CAP ● ● ● ● Eric Brewer: Towards Robust Distributed Systems http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keyn ote.pdf Eric Brewer: NoSQL: Past, Present, Future http://www.infoq.com/presentations/NoSQL-History Eric Brewer: CAP Twelve Years Later: How the "Rules" Have Changed http://www.infoq.com/articles/cap-twelve-years-later-how-the -rules-have-changed Nancy Lynch, Seth Gilbert: Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services http://lpd.epfl.ch/sgilbert/pubs/BrewersConjecture-SigAct.pdf
  128. 128. Further Oracle NoSQL Resources ● ● Oracle's Product Page: http://www.oracle.com/technetwork/products/no sqldb/overview/index.html Good Documentation: http://docs.oracle.com/cd/NOSQL/html/index.ht ml
  129. 129. Code examples ● https://github.com/alshopov/OracleNoSQLExam ples

×