NoSql Databases and Polyglot Applications

Bases
de
Datos
NoSQL

Y
Aplicaciones
Poliglotas

Agus%n
Magaña
Falconi

AGENDA

•  Un
poco
de
historia

•  Deﬁnición
de
NoSQL

•  Tipos
de
NoSQL

•  ¿Porqué
Aplicaciones
Poliglotas?

•  Ejemplo
usando
REDIS

Edgar
Frank
"Ted"
Codd

Tendencia:
Menos
uniformidad

Datos
dispersos:
BD
Relacionales

Tendencia:
Crecimiento
exponencial
de
datos

Not
Only
SQL

•  No
usan
SQL
como
lenguaje
principal
de
consultas

•  No
se
requieren
estructuras
ﬁjas
como
tablas

•  No
JOINsa

•  No
se
garanAza
ACID

•  Escalan
bien
horizontalmente

Tipos
de
Bases
de

Datos
NoSQL

•  Key/Value
stores

•  Document
Databases

•  Graph
Databases

•  Column
Oriented
Databases

Key/Value
Stores

•  Apache
Cassandra

•  BigTable
de
Google

•  Dynamo
de
Amazon

•  Voldemort
de
LinkedIn

•  Memcached

•  Oracle
NoSQL
Database

•  Redis

Graph
Databases

•  Neo4j

•  DEX

•  AlegroGraph

•  OrientDB

•  InﬁniteGraph

•  InfoGraph

•  HyperGraphDB

Document
Databases

•  MongoDB

•  CouchDB

•  BaseX

•  Djondb

•  SimpleDB

•  Terrastore

Column
Oriented

Databases

•  Cassandra

•  BigTable

•  Hbase

•  Hypertable

¿Aplicaciones
Políglotas?

Food To Go Architecture
RESTAURANT
CONSUMER
OWNER

Order Restaurant
taking Management

MySQL
Database

Limitaciones
de
las
Bases
de

Datos
Relacionales

•  Escalabilidad

•  Distribución

•  Modiﬁción
del
schema

•  O/R
impedance
mismatch

•  Manejor
de
datos
semi-‐estructurados

Solución:
Gastar
mucho
dinero

Solución:
Usar
NoSQL

Beneﬁcios
Desventajas

•  Alto
performance
•  Limitaciones
en

transacciones

•  Alta
escalabilidad

•  Limitaciones
en
queries

•  Rico
modelo
de
datos

•  Consistencia
no

•  Schema-‐less
garanAzada

•  Datos
sin
constrains

Redis

•  Almacenamiento
key-‐value
avanzado

•  Muy
rápida
(100K
reqs/seg)

•  Persistencia
opcional

•  Transacciones
con
bloqueo
opAmista

•  Replicación
de
información
Master/Slave

Sorted sets
Value
Key

a b
myset
5.0 10.

Members are Score
sorted by score

Adding members to a sorted set
Redis Server

Key Score Value

a
zadd myset 5.0 a myset
5.0

Redis Server

a b
zadd myset 10.0 b myset
5.0 10.

Redis Server

c a b
zadd myset 1.0 c myset
1.0 5.0 10.

Retrieving members by index range
Start End
Key
Index Index Redis Server

zrange myset 0 1

c a b
myset
1.0 5.0 10.
c a

Retrieving members by score
Min Max
Key
value value Redis Server

zrangebyscore myset 1 6

c a b
myset
1.0 5.0 10.
c a

Redis
es
bueno
pero
]ene

desventajas

•  Búsquedas
solo
por
PK

•  Modelo
de
transacciones
limitado:

•  Lee
primero
y
ejecuta
updates
como
batch.

• 
Los
datos
deben
de
caber
en
memoria

•  Le
faltan
funcionalidad
de
controles
de
acceso

Caching with Redis
RESTAURANT
CONSUMER
OWNER

Order Restaurant
taking Management

Redis MySQL
First Second
Cache Database

Domain object to key-value
mapping?

Restaurant
K1 V1

TimeRange MenuItem K2 V2
TimeRange MenuItem

... ...
ServiceArea

Finding available restaurants
Available restaurants =
Serve the zip code of the delivery address
AND
Are open at the delivery time

public interface AvailableRestaurantRepository {

List<AvailableRestaurant>
! ﬁndAvailableRestaurants(Address deliveryAddress, Date deliveryTime);
...
}

Food to Go – Domain model (partial)
class Restaurant { class TimeRange {
long id; long id;
String name; int dayOfWeek;
Set<String> serviceArea; int openTime;
Set<TimeRange> openingHours; int closeTime;
List<MenuItem> menuItems;
}
}

class MenuItem {
String name;
double price;
}

Database schema
ID Name …
RESTAURANT table
1 Ajanta
2 Montclair Eggshop

Restaurant_id zipcode
RESTAURANT_ZIPCODE table
1 94707
1 94619
2 94611
2 94619
RESTAURANT_TIME_RANGE table
Restaurant_id dayOfWeek openTime closeTime
1 Monday 1130 1430
1 Monday 1730 2130
2 Tuesday 1130 …

Finding available restaurants on Monday, 6.15pm
for 94619 zipcode
Straightforward three-way join

select r.*
from restaurant r
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815
and 1815 <= tr.closingtime

BUT how to implement ﬁndAvailableRestaurants()
with Redis?!

?
select r.*
from restaurant r K1 V1
inner join restaurant_time_range tr
on r.id =tr.restaurant_id
inner join restaurant_zipcode sa
on r.id = sa.restaurant_id
K2 V2
where ’94619’ = sa.zip_code
and tr.day_of_week=’monday’
and tr.openingtime <= 1815 ... ...
and 1815 <= tr.closingtime

Where we need to be
ZRANGEBYSCORE myset 1 6

=
sorted_set
select value,score key value score
from sorted_set
where key = ‘myset’
and score >= 1
and score <= 6

We need to denormalize

Think materialized view

Simpliﬁcation #1:
Denormalization
Restaurant_id Day_of_week Open_time Close_time Zip_code

1 Monday 1130 1430 94707
1 Monday 1130 1430 94619
1 Monday 1730 2130 94707
1 Monday 1730 2130 94619
2 Monday 0700 1430 94619
…

SELECT restaurant_id
FROM time_range_zip_code
WHERE day_of_week = ‘Monday’ Simpler query:
 No joins
AND zip_code = 94619  Two = and two <
AND 1815 < close_time
AND open_time < 1815

Simpliﬁcation #2: Application
ﬁltering
SELECT restaurant_id, open_time
WHERE day_of_week = ‘Monday’ Even simpler query
• No joins
AND zip_code = 94619
• Two = and one <
AND open_time < 1815

Simpliﬁcation #3: Eliminate multiple =’s with
concatenation
Restaurant_id Zip_dow Open_time Close_time

1 94707:Monday 1130 1430
1 94619:Monday 1130 1430
1 94707:Monday 1730 2130
1 94619:Monday 1730 2130
2 94619:Monday 0700 1430
…

SELECT restaurant_id, open_time
WHERE zip_code_day_of_week = ‘94619:Monday’
key
range

Simpliﬁcation #4: Eliminate multiple RETURN
VALUES with concatenation
zip_dow open_time_restaurant_id close_time
94707:Monday 1130_1 1430
94619:Monday 1130_1 1430
94707:Monday 1730_1 2130
94619:Monday 1730_1 2130
94619:Monday 0700_2 1430
...

SELECT open_time_restaurant_id,
WHERE zip_code_day_of_week = ‘94619:Monday’
✔

Using a Redis sorted set as an index
zip_dow open_time_restaurant_id close_time
94707:Monday 1130_1 1430
94619:Monday 1130_1 1430
94707:Monday 1730_1 2130
94619:Monday 1730_1 2130
94619:Monday 0700_2 1430
...

Key Sorted Set [ Entry:Score, …]

94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]

94707:Monday [1130_1:1430, 1730_1:2130]

Querying with ZRANGEBYSCORE
Key Sorted Set [ Entry:Score, …]

94619:Monday [0700_2:1430, 1130_1:1430, 1730_1:2130]

94707:Monday [1130_1:1430, 1730_1:2130]

Delivery zip and day Delivery time

ZRANGEBYSCORE 94619:Monday 1815 2359

{1730_1}

1730 is before 1815  Ajanta is open

The future is polyglot

e.g. Netﬂix
• RDBMS
• SimpleDB
• Cassandra
• Hadoop/Hbase

IEEE Software Sept/October 2010 - Debasish Ghosh / Twitter @debasishg

NoSql Databases and Polyglot Applications

Recommended

Recommended

More Related Content

Similar to NoSql Databases and Polyglot Applications

Similar to NoSql Databases and Polyglot Applications (20)

NoSql Databases and Polyglot Applications