4. Installs DB- Zoom in
app
install
s
writer
2K write
IOPS
app events
enrichments 180k read
IOPS
5. We need to replace CouchBase
● Scale out is out of control
(45 r3.4xl instances)
● XDCR can’t keep up
● Daily backups take more than a day
● Paid support is not good enough
chosen solution: Aerospike
6. Rules of the game- starting
point
● # of records: ~ 2,000,000,000 (repl.
factor 3)
● Read intensive: 180K read + 2K write
IOPS
● # of machines: 45 r3.4xlarge
● Cost: ~ 30,000 $ per month
● Time frame: a few months.
Mission critical
7. Why is data migration harder
than in a relational DB?
Relational DB NoSql DB
Data types conventional blob, JSON, specialized bins
Vendors a few 100+
Query
Language
SQL N1QL, SQL, JavaScript, HTTP/REST
-- creating views/ MapReduce style
Schema Yes Maybe
Migration
tools
Yes DIY
11. Easy part (Cont.)
● Write new installs/updates to NEW DB
● Read from BOTH DBs
● A fraction of
records (not VIP)
● Do NOT delete yet!
● Have metrics on
misses
13. Is that it?
If your data is short lived- you are mostly
done.
Otherwise...
14. `Manual` part
Migrating the long tail
Dumper
view (query)
dump keys/records
Kafka as a buffer
Loader
-> write
(do not overwrite)
values
15. Detour: DB views
function (doc, meta) {
emit(meta.expiration, doc.id);
}
key (sorting) value
Map function (JS)
● Record itself is NOT a part of the view
16. `Manual` part
Migrating the long tail
Dumper
view (query)
dump keys/records
Kafka as a buffer
Loader
-> write
(do not overwrite)
values
17. New Cluster
● # of records: ~ 2,000,000,000
● 2AZ- repl. factor 2 in each one
● IOPS, same and growing
● # of machines: 2 x 5 i3.4xlarge
● Cost: ~ 10,000 $ per month (⅓)
18. What have we learned?
1. Plan, Plan, Plan.
2. Go easy: test a sample before you start
(1 app / API / Customer).
3. Breath (read: test) between stages.
4. Do NOT delete until you have to.
5. slower -> safer (-> $$$).