What's new in Redis v3.2

What's new
#RedisTLV Jan 21st 2016
in v3.2

v3.2RC1 - TL;DR
~/src/redis$ git rev-list 3.0..3.2 --count
1606
~/src/redis$ git diff 3.0..3.2 --shortstat
262 files changed, 46931 insertions(+),
28720 deletions(-)

Today's meetup: ~6.379 topics
1. Geospatial indices
2. Quadtree & redimension
3. Internals deep dive
4. Effects-based replication
5. Security & `protected-mode`
6. Redis Lua debugger

About our sponsor
- Home of open source Redis
- The commercial provider for managed and
downloadable Redis solutions
- HQ @ Mountain View, R&D @ Italy & Israel
- ~70 employees & growing fast - we're hiring :)

The newsletter: Redis Watch
http://bit.ly/RedisWatch

OSS crossover:
Redis 2.x -> ardb ->
2D spatial index ->
Matt Stancliff @mattsta ->
Salvatore Sanfilippo @antirez ->
Redis 3.2
Geospatial indices redhistory

The Geohash
- A geocoding system, hierarchical spatial grid
- The hash value maps to a location (lon, lat)
- Is usually base-32 encoded (.e.g sv8y8v66bt0)
- By Gustavo Niemeyer, in the public's domain
since 28 Feb. 2009

PrimeMeridian=0°
lon -180° lon 180°

0 11
10
Equator = 0°
lat -85.05112878
lat -85.05112878

Geospatial Indices in Redis
- Redis' Geohashes are 52-bit integers (~0.6m)
- Redis' Sorted Sets' scores are IEEE 754
floats, i.e. 53-bit integer precision…
BINGO!

New! GEO API, part 1
Add a point:
GEOADD key longitude latitude member [...]
Get longitude & latitude / geohash:
GEOPOS|GEOHASH key member [...]
Get the distance between two points:
GEODIST key member1 member2 [unit]

GEO API - "Demo"
127.0.0.1:6379> GEOADD g 34.84076 32.10942 RL@TLV
(integer) 1
127.0.0.1:6379> GEOADD g -122.0678325 37.3775256 RL@MV
(integer) 1
127.0.0.1:6379> GEODIST g RL@TLV RL@MV km
"11928.692170353959"
127.0.0.1:6379> GEOADD g 34.8380433 32.1098095 Hudson
(integer) 1

Size (almost) doesn't matter
127.0.0.1:6379> GEOHASH g RL@TLV Hudson
1) "sv8y8v66bt0"
2) "sv8y8v2m1n0"
- Shorter hashes -> "same" location, bigger area
- Close spatial proximity usually means a
shared hash prefix

New! GEO API, part 2
Search for members in a radial area:
GEORADIUS key longitude latitude radius unit ...
GEORADIUSBYMEMBER key member radius unit ...
Overthrows ZREVRANGEBYSCORE!!! #RedisTrivia
Delete a point - no GEOREM for you:
ZREM key member [...]

Multi-dimensional queries
SELECT id
FROM users
WHERE age > 35 AND
salary BETWEEN 250 AND 350
http://stackoverflow.
com/questions/32911604/intersection-of-two-or-
more-sorted-sets

The Redis way - "ZQUERY"
ZUNIONSTORE t 1 age WEIGHTS 1
ZREMRANGEBYSCORE t -inf (25
ZREMRANGEBYSCORE t (35 +inf
ZINTERSTORE t 2 t salary WEIGHTS 0 1
ZRANGEBYSCORE t 250 350
DEL t
Works, but not too efficient.

Would indexing the data help?
rqtih.lua: Another Redis Way
https://gist.github.
com/itamarhaber/c1ffda42d86b314ea701
rqtih.lua is about 32.5 times faster than
ZQUERY on 100K users (age & salary)

rqtih.lua?!? A PoC for
R - Redis, duh
QT - Quadtree
IH - In Hash
.LUA - "object oriented", JiT reads, delayed
writes

Trillustration
a
b
d
e
f
g
h
i
a
d b f
h g c e i
c
{
/ : x, y, w, h, {a}
/00/ : x, y, w, h, {d}
/01/ : x, y, w, h, {b}
...}
* Node capacity = 1

A new Redis data structure?
- Discussions in proximity to Redis Developers
Day 2015 (London)
- k-d tree: similar principles for k dimensions,
but complex complexity
- Outcomes: topics/indexes & experimental API
that uses existing data types (Zorted & Hash)

Redimension: k-d query API
@antirez's idea: interleave the dimensions, store
"score"+data in a Zorted Set for lexicographical
ranges, maintain a Hash for lookups
redimension.rb - implementation by @antirez
redimension.lua - port by @itamarhaber

Redimension "Demo"
~/src/lua-redimension$ redis-cli SCRIPT LOAD
"$(cat redimension.lua)"
"4abdad23c459145cbd658c991c0c8ad93d984d91"
~/src/lua-redimension$ redis-cli EVALSHA
4abdad23c459145cbd658c991c0c8ad93d984d91 0
1) "KEYS[1] - index sorted set key"
2) "KEYS[2] - index hash key"
3) "ARGV[1] - command. Can be:"

Redimension "Demo", 2
4) " create - create an index with ARGV
[2] as dimension and ARGV[3] as precision"
5) " drop - drops an index"
6) " index - index an element ARGV[2]
with ARGV[3]..ARGV[3+dimension] values"
7) " unindex - unindex an element ARGV
[2] with ARGV[3]..ARGV[3+dimension] values"

Redimension "Demo", 3
9) " update - update an element ARGV[2]
with ARGV[3]..ARGV[3+dimension] values"
10) " query - query using ranges ARGV
[2], ARGV[3]..ARGV[2+dimension-1], ARGV
[2+dimension]"
11) " fuzzy_test - fuzzily tests the library
on ARGV[2] dimension with ARGV[3] items using
ARGV[4] queries"

redimension.next()
- Currently just an experiment
- Many improvements still needed
- Planned to become a part of the core project
- Need more feedback WRT functionality & API
- Any ideas?

Internals deep dive
Oran Agra @RedisLabs

changes that made it (or didn’t) to OSS redis
● merged into 3.0
○ Fix a race condition in processCommand() with freeMemoryIfNeeded()
○ diskless replication fixes
○ psync fixes
○ fixes in LRU eviction (dict random keys during rehasing)
● merged into 3.2
○ sds optimizations
○ jemalloc size class optimization
● changes not merged yet
○ diskless slave replication
○ dict.c improvements
● other changes i didn’t get to push yet
○

nothing is user facing.
only optimizations, and fixes 8-
(

diskless replication
● how normal replication works.
master->fork->rdb on disk->main process streams to slave
slave->save to disk while serving clients->flushdb->load rdb
● disadvantages of diskless replication
○ slaves must connect together
○ slave side flush before RDB was fully received
○ on slow network, longer fork duration
● a word about fork() and CoW?

diskless replication benchmark (replication time)
two instances of r3.2xl (60GB ram, with 160GB SSD),
4,000,000 string keys of 1k random data.
(consuming 52GB of RAM), 19GB RDB file.
fully disk based: 513 seconds
only master diskless: 365 seconds
fully diskless: 231 seconds
only salve is diskless: 360 seconds

● we all know what fragmentation is
● history: on the search for the ultimate allocator
● how an allocator works (bins) to overcome that
● a word about virtual address space vs OS pages
○ RSS = VM pages mapped to physical RAM
● what’s internal fragmentation / used_memory
(maxmemory) includes internal frag
● RSS = used_memory (+external frag)
○ external frag are unused bins, and pages
●

allocators
16 byte bins pool
32 byte bins pool
internal fragmentation
22
byte
18
bytes
17
bytes
30
bytes
28
bytes

adding bin 24 bytes pull to jemalloc
used by: dictEntry, listNode, etc
redis-cli debug populate 10000000
original code's used_memory: 1,254,709,872
with patch used_memory: 1,094,714,048
memory optimization: 14%
size classes:
8
16
24
32
40
48
56
64
80
96
…
...

p1
FS
cache
kernel p1
FS
cache p2
p1+
p2
p2 p2 p2
FS
cache
FS
cache
FS
cache
physical ram (4k pages)
process 1(virtual address space) process 2(virtual address space)
unmap
ped
unmap
ped
unmap
ped
unmap
ped
unmap
ped

4k page
4k page
4k page
4k page
can be returned to os
(won’t be rss anymore)
4k page

ABCDEFn
char*
used unusedfree
4 bytes 4 bytes
old sds header
● grows in place(sometimes no need for realloc)
○ although realloc may nop instead of give new pointer and do
memcpy
● no need for strlen (search for null terminator)
● can be used in normal string functions like printf
struct sdshdr {
unsigned int len;
unsigned int free;
char buf[];
};

new sds header
ABCDEFn
char*
used unusedfree
4 bytes 4 bytes
old sds header
ABCDEFnused unusable
5 bits 3 bits
type5bit
ABCDEFnused unusedallocated
1 byte 1 byte 1 byte
type8bit
2 bytes 2 bytes 1 byte
type16bit
type32bit
type64bit
struct __attribute__ ((__packed__)) sdshdr5 {
unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
char buf[];
};
uint8_t len; /* used */
uint8_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
char buf[];
};
char buf[];
};
char buf[];
};

sds size classes
debug populate 10000000
used_memory of original code: 1,254,709,872
used_memory with new code: 1,078,723,024
memory optimization: 16%

Script replication before v3.2
Lua scripts are pushed down to the slaves for
local execution. This reduces wire traffic in
cases such as:
for i = 1, 1000000 do
redis.call('LPUSH', KEYS[1], i)
end

Script replication caveats
Compute-intensive scripts (e.g. ZQUERY) waste
CPU time because they are run:
- 1+number of slaves times: wasteful
- When recovering from AOF: really bad
And then there's also...

Free will
vs.
> EVAL "redis.call('SET', KEYS[1], redis.call
('TIME')[1])" 1 foo
(error) ... Write commands
not allowed after non
deterministic commands

Script replication in v3.2
- Same defaults
- NEW! redis.replicate_commands()
causes the script's effects to be replicated
- NEW! redis.set_repl(...)
redis.REPL_[ALL|NONE|AOF|SLAVE]

Effect-based replication uses
Any ideas?

"A few things about Redis security"
"The Redis security model is: it’s
totally insecure to let untrusted
clients access the system, please
protect it from the outside world yourself...
Let’s crack Redis for fun and no profit…"
HOWTO: http://antirez.com/news/96

The totally unexpected result
Script kiddies, cybercriminals and white hackers

3 critical points about security
Honesty is always the best option. That said:
1. Never leave an unprotected server open to
the outside world
2. If your server has been compromised, burn it
3. Always read the documentation

NEW! protected-mode directive
By default is enabled -> a breaking upgrade!
When (protected-mode && !requirepass && !bind):
- Allow only 127.0.0.1, ::1 or socket connections
- DENY (with the longest message ever!) others

Protection in Action - "Demo"
-DENIED Redis is running in protected mode because protected mode is enabled, no
bind address was specified, no authentication password is requested to clients. In
this mode connections are only accepted from the loopback interface. If you want to
connect from external computers to Redis you may adopt one of the following
solutions: 1) Just disable protected mode sending the command 'CONFIG SET protected-
mode no' from the loopback interface by connecting to Redis from the same host the
server is running, however MAKE SURE Redis is not publicly accessible from internet
if you do so. Use CONFIG REWRITE to make this change permanent. 2) Alternatively you
can just disable the protected mode by editing the Redis configuration file, and
setting the protected mode option to 'no', and then restarting the server. 3) If you
started the server manually just for testing, restart it with the '--protected-mode
no' option. 4) Setup a bind address or an authentication password. NOTE: You only
need to do one of the above things in order for the server to start accepting
connections from the outside.

NEW! Integrated Lua debugger
- Step-by-step journey through history
- LDB: SCRIPT DEBUG yes/sync/no
- Demo: redis-cli, ZeroBrane Studio IDE plugin
http://redis.io/topics/ldb
https://redislabs.com/blog/zerobrane-studio-
plugin-for-redis-lua-scripts

What's new in Redis v3.2

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to What's new in Redis v3.2

Similar to What's new in Redis v3.2 (20)

More from Itamar Haber

More from Itamar Haber (9)

Recently uploaded

Recently uploaded (20)

What's new in Redis v3.2