Successfully reported this slideshow.
Your SlideShare is downloading. ×

Aerospike Data Modeling - Meetup Dec 2019

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 14 Ad

Aerospike Data Modeling - Meetup Dec 2019

Download to read offline

This is a presentation done by Ronen Botzer, the Director of Product at Aerospike as part of the IronSource meetup in Israel (December 2019).

In this talk, Ronen explained how to use nested CDTs and Bitwise operations in order to manage user segmentation and to create a proper data model.

This is a presentation done by Ronen Botzer, the Director of Product at Aerospike as part of the IronSource meetup in Israel (December 2019).

In this talk, Ronen explained how to use nested CDTs and Bitwise operations in order to manage user segmentation and to create a proper data model.

Advertisement
Advertisement

More Related Content

Similar to Aerospike Data Modeling - Meetup Dec 2019 (20)

Advertisement

Recently uploaded (20)

Aerospike Data Modeling - Meetup Dec 2019

  1. 1. Aerospike Modeling User Segmentation with Maps and Bitfields
  2. 2. 2 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc.
  3. 3. 3 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ Cassandra databases, including derivatives such as ScyllaDB, have a needle in a haystack problem ▪ In C* each user ID – segment ID pair is in its own row ▪ This affects performance when you need low latency key-value operations ▪ In Aerospike we keep all the segments together in a single record tl;dr
  4. 4. 4 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ In digital advertising user profiles stores assist with audience segmentation ▪ The goal is to pull user segments for a specific user as fast as possible ▪ Modeling this use case is generally applicable to other forms of online personalization User Profile Stores
  5. 5. 5 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. CREATE TABLE userspace.user_segments ( user_id uuid, segment_id int, attr smallint, attr2 smallint, PRIMARY KEY ((user_id, segment_id), user_id) ) ▪ On average 1000 segments per profile ▪ 50 billion cookies means 50 trillion rows ▪ Large latency to find 1000 segments of a user from a huge number of rows Modeling in Cassandra
  6. 6. 6 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. {segmentID: [segment-TTL, {attr1, attr2}]} { 8457: [8889*, {}], 12845: [8889, {}], 42199: [8889, {}], 43696: [8889, {}], } ▪ * Segment TTL uses local epoch (hours since epoch) ▪ The map ordering options are UNORDERED, K-ORDERED and KV-ORDERED ▪ Choosing K-ORDERED gives the best performance for data on SSD Modeling in Aerospike
  7. 7. 7 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can easily upsert into the map new user segments as they are processed (https://github.com/aerospike-examples/modeling-user-segmentation) Advantages
  8. 8. 8 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can use get_by_value_interval to filter segments that have a specific ‘freshness’ Advantages
  9. 9. 9 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can use the map remove_by_value_interval operation to trim expired segments ▪ Mainly, this allows for orders of magnitude faster retrieval of a user’s segments from the user profile store. Just get the record. Advantages
  10. 10. 10 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ We can use the map remove_by_value_interval operation to trim expired segments, called as a background scan operation (>= 4.7) ▪ Mainly, this allows for orders of magnitude faster retrieval of a user’s segments from the user profile store Advantages
  11. 11. 11 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. List operations supported by the server. Method names in the clients might be different. • General Write Flags: (create_only, update_only, no_fail, partial) • resize() • insert(), remove(), set() • or(), and(), xor(), not() • lshift(), rshift() • add(), subtract(), set-integer() • get(), count() • lscan(), rscan() • get-integer() Bitwise Operations
  12. 12. 12 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. ▪ Represent the segments as a continuous bitfield ▪ Each integer is a bit position. Set the bit for a segment the user is in ▪ Bitwise operations to check server-side if user is in multiple segments ▪ Compresses extremely well in Enterprise Edition ▪ Caveat: can't apply a TTL to the segments Modeling with Bitfields
  13. 13. 13 Proprietary & Confidential | All rights reserved. © 2019 Aerospike Inc. List & Map API ▪ https://www.aerospike.com/docs/guide/cdt-list.html ▪ https://www.aerospike.com/docs/guide/cdt-map.html ▪ https://www.aerospike.com/docs/guide/cdt-context.html ▪ https://www.aerospike.com/docs/guide/cdt-ordering.html ▪ https://aerospike-python-client.readthedocs.io/en/latest/aerospike_helpers.operations.html ▪ https://www.aerospike.com/apidocs/java/com/aerospike/client/cdt/ListOperation.html Code Samples ▪ https://github.com/aerospike-examples/modeling-user-segmentation Aerospike Training ▪ https://www.aerospike.com/training/ More material you can explore:
  14. 14. Thank You! Any questions? ronen@aerospike.com

×