This is just a kiwi, what were you thinking about?
What is Pincaster ?
• Not Only SQL (well, not at all)

• Not only a database for geolocalized apps

• Not only a key/value...
Speaks HTTP / JSON
• Adds a slight overhead over a binary protocol, but makes it dead
  simple to use in almost any langua...
Threads


HTTP server               OpReply queue   Worker
              Zero copy
                                       ...
Layers
• A layer is like a database, identified by a unique
  name.

• A layer contains a set of records.

• Layers are ind...
Void records

• Just unique keys.

• Fast and memory efficient.

• Serialized data can be embedded in keys.

• Useful as fla...
Hashes
      Property   Binary-safe value



Key   Property   Binary-safe value



      Property   Binary-safe value
Atomic operations
No transaction, but multiple changes can be combined as a
                  single atomic operation:
   ...
Points

                                          Latitude, longitude
Key                                               or...
Layers types
• Flat: rectangular area (x0, y0) - (x1, y1)

• Flatwrap: rectangular area with wrapping.

• Spherical / geoi...
Simple spatial queries
• Find points within a radius (euclidian distance for
  flat/flatwrap or meters for spherical/geoidal...
Points+hashes
                   Location



Key                Property               Binary-safe value



              ...
Expirable records
• Records can automatically expire by setting an
  _expires_at property.

• Expiration dates can be chan...
Range queries
• Keys are lexically sorted (red-black tree).

• Hence, range queries are cheap.

• Pincaster currently offe...
Linked records
• Symbolic links to other records with special properties starting
  with $link: .

• Implements N:1 relati...
Name
                                       Donald Duck

Donald             Location


           $link:favorite restauran...
Public HTTP service
• Public data can be embedded in records, through
  $content and $content_type properties.

• Especial...
Expiration

                  Location

Donald             Name                      Donald Duck

                $content...
Durability
• Similar to Redis AOF, but with a non-binary (human readable
  and tweakable) journal.

• Data set and indices...
Journal rewrite
• Reduces startup time.

• Constructs a new journal with only needed operations to
  reconstruct the data ...
Coming soon
• Replication

• Clustering of spatial results (partly implemented).

• Optimization (distance computation, fin...
http://github.com/
jedisct1/Pincaster/
Upcoming SlideShare
Loading in...5
×

An introduction to Pincaster

5,096

Published on

An introduction to the Pincaster nosql data store.

Published in: Technology
1 Comment
9 Likes
Statistics
Notes
No Downloads
Views
Total Views
5,096
On Slideshare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
31
Comments
1
Likes
9
Embeds 0
No embeds

No notes for slide






















  • An introduction to Pincaster

    1. 1. This is just a kiwi, what were you thinking about?
    2. 2. What is Pincaster ? • Not Only SQL (well, not at all) • Not only a database for geolocalized apps • Not only a key/value store • Not only a persistent database • Not only a non-relational database • Not only a database.
    3. 3. Speaks HTTP / JSON • Adds a slight overhead over a binary protocol, but makes it dead simple to use in almost any language and environment. • Pincaster is fast and lightweight no matter what. And there's plenty of room for optimization. HTTP keep alive is supported. • Written in C. Runs on OSX / Linux / *BSD and doesn't require any external dependency. Valgrind doesn't cry. BSD license. • Event-driven model with asynchronous workers. Powered by the awesome libevent2 library. • And shamelessly reuses (in a different implementation) some of the nice concepts from Redis. Why not?
    4. 4. Threads HTTP server OpReply queue Worker Zero copy Worker Domain handler Op queue Worker Journal rewriter fork()ed
    5. 5. Layers • A layer is like a database, identified by a unique name. • A layer contains a set of records. • Layers are independent and can have different settings. • Layers can be created / deleted online. • Layers can mix different types of records.
    6. 6. Void records • Just unique keys. • Fast and memory efficient. • Serialized data can be embedded in keys. • Useful as flags and in range queries.
    7. 7. Hashes Property Binary-safe value Key Property Binary-safe value Property Binary-safe value
    8. 8. Atomic operations No transaction, but multiple changes can be combined as a single atomic operation: • Add new properties • Update properties • Delete properties • Change special properties • Increment/decrement counters which are automatically created if needed.
    9. 9. Points Latitude, longitude Key or x,y Location is set through the special _loc property. Indexed with quad-trees. Space efficient, points are grouped into buckets. Designed for dynamic data like geolocalized applications.
    10. 10. Layers types • Flat: rectangular area (x0, y0) - (x1, y1) • Flatwrap: rectangular area with wrapping. • Spherical / geoidal - WGS84 - GPS and map services friendly. Handle corner cases. • Pick your function for non-euclidian distance computation: rhomboid, fast, great circle or haversine.
    11. 11. Simple spatial queries • Find points within a radius (euclidian distance for flat/flatwrap or meters for spherical/geoidal layers). • Find points within a rectangle. Wraparound is properly handled. You can directly query according to the Google Map viewing area. • Overflow is reported. • Clustering is on the way.
    12. 12. Points+hashes Location Key Property Binary-safe value Property Binary-safe value Spatial queries can optionally return properties.
    13. 13. Expirable records • Records can automatically expire by setting an _expires_at property. • Expiration dates can be changed, removed and readded at will. • Can act like a memcache speaking HTTP/ JSON with a set of properties per record. Also useful for ephemeral geo data (e.g. when storing location of online users).
    14. 14. Range queries • Keys are lexically sorted (red-black tree). • Hence, range queries are cheap. • Pincaster currently offers prefix-matching. • Results of range queries can include keys, keys + overview or keys + overview + properties.
    15. 15. Linked records • Symbolic links to other records with special properties starting with $link: . • Implements N:1 relations but 1:N and N:N can be represented by multiple links. Records can have any number of links. • No referential integrity. • Useful for easy retrieval of related records. But nowhere an alternative to a graph database. • Just add link=1 to any query in order to traverse links and retrieve related records (duplicate records and loops are tracked).
    16. 16. Name Donald Duck Donald Location $link:favorite restaurant Rest1234 Rest1234 Description Mac Donald's Location
    17. 17. Public HTTP service • Public data can be embedded in records, through $content and $content_type properties. • Especially useful to store JSON data and HTML partials that can be directly served to browsers. You can also think about it like memcache with an embedded web server. • Should be used behind a proxy or a filtering load balancer, though.
    18. 18. Expiration Location Donald Name Donald Duck $content <html>Quack quack!</html> $content_type text/html http://host/api/1.0/records/users/Donald.json Public: http://host/public/users/Donald.html Quack quack!
    19. 19. Durability • Similar to Redis AOF, but with a non-binary (human readable and tweakable) journal. • Data set and indices are kept in memory. • But an append-only journal can log every query that is going to change a database. Timestamps allow point-in-time rollback. • Configurable fsync() policy: after every commit, never or after x seconds. • Combines efficiency of an in-memory database with durability.
    20. 20. Journal rewrite • Reduces startup time. • Constructs a new journal with only needed operations to reconstruct the data set. • Background operation happening in a child process with a low priority. • Takes advantage of copy-on-write through fork() and appends missing records afterwards. • The new journal atomically replaces the previous one after successful completion.
    21. 21. Coming soon • Replication • Clustering of spatial results (partly implemented). • Optimization (distance computation, finer grained rwlocks, use slabs in more areas). • Spidermonkey integration. • Automatically move expired records to another layer. • Observers (push a notification over a HTTP channel when there's an update in a geographic zone). • Client libraries and possibly a decent web site (help needed!)
    22. 22. http://github.com/ jedisct1/Pincaster/
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×