Unique ID generation in distributed systems

35,048
-1

Published on

A run through of the various options available for generating unique IDs

Published in: Technology, Business
0 Comments
50 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
35,048
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
205
Comments
0
Likes
50
Embeds 0
No embeds

No notes for slide

Unique ID generation in distributed systems

  1. 1. ID generationPHP London 2012-08-02@davegardnerisme
  2. 2. @davegardnerismehailoapp.com/dave(for a £5 discount)
  3. 3. MySQL auto increment DC 1 1,2,3,4… MySQL Web App
  4. 4. MySQL auto increment • Numeric IDs • Go up with time • Not resilient
  5. 5. MySQL multi-master replication DC 1 1,3,5,7… MySQL Web App 2,4,6,8… MySQL
  6. 6. MySQL multi-master replication • Numeric IDs • Do not go up with time • Some resilience
  7. 7. DC 1DC 2 DC 4 DC 5 DC 6 DC 3 Going global…
  8. 8. MySQL in multi DC setup DC 1 1,2,3… Web MySQL App WAN LINK DC 2 ? Web App
  9. 9. Flickr MySQL ticket server DC 1 Ticket 1,3,5… Web WAN link not required to Server App generate an ID WAN LINK DC 2 Ticket 4,6,8… Web Server App
  10. 10. Flickr MySQL ticket server • Numeric IDs • Do not go up with time • Resilient and distributed • ID generation separated from data store
  11. 11. The anatomy of a ticket server DC Web Web Web Web App App App App Ticket Server
  12. 12. Making things simpler DC Web Web Web Web App App App App ID gen ID gen ID gen ID gen
  13. 13. UUIDs • 128 bits • Could use type 4 (Random) or type 1 (MAC address with time component) • Can generate on each machine with no co-ordination
  14. 14. Type 4 – random version xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx variant (8, 9, A or B) f47ac10b-58cc-4372-a567-0e02b2c3d479
  15. 15. 5.3 x 1036possible values for a type 4 UUID
  16. 16. 1.1 x 1019UUIDs we could generate per second since the Universe began
  17. 17. 2.1 x 1027Olympic swimming pools filled if eachpossible value contributed a millilitre
  18. 18. Type 1 – MAC address 51063800-dc76-11e1-9fae-001c42000009 • Time component is based on 100 nanosecond intervals since October 15, 1582 • Most significant bits of timestamp shifted to least significant bits of UUID
  19. 19. Type 1 – MAC address • The address (MAC) of the computer that generated the ID is encoded into it • Lexical ordering essentially meaningless • Deterministically unique
  20. 20. There are some other options…
  21. 21. No co-ordination neededDeterministically uniqueK-ordered (time-ordered lexically)
  22. 22. Twitter Snowflake • Under 64 bits • No co-ordination (after startup) • K-ordered • Scala service, Thrift interface, uses Zookeeper for configuration
  23. 23. Twitter Snowflake 41 bits Timestamp millisecond precision, bespoke epoch 10 bits Configured machine ID 12 bits Sequence number
  24. 24. Twitter Snowflake 77669839702851584 = (timestamp << 22) | (machine << 12) | sequence
  25. 25. Boundary Flake • 128 bits • No co-ordination at all • K-ordered • Erlang service
  26. 26. Boundary Flake 64 bits Timestamp millisecond precision, 1970 epoch 48 bits MAC address 16 bits Sequence number
  27. 27. PHP Cruftflake • Based on Twitter Snowflake • No co-ordination (after startup) • K-ordered • PHP, ZeroMQ interface, uses Zookeeper for configuration
  28. 28. Questions?
  29. 29. ReferencesFlickr distributed ticket serverhttp://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/UUIDshttp://tools.ietf.org/html/rfc4122How random are random UUIDs?http://stackoverflow.com/a/2514722/15318Twitter Snowflakehttps://github.com/twitter/snowflakeBoundary Flakehttps://github.com/boundary/flakePHP Cruftflakehttps://github.com/davegardnerisme/cruftflake
  30. 30. private function mintId64($timestamp, $machine, $sequence){ $timestamp = (int)$timestamp; $value = ($timestamp << 22) | ($machine << 12) | $sequence; return (string)$value;}private function mintId32($timestamp, $machine, $sequence){ $hi = (int)($timestamp / pow(2,10)); $lo = (int)($timestamp * pow(2, 22)); // stick in the machine + sequence to the low bit $lo = $lo | ($machine << 12) | $sequence; // reconstruct into a string of numbers $hex = pack(N2, $hi, $lo); $unpacked = unpack(H*, $hex); $value = $this->hexdec($unpacked[1]); return (string)$value;}
  31. 31. public function generate(){ $t = floor($this->timer->getUnixTimestamp() - $this->epoch); if ($t !== $this->lastTime) { $this->sequence = 0; $this->lastTime = $t; } else { $this->sequence++; if ($this->sequence > 4095) { throw new OverflowException(Sequence overflow); } } if (PHP_INT_SIZE === 4) { return $this->mintId32($t, $this->machine, $this->sequence); } else { return $this->mintId64($t, $this->machine, $this->sequence); }}
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×