Successfully reported this slideshow.
Your SlideShare is downloading. ×

Unique ID generation in distributed systems

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Upcoming SlideShare
qcon
qcon
Loading in …3
×

Check these out next

1 of 31 Ad

Unique ID generation in distributed systems

Download to read offline

A run through of the various options available for generating unique IDs

A run through of the various options available for generating unique IDs

Advertisement
Advertisement

More Related Content

Advertisement
Advertisement

Unique ID generation in distributed systems

  1. 1. ID generation PHP London 2012-08-02 @davegardnerisme
  2. 2. @davegardnerisme hailoapp.com/dave (for a £5 discount)
  3. 3. MySQL auto increment DC 1 1,2,3,4… MySQL Web App
  4. 4. MySQL auto increment • Numeric IDs • Go up with time • Not resilient
  5. 5. MySQL multi-master replication DC 1 1,3,5,7… MySQL Web App 2,4,6,8… MySQL
  6. 6. MySQL multi-master replication • Numeric IDs • Do not go up with time • Some resilience
  7. 7. DC 1 DC 2 DC 4 DC 5 DC 6 DC 3 Going global…
  8. 8. MySQL in multi DC setup DC 1 1,2,3… Web MySQL App WAN LINK DC 2 ? Web App
  9. 9. Flickr MySQL ticket server DC 1 Ticket 1,3,5… Web WAN link not required to Server App generate an ID WAN LINK DC 2 Ticket 4,6,8… Web Server App
  10. 10. Flickr MySQL ticket server • Numeric IDs • Do not go up with time • Resilient and distributed • ID generation separated from data store
  11. 11. The anatomy of a ticket server DC Web Web Web Web App App App App Ticket Server
  12. 12. Making things simpler DC Web Web Web Web App App App App ID gen ID gen ID gen ID gen
  13. 13. UUIDs • 128 bits • Could use type 4 (Random) or type 1 (MAC address with time component) • Can generate on each machine with no co-ordination
  14. 14. Type 4 – random version xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx variant (8, 9, A or B) f47ac10b-58cc-4372-a567-0e02b2c3d479
  15. 15. 5.3 x 1036 possible values for a type 4 UUID
  16. 16. 1.1 x 1019 UUIDs we could generate per second since the Universe began
  17. 17. 2.1 x 1027 Olympic swimming pools filled if each possible value contributed a millilitre
  18. 18. Type 1 – MAC address 51063800-dc76-11e1-9fae-001c42000009 • Time component is based on 100 nanosecond intervals since October 15, 1582 • Most significant bits of timestamp shifted to least significant bits of UUID
  19. 19. Type 1 – MAC address • The address (MAC) of the computer that generated the ID is encoded into it • Lexical ordering essentially meaningless • Deterministically unique
  20. 20. There are some other options…
  21. 21. No co-ordination needed Deterministically unique K-ordered (time-ordered lexically)
  22. 22. Twitter Snowflake • Under 64 bits • No co-ordination (after startup) • K-ordered • Scala service, Thrift interface, uses Zookeeper for configuration
  23. 23. Twitter Snowflake 41 bits Timestamp millisecond precision, bespoke epoch 10 bits Configured machine ID 12 bits Sequence number
  24. 24. Twitter Snowflake 77669839702851584 = (timestamp << 22) | (machine << 12) | sequence
  25. 25. Boundary Flake • 128 bits • No co-ordination at all • K-ordered • Erlang service
  26. 26. Boundary Flake 64 bits Timestamp millisecond precision, 1970 epoch 48 bits MAC address 16 bits Sequence number
  27. 27. PHP Cruftflake • Based on Twitter Snowflake • No co-ordination (after startup) • K-ordered • PHP, ZeroMQ interface, uses Zookeeper for configuration
  28. 28. Questions?
  29. 29. References Flickr distributed ticket server http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on- the-cheap/ UUIDs http://tools.ietf.org/html/rfc4122 How random are random UUIDs? http://stackoverflow.com/a/2514722/15318 Twitter Snowflake https://github.com/twitter/snowflake Boundary Flake https://github.com/boundary/flake PHP Cruftflake https://github.com/davegardnerisme/cruftflake
  30. 30. private function mintId64($timestamp, $machine, $sequence) { $timestamp = (int)$timestamp; $value = ($timestamp << 22) | ($machine << 12) | $sequence; return (string)$value; } private function mintId32($timestamp, $machine, $sequence) { $hi = (int)($timestamp / pow(2,10)); $lo = (int)($timestamp * pow(2, 22)); // stick in the machine + sequence to the low bit $lo = $lo | ($machine << 12) | $sequence; // reconstruct into a string of numbers $hex = pack('N2', $hi, $lo); $unpacked = unpack('H*', $hex); $value = $this->hexdec($unpacked[1]); return (string)$value; }
  31. 31. public function generate() { $t = floor($this->timer->getUnixTimestamp() - $this->epoch); if ($t !== $this->lastTime) { $this->sequence = 0; $this->lastTime = $t; } else { $this->sequence++; if ($this->sequence > 4095) { throw new OverflowException('Sequence overflow'); } } if (PHP_INT_SIZE === 4) { return $this->mintId32($t, $this->machine, $this->sequence); } else { return $this->mintId64($t, $this->machine, $this->sequence); } }

×