Hashing for Fun and Profit

384 views

Published on

An impromptu lightning talk I gave at Code4Lib North 2013

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
384
On SlideShare
0
From Embeds
0
Number of Embeds
53
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Hashing for Fun and Profit

  1. 1. HASHING FOR FUN AND PROFITMatTrudel@mattrudel
  2. 2. HASHING• A one-way mathematical function that reduces a string of datainto a fixed length number• Easy to compute, hard to reverse• Collision resistant. No two files should have the same hash• Like a fingerprint, basically
  3. 3. SHA-1
  4. 4. SHA-1160 bits (40 hex chars)
  5. 5. SHA-1ff4f25dfc62c9df4478549444e9eb364841c9391
  6. 6. ff4f25dfc62c9df4478549444e9eb364841c9391
  7. 7. WEBCITATION.ORG
  8. 8. Unicorns! Unicorns! Unicorns!Unicorns! Unicorns! Unicorns!
  9. 9. Unicorns! Unicorns! Unicorns!Unicorns!ff4f25dfc62c9df4478549444e9eb364841c9391 ff4f25dfc62c9df4478549444e9eb364841c9391 ff4f25dfc62c9df4478549444e9eb364841c9391
  10. 10. ASSET STORAGE ISDEAD SIMPLEff4f25dfc62c9df4478549444e9eb364841c9391.jpg
  11. 11. COST OF A DUPLICATE COPYIS A DB ROW OF METADATAThey both point to the same data on-disk
  12. 12. EVERY COPY OFISTHE SAMEff4f25dfc62c9df4478549444e9eb364841c9391
  13. 13. TONS OF OTHER USEFULPROPERTIES• Content Addressable - essentially a URN• Useful for detecting file changes (intentional or not)• Can be computed using just the file itself (it’s just math)• Indispensable part of many tools (git, CDNs,TLS)
  14. 14. • fin •

×