3. Disclaimer
● Views and Opinions shared here are our
own and not our employers, past,
present, or (obviously) future.
4. Who We Are
● biosshadow - Fearless leader
● Benson - Resident code monkey
● Matt - Security guy
5. We would like to Thank
● Travis McCrea - Designer of our website
● Justin Elze - sysadmin and ideas
● Ashleigh Baumgardner - stats advice
● Mike Kelly of Spiderlabs - access to leaks
● Anyone who provided data and cracked
passwords for us.
9. But...
● It's still quite useful
● Unique as a leak clearinghouse
● We can work around some of the issues
(more on this later)
10. The Project in 4 Bullet Points
● Automate Collection of Leaks via
Pastebin and Twitter
● Clean and remove all data that is not
emails or passwords
● Enter the data in a centralized database
● Run analytics on the database to find
interesting patterns
11. The process
● Collecting leaks
● Cleaning the passwords
● Importing the data
● Run Analysis
● Find patterns
● ???
● Profit?
12. Collecting Passwords
● Data collected via Twitter API and
scraping Pastebin
● Plan to add the top 5 leak pastebins
● And eventually as many as we can find
13. Cleaning The Data
● Leaks contain information that is private
and/or unneeded by the project (address,
full names, and phone numbers)
● We remove all data besides passwords,
hashes, and emails
14. Automation is key
● There is a LOT of data to go through
● Script ALL the things!
● Profit ???
● The problem is non-standard dumps
15. Importing Data
● Handcrafted CSV files
● Rake task to introduce them to rails env
● Calculate leak-specific stats
16. Run Analysis and
Find patterns
● Analysis run en masse and leak by leak
● We let the data tell the story
18. ???
● Automate bruteforcing
○ Dedicated server or EC2
○ GPU goodness with oclhashcat
● Add more leak sources
● An interactive dataset viewer
● More data, faster
19. ??? contd.
● IRCbot to find links dropped by
Anonymous and other similar groups
● Reports - quarterly for anyone to use to
help your their company or clients
20. Profit?
● No plans to monetize anything
● All donations, monetary or otherwise, go
into the project
21. Data
● Most interesting attribute is "strength"
● How hard is it to crack?
○ Length
○ Presence in dictionary
○ Complexity of character set
22. Calculating Strength
● First crack at it: complexity ^ length
● Strength value is far unmanageably large
● log(complexity ^ length)
○ Still monotonically increasing with strength
○ Log lets you graph it nicely
30. How to help/contact
us
Jacob @biosshadow / biosshadow@biosshadow.com
Benson @bensonk42 / bensonk42@gmail.com
Matt @undeadsecurity / matt@zonbi.org
31. How You can Help
the Project
● Requests
○ Features
○ Analytics
● Notify us of leaks, big and small
● Help with our code - Github pull requests are welcome