Successfully reported this slideshow.
Your SlideShare is downloading. ×

Analyzing Pwned Passwords with Spark and Scala

Ad

Analyzing Pwned
Passwords with Spark
Kelley Robinson
@kelleyrobinson
Developer Evangelist

Ad

+

Ad

BIG DATA & SECURITY @KELLEYROBINSON
Spark: then and now
The state of passwords
Spark in action
Big Data ∩ Security

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Loading in …3
×

Check these out next

1 of 35 Ad
1 of 35 Ad

Analyzing Pwned Passwords with Spark and Scala

Download to read offline

Apache Spark aims to solve the problem of working with large scale distributed data -- and with access to over 500 million leaked passwords we have a lot of data to dig through.

Apache Spark aims to solve the problem of working with large scale distributed data -- and with access to over 500 million leaked passwords we have a lot of data to dig through.

Advertisement
Advertisement

More Related Content

Advertisement

Analyzing Pwned Passwords with Spark and Scala

  1. 1. Analyzing Pwned Passwords with Spark Kelley Robinson @kelleyrobinson Developer Evangelist
  2. 2. +
  3. 3. BIG DATA & SECURITY @KELLEYROBINSON Spark: then and now The state of passwords Spark in action Big Data ∩ Security
  4. 4. BIG DATA & SECURITY @KELLEYROBINSON
  5. 5. BIG DATA & SECURITY @KELLEYROBINSON
  6. 6. BIG DATA & SECURITY @KELLEYROBINSON Apache Spark Ecosystem
  7. 7. BIG DATA & SECURITY @KELLEYROBINSON Spark Abstractions Then Now RDD (Resilient Distributed Dataset) DataFrames / Datasets
  8. 8. https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html @KELLEYROBINSONBIG DATA & SECURITY RDDs • Immutable & distributed collection • Unstructured data • Low-level transformation and control
  9. 9. BIG DATA & SECURITY https://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html @KELLEYROBINSON
  10. 10. https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html @KELLEYROBINSONBIG DATA & SECURITY Datasets • Structured data • Strongly typed • Fast
  11. 11. @KELLEYROBINSONBIG DATA & SECURITY Datasets • Structured data • Strongly typed • Fast • SQL DSLs
  12. 12. BIG DATA & SECURITY @KELLEYROBINSON Apache Spark Ecosystem
  13. 13. BIG DATA & SECURITY @KELLEYROBINSON Scala has the most robust language API
  14. 14. BIG DATA & SECURITY https://www.slideshare.net/databricks/composable-parallel-processing-in-apache-spark-and-weld @KELLEYROBINSON
  15. 15. BIG DATA & SECURITY https://twitter.com/CamJo89/status/996497423621996544 @KELLEYROBINSON
  16. 16. BIG DATA & SECURITY @KELLEYROBINSON Spark: then and now The state of passwords Spark in action Big Data ∩ Security
  17. 17. @KELLEYROBINSONBIG DATA & SECURITY Spark: then and now The state of passwords Spark in action Big Data ∩ Security
  18. 18. https://twitter.com/dog_rates/status/986762231290490881
  19. 19. Benefits Fast Flexible Good for exploration Proven for large systems BIG DATA & SECURITY @KELLEYROBINSON
  20. 20. Challenges Opaque error messages Operationalizing Documentation http://heather.miller.am/blog/launching-a-spark-cluster-part-1.html BIG DATA & SECURITY @KELLEYROBINSON
  21. 21. BIG DATA & SECURITY https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/ @KELLEYROBINSON 👍💯 The missing Spark documentation
  22. 22. BIG DATA & SECURITY @KELLEYROBINSON Spark: then and now The state of passwords Spark in action Big Data ∩ Security
  23. 23. BIG DATA & SECURITY @KELLEYROBINSON
  24. 24. @KELLEYROBINSON
  25. 25. BIG DATA & SECURITY
  26. 26. THANK YOU! @kelleyrobinson
  27. 27. Spark Resources • Apache Spark • Jacek's Spark Documentation • Zeppelin • RDDs vs. Datasets • Running Spark on a Cluster Security Resources • Pwned Passwords • Reverse SHA1 hashes • LastPass and 1Password • 2FA Guides @KELLEYROBINSONBIG DATA & SECURITY

×