Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Rv defcon25 burner phone challenge - dakota nelson

93 views

Published on

http://reconvillage.org/burner-phone-challenge/

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Rv defcon25 burner phone challenge - dakota nelson

  1. 1. Burner Phone Challenge Dakota Nelson
  2. 2. Welcome!
  3. 3. If someone switches phones, can we tell it’s still them?
  4. 4. Overview ● Why this problem? ● Get set up ● Basic data analysis (graphs!) ● Basic machine learning ● The Challenge
  5. 5. Hemisphere “It’s a contract service which AT&T provides law enforcement agencies, mostly drug enforcement, for investigating telephone numbers. It’s a bulk metadata surveillance program. Hemisphere includes a database of telephone call metadata with trillions of records, the primary AT&T proprietary database, against which the company cross-references telephone numbers provided by police for investigation.” Kenneth Lipp, https://medium.com/@kennethlipp/hemisphere-so-sensitive-lets-use-it-all-the -time-58b8cedbedf2
  6. 6. Let’s say a narcotics task force is trying to maintain a wiretap on a drug dealer, but the drug dealer uses prepaid “burner” phones and is constantly changing his number. However, from each new phone number, this drug dealer makes calls to many of the same people — his mom, girlfriend, a supplier — and also makes most of his calls from a limited range of locations. Because AT&T can analyze so many call-detail records with such well-developed custom software, it can easily track these patterns to infer which unknown number is really the suspect. Since it can do this in real time by mining its live data stream, by the time the suspect burns an old phone and makes a few calls from his new one, he’s back in the net.
  7. 7. Let’s say a narcotics task force is trying to maintain a wiretap on a drug dealer, but the drug dealer uses prepaid “burner” phones and is constantly changing his number. However, from each new phone number, this drug dealer makes calls to many of the same people — his mom, girlfriend, a supplier — and also makes most of his calls from a limited range of locations. Because AT&T can analyze so many call-detail records with such well-developed custom software, it can easily track these patterns to infer which unknown number is really the suspect. Since it can do this in real time by mining its live data stream, by the time the suspect burns an old phone and makes a few calls from his new one, he’s back in the net.
  8. 8. Let’s say a narcotics task force is trying to maintain a wiretap on a drug dealer, but the drug dealer uses prepaid “burner” phones and is constantly changing his number. However, from each new phone number, this drug dealer makes calls to many of the same people — his mom, girlfriend, a supplier — and also makes most of his calls from a limited range of locations. Because AT&T can analyze so many call-detail records with such well-developed custom software, it can easily track these patterns to infer which unknown number is really the suspect. Since it can do this in real time by mining its live data stream, by the time the suspect burns an old phone and makes a few calls from his new one, he’s back in the net.
  9. 9. Let’s say a narcotics task force is trying to maintain a wiretap on a drug dealer, but the drug dealer uses prepaid “burner” phones and is constantly changing his number. However, from each new phone number, this drug dealer makes calls to many of the same people — his mom, girlfriend, a supplier — and also makes most of his calls from a limited range of locations. Because AT&T can analyze so many call-detail records with such well-developed custom software, it can easily track these patterns to infer which unknown number is really the suspect. Since it can do this in real time by mining its live data stream, by the time the suspect burns an old phone and makes a few calls from his new one, he’s back in the net.
  10. 10. We’re going to replicate this
  11. 11. Just, y’know… poorly
  12. 12. The data I made it myself It’s… not great (plz help) Things to know: - Only call initiation matters (receiving calls means nothing) - A lot of things are… wrong
  13. 13. https://notebooks.strikersecurity.com Authenticate, then wait once you see a green button. (Or install Jupyter yourself [using Anaconda], I’m not your boss.)
  14. 14. Jupyter You got RCE on AWS today!
  15. 15. Let’s explore this data a little And learn about Jupyter and Python in the process You might like: https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/
  16. 16. Street Fighting Machine Learning the art of educated guessing and opportunistic problem solving blatantly stolen from Sanjoy Mahajan
  17. 17. Disclaimers
  18. 18. Feature engineering Algorithm selection Training
  19. 19. Feature engineering Algorithm selection Training
  20. 20. Coming up with features is difficult, time-consuming, requires expert knowledge. "Applied machine learning" is basically feature engineering. — Andrew Ng
  21. 21. What is feature engineering? Pick some attributes (“features”) of the data that you’ll feed into the model
  22. 22. “A feature is a piece of information that might be useful for prediction. Any attribute could be a feature, as long as it is useful to the model.” - Wikipedia
  23. 23. Feature features - Meaningful
  24. 24. Feature features - Meaningful - Independent
  25. 25. Feature features - Meaningful - Independent - Accurate
  26. 26. Wait, back up… why bother selecting? Simpler is better: - Easier to understand
  27. 27. Simpler is better: - Easier to understand - Less compute power Wait, back up… why bother selecting?
  28. 28. Simpler is better: - Easier to understand - Less compute power - “Curse of dimensionality” Wait, back up… why bother selecting?
  29. 29. Simpler is better: - Easier to understand - Less compute power - “Curse of dimensionality” - Easier to generalize Wait, back up… why bother selecting?
  30. 30. This is the most important step.
  31. 31. … so let’s try it!
  32. 32. Feature engineering Algorithm selection Training
  33. 33. Classification vs. Regression vs. Clustering
  34. 34. Unsupervised vs. Supervised
  35. 35. Unsupervised: Clustering Detecting similarity What’s similarity? Well…
  36. 36. Unsupervised: Clustering Detecting similarity What’s similarity? Well… distance (“on graph”)
  37. 37. Unsupervised: Clustering Detecting similarity What’s similarity? Well… distance (“on graph”)
  38. 38. Unsupervised: Clustering Detecting similarity What’s similarity? Well… distance (“on graph”)
  39. 39. Mean Shift Two step program: 0. Put some ‘centroids’ on your graph Repeat until done: 1. Compute the “kernel density” over a window around each centroid 2. Move each centroid to the maximum density
  40. 40. convergence Mean Shift Two step program: 0. Put some ‘centroids’ on your graph Repeat until done: 1. Compute the “kernel density” over a window around each centroid 2. Move each centroid to the maximum density
  41. 41. convergence things stopped changing Mean Shift Two step program: 0. Put some ‘centroids’ on your graph Repeat until done: 1. Compute the “kernel density” over a window around each centroid 2. Move each centroid to the maximum density
  42. 42. Visualizations: https://www.naftaliharris.com/blog/visualizing-k-means-clustering/ http://stanford.edu/class/ee103/visualizations/kmeans/kmeans.html (note these are of K-means, not mean shift - same basic process, easier to visualize)
  43. 43. But keep in mind... It’s all about feature engineering.
  44. 44. But keep in mind... It’s all about feature engineering. And data.
  45. 45. Feature engineering Algorithm selection Training
  46. 46. Let’s just do it!
  47. 47. Resources http://greenteapress.com/thinkstats/ http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ https://github.com/hangtwenty/dive-into-machine-learning http://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
  48. 48. And now, it’s your turn.
  49. 49. Thank you! No, seriously.
  50. 50. One last thing...
  51. 51. Dakota Nelson @jerkota dakota@strikersecurity.com

×