DETECTING A HACKED
TWEET
with Machine Learning and Artificial Intelligence

Sponsored by
AI IS GOOD
APRIL 23, 2013 1:15PM
143 POINT DROP


Supervised



Logistic Regression



Support Vector Machines





Linear Regression

Neural Networks

Unsupervised...
LINEAR REGRESSION
LOGISTIC REGRESSION
LOGISTIC REGRESSION
Linear Classification
SUPPORT VECTOR MACHINE
Non-Linear Classification
SUPPORT VECTOR MACHINE
Gaussian Kernel
POP QUIZ!


You are designing an agent for The Matrix.



It’s task is to classify people that are threats to the system.



Feat...


You are designing the brain of a battle robot.



It’s primary attack is hand-to-hand combat. Your task is to find
the...


Accord .NET



Support Vector Machine



2 Classes



Gaussian Kernel, Sigma = 2



Let’s get down to business!

TW...


Extract Tweets with TweetSharp



Create Document Corpus (6,054 tweets)



Create Vocabulary (2,225 words)



Digiti...
ACCURACY
99.74% TRAINING
96.22% CROSS VALIDATION


Changed from Gaussian to Linear Kernel

CAN WE DO BETTER THAN THAT?
ACCURACY
100% TRAINING
97.21% CV
95.10% TEST
LOOKIN’ PRETTY GOOD
ACCURACY
100% TRAINING
97.38% CV
96.23% TEST


Initial training set contained random tweets from AP and non-AP



Correctly classified AP tweets, but failed on hacke...
ITTTTTTTTT’S DEMO TIME!
Red

Green

COLOR CLASSIFICATION

BLUE
COUNTING PIXELS
91.0% TRAINING
93.6% CV
NOT BAD, BUT .. =)
SVM
100% TRAINING
94.2% CV
NEURAL NETWORK
100% TRAINING
97.6% CV
ITTTTTTTTT’S DEMO TIME!


Image Classification



Sentiment Analysis



Prediction



Recommender Systems



Clustering



Deep Learning


...
Upcoming SlideShare
Loading in …5
×

Detecting a Hacked Tweet with Machine Learning

843 views

Published on

Full article: http://primaryobjects.com/CMS/Article158.aspx

On April 23, 2013 the stock market experienced one of its biggest flash-crash drops of the year, with the Dow Jones industrial average falling 143 points (over 1%) in a matter of minutes. Unlike the 2012 stock market blip, this one wasn't caused by an individual trade, but rather by a single tweet from AP's account on the social network, Twitter. The tweet, of course, wasn't written by AP, but rather by an imposter who had temporarily gained control of the account. Considering the impact of real-time messaging services, such as Twitter, what if it were possible to detect the tweet as hacked? In this presentation, we'll discuss how to use machine learning and "big data" analysis to mine large amounts of information and classify meaningful relationships from them. In particular, we'll walk-through a prototype machine learning example that attempts to classify tweets as having been authored by AP or not. We'll examine learning curves to see how they help validate machine learning algorithms and models. As a final test, we'll run the program on the hacked tweet and see if it's able to successfully classify the tweet as being authentic or hacked.

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
843
On SlideShare
0
From Embeds
0
Number of Embeds
25
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Detecting a Hacked Tweet with Machine Learning

  1. 1. DETECTING A HACKED TWEET with Machine Learning and Artificial Intelligence Sponsored by
  2. 2. AI IS GOOD
  3. 3. APRIL 23, 2013 1:15PM 143 POINT DROP
  4. 4.  Supervised   Logistic Regression  Support Vector Machines   Linear Regression Neural Networks Unsupervised  K-Means Clustering  Principal Component Analysis (Dimensionality Reduction) MACHINE LEARNING ALGORITHMS
  5. 5. LINEAR REGRESSION
  6. 6. LOGISTIC REGRESSION
  7. 7. LOGISTIC REGRESSION Linear Classification
  8. 8. SUPPORT VECTOR MACHINE Non-Linear Classification
  9. 9. SUPPORT VECTOR MACHINE Gaussian Kernel
  10. 10. POP QUIZ!
  11. 11.  You are designing an agent for The Matrix.  It’s task is to classify people that are threats to the system.  Feature Set:   IQ  Level of Education   Age # of Times They Watched the Movie The Matrix Training Set of 100,000 people: 50k threats, 50k non-threats QUESTION 1: SUPERVISED OR UNSUPERVISED?
  12. 12.  You are designing the brain of a battle robot.  It’s primary attack is hand-to-hand combat. Your task is to find the most effective move combos.  Feature Set:   # of Punches  # of Head-butts   # of Kicks # of Leg Sweeps Training Set of 100,000 winning battles QUESTION 2: SUPERVISED OR UNSUPERVISED?
  13. 13.  Accord .NET  Support Vector Machine  2 Classes  Gaussian Kernel, Sigma = 2  Let’s get down to business! TWEET ANALYSIS PROJECT SETUP
  14. 14.  Extract Tweets with TweetSharp  Create Document Corpus (6,054 tweets)  Create Vocabulary (2,225 words)  Digitize Corpus   Term Frequency Inverse Document Frequency (TF*IDF)   Porter-Stemmer (“talking” => “talk”, “explosion” => “explos”) Word Existence Vector Size = Vocabulary Size | Matrix = double[6054][2225] ALL YOUR DATA ARE BELONG TO US
  15. 15. ACCURACY 99.74% TRAINING 96.22% CROSS VALIDATION
  16. 16.  Changed from Gaussian to Linear Kernel CAN WE DO BETTER THAN THAT?
  17. 17. ACCURACY 100% TRAINING 97.21% CV 95.10% TEST
  18. 18. LOOKIN’ PRETTY GOOD
  19. 19. ACCURACY 100% TRAINING 97.38% CV 96.23% TEST
  20. 20.  Initial training set contained random tweets from AP and non-AP  Correctly classified AP tweets, but failed on hacked tweet  Let’s try a teensy-weensie bit of over-fitting  “-from:AP obama” “-from:AP breaking” “-from:AP explosions” SO .. DID IT WORK OR WHAT?
  21. 21. ITTTTTTTTT’S DEMO TIME!
  22. 22. Red Green COLOR CLASSIFICATION BLUE
  23. 23. COUNTING PIXELS 91.0% TRAINING 93.6% CV NOT BAD, BUT .. =)
  24. 24. SVM 100% TRAINING 94.2% CV
  25. 25. NEURAL NETWORK 100% TRAINING 97.6% CV
  26. 26. ITTTTTTTTT’S DEMO TIME!
  27. 27.  Image Classification  Sentiment Analysis  Prediction  Recommender Systems  Clustering  Deep Learning  Surpassing human intelligence? CONCLUSION Detecting a Hacked Tweet with Machine Learning http://primaryobjects.com/CMS/Article158.aspx An Intelligent Approach to Image Classification By Color http://primaryobjects.com/CMS/Article154.aspx Self-Programming Artificial Intelligence http://primaryobjects.com/CMS/Article149.aspx

×