SlideShare a Scribd company logo
1 of 59
Birds, bots and machines:
      Detecting fraud in Twitter using Machine Learning




Vicente Díaz
Senior Security Analyst,
Global Research and Analysis
Expectations vs reality
Why Twitter?
Spam - email
90.00
80.00
70.00
60.00
50.00
40.00
30.00
20.00
10.00
 0.00
Using hacked accounts
Using hacked accounts
Anything else interesting?




          #PalabrasNeciasMovistarSorda
Anything else interesting?




          #PalabrasNeciasMovistarSorda
Getting profiles
Getting profiles
Getting profiles
A random campaign
Lifespan of bots
Detour – A few words on privacy
Tracking
Identify the user:
                   Advanced tracking
    Passive data: headers, plugins, browser, OS
    JS: screen resolution, custom resource detection via Plugins API
 (i.e. printers via PDF, fonts via Flash, etc.)
Track ID
    Cookies, Flash cookies (allow cross-domain references),
 HTML5 storage, silverlight
 Java: own download cache, applets can read embedded resource streams


Future? Apps and games in social networks.
Let´s play
Experiment

  • 3 months of tracking
• 36 malicious campaigns
     • 13,490 profiles
    • 195,801 tweets
• 6,519,247 relationships
Machine Learning in 60 seconds
• Supervised learning
• Training – adaptative models
• Classification

• Key: choose the right attributes
Machine Learning in 60 seconds
• Supervised learning
• Training – adaptative models
• Classification

• Key: choose the right attributes
Twitter             Feature selection
 username
                      coordinates     Derived
                      description
 profileImg
                      lang
• Curse of dimensionality
 followingCount
 followersCount
                      url   meanTimeBetweenTweets
                      createdAt
• No new knowledge is generated: choose the
 tweetsCount
 fullName
                            friendFollowerRatio
                      timeZone
                            tweetsKnownRecv
                      verified
  right features!
 following
 followers
                            tweetsUnknownRecv
                                   percFollowingFollowers
 numberOfProfile
 Tweets                            percProfileTweetsWithLink
 protected                         percProfileTweetsToSomeone
 text                              percProfileTweetsRT
 possiblySensitive
 source                            numberOfViasUsed
 location
Mean time between tweets
Tweets to someone
Tweets to someone

    After some testing and feature-selection
                  algorithms:

numberOfVias
tweetsToSomeone
tweetsWithLink
followingFollowers
friendFollowerRatio
tweetsKnownReceiver
tweetsUnknownReceiver
Avoiding detection



         You are doing it wrong!
Avoiding semantic analysis
• if its do you me your my do it my be find is but on are its rt
  that was

• I a me at get out your they on rt if I get rt can a

• u you rt find in I that that your my my find one you so is is
  my you this but get all a one its it

• they with its your get me of I
Avoiding relationship checks
Avoiding relationship checks


    Or just overflow with fake profiles …
DIY
Finding malicious profiles
• Not so hard …
1 week later…
AdrianaDickson7

MyrtleTerry11

PatricaFitzpat6

RobertP97792514

RochelleBeasle8
                      5200 profiles in this campaign
ShannonMunoz13


                  Around 250 new profiles created every
                                 day
Following
180
160
140
120
100
 80                                          Following
 60
 40
 20
  0
      0   50   100   150   200   250   300
Followers
200

150

100
                                             Followers
 50

 0
      0   50   100   150   200   250   300
Top tweets sent
• Mmmm hot chocolate with cream
• Beyonce looks so hot in her new ad             1800 different tweets
• So Hot
• Spain !! Too hot
• hot summer
• a hot bubble bath is much needed
• Tea water supposed to hot ya now
• Air conditioner-laying on the bed-naked-relax-heaven! So hot tonight!
• playing piano and guitar r the only things i can do right in life does
  this make me hot enough for a boyfriend yet</p
• Austin mahone is just like another justin beiber..he is hot tho!
Top tweets sent
• Mmmm hot chocolate with cream
• Beyonce looks so hot in her new ad             1800 different tweets
• So Hot
• Spain !! Too hot
• hot summer
• a hot bubble bath is much needed
• Tea water supposed to hot ya now
• Air conditioner-laying on the bed-naked-relax-heaven! So hot tonight!
• playing piano and guitar r the only things i can do right in life does
  this make me hot enough for a boyfriend yet</p
• Austin mahone is just like another justin beiber..he is hot tho!
Not only limited to Twitter
Not only limited to Twitter
Not only limited to Twitter
Conclusions
          It is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

          Machine learning: lots of resources!
Conclusions
          It is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

          Machine learning: lots of resources!
Conclusions
          It is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

          Machine learning: lots of resources!
Conclusions
          It is relatively easy to find anomalies

Bots are there for different reasons, mostly fraud-related

          Machine learning: lots of resources!
Thank you
               Questions?


Vicente Díaz          @trompi


Senior Security Analyst,
Global Research and Analysis

More Related Content

Viewers also liked

Bahaya mengguncang bayi (shaken baby sindrome (sbs) )
Bahaya mengguncang bayi (shaken baby sindrome (sbs) )Bahaya mengguncang bayi (shaken baby sindrome (sbs) )
Bahaya mengguncang bayi (shaken baby sindrome (sbs) )Prodalima Sinulingga, M.Kep
 
WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...
WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...
WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...Augustin Jose
 
Machine learning & security. Detect atypical behaviour in logs
Machine learning & security. Detect atypical behaviour in logsMachine learning & security. Detect atypical behaviour in logs
Machine learning & security. Detect atypical behaviour in logsAlexander Melnychuk
 
A Sober Look at Machine Learning
A Sober Look at Machine LearningA Sober Look at Machine Learning
A Sober Look at Machine LearningSven Krasser
 
Mr201306 machine learning for computer security
Mr201306 machine learning for computer securityMr201306 machine learning for computer security
Mr201306 machine learning for computer securityFFRI, Inc.
 
Real time classification of malicious urls.pptx 2
Real time classification of malicious urls.pptx 2Real time classification of malicious urls.pptx 2
Real time classification of malicious urls.pptx 2Daniyar Mukhanov
 
[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...
[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...
[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...CODE BLUE
 
Applied Machine learning for data exfiltration and other fun topics
Applied Machine learning for data exfiltration and other fun topicsApplied Machine learning for data exfiltration and other fun topics
Applied Machine learning for data exfiltration and other fun topicsPriyanka Aash
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...Alex Pinto
 
BSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information SecurityBSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information SecurityAlex Pinto
 
Malicious Url Detection Using Machine Learning
Malicious Url Detection Using Machine LearningMalicious Url Detection Using Machine Learning
Malicious Url Detection Using Machine Learningsecurityxploded
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Alex Pinto
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Alex Pinto
 
Jim Geovedi - Machine Learning for Cybersecurity
Jim Geovedi - Machine Learning for CybersecurityJim Geovedi - Machine Learning for Cybersecurity
Jim Geovedi - Machine Learning for Cybersecurityidsecconf
 
Computer security - A machine learning approach
Computer security - A machine learning approachComputer security - A machine learning approach
Computer security - A machine learning approachSandeep Sabnani
 
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Alex Pinto
 
Machine Learning for Threat Detection
Machine Learning for Threat DetectionMachine Learning for Threat Detection
Machine Learning for Threat DetectionNapier University
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)
AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)
AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)Isao Takaesu
 

Viewers also liked (20)

Bahaya mengguncang bayi (shaken baby sindrome (sbs) )
Bahaya mengguncang bayi (shaken baby sindrome (sbs) )Bahaya mengguncang bayi (shaken baby sindrome (sbs) )
Bahaya mengguncang bayi (shaken baby sindrome (sbs) )
 
WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...
WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...
WARNINGBIRD: A NEAR REAL-TIME DETECTION SYSTEM FOR SUSPICIOUS URLS IN TWITTER...
 
Machine learning & security. Detect atypical behaviour in logs
Machine learning & security. Detect atypical behaviour in logsMachine learning & security. Detect atypical behaviour in logs
Machine learning & security. Detect atypical behaviour in logs
 
Analogic Opsec 101
Analogic Opsec 101Analogic Opsec 101
Analogic Opsec 101
 
A Sober Look at Machine Learning
A Sober Look at Machine LearningA Sober Look at Machine Learning
A Sober Look at Machine Learning
 
Mr201306 machine learning for computer security
Mr201306 machine learning for computer securityMr201306 machine learning for computer security
Mr201306 machine learning for computer security
 
Real time classification of malicious urls.pptx 2
Real time classification of malicious urls.pptx 2Real time classification of malicious urls.pptx 2
Real time classification of malicious urls.pptx 2
 
[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...
[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...
[CB16] Method of detecting vulnerability in WebApps using Machine Learning by...
 
Applied Machine learning for data exfiltration and other fun topics
Applied Machine learning for data exfiltration and other fun topicsApplied Machine learning for data exfiltration and other fun topics
Applied Machine learning for data exfiltration and other fun topics
 
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
From Threat Intelligence to Defense Cleverness: A Data Science Approach (#tid...
 
BSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information SecurityBSidesLV 2013 - Using Machine Learning to Support Information Security
BSidesLV 2013 - Using Machine Learning to Support Information Security
 
Malicious Url Detection Using Machine Learning
Malicious Url Detection Using Machine LearningMalicious Url Detection Using Machine Learning
Malicious Url Detection Using Machine Learning
 
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
 
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
Secure Because Math: A Deep-Dive on Machine Learning-Based Monitoring (#Secur...
 
Jim Geovedi - Machine Learning for Cybersecurity
Jim Geovedi - Machine Learning for CybersecurityJim Geovedi - Machine Learning for Cybersecurity
Jim Geovedi - Machine Learning for Cybersecurity
 
Computer security - A machine learning approach
Computer security - A machine learning approachComputer security - A machine learning approach
Computer security - A machine learning approach
 
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
 
Machine Learning for Threat Detection
Machine Learning for Threat DetectionMachine Learning for Threat Detection
Machine Learning for Threat Detection
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)
AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)
AISECjp SAIVS(Spider Artificial Intelligence Vulnerability Scanner)
 

Similar to Birds, Bots and Machines - fraud in Twitter and machine learning

A Guided Tour of Twitter
A Guided Tour of TwitterA Guided Tour of Twitter
A Guided Tour of Twittermurcha
 
Social Zombies II: Your Friends Need More Brains
Social Zombies II: Your Friends Need More BrainsSocial Zombies II: Your Friends Need More Brains
Social Zombies II: Your Friends Need More BrainsTom Eston
 
Twitter for beginners in 2014
Twitter for beginners in 2014Twitter for beginners in 2014
Twitter for beginners in 2014Website Wings
 
The Long and Short of Twitter
The Long and Short of TwitterThe Long and Short of Twitter
The Long and Short of TwitterTim Brice
 
Social Media Basics: Security Loopholes with Twitter & Other Social Media
Social Media Basics: Security Loopholes with Twitter & Other Social MediaSocial Media Basics: Security Loopholes with Twitter & Other Social Media
Social Media Basics: Security Loopholes with Twitter & Other Social MediaTyler Shields
 
Exploring the Twittersphere: A beginner’s guide!
Exploring the Twittersphere: A beginner’s guide!Exploring the Twittersphere: A beginner’s guide!
Exploring the Twittersphere: A beginner’s guide!AdultLearning Australia
 
Basics of Twitter Marketing 2010
Basics of Twitter Marketing 2010Basics of Twitter Marketing 2010
Basics of Twitter Marketing 2010Kyle Lacy
 
Twitter, Professionally
Twitter, ProfessionallyTwitter, Professionally
Twitter, ProfessionallyNikComm Inc.
 
Twitter Tips
Twitter TipsTwitter Tips
Twitter Tipsron mader
 
Corp Web Risks and Concerns
Corp Web Risks and ConcernsCorp Web Risks and Concerns
Corp Web Risks and ConcernsPINT Inc
 
Staying Safe & Secure on Twitter
Staying Safe & Secure on TwitterStaying Safe & Secure on Twitter
Staying Safe & Secure on TwitterTom Eston
 
Twitter Power Primer
Twitter Power PrimerTwitter Power Primer
Twitter Power PrimerAdam Helweh
 
Digital Challenges and Opportunities in Community News
Digital Challenges and Opportunities in Community NewsDigital Challenges and Opportunities in Community News
Digital Challenges and Opportunities in Community NewsSteve Buttry
 
Social Media for Business
Social Media for Business Social Media for Business
Social Media for Business Sarah Page
 
Twitter for Recruiters
Twitter for RecruitersTwitter for Recruiters
Twitter for RecruitersCarmen Hudson
 

Similar to Birds, Bots and Machines - fraud in Twitter and machine learning (20)

A Guided Tour of Twitter
A Guided Tour of TwitterA Guided Tour of Twitter
A Guided Tour of Twitter
 
Social Zombies II: Your Friends Need More Brains
Social Zombies II: Your Friends Need More BrainsSocial Zombies II: Your Friends Need More Brains
Social Zombies II: Your Friends Need More Brains
 
Twitter for beginners in 2014
Twitter for beginners in 2014Twitter for beginners in 2014
Twitter for beginners in 2014
 
The Long and Short of Twitter
The Long and Short of TwitterThe Long and Short of Twitter
The Long and Short of Twitter
 
Social Media Basics: Security Loopholes with Twitter & Other Social Media
Social Media Basics: Security Loopholes with Twitter & Other Social MediaSocial Media Basics: Security Loopholes with Twitter & Other Social Media
Social Media Basics: Security Loopholes with Twitter & Other Social Media
 
Exploring the Twittersphere: A beginner’s guide!
Exploring the Twittersphere: A beginner’s guide!Exploring the Twittersphere: A beginner’s guide!
Exploring the Twittersphere: A beginner’s guide!
 
Basics of Twitter Marketing 2010
Basics of Twitter Marketing 2010Basics of Twitter Marketing 2010
Basics of Twitter Marketing 2010
 
Twitter, Professionally
Twitter, ProfessionallyTwitter, Professionally
Twitter, Professionally
 
Twitter Tips
Twitter TipsTwitter Tips
Twitter Tips
 
Corp Web Risks and Concerns
Corp Web Risks and ConcernsCorp Web Risks and Concerns
Corp Web Risks and Concerns
 
Staying Safe & Secure on Twitter
Staying Safe & Secure on TwitterStaying Safe & Secure on Twitter
Staying Safe & Secure on Twitter
 
Social Media is About Relationships, Its About Time You Learn to Build Them |...
Social Media is About Relationships, Its About Time You Learn to Build Them |...Social Media is About Relationships, Its About Time You Learn to Build Them |...
Social Media is About Relationships, Its About Time You Learn to Build Them |...
 
Productivity Apps
Productivity AppsProductivity Apps
Productivity Apps
 
Twitter Power Primer
Twitter Power PrimerTwitter Power Primer
Twitter Power Primer
 
Digital Challenges and Opportunities in Community News
Digital Challenges and Opportunities in Community NewsDigital Challenges and Opportunities in Community News
Digital Challenges and Opportunities in Community News
 
Social Media for Business
Social Media for Business Social Media for Business
Social Media for Business
 
Twitter & Business
Twitter & BusinessTwitter & Business
Twitter & Business
 
From OSINT to Phishing presentation
From OSINT to Phishing presentationFrom OSINT to Phishing presentation
From OSINT to Phishing presentation
 
Social Media Risks
Social Media RisksSocial Media Risks
Social Media Risks
 
Twitter for Recruiters
Twitter for RecruitersTwitter for Recruiters
Twitter for Recruiters
 

Recently uploaded

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 

Recently uploaded (20)

Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 

Birds, Bots and Machines - fraud in Twitter and machine learning

  • 1. Birds, bots and machines: Detecting fraud in Twitter using Machine Learning Vicente Díaz Senior Security Analyst, Global Research and Analysis
  • 3.
  • 6.
  • 7.
  • 8.
  • 11. Anything else interesting? #PalabrasNeciasMovistarSorda
  • 12. Anything else interesting? #PalabrasNeciasMovistarSorda
  • 13.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25. Detour – A few words on privacy
  • 27. Identify the user: Advanced tracking Passive data: headers, plugins, browser, OS JS: screen resolution, custom resource detection via Plugins API (i.e. printers via PDF, fonts via Flash, etc.) Track ID Cookies, Flash cookies (allow cross-domain references), HTML5 storage, silverlight Java: own download cache, applets can read embedded resource streams Future? Apps and games in social networks.
  • 28.
  • 30. Experiment • 3 months of tracking • 36 malicious campaigns • 13,490 profiles • 195,801 tweets • 6,519,247 relationships
  • 31. Machine Learning in 60 seconds • Supervised learning • Training – adaptative models • Classification • Key: choose the right attributes
  • 32. Machine Learning in 60 seconds • Supervised learning • Training – adaptative models • Classification • Key: choose the right attributes
  • 33. Twitter Feature selection username coordinates Derived description profileImg lang • Curse of dimensionality followingCount followersCount url meanTimeBetweenTweets createdAt • No new knowledge is generated: choose the tweetsCount fullName friendFollowerRatio timeZone tweetsKnownRecv verified right features! following followers tweetsUnknownRecv percFollowingFollowers numberOfProfile Tweets percProfileTweetsWithLink protected percProfileTweetsToSomeone text percProfileTweetsRT possiblySensitive source numberOfViasUsed location
  • 36. Tweets to someone After some testing and feature-selection algorithms: numberOfVias tweetsToSomeone tweetsWithLink followingFollowers friendFollowerRatio tweetsKnownReceiver tweetsUnknownReceiver
  • 37.
  • 38.
  • 39. Avoiding detection You are doing it wrong!
  • 40. Avoiding semantic analysis • if its do you me your my do it my be find is but on are its rt that was • I a me at get out your they on rt if I get rt can a • u you rt find in I that that your my my find one you so is is my you this but get all a one its it • they with its your get me of I
  • 42. Avoiding relationship checks Or just overflow with fake profiles …
  • 43. DIY
  • 45.
  • 46. 1 week later… AdrianaDickson7 MyrtleTerry11 PatricaFitzpat6 RobertP97792514 RochelleBeasle8 5200 profiles in this campaign ShannonMunoz13 Around 250 new profiles created every day
  • 47. Following 180 160 140 120 100 80 Following 60 40 20 0 0 50 100 150 200 250 300
  • 48. Followers 200 150 100 Followers 50 0 0 50 100 150 200 250 300
  • 49. Top tweets sent • Mmmm hot chocolate with cream • Beyonce looks so hot in her new ad 1800 different tweets • So Hot • Spain !! Too hot • hot summer • a hot bubble bath is much needed • Tea water supposed to hot ya now • Air conditioner-laying on the bed-naked-relax-heaven! So hot tonight! • playing piano and guitar r the only things i can do right in life does this make me hot enough for a boyfriend yet</p • Austin mahone is just like another justin beiber..he is hot tho!
  • 50. Top tweets sent • Mmmm hot chocolate with cream • Beyonce looks so hot in her new ad 1800 different tweets • So Hot • Spain !! Too hot • hot summer • a hot bubble bath is much needed • Tea water supposed to hot ya now • Air conditioner-laying on the bed-naked-relax-heaven! So hot tonight! • playing piano and guitar r the only things i can do right in life does this make me hot enough for a boyfriend yet</p • Austin mahone is just like another justin beiber..he is hot tho!
  • 51.
  • 52. Not only limited to Twitter
  • 53. Not only limited to Twitter
  • 54. Not only limited to Twitter
  • 55. Conclusions It is relatively easy to find anomalies Bots are there for different reasons, mostly fraud-related Machine learning: lots of resources!
  • 56. Conclusions It is relatively easy to find anomalies Bots are there for different reasons, mostly fraud-related Machine learning: lots of resources!
  • 57. Conclusions It is relatively easy to find anomalies Bots are there for different reasons, mostly fraud-related Machine learning: lots of resources!
  • 58. Conclusions It is relatively easy to find anomalies Bots are there for different reasons, mostly fraud-related Machine learning: lots of resources!
  • 59. Thank you Questions? Vicente Díaz @trompi Senior Security Analyst, Global Research and Analysis

Editor's Notes

  1. Today I´m gonna talk about fraud in Twitter and Machine Learning. The historical problem with ML, field of AI, is expectations.In our collective imagination we envision The Terminator, Matrix and Ghost in the shell. What I want to do with this presentation is show how AI may be used in a much more simple way for daily problems that we, as researchers, face every day.However using AI in security is not new, but for some reason I think we underuse it. I hope after this talk everybody will be more interested in this topicand learn how to use it on a regular basis.
  2. I just want to stress that we often use these techniques in security for very interesting stuff.But it always looks like something big and difficult to apply to more mundane problems. That´s where I hope this talk could help everybody through an example.
  3. Ok, so let´s start. I´ve already said we will apply machine learning to detect fraud. In this case, we detect fraud in Twitter.Why Twitter? I don´t think it is necessary to stress why Twitter is relevant these days, here you can see some numbers about how big it is.But also Twitter has some other interesting features for a researcher: all the data is public and easily accessible, information about profiles is public and easy to obtain through a convenient API (note this changed last year and now you should use Oauth). Also Twitter messages are short, that helps in case you want to analyze the contents.
  4. So what´s the problem with Twitter? Where is the fraud?One of the problems is in the level of Spam that social networks are reaching. Playing with big numbers is never easy and we only have a partial view of the big picture, but from our data we see how the level of Spam started decreasing a couple of years ago. At the same time, the level of Spam increased in Social Networks.The reasons are understandable: we learned how to detect spam in email messages and we ignore them. Also we have a lot of software doing the filtering for us. But in the case of social networks spammers get a better ROI as people still gets confused and opens everything it´s sent to them. Also protection mechanisms are not so well established.I remember I read something on how old email spammers were moving to buying legitimate Ads in Social Networks as the ROI was bigger.
  5. We have some data from Twitter about their levels of Spam. This is always a bit tricky, because their figures is what they detect, so we may think that either they improved their detection mechanisms or failed to detect new stuff Still no fresh data is available, but we see how Twitter started to get serious about that. Still the problem exists, and keep in mind that 1% of 175 million tweets is 1,75 million Spam messages a day!About media, this is what I was talking about when talking nobody has all the data.Still – why doing this spam? Is people buying Viagra through Twitter? We will talk a bit about this later.
  6. Well, we have many examples of Spam being sent on Twitter, but not so many on malware. Why not?Spam is still a grey area. We will see that in the examples later, but many times it is very difficult to say whether a campaign is malicious or not – or legal or not, so it´s not easy for the social network to shutdown the account. But when it comes to malware distribution, everything is clear. All security researchers and AV industry is quick to investigate and shutdown everything.We should understand how basically there are two techniques for this: creating new bots or hijacking existing accounts. Both methods have pros and cons for attackers and researchers.
  7. Well, we have many examples of Spam being sent on Twitter, but not so many on malware. Why not?Spam is still a grey area. We will see that in the examples later, but many times it is very difficult to say whether a campaign is malicious or not – or legal or not, so it´s not easy for the social network to shutdown the account. But when it comes to malware distribution, everything is clear. All security researchers and AV industry is quick to investigate and shutdown everything.We should understand how basically there are two techniques for this: creating new bots or hijacking existing accounts. Both methods have pros and cons for attackers and researchers.
  8. Hacked accounts may have many uses, but one of them is to get more accounts!
  9. Hacked accounts may have many uses, but one of them is to get more accounts!
  10. Well, what else do we have in Twitter?Anyone ever heard of what is called Digital Marketing? Basically that consists on a bunch of people whose work consists on creating strange ratios based on followers, trends and stuff and show it to their bosses One of their most used tools is Twitter. They try to get as many followers as possible to spread the company´s message. But they also know how important is the influence of other people in social networks to promote their message. In the past it was the marketing guy going to wikipedia´s website to change what other people said about them – we could see that through the used IPs for their shame. Nowadays a lot of digital marketing companies promote their message using fake profiles in social networks, like in this example.In this case accounts are not hijacked, as this would be big trouble for the company behind that. Also, these profiles may not be against the Terms Of Service of the social network but are totally against the interest of it – they want real people, and that´s why they are lately asking for complementary data (such as the telephone number in Google) and shutting down fake profiles (in Facebook).
  11. Well, what else do we have in Twitter?Anyone ever heard of what is called Digital Marketing? Basically that consists on a bunch of people whose work consists on creating strange ratios based on followers, trends and stuff and show it to their bosses One of their most used tools is Twitter. They try to get as many followers as possible to spread the company´s message. But they also know how important is the influence of other people in social networks to promote their message. In the past it was the marketing guy going to wikipedia´s website to change what other people said about them – we could see that through the used IPs for their shame. Nowadays a lot of digital marketing companies promote their message using fake profiles in social networks, like in this example.In this case accounts are not hijacked, as this would be big trouble for the company behind that. Also, these profiles may not be against the Terms Of Service of the social network but are totally against the interest of it – they want real people, and that´s why they are lately asking for complementary data (such as the telephone number in Google) and shutting down fake profiles (in Facebook).
  12. Some accounts may be abused by hacktivists as well, as when defacing any website. These cases are more rare, the accounts are hijacked in a more selective and unique way, and as such, this is not very interesting from a global perspective. Still their impact may be very important in case the account is not detected as malicious quickly!
  13. We have seen how it is interesting for attackers to both create fake accounts and to hijack legitimate ones.So, how to create trouble in Twitter? There are different methods for malicious activity. Basically attackers can Create new profiles or Hack existing ones.For this last method they:Can steal it as any victim of any malwareBruteforce the accountDelete the hash from Twitter
  14. We have seen how it is interesting for attackers to both create fake accounts and to hijack legitimate ones.So, how to create trouble in Twitter? There are different methods for malicious activity. Basically attackers can Create new profiles or Hack existing ones.For this last method they:Can steal it as any victim of any malwareBruteforce the accountDelete the hash from Twitter
  15. We have seen how it is interesting for attackers to both create fake accounts and to hijack legitimate ones.So, how to create trouble in Twitter? There are different methods for malicious activity. Basically attackers can Create new profiles or Hack existing ones.For this last method they:Can steal it as any victim of any malwareBruteforce the accountDelete the hash from Twitter
  16. So let me show you some examples and details on how this works with a real campaign.
  17. In this case it all started last summer when I was sending a tweet about Battlefield, and I got a reply from this nice girl – which usually never happens to me.Inmediately I got suspicious and started looking into it. I discovered several other profiles, basically all of them consisting on nice girls sending messages like the one I received to guys like me.These messages were on different topics: xbox, iphone, macbookpro, victoria´s secret, etc.Here you can see a collage I created with some of the profile pictures I found.
  18. In this case it all started last summer when I was sending a tweet about Battlefield, and I got a reply from this nice girl – which usually never happens to me.Inmediately I got suspicious and started looking into it. I discovered several other profiles, basically all of them consisting on nice girls sending messages like the one I received to guys like me.These messages were on different topics: xbox, iphone, macbookpro, victoria´s secret, etc.Here you can see a collage I created with some of the profile pictures I found.
  19. One of the most interesting things to notice is how all these were one-shot bots. The lifespan for almost half of them was less than 45 minutes!Another interesting thing to notice is how these bots were doing some semantic analysis of their victims. And that´s a real improvement for Twitter bots when compared to email Spam bots where you have no knowledge of the victim. In this case you can try to get the interest of your victim by offering him something related to his interests.
  20. Well, basically after some redirections –first through a fake blog - you were landing on a page like this one where depending on your IP and your ZIP code you were asked for your email to play a lottery.This is not the typical Viagra website, so I was wondering how was the spammer making money here? – explain the campaign on the following slides and how the AD industry works.
  21. So let me say a few words on what is happening with our privacy these days and how tracking works.
  22. Flash cookies to rewrite cookies.From the How Unique is Your web browser we get a 86% of unique fingerprints.Sometimes plugins are used to bypass the content of blocked sites.JS code simulates user interaction to bypass the third-party cookie restrictions.Related, cross-domain postMessage support to pass cookies between coordinating sites and store in localStorage!Initiatives such as DoNotTrack (http header) are being completely ignored
  23. The same on a enterprise level: do we know who else Google provide access to our data?Multibillion industry, data is cross-checked with real life information and finally sold. To whom?What about GOVs?
  24. Ok, so once we know how it works I decided to check whether it was possible to detect all these nasty malicious campaigns and I decided to do my own experiment.
  25. There is also a surprisingly good result for detecting hacked accounts as well. In this case I believe the reason is that many of the features are related to the tweets sent and, in my experiment, I only considered the last 20 tweets sent – so it´s a limited time window.
  26. There is also a surprisingly good result for detecting hacked accounts as well. In this case I believe the reason is that many of the features are related to the tweets sent and, in my experiment, I only considered the last 20 tweets sent – so it´s a limited time window.
  27. So we have seen how the key is in choosing the right features for machine learning to be effective and detect the malicious profiles.Keep in mind this is an experiment and in real life we might find that our training subset does not cover all possible variables.However, creators of fake profile keep this in mind in order to avoid being detected. Many features may be easily forged to avoid detection, that´s why derived features such as the ones that have to do with relationship with other profiles are more solid, but still can be avoided.Let´s take a look to what attackers are doing in order to avoid detection.
  28. Creating neighborhoods, however, has some implicit risk: it is easy to shutdown the whole group once detected.I haven´t seen yet using hacked accounts to create relationships with bots in order to make it harder to detect campaigns – maybe we will see this in the future.
  29. Creating neighborhoods, however, has some implicit risk: it is easy to shutdown the whole group once detected.I haven´t seen yet using hacked accounts to create relationships with bots in order to make it harder to detect campaigns – maybe we will see this in the future.
  30. Usually bots use the same messages and URLs. URLs are difficult to search because of shorteners, but in some cases they use other phrases all of them. This way you can localize them.As you can see here, it is common they re-use the same profile picture many times, this is another way to detect them.You can just search in Twitter (using the API or the website)
  31. In this example you can see how reused profile pictures is another trick.You can see how they use here the same profile description, so it is quite easy to get many other profiles of this campaign just looking for the email they use in the campaign.
  32. We can also see how the non-deleted bots are reused in different campaigns, changing some parameters to adjust for the new one and also to avoid detection.There are more pictures, but I got tired … Also, some other accounts were suspended during this time.Another hint for suspicious profiles is in the name of the profile itself. You can see how all of them here follow a given pattern.
  33. We see how analyzing these profiles they are extremely easy to detect. Still they survive thanks to brute force.
  34. We see how analyzing these profiles they are extremely easy to detect. Still they survive thanks to brute force.
  35. You see any pattern?1800 tweets include the word hot
  36. You see any pattern?1800 tweets include the word hot
  37. This is one of my favourites sites to use to find stuf