Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages

  • 808 views
Uploaded on

Slides of my talk at CCS 2013

Slides of my talk at CCS 2013

More in: Technology , Design
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
808
On Slideshare
792
From Embeds
16
Number of Embeds
3

Actions

Shares
Downloads
10
Comments
0
Likes
0

Embeds 16

https://twitter.com 12
http://www.linkedin.com 3
https://www.linkedin.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna University of California, Santa Barbara
  • 2. The Web is a Dangerous Place • Drive-by downloads • Social engineering Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 2
  • 3. Current Detection Techniques Static Analysis Dynamic Analysis Suspicious elements in • URLs • JavaScript • Flash Visit the web page (honeyclients) • Signs of exploitation Obfuscation Cloaking Can only detect attacks that exploit vulnerabilities! Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 3
  • 4. Our Technique Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 4
  • 5. Redirection Graphs No need to analyze the final page! By analyzing the characteristics of the set of visitors and of the redirection graph, we can determine if the destination page is malicious Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 5
  • 6. Legitimate Uses of Redirections • Inform that a web page has moved • Login functionalities • Advertisements We cannot flag all redirections as malicious Luckily, malicious redirection graphs look different Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 6
  • 7. Malicious Redirection Graphs Uniform software configuration Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 7
  • 8. Malicious Redirection Graphs Cross-domain redirections evil.co.cc malicious.ru Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 8
  • 9. Malicious Redirection Graphs “Hubs” to aggregate traffic Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 9
  • 10. Malicious Redirection Graphs “Infected” websites Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 10
  • 11. System Overview Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 11
  • 12. Our System: SpiderWeb We leverage the differences between legitimate and malicious redirection graphs for detection Three components: • Data collection • Creation of redirection graphs • Classification component Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 12
  • 13. Data Collection SpiderWeb needs a set of navigation data from a diverse population of users Dataset obtained from a large AV vendor • Users of a browser security tool • Data collection was optin only • Data was anonymized Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 13
  • 14. Creation of Redirection Graphs b.com c.com d.com c.com a.com d.com c.com d.com When we specify the final page, we allow wildcards (e.g., malicious.com/*) → Groupings We need to discard groupings that are too general Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 14
  • 15. Classification Component Five categories of features • Client features (3 features) • Referrer features (4 features) • Landing page features (4 features) • Final page features (5 features) } how diverse are these elements Distinct URLs, Parameters, TLD, Domain is an IP • Redirection graph features (12 features) Length of chains, same country across referrer and final page, intra-domain redirections, hubs We use Support Vector Machines for classification Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 15
  • 16. Evaluation Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 16
  • 17. Evaluation Dataset 388,098 redirection chains, collected over two months • 34,011 final URLs • 13,780 distinct user IP addresses per week • 145 countries Labeled dataset for training • • 2,533 redirection chains leading to 1,854 malicious URLs 2,466 redirection chains leading to 510 legitimate URLs Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 17
  • 18. Analysis of the Classifier SpiderWeb’s performance depends on the redirection graph complexity • Complexity ≥ 6 causes no FPs and no FNs • Our dataset is limited → we discard graphs with complexity < 4 We need to accept a certain amount of FPs and FNs Full URL grouping: 1.2% FP rate, 17% FN rate Redirection-graph specific features are the most important: Without them, FNs raise to 67% Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 18
  • 19. Detection in the Wild 3,549 redirection graphs with complexity ≥ 4 564 flagged as malicious → 3,368 URLs 778 URLs undetected by the AV vendor • We could not confirm 1.5% of them • Effectively complements state of the art Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 19
  • 20. Comparison with Previous Work A few previous systems leverage redirection information to detect malicious web pages These systems also use other type of information • WarningBird: uses Twitter profile information • SURF: SEO specific If this additional information is not present, SpiderWeb outperforms previous systems Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 20
  • 21. Possible Use Cases Offline detection (blacklist) Online detection Users get infected until the required “complexity” is reached We performed a chronological experiment SpiderWeb would have protected 93% users Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 21
  • 22. Discussion Limitations • Graphs with high complexity are required • Groupings are not perfect • Attackers might redirect users to legitimate pages Attackers might make their redirections look legitimate • Stop using cloaking (easier to detect by previous work) • Stop using hubs (raises the bar) Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 22
  • 23. Conclusions • We showed that malicious and legitimate redirection graphs differ • We presented a system that analyzes redirection graphs to detect malicious web pages • We showed that our system is effective, and complements existing systems Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 23
  • 24. Questions? gianluca@cs.ucsb.edu @gianlucaSB Shady Paths: Leveraging Surfing Crowds to Detect Malicious Web Pages 24