Two Years of Short URL Internet Measurement: security threats and countermeasures

6,319 views
6,245 views

Published on

I presented the results of this research work at #www2013.

Abstract: URL shortening services have become extremely popular. However, it is still unclear whether they are an effective and reliable tool that can be leveraged to hide malicious URLs, and to what extent these abuses can impact the end users. With these questions in mind, we first analyzed existing countermeasures adopted by popular shortening services. Surprisingly, we found such countermeasures to be ineffective and trivial to bypass. This first measurement motivated us to proceed further with a large-scale collection of the HTTP interactions that originate when web users access live pages that contain short URLs. To this end, we monitored 622 distinct URL shortening services between March 2010 and April 2012, and collected 24,953,881 distinct short URLs. With this large dataset, we studied the abuse of short URLs. Despite short URLs are a significant, new security risk, in accordance with the reports resulting from the observation of the overall phishing and spamming activity, we found that only a relatively small fraction of users ever encountered malicious short URLs. Interestingly, during the second year of measurement, we noticed an increased percentage of short URLs being abused for drive-by download campaigns and a decreased percentage of short URLs being abused for spam campaigns. In addition to these security-related findings, our unique monitoring infrastructure and large dataset allowed us to complement previous research on short URLs and analyze these web services from the user’s perspective.

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,319
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
7
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Two Years of Short URL Internet Measurement: security threats and countermeasures

  1. 1. Two Years of Short URL Internet Measurement http://flic.kr/phretor Federico Maggi, Alessandro Frossi, Stefano Zanero Gianluca Stringhini, Brett Stone-Gross, Christopher Kruegel, Giovanni Vigna
  2. 2. How short URLs work: Creating an alias. URL shortening service http://example.com/very/long/?url=to&the=landing-pagelong URL http://ab.cd/d73fYfzshort URL RANDOM SUFFIX IS GENERATED "make me shorter"
  3. 3. How short URLs work: Resolving an alias. http://ab.cd/d73fYfz REDIRECTION MECHANISM http://example.com/very/long/... HTTP 302 HTML META Refresh JavaScript ActionScript redirection happens on the browser's side URL shortening service
  4. 4. From a security point of view Perfect mean for masquerading malicious URLs • Trivially evade naïve filters • Robust to email clients that break long URLs into multiple lines • Trendy effect (e.g., Twitter, Facebook) • Users have grown accustomed to short URLs http://i.am/harmless http://evil.com/very/long/and/suspicous/path
  5. 5. Do shortening services take security measures?
  6. 6. Do shortening services take security measures? • Prepare lists of malicious long URLs • Malware (e.g., drive-by download) from Wepawet • Spam from Spamhaus • Phishing from PhishTank • Performed 3 small measurement experiments • Creation • Resolution • Periodic checks
  7. 7. Creation of malicious short URLs • Submit for shortening to the top 6 services (as of 2010) 1. bit.ly 2. durl.me 3. goo.gl 4. is.gd 5. migre.me 6. tinyurl.com • Do they accept malicious long URLs?
  8. 8. URL shortening service http://example.com/very/long/?url=to&the=evil.comlong URL "make me shorter" Blacklist "sure, here is your short URL" ?
  9. 9. Malicious long URLs accepted by top services Service Malware Phishing Spam # % # % # % bit.ly 2,000 100.0 2,000 100.0 2,000 100.0 durl.me 1,999 99.9 1,987 99.4 1,976 98.8 goo.gl* 2000 99.9 994 99.4 1,000 100.0 is.gd 1,854 92.7 1,834 91.7 364 18.2 migre.me 1,738 86.9 1,266 63.3 1,634 81.7 tinyurl.com 1,959 99.5 1,935 96.8 587 29.4 Overall 9,550 95.5 9,022 90.2 6,561 65.6
  10. 10. Resolution of malicious short URLs • Resolved the malicious short URLs with the same services • Do they resolve malicious URLs?
  11. 11. URL shortening service http://i.am/so-evilshort URL "please resolve this" Mapping table or blacklist "sure, here is your long URL"
  12. 12. Malicious short URLs resolved by top services Service Malware Phishing Spam bit.ly 0.05 11.3 0.0 durl.me 0.0 0.0 0.0 goo.gl* 66.4 96.9 78.7 is.gd 1.08 2.27 0.8 migre.me 0.86 14.0 0.0 tinyurl.com 0.66 0.7 2.04 Overall 21.45 26.39 31.38
  13. 13. Periodic checks on existing short URLs • Created a dynamic landing page (i.e., long URL) • Submission time (day zero): serve benign content (e.g., text & images) • ...
  14. 14. URL shortening service http://our.server/dynamic-page.phpdynamic long URL "make me shorter" Blacklist "is this long URL malicious?"No! http://ab.cd/good-today
  15. 15. ...we changed the content of the page after 1 week • ... • Resolution time (each subsequent week): serve malicious content • Do services periodically check and sanitize existing short URLs?
  16. 16. URL shortening service http://ab.cd/good-todayshort URL "please resolve this" Aliases DB "is the long URL malicious?"No! http://our.server/dynamic-page.php
  17. 17. Deferred malicious short URLs survived Threat Shortened Blocked Not Blocked Malware 5,000 20% 80% Phishing 5,000 20% 80% Spam 5,000 20% 80% Overall 15,000 20% 80%durl.me
  18. 18. This motivated us to start a measurement experiment to collect short URLs
  19. 19. A different perspective what is the impact on users? • Most of previous work used Twitter as a source of short URLs • Types of questions • Do users stumble upon malicious short URLs very often? • Are users aware of malicious short URLs? • What kind of short URLs users typically encounter? User-centered measurement.
  20. 20. Measurement infrastructure JS Our resolver Collectors (users) http://ab.cd/4jaYas http://ab.cd/4jaYas http://ab.cd/4jaYas http://ab.cd/4jaYas LANDING PAGE
  21. 21. http://ab.cd/sfb4Ac http://ab.cd/asd31A http://ab.cd/5aD3B9 http://ab.cd/419E9s http://example.com/container Container page http://ab.cd/sfb4Ac http://ab.cd/asd31A http://ab.cd/5aD3B9 http://ab.cd/419E9s
  22. 22. Biased measurements? • We do not ask a user to become a collector • We want users to subscribe as collectors spontaneously • How? We provide a useful service!
  23. 23. Collected data between 2010 and 2012 • Total 7,000 distinct users (estimate from 1,370,277 distinct IPs) • about 500 to 1,000 active users per day • about 20,000 to 50,000 short URLs sent each day (100,000 peaks) • 24,953,881 distinct short URLs collected overall
  24. 24. Dataset statistics (geolocation of the submitters)
  25. 25. Dataset statistics (top services) Distinct URLs Log entries 10,069,846 bit.ly 24,818,239 bit.ly 4,725,125 t.co 12,054,996 t.co 1,418,418 tinyurl.com 5,649,043 tinyurl.com 816,744 ow.ly 2,188,619 goo.gl 800,761 goo.gl 2,053,575 ow.ly 638,483 tumblr.com 1,214,705 j.mp 597,167 fb.me 1,159,536 fb.me 584,377 4sq.com 1,116,514 4sq.com 517,965 j.mp 1,066,325 tumblr.com 464,875 tl.gd 1,045,380 is.gd
  26. 26. Do users stumble upon malicious short URLs very often?
  27. 27. Blacklist Phishing Malware Spam Spamhaus - - 10,306 Phishtank 7 - - Wepawet - 6,057 - Safe Browsing 913 2,405 - Malicious short URLs encountered by submitters Top sources Social networks Blogs News Overall baseline Chatrooms Webmails Online IMs
  28. 28. Malicious page http://ab.cd/sfb4Ac http://ab.cd/asd31A http://ab.cd/5aD3B9 http://ab.cd/419E9s http://ab.cd/sfb4Ac . . . http://ab.cd/5aD3B9 http://ab.cd/419E9s Aliasing of of malicious URLs
  29. 29. Aliasing of malicious pages using short URLs x = #distinct short URL aliases per landing page Spam pages not aliased as much as before. OPY 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1 10 100 1000 CDF Benign Spam URLs Malware URLs (a) Distinct short URLs per distinct malicious or benign landing URL from April 2010 to April 2011. 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 1 10 100 CDF Benign Spam URLs Malware URLs (b) Distinct short URLs per distinct malicious or benign landing URL from April 2010 to April 2012. 0.98 1 0.98 1 Until first year Until second year Spam pages more aliased than benign pages.
  30. 30. http://ab.cd/asd31A http://ab.cd/5aD3B9 http://ab.cd/sfb4Ac http://ab.cd/419E9s Container page 1 http://ab.cd/asd31A http://ab.cd/5aD3B9 http://ab.cd/419E9s http://ab.cd/sfb4Ac Container page 2 . . . http://ab.cd/sfb4Ac http://ab.cd/asd31A http://ab.cd/5aD3B9 http://ab.cd/419E9s Container page N Dissemination of malicious short URLs http://ab.cd/sfb4Ac
  31. 31. Spreading of malicious short URLs 0.82 0.84 0.86 0.88 0.9 0.92 1 10 100 1000 CD Benign Spam URLs Malware URLs (a) Distinct short URLs per distinct malicious or benign landing URL from April 2010 to April 2011. 0.93 0.94 0.95 0.96 1 10 100 CD Benign Spam URLs Malware URLs (b) Distinct short URLs per distinct malicious or benign landing URL from April 2010 to April 2012. 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1 10 100 1000 CDF Benign URLs Spam URLs Malware URLs (c) Distinct containers per distinct malicious or benign short URL from April 2010 to April 2011. 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 1 10 100 1000 CDF Benign URLs Spam URLs Malware URLs (d) Distinct containers per distinct malicious or benign short URL from April 2010 to April 2012. Comparison of the number of distinct short URLs per unique landing page (a, c) and page per unique short URL (b, d) after 1 year (a, b) and after 2 years (c, d). The distribu over time: (a) spam pages were generally aliased with a larger number of short URLs tha x = #distinct container pages per short URL Until first year Until second year Spam short URLs almost undistinguishable from benign ones. Spam short URLs spread on more container pages than benign ones.
  32. 32. Total lifespan of a malicious short URL s short URLs lasted about 4 months. For example, we observed a spam c between April 1st and June 30th 2010 that involved 1,806 malicious sho ng to junk landing pages; this campaign lasted about three months befo by tinyurl.com administrators. The latest MessageLabs Intelligence 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 CDF Lifespan (hours between first and latest occurrence of a short URLs in our dataset) 3-months spam campaign disseminated via 1,806 tinyurl.com short URLs (April 1st-June 30, 2010) Benign Malicious Malicious (excluding 3-months Apr-Jun spam campaign) . Delta time between first and latest occurrence of malicious versus benign sh k” indicates a high, about 50 %, amount of spam short URLs that lasted ab Malicious URLs are usually found on multiple different container pages even for Malicious short URLs typically survive longer than benign ones. We found out a spam campaign (Storm botnet?) with 1,806 short URLs. Deleted by tinyurl.com's admins after 3 months. x = #hours between the first and latest occurrence of a short URL
  33. 33. Conclusions • Do shortening services take enough countermeasures to protect the users? • Some of them use blacklists but do not proactively check existing aliases • Do users stumble upon malicious short URLs very often? • Not very often: about 0.1% overall. • Are users aware of malicious short URLs? • Not much: almost no one clicked on our "+flag as malicious" link. Also confirmed by [Onarlioglu et al., NDSS 2012]
  34. 34. Questions? Federico Maggi fede@maggi.cc @phretor
  35. 35. What happens when users click on a short URL?
  36. 36. r Category 0.00 naturism 0.18 artnudes 0.36 weapons 0.75 shopping 0.01 personalfinance 0.21 antispyware 0.36 cleaning 0.78 games 0.01 do-it-yourself 0.23 drinks 0.37 dating 0.80 news 0.03 pets 0.25 medical 0.39 vacation 0.82 government 0.04 gardening 0.25 weather 0.40 religion 0.88 chat 0.07 clothing 0.30 onlinegames 0.42 culinary 0.90 blog 0.07 mail 0.32 jobsearch 0.45 filehosting 0.91 socialnetworking 0.09 banking 0.33 sportnews 0.52 kidstimewasting 1.00 contraception 0.12 abortion 0.33 gambling 0.55 ecommerce 1.00 childcare 0.12 instantmessaging 0.36 drugs 0.67 adult 1.00 astrology 0.13 jewelry 0.36 searchengines 0.68 audio-video 1.00 cellphones 0.18 hacking 0.36 weapons 0.69 sports 1.00 onlineauctions 1.00 onlinepayment ⇢ = In(cat) In(cat) + Out(cat) ⇢ ! 0 ⇢ ! 1 Many outbound short URLs (aggregators, e.g., Twitter) Many inbound short URLs (landing pages, e.g., news, blogs)
  37. 37. Are the shortening service similar in terms of content aliased?
  38. 38. Content-specific vs. general-purpose services data we created a weighted digraph with 48 nodes, each corresponding to a category. The weights are the frequencies of change, calculated between each pair of categories—and averaged over all the short URLs and pages within each category; weights are between 10.19 and 39.41% and distributed as shown in Fig. 8. We then calculate the average weight of incoming, In(cat), and outgoing, Out(cat), edges for each category cat, and finally derive the ratio r(cat) = In(cat) In(cat)+Out(cat) . When r ! 0, the category has a majority of outgoing short URLs (i.e., many container pages of such category), whereas r ! 1 trunc.it om.ly slidesha.re moby.to migre.me wp.me tumblr.com lnk.ms cot.ag flic.kr icio.us tiny.ly amzn.to post.ly youtu.be p.tl tcrn.ch tl.gd nyti.ms ht.ly alturl.com tinysong.com dld.bz su.pr ustre.am t.co j.mp dlvr.it fb.me twurl.nl goo.gl ow.ly ff.im ping.fm is.gd tiny.cc bit.ly tinyurl.com mash.to ur1.ca sqze.it cort.as shar.es 4sq.com 020406080100 Median%categorydrift Most popular shorteners are also general-purpose and cover a wide variety of categories #categories covered (min. 0, max. 48) Figure 7. Frequency of change of category (median with 25- and 75-percent quantiles) and number of categories covered (size of black dot) of the top 50 services. The most popular, general-purpose shortening services highlighted are characterized by an ample set of categories (close to 48, which
  39. 39. ht.ly alturl.com tinysong.com dld.bz su.pr ustre.am t.co j.mp dlvr.it fb.me twurl.nl goo.gl ow.ly ff.im ping.fm is.gd tiny.cc bit.ly tinyurl.com mash.to ur1.ca sqze.it cort.as shar.es 4sq.com Median%categorydrift Most popular shorteners are also general-purpose and cover a wide variety of categories Figure 7. Frequency of change of category (median wit
  40. 40. trunc.it om.ly slidesha.re moby.to migre.me wp.me tumblr.com lnk.ms cot.ag flic.kr icio.us tiny.ly amzn.to post.ly youtu.be p.tl tcrn.ch tl.gd nyti.ms 020406080100 #categories covered (min. 0, max. 48) th 25- and 75-percent quantiles) and number

×