SlideShare a Scribd company logo
1 of 19
Download to read offline
Ban Spam? Yes we can!
Simple techniques to keep comment spam at bay.

                 Andrew Hedges
          http://andrew.hedges.name/

             December 26, 2008
You lock your bike, right?
•   Even a Kryptonite™ lock can
    be defeated

•   The point is to prevent
    “crimes of opportunity”

•   For this, simple techniques
                                  Photo credit: thewashcycle.com
    are as effective as
    complicated ones
How do spammers work?
•   Itʼs an arms race; what prevents comment spam now
    might not work later

•   Automated form submission ʼbots: dumb, they
    “succeed” by spamming 1000s of sites

•   Human spammers: paid per submission, not likely to
    spend much time on sites with non-obvious barriers
Common Defenses
•   CAPTCHA

•   Bayesian filters

•   Registration/login

•   Comment moderation

•   Tricky JavaScript
                         Copyright 2003 by Randy Glasbergen
CAPTCHAs Suck
•   CAPTCHAs are annoying

•   Ones good enough to defeat
    computers defeat humans, too

•   They require workarounds to
    be accessible                  Facebook.com CAPTCHA,
                                     circa December 2008
Bayesian Filters Suck
•   Fuzzy logic needed to
    determine whether a          [T]he probability that an email is spam, given
    comment is spam, less        that it has certain words in it, is equal to the
                                 probability of finding those certain words in
    than 100% accurate           spam email, times the probability that any


•
                                 email is spam, divided by the probability of
    Akismet is probably the      finding those words in any email…

    best-of-breed, but even it   Source: en.wikipedia.org/wiki/Bayesian_spam_filtering

    returns false positives
Registering Sucks
•   I have no illusions
    about my popularity;
    one-time visitors are
    not going to register
    to comment on my
    blog                    Source: attentionmax.com
Moderation Sucks
•   Penalizes real humans who want to see their pithy
    comment in pixels as soon as it is submitted




                       Source: thinplace.com
Relying on JavaScript Sucks
•   Some mobile user agents do not
    support JavaScript

•   Some Firefox users have the NoScript
    extension installed, especially my
    blogʼs target demographic: geeks       Source: noscript.net
My Ideal System
Balance between                 •   No CAPTCHA
preventing spam and
allowing unmoderated            •   No Bayesian anything
comments
                                •   No registration/login

                                •   No moderation

                                •   No reliance on JavaScript

Source: zenlogistics.net
                                •   No false positives, no false negatives
My Production System
•   Honeypot CAPTCHA                    As of December,
•   Hidden timestamp                    2008, this system
                                         has been 100%
•   Clearly state that links will be    effective. No false
    tagged with rel=quot;nofollowquot;         negatives. No false
                                            positives.
•   Close comments after 15 days

      See it in action at andrew.hedges.name/blog
Honeypot CAPTCHA
•   Hidden from human users      <style type=quot;text/cssquot;>
                                 .captcha {display: none}
•   Sometimes filled in by        </style>
                                 <div class=quot;captchaquot;>
    ʼbots, sometimes filled in      What is 5 + 3?
    by human spammers              <input type=quot;textquot;
                                   name=quot;captchaquot;>

•   Reject the comment if any    </div>

    value is submitted for the
    field
Hidden Timestamp
•    Automated spam ʼbots either submit comment forms
     very quickly or cache them and spam repeatedly

•    Reject comments posted in fewer than 30 seconds or
     more than 24 hours


    <input type=quot;hiddenquot; name=quot;whenquot; value=quot;<?=time()?>quot;>
rel=quot;nofollowquot;
•   Clearly state that links will be tagged with rel=quot;nofollowquot;

•   Not a deterrent to real people who have something to say

          If you spam for a living, please be aware that all links in comments will be
          tagged with rel=quot;nofollowquot;. This means spamming my blog will not help
          your Google PageRank. Spam kills. Just say no.



    <a rel=quot;nofollowquot; href=quot;http://example.comquot;>V1@gr@!</a>
Close comments after 15 days
 •   Prevents blog posts from becoming comment spam
     graveyards and presents fewer targets for spammers

             Comments close in 15 days.

             Comments close in 5 days. Dawdle not!

             Comments closed. Have something to say? Drop me a line!
A little sugar on top…
•   Donʼt tell the spammers their post has been rejected,
    just that itʼs been “moderated”

•   Help real humans avoid being moderated by using
    JavaScript to enable the submit button only when itʼs
    legal to post

•   My system emails me with each successful comment
    submission so I can catch false negatives quickly
Next steps
•   Did I mention itʼs an arms race?

•   Expect your system to be defeated; be ready with next
    steps

•   Jibberish form field names? Hash of timestamp + entry
    ID + salt? Something else?
Summary
•   Comment spam is a “crime of opportunity,” that is,
    spammers go for easy targets first

•   Most strategies and tactics currently used on
    commercial blog software suck because they either
    deter humans or sometimes let spam through

•   Simple techniques such as honeypot CAPTCHAs and
    hidden timestamps appear to be highly effective in
    combatting comment spam…for now
Is it progress?
•   I welcome your feedback on my strategy and tactics at
    andrew@hedges.name

•   I wasnʼt the first to think of these ideas. Here are some
    of my sources of inspiration:

    •   http://nedbatchelder.com/text/stopbots.html

    •   http://haacked.com/archive/2007/09/11/honeypot-
        captcha.aspx

More Related Content

Similar to Defeating Comment Spam

Webspam (English Version)
Webspam (English Version)Webspam (English Version)
Webspam (English Version)Dirk Haun
 
Be Afraid. Be Very Afraid. Javascript security, XSS & CSRF
Be Afraid. Be Very Afraid. Javascript security, XSS & CSRFBe Afraid. Be Very Afraid. Javascript security, XSS & CSRF
Be Afraid. Be Very Afraid. Javascript security, XSS & CSRFMark Stanton
 
Programming to the Twitter API: ReTweeter
Programming to the Twitter API: ReTweeterProgramming to the Twitter API: ReTweeter
Programming to the Twitter API: ReTweeterJohn Eckman
 
A CAPTCHA in the Rye
A CAPTCHA in the RyeA CAPTCHA in the Rye
A CAPTCHA in the RyeImperva
 
Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...
Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...
Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...amiable_indian
 
OpenID Intro @ Barcamp Brussels 3
OpenID Intro @ Barcamp Brussels 3OpenID Intro @ Barcamp Brussels 3
OpenID Intro @ Barcamp Brussels 3Frank Louwers
 
Safeguard our website and prevents from bad internet bots and scripts to expl...
Safeguard our website and prevents from bad internet bots and scripts to expl...Safeguard our website and prevents from bad internet bots and scripts to expl...
Safeguard our website and prevents from bad internet bots and scripts to expl...Sivalingam Thangavel, TOGAF 9, ITIL
 
Rise of the Autobots: Into the Underground of Social Network Bots
Rise of the Autobots: Into the Underground of Social Network BotsRise of the Autobots: Into the Underground of Social Network Bots
Rise of the Autobots: Into the Underground of Social Network BotsTom Eston
 
Things that go bump on the web - Web Application Security
Things that go bump on the web - Web Application SecurityThings that go bump on the web - Web Application Security
Things that go bump on the web - Web Application SecurityChristian Heilmann
 
The life of breached data and the attack lifecycle
The life of breached data and the attack lifecycleThe life of breached data and the attack lifecycle
The life of breached data and the attack lifecycleJarrod Overson
 
David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...
David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...
David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...webcontent2007
 
How to become hacker
How to become hackerHow to become hacker
How to become hackerRaman Sanoria
 
CAPTCHA(Image Verification Code)
CAPTCHA(Image Verification Code)CAPTCHA(Image Verification Code)
CAPTCHA(Image Verification Code)Abhimanyu Sood
 

Similar to Defeating Comment Spam (20)

Webspam (English Version)
Webspam (English Version)Webspam (English Version)
Webspam (English Version)
 
Be Afraid. Be Very Afraid. Javascript security, XSS & CSRF
Be Afraid. Be Very Afraid. Javascript security, XSS & CSRFBe Afraid. Be Very Afraid. Javascript security, XSS & CSRF
Be Afraid. Be Very Afraid. Javascript security, XSS & CSRF
 
Spam Wars
Spam WarsSpam Wars
Spam Wars
 
Programming to the Twitter API: ReTweeter
Programming to the Twitter API: ReTweeterProgramming to the Twitter API: ReTweeter
Programming to the Twitter API: ReTweeter
 
A CAPTCHA in the Rye
A CAPTCHA in the RyeA CAPTCHA in the Rye
A CAPTCHA in the Rye
 
Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...
Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...
Insecure Implementation of Security Best Practices: of hashing, CAPTCHA's and...
 
Seven Reasons for Code Bloat
Seven Reasons for Code BloatSeven Reasons for Code Bloat
Seven Reasons for Code Bloat
 
Captcha
CaptchaCaptcha
Captcha
 
Web 2.0 Expo
Web 2.0 ExpoWeb 2.0 Expo
Web 2.0 Expo
 
Captcha
CaptchaCaptcha
Captcha
 
Captcha
CaptchaCaptcha
Captcha
 
OpenID Intro @ Barcamp Brussels 3
OpenID Intro @ Barcamp Brussels 3OpenID Intro @ Barcamp Brussels 3
OpenID Intro @ Barcamp Brussels 3
 
Safeguard our website and prevents from bad internet bots and scripts to expl...
Safeguard our website and prevents from bad internet bots and scripts to expl...Safeguard our website and prevents from bad internet bots and scripts to expl...
Safeguard our website and prevents from bad internet bots and scripts to expl...
 
Rise of the Autobots: Into the Underground of Social Network Bots
Rise of the Autobots: Into the Underground of Social Network BotsRise of the Autobots: Into the Underground of Social Network Bots
Rise of the Autobots: Into the Underground of Social Network Bots
 
Captchas
CaptchasCaptchas
Captchas
 
Things that go bump on the web - Web Application Security
Things that go bump on the web - Web Application SecurityThings that go bump on the web - Web Application Security
Things that go bump on the web - Web Application Security
 
The life of breached data and the attack lifecycle
The life of breached data and the attack lifecycleThe life of breached data and the attack lifecycle
The life of breached data and the attack lifecycle
 
David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...
David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...
David Esrati, The Blogzilla Report- Fact, Fiction Fear: The Monster of the In...
 
How to become hacker
How to become hackerHow to become hacker
How to become hacker
 
CAPTCHA(Image Verification Code)
CAPTCHA(Image Verification Code)CAPTCHA(Image Verification Code)
CAPTCHA(Image Verification Code)
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 

Defeating Comment Spam

  • 1. Ban Spam? Yes we can! Simple techniques to keep comment spam at bay. Andrew Hedges http://andrew.hedges.name/ December 26, 2008
  • 2. You lock your bike, right? • Even a Kryptonite™ lock can be defeated • The point is to prevent “crimes of opportunity” • For this, simple techniques Photo credit: thewashcycle.com are as effective as complicated ones
  • 3. How do spammers work? • Itʼs an arms race; what prevents comment spam now might not work later • Automated form submission ʼbots: dumb, they “succeed” by spamming 1000s of sites • Human spammers: paid per submission, not likely to spend much time on sites with non-obvious barriers
  • 4. Common Defenses • CAPTCHA • Bayesian filters • Registration/login • Comment moderation • Tricky JavaScript Copyright 2003 by Randy Glasbergen
  • 5. CAPTCHAs Suck • CAPTCHAs are annoying • Ones good enough to defeat computers defeat humans, too • They require workarounds to be accessible Facebook.com CAPTCHA, circa December 2008
  • 6. Bayesian Filters Suck • Fuzzy logic needed to determine whether a [T]he probability that an email is spam, given comment is spam, less that it has certain words in it, is equal to the probability of finding those certain words in than 100% accurate spam email, times the probability that any • email is spam, divided by the probability of Akismet is probably the finding those words in any email… best-of-breed, but even it Source: en.wikipedia.org/wiki/Bayesian_spam_filtering returns false positives
  • 7. Registering Sucks • I have no illusions about my popularity; one-time visitors are not going to register to comment on my blog Source: attentionmax.com
  • 8. Moderation Sucks • Penalizes real humans who want to see their pithy comment in pixels as soon as it is submitted Source: thinplace.com
  • 9. Relying on JavaScript Sucks • Some mobile user agents do not support JavaScript • Some Firefox users have the NoScript extension installed, especially my blogʼs target demographic: geeks Source: noscript.net
  • 10. My Ideal System Balance between • No CAPTCHA preventing spam and allowing unmoderated • No Bayesian anything comments • No registration/login • No moderation • No reliance on JavaScript Source: zenlogistics.net • No false positives, no false negatives
  • 11. My Production System • Honeypot CAPTCHA As of December, • Hidden timestamp 2008, this system has been 100% • Clearly state that links will be effective. No false tagged with rel=quot;nofollowquot; negatives. No false positives. • Close comments after 15 days See it in action at andrew.hedges.name/blog
  • 12. Honeypot CAPTCHA • Hidden from human users <style type=quot;text/cssquot;> .captcha {display: none} • Sometimes filled in by </style> <div class=quot;captchaquot;> ʼbots, sometimes filled in What is 5 + 3? by human spammers <input type=quot;textquot; name=quot;captchaquot;> • Reject the comment if any </div> value is submitted for the field
  • 13. Hidden Timestamp • Automated spam ʼbots either submit comment forms very quickly or cache them and spam repeatedly • Reject comments posted in fewer than 30 seconds or more than 24 hours <input type=quot;hiddenquot; name=quot;whenquot; value=quot;<?=time()?>quot;>
  • 14. rel=quot;nofollowquot; • Clearly state that links will be tagged with rel=quot;nofollowquot; • Not a deterrent to real people who have something to say If you spam for a living, please be aware that all links in comments will be tagged with rel=quot;nofollowquot;. This means spamming my blog will not help your Google PageRank. Spam kills. Just say no. <a rel=quot;nofollowquot; href=quot;http://example.comquot;>V1@gr@!</a>
  • 15. Close comments after 15 days • Prevents blog posts from becoming comment spam graveyards and presents fewer targets for spammers Comments close in 15 days. Comments close in 5 days. Dawdle not! Comments closed. Have something to say? Drop me a line!
  • 16. A little sugar on top… • Donʼt tell the spammers their post has been rejected, just that itʼs been “moderated” • Help real humans avoid being moderated by using JavaScript to enable the submit button only when itʼs legal to post • My system emails me with each successful comment submission so I can catch false negatives quickly
  • 17. Next steps • Did I mention itʼs an arms race? • Expect your system to be defeated; be ready with next steps • Jibberish form field names? Hash of timestamp + entry ID + salt? Something else?
  • 18. Summary • Comment spam is a “crime of opportunity,” that is, spammers go for easy targets first • Most strategies and tactics currently used on commercial blog software suck because they either deter humans or sometimes let spam through • Simple techniques such as honeypot CAPTCHAs and hidden timestamps appear to be highly effective in combatting comment spam…for now
  • 19. Is it progress? • I welcome your feedback on my strategy and tactics at andrew@hedges.name • I wasnʼt the first to think of these ideas. Here are some of my sources of inspiration: • http://nedbatchelder.com/text/stopbots.html • http://haacked.com/archive/2007/09/11/honeypot- captcha.aspx