Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
TLScompare
Crowdsourcing Rules
for HTTPS Everywhere
Wilfried Mayer, Martin Schmiedecker
Introduction
• not used – although deployed
◦ Services do not enforce it
◦ Service operators could
− TCP Port redirect
− H...
HTTPS Everywhere Rules
• Currently: Manually crafted
◦ <target host="example.org" />
◦ <rule from="^http:"
to="https:" />
...
Similarity
Dynamic content, different ads, ...
4
Current Rules
Files with 1 rule 13,871
Files with 2–10 rules 4,496
Files with more than 10 rules 44
Total rules 26,522
Tri...
Automated Rule Generation
“Is it possible to automatically
generate these rules?” → Trivial
“Are two pages similar?” → Com...
Methodologies
Algorithm-based
solution
Reality
Empirical test
(Crowdsourcing)
1.0
...
0.5
...
0.0
/
...
/
...
/
...
? ? ? ...
Use Cases for Empirical Testing
• Validating algorithms
• Validating existing rules
• Evaluating edge cases
8
TLScompare.org
9
TLScompare.org
10
TLScompare.org
Admin panel
11
TLScompare.org
• Identify domains
◦ Internet-wide scanning techniques
◦ Filter HSTS or server-side redirects
◦ Alexa Top 1...
Results
Equal 169
Not Equal (Total) 358
Dataset for existing HTTPS Everywhere rules
Total attempts 2,600
Total results 2,2...
Results
Combination Results %
0 842 66%
1 443 34%
00 612 53%
10 148 13%
11 394 34%
000 67 58%
100 6
14%
110 10
111 32 28%
...
Discussion
• Tool for manual creation
• Commercial crowdsourcing platforms
• Data quality
• Reproducibility
15
Discussion cont.
• Future work
◦ Different algorithms
◦ Integrated algorithms
◦ Improve selection
◦ Fully automated generat...
Contact
wmayer@sba-research.org
Images: Sean MacEntee, http://flickr.com/photos/18090920@N07/15944989872. (cc-by-2.0)
EFF, ...
TLScompare.org - Crowdsourcing Rules for HTTPS Everywhere
Upcoming SlideShare
Loading in …5
×

TLScompare.org - Crowdsourcing Rules for HTTPS Everywhere

344 views

Published on

Presentation from the Workshop on Empirical Research Methods in Information Security (ERMIS), 2016

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

TLScompare.org - Crowdsourcing Rules for HTTPS Everywhere

  1. 1. TLScompare Crowdsourcing Rules for HTTPS Everywhere Wilfried Mayer, Martin Schmiedecker
  2. 2. Introduction • not used – although deployed ◦ Services do not enforce it ◦ Service operators could − TCP Port redirect − HSTS • tackles this problem ◦ browser extension (client side) ◦ prefers HTTPS over HTTP ◦ depends on manually crafted rules 2
  3. 3. HTTPS Everywhere Rules • Currently: Manually crafted ◦ <target host="example.org" /> ◦ <rule from="^http:" to="https:" /> ◦ <rule from="^http://([^/:@.]+.)? torproject.org/" to= "https://$1torproject.org/" /> • Rule Validation is not trivial ◦ Equality ◦ Same-same, but different 3
  4. 4. Similarity Dynamic content, different ads, ... 4
  5. 5. Current Rules Files with 1 rule 13,871 Files with 2–10 rules 4,496 Files with more than 10 rules 44 Total rules 26,522 Trivial rules 7,528 Without reference 19,267 Current Rules 5
  6. 6. Automated Rule Generation “Is it possible to automatically generate these rules?” → Trivial “Are two pages similar?” → Complex Matching algorithms Crowdsourcing 6
  7. 7. Methodologies Algorithm-based solution Reality Empirical test (Crowdsourcing) 1.0 ... 0.5 ... 0.0 / ... / ... / ... ? ? ? ? ? ... × × × × × 7
  8. 8. Use Cases for Empirical Testing • Validating algorithms • Validating existing rules • Evaluating edge cases 8
  9. 9. TLScompare.org 9
  10. 10. TLScompare.org 10
  11. 11. TLScompare.org Admin panel 11
  12. 12. TLScompare.org • Identify domains ◦ Internet-wide scanning techniques ◦ Filter HSTS or server-side redirects ◦ Alexa Top 1 million ranking • Test ◦ Store IP, Datetime, User Agent, result ◦ Session ID to filter bogus attempts • Evaluate results • If feasible: create minimal ruleset 12
  13. 13. Results Equal 169 Not Equal (Total) 358 Dataset for existing HTTPS Everywhere rules Total attempts 2,600 Total results 2,267 Equal 1,688 Not Equal 579 Dataset for similar Alexa Top 10k domains 13
  14. 14. Results Combination Results % 0 842 66% 1 443 34% 00 612 53% 10 148 13% 11 394 34% 000 67 58% 100 6 14% 110 10 111 32 28% ... ... ... Multiple results to ensure data quality 14
  15. 15. Discussion • Tool for manual creation • Commercial crowdsourcing platforms • Data quality • Reproducibility 15
  16. 16. Discussion cont. • Future work ◦ Different algorithms ◦ Integrated algorithms ◦ Improve selection ◦ Fully automated generation • Valid HTTPS Everywhere rules • Increase HTTPS usage 16
  17. 17. Contact wmayer@sba-research.org Images: Sean MacEntee, http://flickr.com/photos/18090920@N07/15944989872. (cc-by-2.0) EFF, https://www.eff.org/https-everywhere 17

×