SlideShare a Scribd company logo
1 of 11
Download to read offline
Contents
NPD: Repeat/ Unique Visitor Identification............................................................................................2
1. Product Intro & Goal...................................................................................................................2
2. Who’s it for?................................................................................................................................2
3. Why Build It.................................................................................................................................2
4. Desired Output of the product ...................................................................................................2
5. Preconditions ..............................................................................................................................2
6. Background & Short Description.................................................................................................2
7. Success Scenario for the product................................................................................................3
Implementation Methodology............................................................................................................4
8. Attributes for Fingerprinting.......................................................................................................4
8.1 User Agent String (HTTP Header)........................................................................................4
8.2 HTTP Requests header........................................................................................................4
8.3 Javascript Display Data........................................................................................................4
8.4 Plugin Data..........................................................................................................................4
8.5 HTML Canvas fingerprint.....................................................................................................5
8.6 WebGL Rendering ...............................................................................................................5
8.7 System Fonts.......................................................................................................................6
8.8 Do Not Track Request .........................................................................................................6
8.9 DNS/ TCP (network) ............................................................................................................6
8.10 Timezone.............................................................................................................................6
9. Scenarios for the model..............................................................................................................6
10. Observations with different devices, browsers/OS & Network..............................................7
10.1 With a given browser and OS but different network connections.....................................7
Connecting via Home Wi-Fi, company Wi-Fi, dongle, ISP shows the following behaviour in the
attributes.........................................................................................................................................7
10.2 Same device but different browsers/ OS............................................................................8
11. Estimating weights for each attribute for the model .........................................................8
12. Suggested Methodology .........................................................................................................9
13. Conclusion.............................................................................................................................10
14. Corner Case...........................................................................................................................10
15. Experimental.........................................................................................................................10
16. References for further reading .............................................................................................10
NPD: Repeat/ Unique Visitor Identification
1. Product Intro & Goal
A web based solution to detect if a given user visiting your website has visited it before or not.
2. Who’s it for?
Partner NBFCs with whom prospective borrowers fill up loan application
3. Why Build It
This solution aims to detect in the pre-login journey itself that whether a visitor is a unique visitor or a repeat
visitor. Usage can be in loan application fraud, and in ad servers for identifying users uniquely even after they
flush their cookies.
4. Desired Output of the product
Confidence score of a visitor being a re-visitor
List of previous visits with time-stamp for a repeat visitor
5. Preconditions
Cookies not allowed, outside compliance
6. Background & Short Description
A client device when connecting with the web server where the website is hosted goes through several
handshakes and ends up sending network and application data from which device geolocation, device
network connection and device browser, operating system and hardware data details can be gathered and
interpreted by the application server to uniquely identify a client (device).
Client- server connection flow
Load Balancer will direct HTTP/HTTPS requests to different server instances. It may or may not be content
aware. During the negotiation between originating browser and the hosting server several REQUEST headers
will be passed for appropriate server response for content. Network metadata information can be processed
at network level or passed as a connection attribute to hosting server to be processed along with html/css &
JavaScript data.
7. Success Scenario for the product
a. The product should predict with a high confidence score that whether a user is new visitor or
a returning user (H0: each user is a new visitor)
Minimum Type I & Type II errors
b. Min False positive : user was a new visitor but system predicted it a repeat visitor is a failure
c. Min False negative: user was not a new visitor but system predicted it a new visitor
d. Minimum test speed
---------------------------------------------------------------------------------------------------------------------------------------------------
Implementation Methodology
8. Attributes for Fingerprinting
8.1 User Agent String (HTTP Header)
Identifies information regarding browser & operating system
A typical user-agent string looks like:
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36"
8.2 HTTP Requests header
Accept request-header fields
Cookie device=d7834267-37fd-42ac-aa8c-1373aeebcf92; JSESSIONID=...
Host noc.to
Accept text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
If-None-Match "d7834267-37fd-42ac-aa8c-1373aeebcf92/2017-01-13-09:43:54.513"
Upgrade-Insecure-Requests 1
Accept-Language en-US,en;q=0.8
User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/53
Connection keep-alive
Accept-Encoding gzip, deflate, sdch
8.3 Javascript Display Data
If browser can run JavaScript (by default Yes)
Screen W x H
1366 x 768
Available W x H 1366 x 728
Color Depth 24
Pixel Ratio 1
8.4 Plugin Data
Flash Plugin Data
Plugin Version WIN 24,0,0,194
Plugin Manufacturer Google Pepper
Language en
Operating System Windows 10
CPU Architecture x86
Supports 32-Bit Processes yes
Supports 64-Bit Processes yes
Screen Resolution 1366 x 768
LSO Storage Test passed
Pixel Aspect Ratio 1
Screen DPI 72
AV Hardware Disabled no
File Read Disabled no
Has Printing yes
Has Accessibility yes
Has Audio yes
Has MP3 yes
Has Embedded Video yes
Has Screen Broadcast no
Has Screen Playback no
Has Streaming Audio yes
Has Streaming Video yes
Has Audio Encoder yes
Has Video Encoder yes
Has Input Method Editor yes
Max Level IDC 5.1
Player Type PlugIn
Is Debugger no
Has Transport Layer Security yes
Navigator Plugin List
Name Chrome PDF Viewer
Filename mhjfbmdgcfjbbpaeojofohoefgiehjai
Name Chrome PDF Viewer
Description Portable Document Format
Filename internal-pdf-viewer
Name Native Client
Filename internal-nacl-plugin
Name Shockwave Flash
Description Shockwave Flash 24.0 r0
Filename pepflashplayer.dll
Name Widevine Content Decryption Module
Description Enables Widevine licenses for playback of HTML audio/video content. (version: 1.4.8.903)
Filename widevinecdmadapter.dll
Other variables:
8.5 HTML Canvas fingerprint
With HTML5 canvas API, text/ image is rendered differently in different devices with varying OS, font library,
graphics card, graphics driver and browser version. For e.g. the pixelmap produced from client running Chrome
v55, Win 10, x64 with "ANGLE (Intel(R) HD Graphics 5500 Direct3D11 vs_5_0 ps_5_0)"
8.6 WebGL Rendering
Browser supporting WebGL & WebGL 2.0 gives a unique hashmap to an extent depending on GPU context
8.7 System Fonts
Extracted via Flash and JavaScript, can give a unique fingerprint in some machines apart from browser specific
webfonts capability.
8.8 Do Not Track Request
JavaScript populated Do Not Track tag
8.9 DNS/ TCP (network)
TCP packet sent by client when negotiating connection have different values set by different OS types and
versions. TTL in IP header and TCP window size for eg are different for different OS types
In case of User Agent string spoofing, it can be cross verified with network data which is harder and less
frequently spoofed. DNS Data can be used like DNS version to fingerprint a user to a particular DNS.
8.10 Timezone
Client timezone and timestamp can be obtained by executing JS code on browser
9. Scenarios for the model
The model aims be able to identify users who access the website that whether they are first time or repeat
users. Since IP is dynamic, and in cases like corporate offices/ business parks, it gets shared across LAN with
one gateway to WAN, it is difficult to narrow it down with IP fingerprint only. With proxies/ VPNs being quite
prevalent it is no longer a good idea to use IP address fingerprinting only.
Quite simply we want to detect whether:
a. A browser is a returning browser or not. If yes, then we would want to narrow it down to the
device details using other fingerprinting methods.
b. If it’s not a returning browser but a new browser, even then we would want to cross-match with
device fingerprinting using canvas/GPU data etc.
The rarer the browser or device OS/ device hardware the easier it is to uniquely track a visitor.
Decision Scenarios
User Device OS Browser Model Scope
Same
User
device 1 Same OS Different Browsers –
edge/IE/tor/firefox/chrome
etc
Should detect
Same device 1 Different OS –
Win/OSX/Android/
Linux
Should detect
(rarer device is
better)
New device 2 Same OS as row 1
(exact match)
Same browser as row 1
(exact match)
Cannot detect
the user since
device is new
Different
User
Device 1 Same OS from
row 1 above
different browser from row
1
Should detect
(rarer the
browser/OS the
better)
Device 1 different OS from
row 1 above
Any browser Yes
Hence the success of the algorithm will be in uniquely identifying a device from different variables like user
agent and flash, canvas data etc. Repeat user visit here should mean repeat access by the same client device
using same or different browser and give the previous timestamps of each visit
10. Observations with different devices, browsers/OS & Network
On testing with https://panopticlick.eff.org; www.amiunique.org; https://browserleaks.com and
www.letmetrackyou.org following behaviour of attributes was seen.
10.1 With a given browser and OS but different network connections
Connecting via Home Wi-Fi, company Wi-Fi, dongle, ISP shows the following behaviour in the attributes
No. Attribute Behaviour
1 Accept header data Doesn’t change
2 User-Agent Doesn’t change, unless updated
3 DNT Doesn’t change
4 Touch Support Doesn’t change
5 Platform Doesn’t change
6 Language Doesn’t change
7 Cookies enabled Doesn’t change
8 Screen resolution Doesn’t change
9 Timezone Doesn’t change
10 Plugin versions Doesn’t change, unless updated or removed
11 Font List (all) Order does not change
12 Canvas Hash Doesn’t change
13 WebGL Hash Doesn’t change
From this we can conclude that network has the least impact on each of these variables
10.2 Same device but different browsers/ OS
Using Edge, IE, chrome, Firefox, Android, Win, the below attributes show the following behaviour
No. Attribute Behaviour
1 Accept header changes
2 User-Agent changes
3 DNT changes
4 Touch Support Doesn’t change
5 Platform Doesn’t change
6 Language Doesn’t change
7 Cookies enabled Doesn’t change
8 Screen resolution Doesn’t change
9 Timezone Doesn’t change
10 Plugin version changes
11 Font List (all) Changes
12 Canvas Hash changes
13 WebGL Hash changes
11. Estimating weights for each attribute for the model
From the observations from https://panopticlick.eff.org results it can be inferred that lower the probability of
a particular attribute, the rarer it is to be found on the Internet, and hence higher the chances to uniquely
identify the visitor and bigger the confidence score
No. Attribute Weight
1 Accept header data High
2 User-Agent High
3 DNT Low
4 Touch Support Low
5 Platform Low
6 Language Low
7 Cookies enabled Low
8 Screen resolution Medium
9 Timezone Low
10 Plugin version High
11 Font List (all) High
12 Canvas Hash High
13 WebGL Hash High
12. Suggested Methodology
Each visitor’s network and application data attributes will be lifted out of the browser request headers and by
running JS code on the browser. Each of these attributes can be fingerprinted and their probabilities be
computed. Further, canvas fingerprinting is possible by rendering a simple text on client machine from canvas
API and extracting the resulting pixelmap. For WebGL fingerprinting the we can acquire client’s graphic card to
draw & then extract images will give us a unique fingerprint which can then be stored against the detected
hardware data obtained from flash for further uses.
The combined fingerprints of each of these attributes will result in a one single user fingerprint which can then
be statistically validated for a unique print with given significance level. (P-value)
With enough test data, we can derive the coefficients of all the above 13 variables and error constant in the
below equation. With time the model can better itself. The final output will be the test statistic which can be
compared with the mean and standard deviation to arrive at the confidence internal.
( ) ( ) ( )
( ) ( )
( ) ( )
( ) ( )
( ) ( ) ( )
( )
Mean of the above equation can be calculated from
Knowing the y output, same mean and standard deviation, z-value can be calculated and confidence interval
can be calculated within 1 to 3 standard deviations.
For example,
Samples above the 2 standard deviation can be treated as new visitors and those below the 95% significance
be treated as repeat users and timestamps be shown of previous visits.
Case:
Same device, different browser. Assigning weights from 10.2 and lifting probabilities for the variables from
https://panopticlick.eff.org
No. Attribute Weights Percentages Edge
(1/Prob)
IE
(1/Prob)
Chrome
(1/Prob)
1 Accept header data High 11% 31462.83 31462.83 8.41
2 User-Agent High 13% 1716.15 522.93 35.17
3 DNT Low 4% 2.34 2.34 2.34
4 Touch Support Low 3% 1.41 1.41 1.41
5 Platform Low 5% 2.44 2.44 2.44
6 Language Low 4% 5243.81 5243.81 2.01
7 Cookies enabled Low 4% 1.13 1.13 1.13
8 Screen resolution &
color
Medium 8% 8.37 8.37 8.37
9 Timezone Low 3% 92.27 92.27 92.27
10 Browser Plugin
version
High 11% 3.17 188777 13.5
11 Font List (all) High 12% 4016.53 188777 188775
12 High 11% 20975.22 1165.29 319.42
13 WebGL Hash High 11% 17161.55 293.13 21.19
Using above eq
0.116112455 0.078871 0.130456
As per panopticlick, Edge & IE had the same entropy while Chrome had slightly higher entropy. From the above
we can see that output for chrome is higher than the edge & IE (output of these two should have been similar).
But without test data, the weights of all the above attributes are not validated.
Multicollinearity: some of the variables show high degree of correlation with each other and may not be
suitable for regression. (Canvas and WebGL hash; UA and Accept header show high correlation,
unsubstantiated with a test sample size). Also it is assumed to be a linear regression for simplification
purposes.
13. Conclusion
Online financial services tend to reduce the information required from clients during on-boarding. They
simplify the process to what is necessary for complying with regulations, and thus open the door to abuses and
fraud. Moreover borrowers tend to fill multiple loan application forms, with same and different lenders, just in
case they get rejected by one.
Hence preventing loan application & credit card fraud is the biggest use case for such a model apart from
allowing ad networks and DSPs to track visitors uniquely in case users have flushed cookies or accessing
incognito. With network attributes like DNS and TCP we have also cross-verify if the user has manipulated their
browser’s user agent string.
14. Corner Case
a. Overtime a user’s fingerprint may change due to OS/browser upgrades. Since these values won’t
deprecate, provisions can be made in the algorithm.
b. Two devices with exactly same hardware, OS and browser versions will always give same fingerprint
and may not be able to be differentiated by any method. In such instances we have to think of some
other attributes.
15. Experimental
a. MAC collection: Some languages provide remote server with the client mac address but then it is
limited to PHP as of now which can run the script on client end.
b. Keyboard fingerprinting: user’s typing speed and words usage can also be used in case other methods
fail to determine conclusively.
c. Audio fingerprinting: new HTML5 Audio Context API can be used fingerprint website visitors
d. Battery fingerprinting: new HTML5 API will allow browser see how much battery life in percentages is
left in the device
16. References for further reading
1. https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy
2. https://zyan.scripts.mit.edu/presentations/toorcon2015.pdf
3. http://letmetrackyou.org/identify.php
4. https://panopticlick.eff.org/about
5. https://github.com/brave/browser-laptop/issues/2259
6. https://github.com/brave/browser-laptop/issues/242
7. https://www.threatmetrix.com/digital-identity-blog/fraud-prevention/advancing-beyond-device-
fingerprinting-prevent-loan-fraud/

More Related Content

What's hot

2016 pycontw web api authentication
2016 pycontw web api authentication 2016 pycontw web api authentication
2016 pycontw web api authentication Micron Technology
 
What the Heck is OAuth and Open ID Connect? - UberConf 2017
What the Heck is OAuth and Open ID Connect? - UberConf 2017What the Heck is OAuth and Open ID Connect? - UberConf 2017
What the Heck is OAuth and Open ID Connect? - UberConf 2017Matt Raible
 
Stateless authentication for microservices - Spring I/O 2015
Stateless authentication for microservices  - Spring I/O 2015Stateless authentication for microservices  - Spring I/O 2015
Stateless authentication for microservices - Spring I/O 2015Alvaro Sanchez-Mariscal
 
Authenticating Angular Apps with JWT
Authenticating Angular Apps with JWTAuthenticating Angular Apps with JWT
Authenticating Angular Apps with JWTJennifer Estrada
 
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015Alvaro Sanchez-Mariscal
 
Stateless authentication for microservices - Greach 2015
Stateless authentication for microservices - Greach 2015Stateless authentication for microservices - Greach 2015
Stateless authentication for microservices - Greach 2015Alvaro Sanchez-Mariscal
 
Token Based Authentication Systems with AngularJS & NodeJS
Token Based Authentication Systems with AngularJS & NodeJSToken Based Authentication Systems with AngularJS & NodeJS
Token Based Authentication Systems with AngularJS & NodeJSHüseyin BABAL
 
Json web token api authorization
Json web token api authorizationJson web token api authorization
Json web token api authorizationGiulio De Donato
 
Stateless token-based authentication for pure front-end applications
Stateless token-based authentication for pure front-end applicationsStateless token-based authentication for pure front-end applications
Stateless token-based authentication for pure front-end applicationsAlvaro Sanchez-Mariscal
 

What's hot (10)

2016 pycontw web api authentication
2016 pycontw web api authentication 2016 pycontw web api authentication
2016 pycontw web api authentication
 
What the Heck is OAuth and Open ID Connect? - UberConf 2017
What the Heck is OAuth and Open ID Connect? - UberConf 2017What the Heck is OAuth and Open ID Connect? - UberConf 2017
What the Heck is OAuth and Open ID Connect? - UberConf 2017
 
Stateless authentication for microservices - Spring I/O 2015
Stateless authentication for microservices  - Spring I/O 2015Stateless authentication for microservices  - Spring I/O 2015
Stateless authentication for microservices - Spring I/O 2015
 
Authenticating Angular Apps with JWT
Authenticating Angular Apps with JWTAuthenticating Angular Apps with JWT
Authenticating Angular Apps with JWT
 
thesis
thesisthesis
thesis
 
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015Stateless authentication with OAuth 2 and JWT - JavaZone 2015
Stateless authentication with OAuth 2 and JWT - JavaZone 2015
 
Stateless authentication for microservices - Greach 2015
Stateless authentication for microservices - Greach 2015Stateless authentication for microservices - Greach 2015
Stateless authentication for microservices - Greach 2015
 
Token Based Authentication Systems with AngularJS & NodeJS
Token Based Authentication Systems with AngularJS & NodeJSToken Based Authentication Systems with AngularJS & NodeJS
Token Based Authentication Systems with AngularJS & NodeJS
 
Json web token api authorization
Json web token api authorizationJson web token api authorization
Json web token api authorization
 
Stateless token-based authentication for pure front-end applications
Stateless token-based authentication for pure front-end applicationsStateless token-based authentication for pure front-end applications
Stateless token-based authentication for pure front-end applications
 

Similar to Browser fingerprinting without cookies

Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4
Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4
Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4Yusuf Hadiwinata Sutandar
 
ESM Service Layer Developers Guide for ESM 6.8c
ESM Service Layer Developers Guide for ESM 6.8cESM Service Layer Developers Guide for ESM 6.8c
ESM Service Layer Developers Guide for ESM 6.8cProtect724gopi
 
ArcSight Connector Appliance 6.4 Administrator's Guide
ArcSight Connector Appliance 6.4 Administrator's GuideArcSight Connector Appliance 6.4 Administrator's Guide
ArcSight Connector Appliance 6.4 Administrator's GuideProtect724tk
 
Pandora FMS: Outlook Anywhere Plugin
Pandora FMS: Outlook Anywhere PluginPandora FMS: Outlook Anywhere Plugin
Pandora FMS: Outlook Anywhere PluginPandora FMS
 
Protocol
ProtocolProtocol
Protocolm_bahba
 
Trusted Hardware Database With Privacy And Data Confidentiality
Trusted Hardware Database With Privacy And Data ConfidentialityTrusted Hardware Database With Privacy And Data Confidentiality
Trusted Hardware Database With Privacy And Data Confidentialitytheijes
 
RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.
RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.
RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.SoftInstigate
 
Sp 29 two_factor_auth_guide
Sp 29 two_factor_auth_guideSp 29 two_factor_auth_guide
Sp 29 two_factor_auth_guideHai Nguyen
 
Bit taka bangladeshi country owned crypto currency
Bit taka bangladeshi country owned crypto currencyBit taka bangladeshi country owned crypto currency
Bit taka bangladeshi country owned crypto currencyMohammad Salehin
 
Github-Source code management system SRS
Github-Source code management system SRSGithub-Source code management system SRS
Github-Source code management system SRSAditya Narayan Swami
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...Amazon Web Services
 
Actor Model Import Connector for Microsoft Active Directory
Actor Model Import Connector for Microsoft Active DirectoryActor Model Import Connector for Microsoft Active Directory
Actor Model Import Connector for Microsoft Active Directoryprotect724rkeer
 
Re-Imagining the Data Center with Intel
Re-Imagining the Data Center with IntelRe-Imagining the Data Center with Intel
Re-Imagining the Data Center with IntelIntel IT Center
 
dassault-systemes-catia-application-scalability-guide
dassault-systemes-catia-application-scalability-guidedassault-systemes-catia-application-scalability-guide
dassault-systemes-catia-application-scalability-guideJason Kyungho Lee
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stackLuca Mattia Ferrari
 

Similar to Browser fingerprinting without cookies (20)

Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4
Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4
Cloud Forms Iaa S V2wp 6299847 0411 Dm Web 4
 
ESM Service Layer Developers Guide for ESM 6.8c
ESM Service Layer Developers Guide for ESM 6.8cESM Service Layer Developers Guide for ESM 6.8c
ESM Service Layer Developers Guide for ESM 6.8c
 
ArcSight Connector Appliance 6.4 Administrator's Guide
ArcSight Connector Appliance 6.4 Administrator's GuideArcSight Connector Appliance 6.4 Administrator's Guide
ArcSight Connector Appliance 6.4 Administrator's Guide
 
Pandora FMS: Outlook Anywhere Plugin
Pandora FMS: Outlook Anywhere PluginPandora FMS: Outlook Anywhere Plugin
Pandora FMS: Outlook Anywhere Plugin
 
Protocol
ProtocolProtocol
Protocol
 
Performance vision Version 2.15 news
Performance vision Version 2.15 newsPerformance vision Version 2.15 news
Performance vision Version 2.15 news
 
Trusted Hardware Database With Privacy And Data Confidentiality
Trusted Hardware Database With Privacy And Data ConfidentialityTrusted Hardware Database With Privacy And Data Confidentiality
Trusted Hardware Database With Privacy And Data Confidentiality
 
Cp r75.40 release_notes
Cp r75.40 release_notesCp r75.40 release_notes
Cp r75.40 release_notes
 
ProjectReport_Subhayu
ProjectReport_SubhayuProjectReport_Subhayu
ProjectReport_Subhayu
 
RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.
RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.
RESTHeart - Modern runtime for microservices with instant Data API on MongoDB.
 
Sp 29 two_factor_auth_guide
Sp 29 two_factor_auth_guideSp 29 two_factor_auth_guide
Sp 29 two_factor_auth_guide
 
Bit taka bangladeshi country owned crypto currency
Bit taka bangladeshi country owned crypto currencyBit taka bangladeshi country owned crypto currency
Bit taka bangladeshi country owned crypto currency
 
Github-Source code management system SRS
Github-Source code management system SRSGithub-Source code management system SRS
Github-Source code management system SRS
 
Holger
HolgerHolger
Holger
 
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
AWS re:Invent 2016: Amazon CloudFront Flash Talks: Best Practices on Configur...
 
Actor Model Import Connector for Microsoft Active Directory
Actor Model Import Connector for Microsoft Active DirectoryActor Model Import Connector for Microsoft Active Directory
Actor Model Import Connector for Microsoft Active Directory
 
Re-Imagining the Data Center with Intel
Re-Imagining the Data Center with IntelRe-Imagining the Data Center with Intel
Re-Imagining the Data Center with Intel
 
dassault-systemes-catia-application-scalability-guide
dassault-systemes-catia-application-scalability-guidedassault-systemes-catia-application-scalability-guide
dassault-systemes-catia-application-scalability-guide
 
World Wide Web(WWW)
World Wide Web(WWW)World Wide Web(WWW)
World Wide Web(WWW)
 
The new (is it really ) api stack
The new (is it really ) api stackThe new (is it really ) api stack
The new (is it really ) api stack
 

Recently uploaded

代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 

Recently uploaded (20)

Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 

Browser fingerprinting without cookies

  • 1. Contents NPD: Repeat/ Unique Visitor Identification............................................................................................2 1. Product Intro & Goal...................................................................................................................2 2. Who’s it for?................................................................................................................................2 3. Why Build It.................................................................................................................................2 4. Desired Output of the product ...................................................................................................2 5. Preconditions ..............................................................................................................................2 6. Background & Short Description.................................................................................................2 7. Success Scenario for the product................................................................................................3 Implementation Methodology............................................................................................................4 8. Attributes for Fingerprinting.......................................................................................................4 8.1 User Agent String (HTTP Header)........................................................................................4 8.2 HTTP Requests header........................................................................................................4 8.3 Javascript Display Data........................................................................................................4 8.4 Plugin Data..........................................................................................................................4 8.5 HTML Canvas fingerprint.....................................................................................................5 8.6 WebGL Rendering ...............................................................................................................5 8.7 System Fonts.......................................................................................................................6 8.8 Do Not Track Request .........................................................................................................6 8.9 DNS/ TCP (network) ............................................................................................................6 8.10 Timezone.............................................................................................................................6 9. Scenarios for the model..............................................................................................................6 10. Observations with different devices, browsers/OS & Network..............................................7 10.1 With a given browser and OS but different network connections.....................................7 Connecting via Home Wi-Fi, company Wi-Fi, dongle, ISP shows the following behaviour in the attributes.........................................................................................................................................7 10.2 Same device but different browsers/ OS............................................................................8 11. Estimating weights for each attribute for the model .........................................................8 12. Suggested Methodology .........................................................................................................9 13. Conclusion.............................................................................................................................10 14. Corner Case...........................................................................................................................10 15. Experimental.........................................................................................................................10 16. References for further reading .............................................................................................10
  • 2. NPD: Repeat/ Unique Visitor Identification 1. Product Intro & Goal A web based solution to detect if a given user visiting your website has visited it before or not. 2. Who’s it for? Partner NBFCs with whom prospective borrowers fill up loan application 3. Why Build It This solution aims to detect in the pre-login journey itself that whether a visitor is a unique visitor or a repeat visitor. Usage can be in loan application fraud, and in ad servers for identifying users uniquely even after they flush their cookies. 4. Desired Output of the product Confidence score of a visitor being a re-visitor List of previous visits with time-stamp for a repeat visitor 5. Preconditions Cookies not allowed, outside compliance 6. Background & Short Description A client device when connecting with the web server where the website is hosted goes through several handshakes and ends up sending network and application data from which device geolocation, device network connection and device browser, operating system and hardware data details can be gathered and interpreted by the application server to uniquely identify a client (device). Client- server connection flow Load Balancer will direct HTTP/HTTPS requests to different server instances. It may or may not be content aware. During the negotiation between originating browser and the hosting server several REQUEST headers will be passed for appropriate server response for content. Network metadata information can be processed at network level or passed as a connection attribute to hosting server to be processed along with html/css & JavaScript data.
  • 3. 7. Success Scenario for the product a. The product should predict with a high confidence score that whether a user is new visitor or a returning user (H0: each user is a new visitor) Minimum Type I & Type II errors b. Min False positive : user was a new visitor but system predicted it a repeat visitor is a failure c. Min False negative: user was not a new visitor but system predicted it a new visitor d. Minimum test speed ---------------------------------------------------------------------------------------------------------------------------------------------------
  • 4. Implementation Methodology 8. Attributes for Fingerprinting 8.1 User Agent String (HTTP Header) Identifies information regarding browser & operating system A typical user-agent string looks like: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36" 8.2 HTTP Requests header Accept request-header fields Cookie device=d7834267-37fd-42ac-aa8c-1373aeebcf92; JSESSIONID=... Host noc.to Accept text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 If-None-Match "d7834267-37fd-42ac-aa8c-1373aeebcf92/2017-01-13-09:43:54.513" Upgrade-Insecure-Requests 1 Accept-Language en-US,en;q=0.8 User-Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/53 Connection keep-alive Accept-Encoding gzip, deflate, sdch 8.3 Javascript Display Data If browser can run JavaScript (by default Yes) Screen W x H 1366 x 768 Available W x H 1366 x 728 Color Depth 24 Pixel Ratio 1 8.4 Plugin Data Flash Plugin Data Plugin Version WIN 24,0,0,194 Plugin Manufacturer Google Pepper Language en Operating System Windows 10 CPU Architecture x86 Supports 32-Bit Processes yes Supports 64-Bit Processes yes Screen Resolution 1366 x 768 LSO Storage Test passed Pixel Aspect Ratio 1 Screen DPI 72 AV Hardware Disabled no File Read Disabled no Has Printing yes Has Accessibility yes Has Audio yes
  • 5. Has MP3 yes Has Embedded Video yes Has Screen Broadcast no Has Screen Playback no Has Streaming Audio yes Has Streaming Video yes Has Audio Encoder yes Has Video Encoder yes Has Input Method Editor yes Max Level IDC 5.1 Player Type PlugIn Is Debugger no Has Transport Layer Security yes Navigator Plugin List Name Chrome PDF Viewer Filename mhjfbmdgcfjbbpaeojofohoefgiehjai Name Chrome PDF Viewer Description Portable Document Format Filename internal-pdf-viewer Name Native Client Filename internal-nacl-plugin Name Shockwave Flash Description Shockwave Flash 24.0 r0 Filename pepflashplayer.dll Name Widevine Content Decryption Module Description Enables Widevine licenses for playback of HTML audio/video content. (version: 1.4.8.903) Filename widevinecdmadapter.dll Other variables: 8.5 HTML Canvas fingerprint With HTML5 canvas API, text/ image is rendered differently in different devices with varying OS, font library, graphics card, graphics driver and browser version. For e.g. the pixelmap produced from client running Chrome v55, Win 10, x64 with "ANGLE (Intel(R) HD Graphics 5500 Direct3D11 vs_5_0 ps_5_0)" 8.6 WebGL Rendering Browser supporting WebGL & WebGL 2.0 gives a unique hashmap to an extent depending on GPU context
  • 6. 8.7 System Fonts Extracted via Flash and JavaScript, can give a unique fingerprint in some machines apart from browser specific webfonts capability. 8.8 Do Not Track Request JavaScript populated Do Not Track tag 8.9 DNS/ TCP (network) TCP packet sent by client when negotiating connection have different values set by different OS types and versions. TTL in IP header and TCP window size for eg are different for different OS types In case of User Agent string spoofing, it can be cross verified with network data which is harder and less frequently spoofed. DNS Data can be used like DNS version to fingerprint a user to a particular DNS. 8.10 Timezone Client timezone and timestamp can be obtained by executing JS code on browser 9. Scenarios for the model The model aims be able to identify users who access the website that whether they are first time or repeat users. Since IP is dynamic, and in cases like corporate offices/ business parks, it gets shared across LAN with one gateway to WAN, it is difficult to narrow it down with IP fingerprint only. With proxies/ VPNs being quite prevalent it is no longer a good idea to use IP address fingerprinting only. Quite simply we want to detect whether: a. A browser is a returning browser or not. If yes, then we would want to narrow it down to the device details using other fingerprinting methods. b. If it’s not a returning browser but a new browser, even then we would want to cross-match with device fingerprinting using canvas/GPU data etc. The rarer the browser or device OS/ device hardware the easier it is to uniquely track a visitor.
  • 7. Decision Scenarios User Device OS Browser Model Scope Same User device 1 Same OS Different Browsers – edge/IE/tor/firefox/chrome etc Should detect Same device 1 Different OS – Win/OSX/Android/ Linux Should detect (rarer device is better) New device 2 Same OS as row 1 (exact match) Same browser as row 1 (exact match) Cannot detect the user since device is new Different User Device 1 Same OS from row 1 above different browser from row 1 Should detect (rarer the browser/OS the better) Device 1 different OS from row 1 above Any browser Yes Hence the success of the algorithm will be in uniquely identifying a device from different variables like user agent and flash, canvas data etc. Repeat user visit here should mean repeat access by the same client device using same or different browser and give the previous timestamps of each visit 10. Observations with different devices, browsers/OS & Network On testing with https://panopticlick.eff.org; www.amiunique.org; https://browserleaks.com and www.letmetrackyou.org following behaviour of attributes was seen. 10.1 With a given browser and OS but different network connections Connecting via Home Wi-Fi, company Wi-Fi, dongle, ISP shows the following behaviour in the attributes No. Attribute Behaviour 1 Accept header data Doesn’t change 2 User-Agent Doesn’t change, unless updated 3 DNT Doesn’t change 4 Touch Support Doesn’t change 5 Platform Doesn’t change 6 Language Doesn’t change 7 Cookies enabled Doesn’t change 8 Screen resolution Doesn’t change 9 Timezone Doesn’t change 10 Plugin versions Doesn’t change, unless updated or removed 11 Font List (all) Order does not change 12 Canvas Hash Doesn’t change 13 WebGL Hash Doesn’t change
  • 8. From this we can conclude that network has the least impact on each of these variables 10.2 Same device but different browsers/ OS Using Edge, IE, chrome, Firefox, Android, Win, the below attributes show the following behaviour No. Attribute Behaviour 1 Accept header changes 2 User-Agent changes 3 DNT changes 4 Touch Support Doesn’t change 5 Platform Doesn’t change 6 Language Doesn’t change 7 Cookies enabled Doesn’t change 8 Screen resolution Doesn’t change 9 Timezone Doesn’t change 10 Plugin version changes 11 Font List (all) Changes 12 Canvas Hash changes 13 WebGL Hash changes 11. Estimating weights for each attribute for the model From the observations from https://panopticlick.eff.org results it can be inferred that lower the probability of a particular attribute, the rarer it is to be found on the Internet, and hence higher the chances to uniquely identify the visitor and bigger the confidence score No. Attribute Weight 1 Accept header data High 2 User-Agent High 3 DNT Low 4 Touch Support Low 5 Platform Low 6 Language Low 7 Cookies enabled Low 8 Screen resolution Medium 9 Timezone Low 10 Plugin version High 11 Font List (all) High 12 Canvas Hash High 13 WebGL Hash High
  • 9. 12. Suggested Methodology Each visitor’s network and application data attributes will be lifted out of the browser request headers and by running JS code on the browser. Each of these attributes can be fingerprinted and their probabilities be computed. Further, canvas fingerprinting is possible by rendering a simple text on client machine from canvas API and extracting the resulting pixelmap. For WebGL fingerprinting the we can acquire client’s graphic card to draw & then extract images will give us a unique fingerprint which can then be stored against the detected hardware data obtained from flash for further uses. The combined fingerprints of each of these attributes will result in a one single user fingerprint which can then be statistically validated for a unique print with given significance level. (P-value) With enough test data, we can derive the coefficients of all the above 13 variables and error constant in the below equation. With time the model can better itself. The final output will be the test statistic which can be compared with the mean and standard deviation to arrive at the confidence internal. ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) Mean of the above equation can be calculated from Knowing the y output, same mean and standard deviation, z-value can be calculated and confidence interval can be calculated within 1 to 3 standard deviations. For example, Samples above the 2 standard deviation can be treated as new visitors and those below the 95% significance be treated as repeat users and timestamps be shown of previous visits. Case: Same device, different browser. Assigning weights from 10.2 and lifting probabilities for the variables from https://panopticlick.eff.org No. Attribute Weights Percentages Edge (1/Prob) IE (1/Prob) Chrome (1/Prob) 1 Accept header data High 11% 31462.83 31462.83 8.41 2 User-Agent High 13% 1716.15 522.93 35.17 3 DNT Low 4% 2.34 2.34 2.34 4 Touch Support Low 3% 1.41 1.41 1.41 5 Platform Low 5% 2.44 2.44 2.44 6 Language Low 4% 5243.81 5243.81 2.01 7 Cookies enabled Low 4% 1.13 1.13 1.13 8 Screen resolution & color Medium 8% 8.37 8.37 8.37
  • 10. 9 Timezone Low 3% 92.27 92.27 92.27 10 Browser Plugin version High 11% 3.17 188777 13.5 11 Font List (all) High 12% 4016.53 188777 188775 12 High 11% 20975.22 1165.29 319.42 13 WebGL Hash High 11% 17161.55 293.13 21.19 Using above eq 0.116112455 0.078871 0.130456 As per panopticlick, Edge & IE had the same entropy while Chrome had slightly higher entropy. From the above we can see that output for chrome is higher than the edge & IE (output of these two should have been similar). But without test data, the weights of all the above attributes are not validated. Multicollinearity: some of the variables show high degree of correlation with each other and may not be suitable for regression. (Canvas and WebGL hash; UA and Accept header show high correlation, unsubstantiated with a test sample size). Also it is assumed to be a linear regression for simplification purposes. 13. Conclusion Online financial services tend to reduce the information required from clients during on-boarding. They simplify the process to what is necessary for complying with regulations, and thus open the door to abuses and fraud. Moreover borrowers tend to fill multiple loan application forms, with same and different lenders, just in case they get rejected by one. Hence preventing loan application & credit card fraud is the biggest use case for such a model apart from allowing ad networks and DSPs to track visitors uniquely in case users have flushed cookies or accessing incognito. With network attributes like DNS and TCP we have also cross-verify if the user has manipulated their browser’s user agent string. 14. Corner Case a. Overtime a user’s fingerprint may change due to OS/browser upgrades. Since these values won’t deprecate, provisions can be made in the algorithm. b. Two devices with exactly same hardware, OS and browser versions will always give same fingerprint and may not be able to be differentiated by any method. In such instances we have to think of some other attributes. 15. Experimental a. MAC collection: Some languages provide remote server with the client mac address but then it is limited to PHP as of now which can run the script on client end. b. Keyboard fingerprinting: user’s typing speed and words usage can also be used in case other methods fail to determine conclusively. c. Audio fingerprinting: new HTML5 Audio Context API can be used fingerprint website visitors d. Battery fingerprinting: new HTML5 API will allow browser see how much battery life in percentages is left in the device 16. References for further reading 1. https://www.eff.org/deeplinks/2010/01/primer-information-theory-and-privacy 2. https://zyan.scripts.mit.edu/presentations/toorcon2015.pdf
  • 11. 3. http://letmetrackyou.org/identify.php 4. https://panopticlick.eff.org/about 5. https://github.com/brave/browser-laptop/issues/2259 6. https://github.com/brave/browser-laptop/issues/242 7. https://www.threatmetrix.com/digital-identity-blog/fraud-prevention/advancing-beyond-device- fingerprinting-prevent-loan-fraud/