This document summarizes an AoIR Digital Methods Workshop on tracking technologies on the web. It introduces different types of trackers like cookies, widgets and advertising trackers that collect data as users browse websites. The workshop demonstrates the Tracker Tracker tool to analyze which trackers are present on lists of websites and identify connections between sites and trackers. An example project analyzing social media platform trackers on the 1000 most visited websites found widespread tracking by Facebook and other companies. The workshop provides methods for analyzing tracker prevalence and visualizing results to study the invisible infrastructures of the web.
App ecologies: Mapping apps and their support networkscgrltz
Presentation by Anne Helmond, Fernando van der Vlist, Esther Weltevrede and Carolin Gerlitz at the Association of Internet Researchers Conference Berlin 2016
"Enhancing your research impact through social media" - presentation given by Nicola Osborne, EDINA Digital Education Manager, at the Edinburgh Postgraduate Law Conference 2017 (19th January 2017).
Slides accompanying Nicola Osborne's(EDINA Digital Education Manager) session on "Social media and blogging to develop and communicate research in the arts and humanities" at the "Academic Publishing: Routes to Success" event held at the University of Stirling on 23rd January 2017.
App ecologies: Mapping apps and their support networkscgrltz
Presentation by Anne Helmond, Fernando van der Vlist, Esther Weltevrede and Carolin Gerlitz at the Association of Internet Researchers Conference Berlin 2016
"Enhancing your research impact through social media" - presentation given by Nicola Osborne, EDINA Digital Education Manager, at the Edinburgh Postgraduate Law Conference 2017 (19th January 2017).
Slides accompanying Nicola Osborne's(EDINA Digital Education Manager) session on "Social media and blogging to develop and communicate research in the arts and humanities" at the "Academic Publishing: Routes to Success" event held at the University of Stirling on 23rd January 2017.
You Are Being Tracked, Evaluated and Sold: an analysis of digital inequalit...Alex Dunedin
You Are Being Tracked, Evaluated and Sold: an analysis of digital inequalities by Prof Beverley Skeggs at LSE. Found http://www.lse.ac.uk/Events/Events-Assets/PDF/2017/2017-MT03/20170926-Bev-Skeggs-PPT.pdf - For Audio: https://soundcloud.com/lsepodcasts/you-are-being-tracked
Digital Media Winter Institute. Smart Data Sprint: Interpreters of platform data.
29 Jan-02 Fev, 2018. Universidade Nova de Lisboa, Lisbon, Portugal
Practical Lab as an introduction to Social Media Methods, taking as a starting point the data extraction tools.
Digital Advertising, Privacy and User-tracking MethodsHonza Pav
Digital advertising is getting more automated and driven by data. Cookies is the most common user-tracking technology but it is not very efficient and is regulated. In the presentation, I describe methods like supercookies and device fingerprinting and ways how users can get aware of who is tracking them. Made for New York University in Prague.
How do you find opportunities? By scanning information being generated globally and tracking trends. In this talk we discuss several ways to find opportunities and how we can use emerging technology trends.
The more “networking” becomes the world, the more difficult to find relevant information using traditional search engine technology. Here social media and search possibilities appear
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering open science and data publishing
Integration data models, Learning Layers project meeting in BremenVladimir Tomberg
Report on process of building common semantic core for data from several Learning Layers applications for an integrated solution supported by Social Semantic Server
You Are Being Tracked, Evaluated and Sold: an analysis of digital inequalit...Alex Dunedin
You Are Being Tracked, Evaluated and Sold: an analysis of digital inequalities by Prof Beverley Skeggs at LSE. Found http://www.lse.ac.uk/Events/Events-Assets/PDF/2017/2017-MT03/20170926-Bev-Skeggs-PPT.pdf - For Audio: https://soundcloud.com/lsepodcasts/you-are-being-tracked
Digital Media Winter Institute. Smart Data Sprint: Interpreters of platform data.
29 Jan-02 Fev, 2018. Universidade Nova de Lisboa, Lisbon, Portugal
Practical Lab as an introduction to Social Media Methods, taking as a starting point the data extraction tools.
Digital Advertising, Privacy and User-tracking MethodsHonza Pav
Digital advertising is getting more automated and driven by data. Cookies is the most common user-tracking technology but it is not very efficient and is regulated. In the presentation, I describe methods like supercookies and device fingerprinting and ways how users can get aware of who is tracking them. Made for New York University in Prague.
How do you find opportunities? By scanning information being generated globally and tracking trends. In this talk we discuss several ways to find opportunities and how we can use emerging technology trends.
The more “networking” becomes the world, the more difficult to find relevant information using traditional search engine technology. Here social media and search possibilities appear
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering open science and data publishing
Integration data models, Learning Layers project meeting in BremenVladimir Tomberg
Report on process of building common semantic core for data from several Learning Layers applications for an integrated solution supported by Social Semantic Server
Similar to AoIR 2016 Digital Methods Workshop - Tracking the Trackers (20)
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
4. Tracking
“For every explicit action of a user, there are
probably 100+ implicit data points from usage;
whether that is a page visit, a scroll etc.”
(Berry 2011: 152)
5. Every time a web user requests a website, a series of
tracking features are enabled: cookies, widgets,
advertising trackers, analytics, beacons etc.
First party (from website) vs. third party tracker (e.g.
Facebook, Twitter, Google).
Purpose: From functionality to profiling.
Tracking technologies
6. Tracker blocking
Ghostery: Browser plugin which detects and allows to
block the ‘invisible’ web and prevents a ‘digital
footprint’.
Detection via tracker library/code snippets [reg ex].
Detecting around 2295 trackers.
Not uncontroversial: started as NGO, then bought by
analytics company Evidon in 2010.
8. DMI Tracker Tracker
Tracker Tracker: tool built on top of
Ghostery by the Digital Methods
Initiative (2012).
Allows to detect which trackers are
present on lists of websites &
create a network view.
“Repurposing analytical capacities” of
privacy app: digital research
methods paired with platform &
software studies.
9. 3. Example project: Like Economy
Gerlitz, Carolin, and Anne Helmond. 2013. “The Like Economy: Social Buttons and the Data-
Intensive Web.” New Media & Society 15 (8): 1348–65. doi:10.1177/1461444812472322.
10. Like Economy
Starting point: social media widgets place
cookies (Gerlitz & Helmond 2013).
These cookies track both platform users and
anyone else on the web.
All web users potentially feed data into
platforms through cookies.
RQ: How pervasive are platform cookies on
the most visited websites of the web?
11. Like Economy: Method
1. Create a collection of 1000 most-visited
websites based on Alexa.com data.
2. Input into the Tracker Tracker tool.
3. Visualise results with Gephi.
4. Colour-code based on platform.
14. Methodological summary
1. Research question: type of tracker & sites
2. Website (URL) collection making: existing
expert list (e.g. alexa.com)
3. Input collection into Tracker Tracker tool
4. Visualise results with Gephi
5. Analyse results + add layers
15. Tracking exercise
1. What kind of sites do you want to study?
2. Get access to the collections made with
Alexa.com: http://tiny.cc/TrackURLs.
3. Enter the list into the Tracker Tracker tool.
Settings: Only look at specified pages.
4. Save > Output > GEFX (Gephi).
a. Alternative: Save > Output > CSV exh
5. Open in Gephi, use colour settings to visually
distinguish between different tracking
services/types.
a. Alternative: visualize CSV (e.g. bar
graphs) with Google Sheets.
16. Tracking exercise
Gephi instructions*:
1. New Project > Open Graph File > OK
2. Layout > Choose a Layout > Force Atlas 2
a. Scaling: 30
b. Dissuade: yes
c. Prevent Overlap: yes
3. Appearance > Nodes > Size > Attribute > Degree > Min size: 5
Max size: 30 (you can play with these settings).
4. Show Node labels. Scale node labels to node size
5. Layout > Choose a Layout > Label adjust
6. Color > Nodes > Attribute > Type
7. Preview > Presets > Default Straight
a. Node Labels Arial 10> Refresh
8. Export > SVG/PDF/PNG
9. Data visualization interpretation
These settings work well for the top 25 adult sites. All Gephi settings depend on the graph (e.g. amount of
nodes/type of algorithm needed for analysis). There are no “universal” settings.
20. Historical tracking analysis using the Internet Archive
Studying the website as an ecosystem embedded in
techno-commercial configurations over time through its
archived source code (Helmond 2015)
25. Key questions
Limits of repurposing analytical capacities of
existing devices.
What data is actually being collected?
Study invisible participation in data flows.
Study media concentration.
Alternative spatialities of the web - tracker
origins and national ecologies.
Insights into invisible infrastructures of the
web.
26. End! Thank you.
Anne Helmond, University of Amsterdam.
Carolin Gerlitz, University of Siegen.
Fernando van der Vlist, University of Siegen.
Esther Weltevrede, University of Amsterdam.
https://digitalmethods.net
27. References
Gerlitz, Carolin, and Anne Helmond. “The Like Economy: Social Buttons and the Data-Intensive
Web.” New Media & Society 15.8 (2013): 1348–1365.
<http://nms.sagepub.com/content/15/8/1348>.
Helmond, Anne. “Historical Website Ecology. Analyzing Past States of the Web Using Archived
Source Code.” Web 25: Histories from the First 25 Years of the World Wide Web. Ed. Niels
Brügger. New York: Peter Lang Publishing, forthcoming. See Dropbox.
Helmond, Anne. “Website Ecologies: Redrawing the Boundaries of a Website.” The Web as
Platform: Data Flows in Social Media. PhD thesis. Amsterdam: University of Amsterdam, 2015.
132–165. <http://dare.uva.nl/record/1/485895>.
van der Velden, Lonneke. “The Third Party Diary: Tracking the Trackers on Dutch Governmental
Websites.” NECSUS. European Journal of Media Studies 3.1 (2014): 195–217.
<http://www.necsus-ejms.org/third-party-diary-tracking-trackers-dutch-governmental-websites-2/>
Editor's Notes
C
What are these elements that track our online behaviour?
What data do they collect?
What happens with the data?
What does that teach us about the data-intensive web?
In response to the proliferation of trackers, a variety of anti-tracker devices has emerged.
Ghostery: make money from donations & ghostrank. “We take that information, add our analysis, and sell it to brands and websites to help them evaluate their relationships with their marketing partners. Some ad tech companies use the data to compare themselves to their competition, while other research firms buy it to learn more about the industry.
We also provide data to consumer advocacy groups like the Better Business Bureau (BBB), journalists writing stories about privacy, and students and activists involved in related projects and papers.”
Makes use of ghostery database, code snippet, matching.
To detect the relative presence of Facebook tracking tools in the web, we took the top 1000
global websites according to Alexa and identified fingerprints of different tracking devices such as analytics, ad programs, widgets, social plugins. The first map highlights the websites using Facebook Social Plugins and Facebook Connect.
In our sample from 2012, around 18% of all websites feature at least one of these connections to Facebook, allowing users to engage with their content via Facebook features and ena- bling multiple data flows in the back end.
>>> Interesting to test if presence of platform trackers has grown.
the example project shows a particular approach
>> Walkthrough starts
>> Walkthrough, see Google Doc - These settings really depend on your graph - I have adjusted the settings to work with the top25 adult sites.
>> Walkthrough, see Google Doc - These settings really depend on your graph - I have adjusted the settings to work with the top25 adult sites.
with all that in mind we turn to a small exercise
How do alternative actors participate in the global advertising economy
Develop a method to study the ecosystem of a website over time, a website is embedded in a techno-commercial configuration - in which trackers play a role.
Media concentration?
Historical trackers - INTERNET ARCHIVE
Further developing the notion of how websites are informed by third parties - we created this guide - biological
Here we focused on the question of who owns these trackers and what kind of data do they collect (based on privacy policies etc) and which countries these trackers are located in. role of US prominent, raising questions about where data is stored.
Add references
Historical trackers Anne
Lonneke’s paper