Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx

Bug Busters
A study on Understanding people browsing behaviours
Prerana Khatiwada, Miguel Angel Torres Sanchez, Aaron Liu,
Aman Sawhney, Nabiha Syed

Introducing "Bug Buster": Expanding Horizons for the
Community Comms Project
(Spotting Online News: A Mixed Methods Study of News Browsing
Behaviors to Inform Misinformation Interventions)
University of Delaware Computer & Information Sciences

The Problem
The internet has become an integral part of
our daily lives, with billions of users engaging
in various online activities.
Understanding user browsing behavior is
important for businesses, marketers, and
website designers to optimize their platforms
to detect and flag false information and
provide a better user experience.

The Goal
The primary objective is to gain a comprehensive understanding of user browsing
behavior and reconstruct their online journey. Additionally, we aim to raise users'
awareness of their news consumption habits, focusing on the types of sites and
domains they predominantly engage with.
Our work primarily focused on the visualization aspects and some exploratory
analysis.

Data Collection
Special thanks to the Community Comm team of Sensify Lab at Udel for generously sharing the
valuable data with us. The dataset consists of real-world information collected from participants
who actively participated in the two-week formative study through the passive logging version of
the Chrome plugin.

Our Data 152501 rows* 7 columns

Steps/Challenges
1. Downloaded the raw data from Firebase.
2. Parse it from the firebase using the Service API Key.
3. Preprocess dataset.
4. Trying to create graph visualization from the structure of user domain “tabbing”
5. Several major issues occurred during the process. For instance, some users
appeared to have opened over 100 tabs simultaneously, all sharing the same tab
ID. Ideally, each time a user switches tabs, the IDs should be distinct. Due to this
problem, such data had to be excluded from the graph generation, likely caused
by a bug in the data capture tool.
6. Lack of time.

Targeted Goals (Contd…)
We used Clickup for managing our tasks.
As of day 2, this was the progress we had accomplished.

Targeted Goals (Contd…)
As of day 2, this was the progress we accomplished.

How a “Session” calculated?
1. The session perspective is based on the user's browsing activity.
2. A gap of more than 10 minutes between events indicates the start of a new
session.
3. Within a session, there can be multiple tabs open concurrently.
4. Each tab represents a sequence of URLs that the user follows during their
browsing session.
5. We used python to consume Web shrinker API, the API has a limit , we
created several accounts because one account gives only 100 hits to
websites.
6. We give them url and they classify those urls into 400 categories

Session Graph- Heterogeneous Edges

How we categorized articles/ websites?
We use Python to interact with the Websrhinker API, which, unfortunately, has a
limitation. To overcome this constraint, we created multiple accounts since a single
account only allows 100 hits to websites.
Our primary task involved providing URLs to the API, which then classified these URLs
into various categories. With an extensive range of 400 categories available, we were
able to efficiently categorize the URLs for our analysis.

Future Outlooks
● Examine how users encounter news articles, their motivations for accessing them, and the factors that led
them to those specific articles.
● Investigate the speed of misinformation spread and analyze users' sequential reading patterns of articles.
● Generate a model without relying on human resources or external hiring, possibly creating a simulator for
the process.
● Explore the generation of heterogeneous graphs and study their properties in the context of user journeys.
Hopefully we will be able to fully create a model of this user experience.
Sourced from the Noun Project, we also would like to credit the creators for the icons we used in the slides.

1. RQ1: Determine the periods when the browser is actively focused and in use.
2. RQ2: Identify the domains to which the browser is primarily focused on.
3. RQ3: Analyze the specific time intervals during which users are actively
engaged in browsing activities.
Kovacs, Geza. "Reconstructing detailed browsing activities from browser history." arXiv preprint arXiv:2102.03742 (2021).
The impact of this project could be
helping application developers to explore
recommendation algorithms and how
interventions in browsing patterns might
improve media literacy.

Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx

Recommended

Recommended

More Related Content

Similar to Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx

Similar to Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx (20)

More from Prerana Khatiwada

More from Prerana Khatiwada (7)

Recently uploaded

Recently uploaded (20)

Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx