SlideShare a Scribd company logo
Bug Busters
A study on Understanding people browsing behaviours
Prerana Khatiwada, Miguel Angel Torres Sanchez, Aaron Liu,
Aman Sawhney, Nabiha Syed
Introducing "Bug Buster": Expanding Horizons for the
Community Comms Project
(Spotting Online News: A Mixed Methods Study of News Browsing
Behaviors to Inform Misinformation Interventions)
University of Delaware Computer & Information Sciences
The Problem
The internet has become an integral part of
our daily lives, with billions of users engaging
in various online activities.
Understanding user browsing behavior is
important for businesses, marketers, and
website designers to optimize their platforms
to detect and flag false information and
provide a better user experience.
The Goal
The primary objective is to gain a comprehensive understanding of user browsing
behavior and reconstruct their online journey. Additionally, we aim to raise users'
awareness of their news consumption habits, focusing on the types of sites and
domains they predominantly engage with.
Our work primarily focused on the visualization aspects and some exploratory
analysis.
Data Collection
Special thanks to the Community Comm team of Sensify Lab at Udel for generously sharing the
valuable data with us. The dataset consists of real-world information collected from participants
who actively participated in the two-week formative study through the passive logging version of
the Chrome plugin.
Our Data 152501 rows* 7 columns
Steps/Challenges
1. Downloaded the raw data from Firebase.
2. Parse it from the firebase using the Service API Key.
3. Preprocess dataset.
4. Trying to create graph visualization from the structure of user domain “tabbing”
5. Several major issues occurred during the process. For instance, some users
appeared to have opened over 100 tabs simultaneously, all sharing the same tab
ID. Ideally, each time a user switches tabs, the IDs should be distinct. Due to this
problem, such data had to be excluded from the graph generation, likely caused
by a bug in the data capture tool.
6. Lack of time.
Targeted Goals (Contd…)
We used Clickup for managing our tasks.
As of day 2, this was the progress we had accomplished.
Targeted Goals (Contd…)
As of day 2, this was the progress we accomplished.
Targeted Goals (Contd…)
As of day 2, this was the progress we accomplished.
How a “Session” calculated?
1. The session perspective is based on the user's browsing activity.
2. A gap of more than 10 minutes between events indicates the start of a new
session.
3. Within a session, there can be multiple tabs open concurrently.
4. Each tab represents a sequence of URLs that the user follows during their
browsing session.
5. We used python to consume Web shrinker API, the API has a limit , we
created several accounts because one account gives only 100 hits to
websites.
6. We give them url and they classify those urls into 400 categories
Session Graph- Heterogeneous Edges
How we categorized articles/ websites?
We use Python to interact with the Websrhinker API, which, unfortunately, has a
limitation. To overcome this constraint, we created multiple accounts since a single
account only allows 100 hits to websites.
Our primary task involved providing URLs to the API, which then classified these URLs
into various categories. With an extensive range of 400 categories available, we were
able to efficiently categorize the URLs for our analysis.
Ipsum dolr amet dolor
Data Insights
Future Outlooks
● Examine how users encounter news articles, their motivations for accessing them, and the factors that led
them to those specific articles.
● Investigate the speed of misinformation spread and analyze users' sequential reading patterns of articles.
● Generate a model without relying on human resources or external hiring, possibly creating a simulator for
the process.
● Explore the generation of heterogeneous graphs and study their properties in the context of user journeys.
Hopefully we will be able to fully create a model of this user experience.
Sourced from the Noun Project, we also would like to credit the creators for the icons we used in the slides.
1. RQ1: Determine the periods when the browser is actively focused and in use.
2. RQ2: Identify the domains to which the browser is primarily focused on.
3. RQ3: Analyze the specific time intervals during which users are actively
engaged in browsing activities.
Kovacs, Geza. "Reconstructing detailed browsing activities from browser history." arXiv preprint arXiv:2102.03742 (2021).
The impact of this project could be
helping application developers to explore
recommendation algorithms and how
interventions in browsing patterns might
improve media literacy.
QA

More Related Content

Similar to Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx

IRJET- News Recommendation based on User Preferences and Location
IRJET-  	  News Recommendation based on User Preferences and LocationIRJET-  	  News Recommendation based on User Preferences and Location
IRJET- News Recommendation based on User Preferences and Location
IRJET Journal
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalization
eSAT Journals
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalization
eSAT Publishing House
 
F018123136
F018123136F018123136
F018123136
IOSR Journals
 
You Name Here1. List several products or services subject to n.docx
You Name Here1. List several products or services subject to n.docxYou Name Here1. List several products or services subject to n.docx
You Name Here1. List several products or services subject to n.docx
jeffevans62972
 
KnowNow Syndication-Oriented Architecture
KnowNow Syndication-Oriented ArchitectureKnowNow Syndication-Oriented Architecture
KnowNow Syndication-Oriented Architecture
rohitkhare
 
Search engine patterns
Search engine patternsSearch engine patterns
Search engine patterns
Rob Paok
 
online news portal system
online news portal systemonline news portal system
online news portal system
Arman Ahmed
 
Sweeper User Guide v0.3
Sweeper User Guide v0.3Sweeper User Guide v0.3
Sweeper User Guide v0.3
Ushahidi
 
STATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGY
STATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGYSTATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGY
STATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGY
IRJET Journal
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
Editor IJCATR
 
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
inventionjournals
 
Web Application Vulnerabilities
Web Application VulnerabilitiesWeb Application Vulnerabilities
Web Application Vulnerabilities
Pamela Wright
 
A comprehensive guide to user behavioral analytics
A comprehensive guide to user behavioral analytics A comprehensive guide to user behavioral analytics
A comprehensive guide to user behavioral analytics
ONE BCG
 
Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  
dannyijwest
 
How To Use Transfer Paper
How To Use Transfer PaperHow To Use Transfer Paper
How To Use Transfer Paper
Allyson Thompson
 
TOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENT
TOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENTTOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENT
TOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENT
csandit
 
Peerbelt_Presentation
Peerbelt_PresentationPeerbelt_Presentation
Peerbelt_Presentation
Krassimir Fotev
 
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICESIMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IAEME Publication
 
WEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESSWEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESS
acijjournal
 

Similar to Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx (20)

IRJET- News Recommendation based on User Preferences and Location
IRJET-  	  News Recommendation based on User Preferences and LocationIRJET-  	  News Recommendation based on User Preferences and Location
IRJET- News Recommendation based on User Preferences and Location
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalization
 
A survey on ontology based web personalization
A survey on ontology based web personalizationA survey on ontology based web personalization
A survey on ontology based web personalization
 
F018123136
F018123136F018123136
F018123136
 
You Name Here1. List several products or services subject to n.docx
You Name Here1. List several products or services subject to n.docxYou Name Here1. List several products or services subject to n.docx
You Name Here1. List several products or services subject to n.docx
 
KnowNow Syndication-Oriented Architecture
KnowNow Syndication-Oriented ArchitectureKnowNow Syndication-Oriented Architecture
KnowNow Syndication-Oriented Architecture
 
Search engine patterns
Search engine patternsSearch engine patterns
Search engine patterns
 
online news portal system
online news portal systemonline news portal system
online news portal system
 
Sweeper User Guide v0.3
Sweeper User Guide v0.3Sweeper User Guide v0.3
Sweeper User Guide v0.3
 
STATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGY
STATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGYSTATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGY
STATE OF THE ART CONTENT MINING USING SCAN TECHNOLOGY
 
Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...Identifying the Number of Visitors to improve Website Usability from Educatio...
Identifying the Number of Visitors to improve Website Usability from Educatio...
 
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
 
Web Application Vulnerabilities
Web Application VulnerabilitiesWeb Application Vulnerabilities
Web Application Vulnerabilities
 
A comprehensive guide to user behavioral analytics
A comprehensive guide to user behavioral analytics A comprehensive guide to user behavioral analytics
A comprehensive guide to user behavioral analytics
 
Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  Effective Performance of Information Retrieval on Web by Using Web Crawling  
Effective Performance of Information Retrieval on Web by Using Web Crawling  
 
How To Use Transfer Paper
How To Use Transfer PaperHow To Use Transfer Paper
How To Use Transfer Paper
 
TOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENT
TOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENTTOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENT
TOWARDS UNIVERSAL RATING OF ONLINE MULTIMEDIA CONTENT
 
Peerbelt_Presentation
Peerbelt_PresentationPeerbelt_Presentation
Peerbelt_Presentation
 
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICESIMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
 
WEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESSWEB MINING – A CATALYST FOR E-BUSINESS
WEB MINING – A CATALYST FOR E-BUSINESS
 

More from Prerana Khatiwada

Human_Centered_Computing_Presentation_Main.pptx
Human_Centered_Computing_Presentation_Main.pptxHuman_Centered_Computing_Presentation_Main.pptx
Human_Centered_Computing_Presentation_Main.pptx
Prerana Khatiwada
 
Accessibility in Website Design_Classppt.pptx
Accessibility in Website Design_Classppt.pptxAccessibility in Website Design_Classppt.pptx
Accessibility in Website Design_Classppt.pptx
Prerana Khatiwada
 
Medication Management.pptx
Medication Management.pptxMedication Management.pptx
Medication Management.pptx
Prerana Khatiwada
 
Analyzing the Security of Smartphone Unlock PINs.pptx
Analyzing the Security of Smartphone Unlock PINs.pptxAnalyzing the Security of Smartphone Unlock PINs.pptx
Analyzing the Security of Smartphone Unlock PINs.pptx
Prerana Khatiwada
 
Evaluating Serverless Machine Learning Performance On Google Cloud Run.pptx
Evaluating Serverless Machine Learning Performance On Google Cloud Run.pptxEvaluating Serverless Machine Learning Performance On Google Cloud Run.pptx
Evaluating Serverless Machine Learning Performance On Google Cloud Run.pptx
Prerana Khatiwada
 
Medication Management2.pptx
Medication Management2.pptxMedication Management2.pptx
Medication Management2.pptx
Prerana Khatiwada
 
Adversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptxAdversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptx
Prerana Khatiwada
 

More from Prerana Khatiwada (7)

Human_Centered_Computing_Presentation_Main.pptx
Human_Centered_Computing_Presentation_Main.pptxHuman_Centered_Computing_Presentation_Main.pptx
Human_Centered_Computing_Presentation_Main.pptx
 
Accessibility in Website Design_Classppt.pptx
Accessibility in Website Design_Classppt.pptxAccessibility in Website Design_Classppt.pptx
Accessibility in Website Design_Classppt.pptx
 
Medication Management.pptx
Medication Management.pptxMedication Management.pptx
Medication Management.pptx
 
Analyzing the Security of Smartphone Unlock PINs.pptx
Analyzing the Security of Smartphone Unlock PINs.pptxAnalyzing the Security of Smartphone Unlock PINs.pptx
Analyzing the Security of Smartphone Unlock PINs.pptx
 
Evaluating Serverless Machine Learning Performance On Google Cloud Run.pptx
Evaluating Serverless Machine Learning Performance On Google Cloud Run.pptxEvaluating Serverless Machine Learning Performance On Google Cloud Run.pptx
Evaluating Serverless Machine Learning Performance On Google Cloud Run.pptx
 
Medication Management2.pptx
Medication Management2.pptxMedication Management2.pptx
Medication Management2.pptx
 
Adversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptxAdversarial Training is all you Need.pptx
Adversarial Training is all you Need.pptx
 

Recently uploaded

What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
DianaGray10
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
operationspcvita
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 

Recently uploaded (20)

What is an RPA CoE? Session 1 – CoE Vision
What is an RPA CoE?  Session 1 – CoE VisionWhat is an RPA CoE?  Session 1 – CoE Vision
What is an RPA CoE? Session 1 – CoE Vision
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
The Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptxThe Microsoft 365 Migration Tutorial For Beginner.pptx
The Microsoft 365 Migration Tutorial For Beginner.pptx
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 

Bug_Busters_Hackathon_AICoE_UniversityofDelaware.pptx

  • 1. Bug Busters A study on Understanding people browsing behaviours Prerana Khatiwada, Miguel Angel Torres Sanchez, Aaron Liu, Aman Sawhney, Nabiha Syed
  • 2. Introducing "Bug Buster": Expanding Horizons for the Community Comms Project (Spotting Online News: A Mixed Methods Study of News Browsing Behaviors to Inform Misinformation Interventions) University of Delaware Computer & Information Sciences
  • 3. The Problem The internet has become an integral part of our daily lives, with billions of users engaging in various online activities. Understanding user browsing behavior is important for businesses, marketers, and website designers to optimize their platforms to detect and flag false information and provide a better user experience.
  • 4. The Goal The primary objective is to gain a comprehensive understanding of user browsing behavior and reconstruct their online journey. Additionally, we aim to raise users' awareness of their news consumption habits, focusing on the types of sites and domains they predominantly engage with. Our work primarily focused on the visualization aspects and some exploratory analysis.
  • 5. Data Collection Special thanks to the Community Comm team of Sensify Lab at Udel for generously sharing the valuable data with us. The dataset consists of real-world information collected from participants who actively participated in the two-week formative study through the passive logging version of the Chrome plugin.
  • 6. Our Data 152501 rows* 7 columns
  • 7. Steps/Challenges 1. Downloaded the raw data from Firebase. 2. Parse it from the firebase using the Service API Key. 3. Preprocess dataset. 4. Trying to create graph visualization from the structure of user domain “tabbing” 5. Several major issues occurred during the process. For instance, some users appeared to have opened over 100 tabs simultaneously, all sharing the same tab ID. Ideally, each time a user switches tabs, the IDs should be distinct. Due to this problem, such data had to be excluded from the graph generation, likely caused by a bug in the data capture tool. 6. Lack of time.
  • 8. Targeted Goals (Contd…) We used Clickup for managing our tasks. As of day 2, this was the progress we had accomplished.
  • 9. Targeted Goals (Contd…) As of day 2, this was the progress we accomplished.
  • 10. Targeted Goals (Contd…) As of day 2, this was the progress we accomplished.
  • 11. How a “Session” calculated? 1. The session perspective is based on the user's browsing activity. 2. A gap of more than 10 minutes between events indicates the start of a new session. 3. Within a session, there can be multiple tabs open concurrently. 4. Each tab represents a sequence of URLs that the user follows during their browsing session. 5. We used python to consume Web shrinker API, the API has a limit , we created several accounts because one account gives only 100 hits to websites. 6. We give them url and they classify those urls into 400 categories
  • 13. How we categorized articles/ websites? We use Python to interact with the Websrhinker API, which, unfortunately, has a limitation. To overcome this constraint, we created multiple accounts since a single account only allows 100 hits to websites. Our primary task involved providing URLs to the API, which then classified these URLs into various categories. With an extensive range of 400 categories available, we were able to efficiently categorize the URLs for our analysis.
  • 16.
  • 17.
  • 18. Future Outlooks ● Examine how users encounter news articles, their motivations for accessing them, and the factors that led them to those specific articles. ● Investigate the speed of misinformation spread and analyze users' sequential reading patterns of articles. ● Generate a model without relying on human resources or external hiring, possibly creating a simulator for the process. ● Explore the generation of heterogeneous graphs and study their properties in the context of user journeys. Hopefully we will be able to fully create a model of this user experience. Sourced from the Noun Project, we also would like to credit the creators for the icons we used in the slides.
  • 19. 1. RQ1: Determine the periods when the browser is actively focused and in use. 2. RQ2: Identify the domains to which the browser is primarily focused on. 3. RQ3: Analyze the specific time intervals during which users are actively engaged in browsing activities. Kovacs, Geza. "Reconstructing detailed browsing activities from browser history." arXiv preprint arXiv:2102.03742 (2021). The impact of this project could be helping application developers to explore recommendation algorithms and how interventions in browsing patterns might improve media literacy.
  • 20. QA