SlideShare a Scribd company logo
http://www.swell-project.net
Collecting a dataset of information
behaviour in context
Maya Sappelli, TNO & Radboud University Nijmegen
Suzan Verberne, Radboud University Nijmegen
Saskia Koldijk, TNO & Radboud University Nijmegen
Wessel Kraaij, TNO & Radboud University Nijmegen
Supported by the Dutch National Program:
http://www.swell-project.net
Information behaviour in context
Supported by the Dutch National Program:
2 / 15
http://www.swell-project.net
Information behaviour in context
Supported by the Dutch National Program:
3 / 15
http://www.swell-project.net
But…
• Controlled Search Experiment
Lacks context for search
Unnatural motive/behaviour
• Uncontrolled Data Collection
Privacy issues
Noise
Supported by the Dutch National Program:
4 / 15
http://www.swell-project.net
Data Collection
Supported by the Dutch National Program:
5 / 15
http://www.swell-project.net
Data Collection
Supported by the Dutch National Program:
6 / 15
http://www.swell-project.net
Data Labeling:
Event Stream to Event Blocks
Supported by the Dutch National Program:
7 / 15
Event Block
e yOutlook A nH
Inbox
http://www.swell-project.net
Data Labeling:
presenting Event Blocks
• Mechanical Turk
• 9416 event blocks
• Cohen’s kappa 0.78
Supported by the Dutch National Program:
8 / 15
http://www.swell-project.net
Data Labeling: result
Supported by the Dutch National Program:
9 / 15
Distribution of Labels
Einstein
Information Overload
Stress
Healthy
Privacy
Perth
Roadtrip
Napoleon
Indeterminable
No Label
Total no. Event
blocks
9416
Average no. Event
blocks per
participant
377
http://www.swell-project.net
Examples of analyses with the data
• Stress-related behavioural research
• Information-related behavioural research
1. system-oriented
2. user-oriented
Supported by the Dutch National Program:
10 / 15
http://www.swell-project.net
System-oriented analysis
Supported by the Dutch National Program:
11 / 15
Total #
Queries:
980
Of which
followed
by a click
on a URL:
732
Of which
followed by a
switch to
Word/
Powerpoint:
125
Of which with
Ctrl+C: 15
with a dwell-
time of >=30
seconds: 44
http://www.swell-project.net
User-oriented analysis
Supported by the Dutch National Program:
12 / 15
http://www.swell-project.net
Discussion: challenges
• Combining data from multiple sources is not trivial
• Incomplete queries logged due to Google’s query suggestions
• Clicks without change in Window title (esp. Google Images)
• Noise from browser logging
Supported by the Dutch National Program:
13 / 15
http://www.swell-project.net
Conclusions
• Dataset of information behaviour of knowledge workers
• Main contributions of the dataset:
1. Combination of data types
2. Natural information seeking behaviour
3. In-context recordings
Supported by the Dutch National Program:
14 / 15

More Related Content

What's hot

Overview of the LAEP learning analytics project
Overview of the LAEP learning analytics projectOverview of the LAEP learning analytics project
Overview of the LAEP learning analytics project
LACE Project
 
Orcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
Orcutt ivey New Needs New Approaches: Libraries as Technology CollaboratorsOrcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
Orcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
National Information Standards Organization (NISO)
 
What do Digital Humanists want from a National Library?
What do Digital Humanists want from a National Library?What do Digital Humanists want from a National Library?
What do Digital Humanists want from a National Library?
LIBER Europe
 
Expert presentations: LAEP / LACE Amsterdam workshop
Expert presentations: LAEP / LACE Amsterdam workshopExpert presentations: LAEP / LACE Amsterdam workshop
Expert presentations: LAEP / LACE Amsterdam workshop
LACE Project
 
Research Data Management Support
Research Data Management SupportResearch Data Management Support
Research Data Management Support
Mariëtte van Selm
 
IPv4 address planning - Networkshop44
IPv4 address planning - Networkshop44IPv4 address planning - Networkshop44
IPv4 address planning - Networkshop44
Jisc
 
DELICATE checklist - to establish trusted Learning Analytics
DELICATE checklist - to establish trusted Learning AnalyticsDELICATE checklist - to establish trusted Learning Analytics
DELICATE checklist - to establish trusted Learning Analytics
Hendrik Drachsler
 
Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016
Paul Bailey
 
Extending and measuring the reach and impact of research output
Extending and measuring the reach and impact of research outputExtending and measuring the reach and impact of research output
Extending and measuring the reach and impact of research output
northerncollaboration
 
Ischools workshop - 5 - data citation
Ischools workshop - 5 - data citationIschools workshop - 5 - data citation
Ischools workshop - 5 - data citation
ARDC
 
Open Expansion: The Drive to Open up Access to Research, Heather Joseph
Open Expansion: The Drive to Open up Access to Research, Heather JosephOpen Expansion: The Drive to Open up Access to Research, Heather Joseph
Open Expansion: The Drive to Open up Access to Research, Heather Joseph
Kungliga biblioteket National Library of Sweden
 
EU-funded learning analytics projects
EU-funded learning analytics projectsEU-funded learning analytics projects
EU-funded learning analytics projects
LACE Project
 
Data Management Support at Leiden University
Data Management Support at Leiden UniversityData Management Support at Leiden University
Data Management Support at Leiden University
LIBER Europe
 
Jenny Evans - Creative thinking building research support services and system...
Jenny Evans - Creative thinking building research support services and system...Jenny Evans - Creative thinking building research support services and system...
Jenny Evans - Creative thinking building research support services and system...
sherif user group
 
The Future of Finding at the University of Oxford: CNI Fall 2016
The Future of Finding at the University of Oxford: CNI Fall 2016The Future of Finding at the University of Oxford: CNI Fall 2016
The Future of Finding at the University of Oxford: CNI Fall 2016
Christine Madsen
 
Andrew Simpson - Making sense for researchers: finding a practical approach a...
Andrew Simpson - Making sense for researchers: finding a practical approach a...Andrew Simpson - Making sense for researchers: finding a practical approach a...
Andrew Simpson - Making sense for researchers: finding a practical approach a...
sherif user group
 
Дмитрий Ветров. Масштабируемые методы обработки данных
Дмитрий Ветров. Масштабируемые методы обработки данныхДмитрий Ветров. Масштабируемые методы обработки данных
Дмитрий Ветров. Масштабируемые методы обработки данных
Skolkovo Robotics Center
 
Do we need rigth skills or the right person?
Do we need rigth skills or the right person?Do we need rigth skills or the right person?
Do we need rigth skills or the right person?
Mari Elisa Kuusniemi
 
Research data management training - How to make it happen?
Research data management training - How to make it happen?Research data management training - How to make it happen?
Research data management training - How to make it happen?
Mari Elisa Kuusniemi
 
MSc DEMM Oct 2013 Finding Research Evidence
MSc DEMM Oct 2013 Finding Research EvidenceMSc DEMM Oct 2013 Finding Research Evidence
MSc DEMM Oct 2013 Finding Research Evidence
EISLibrarian
 

What's hot (20)

Overview of the LAEP learning analytics project
Overview of the LAEP learning analytics projectOverview of the LAEP learning analytics project
Overview of the LAEP learning analytics project
 
Orcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
Orcutt ivey New Needs New Approaches: Libraries as Technology CollaboratorsOrcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
Orcutt ivey New Needs New Approaches: Libraries as Technology Collaborators
 
What do Digital Humanists want from a National Library?
What do Digital Humanists want from a National Library?What do Digital Humanists want from a National Library?
What do Digital Humanists want from a National Library?
 
Expert presentations: LAEP / LACE Amsterdam workshop
Expert presentations: LAEP / LACE Amsterdam workshopExpert presentations: LAEP / LACE Amsterdam workshop
Expert presentations: LAEP / LACE Amsterdam workshop
 
Research Data Management Support
Research Data Management SupportResearch Data Management Support
Research Data Management Support
 
IPv4 address planning - Networkshop44
IPv4 address planning - Networkshop44IPv4 address planning - Networkshop44
IPv4 address planning - Networkshop44
 
DELICATE checklist - to establish trusted Learning Analytics
DELICATE checklist - to establish trusted Learning AnalyticsDELICATE checklist - to establish trusted Learning Analytics
DELICATE checklist - to establish trusted Learning Analytics
 
Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016Jisc learning analytics update-feb 2016
Jisc learning analytics update-feb 2016
 
Extending and measuring the reach and impact of research output
Extending and measuring the reach and impact of research outputExtending and measuring the reach and impact of research output
Extending and measuring the reach and impact of research output
 
Ischools workshop - 5 - data citation
Ischools workshop - 5 - data citationIschools workshop - 5 - data citation
Ischools workshop - 5 - data citation
 
Open Expansion: The Drive to Open up Access to Research, Heather Joseph
Open Expansion: The Drive to Open up Access to Research, Heather JosephOpen Expansion: The Drive to Open up Access to Research, Heather Joseph
Open Expansion: The Drive to Open up Access to Research, Heather Joseph
 
EU-funded learning analytics projects
EU-funded learning analytics projectsEU-funded learning analytics projects
EU-funded learning analytics projects
 
Data Management Support at Leiden University
Data Management Support at Leiden UniversityData Management Support at Leiden University
Data Management Support at Leiden University
 
Jenny Evans - Creative thinking building research support services and system...
Jenny Evans - Creative thinking building research support services and system...Jenny Evans - Creative thinking building research support services and system...
Jenny Evans - Creative thinking building research support services and system...
 
The Future of Finding at the University of Oxford: CNI Fall 2016
The Future of Finding at the University of Oxford: CNI Fall 2016The Future of Finding at the University of Oxford: CNI Fall 2016
The Future of Finding at the University of Oxford: CNI Fall 2016
 
Andrew Simpson - Making sense for researchers: finding a practical approach a...
Andrew Simpson - Making sense for researchers: finding a practical approach a...Andrew Simpson - Making sense for researchers: finding a practical approach a...
Andrew Simpson - Making sense for researchers: finding a practical approach a...
 
Дмитрий Ветров. Масштабируемые методы обработки данных
Дмитрий Ветров. Масштабируемые методы обработки данныхДмитрий Ветров. Масштабируемые методы обработки данных
Дмитрий Ветров. Масштабируемые методы обработки данных
 
Do we need rigth skills or the right person?
Do we need rigth skills or the right person?Do we need rigth skills or the right person?
Do we need rigth skills or the right person?
 
Research data management training - How to make it happen?
Research data management training - How to make it happen?Research data management training - How to make it happen?
Research data management training - How to make it happen?
 
MSc DEMM Oct 2013 Finding Research Evidence
MSc DEMM Oct 2013 Finding Research EvidenceMSc DEMM Oct 2013 Finding Research Evidence
MSc DEMM Oct 2013 Finding Research Evidence
 

Viewers also liked

Subject verb agreement
Subject verb agreementSubject verb agreement
Subject verb agreement
charly2011
 
Ventajas
VentajasVentajas
Ventajas
Mirna Avelar
 
Portugal 2020 15/07/03
Portugal 2020 15/07/03Portugal 2020 15/07/03
Portugal 2020 15/07/03
João Leite
 
Flam 2016 e-brochure
Flam 2016 e-brochureFlam 2016 e-brochure
Flam 2016 e-brochure
TRANSFORMACTION IFSI
 
Jaar van de Klauwieren | Chris van der Heijden
Jaar van de Klauwieren | Chris van der HeijdenJaar van de Klauwieren | Chris van der Heijden
Jaar van de Klauwieren | Chris van der Heijden
Sovon Vogelonderzoek
 
Tarea seminario 2
Tarea seminario 2Tarea seminario 2
Tarea seminario 2
Ruiz de Castro David
 
Tarea seminario 6
Tarea seminario 6Tarea seminario 6
Tarea seminario 6
Rocío García Ruiz
 
Основы программирования Java для новичков
Основы программирования Java для новичковОсновы программирования Java для новичков
Основы программирования Java для новичков
Артем Дмитриченко
 
Mi portfolio
Mi portfolioMi portfolio
Mi portfolio
gildaeugeniagutierrez
 
Kulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja Salo
Kulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja SaloKulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja Salo
Kulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja Salo
Tilastokeskus
 
Твій Free time
Твій Free timeТвій Free time
Твій Free time
pavlogradccl
 
Batalgaajuulalt
BatalgaajuulaltBatalgaajuulalt
Batalgaajuulalt
Jargalsaihan Batbayar
 
NWF Report: Swimming Upstream
NWF Report: Swimming UpstreamNWF Report: Swimming Upstream
NWF Report: Swimming Upstream
National Wildlife Federation
 
O uso das Estruturas Metálicas na Construção Civil
O uso das Estruturas Metálicas na Construção CivilO uso das Estruturas Metálicas na Construção Civil
O uso das Estruturas Metálicas na Construção Civil
David Maciel
 
JLAA2015年度ブロック会議_公開資料_株式会社らしく
JLAA2015年度ブロック会議_公開資料_株式会社らしくJLAA2015年度ブロック会議_公開資料_株式会社らしく
JLAA2015年度ブロック会議_公開資料_株式会社らしく
Junya Sato
 

Viewers also liked (15)

Subject verb agreement
Subject verb agreementSubject verb agreement
Subject verb agreement
 
Ventajas
VentajasVentajas
Ventajas
 
Portugal 2020 15/07/03
Portugal 2020 15/07/03Portugal 2020 15/07/03
Portugal 2020 15/07/03
 
Flam 2016 e-brochure
Flam 2016 e-brochureFlam 2016 e-brochure
Flam 2016 e-brochure
 
Jaar van de Klauwieren | Chris van der Heijden
Jaar van de Klauwieren | Chris van der HeijdenJaar van de Klauwieren | Chris van der Heijden
Jaar van de Klauwieren | Chris van der Heijden
 
Tarea seminario 2
Tarea seminario 2Tarea seminario 2
Tarea seminario 2
 
Tarea seminario 6
Tarea seminario 6Tarea seminario 6
Tarea seminario 6
 
Основы программирования Java для новичков
Основы программирования Java для новичковОсновы программирования Java для новичков
Основы программирования Java для новичков
 
Mi portfolio
Mi portfolioMi portfolio
Mi portfolio
 
Kulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja Salo
Kulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja SaloKulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja Salo
Kulutuksen hiilijalanjäljen seurantaa tarvitaan, Marja Salo
 
Твій Free time
Твій Free timeТвій Free time
Твій Free time
 
Batalgaajuulalt
BatalgaajuulaltBatalgaajuulalt
Batalgaajuulalt
 
NWF Report: Swimming Upstream
NWF Report: Swimming UpstreamNWF Report: Swimming Upstream
NWF Report: Swimming Upstream
 
O uso das Estruturas Metálicas na Construção Civil
O uso das Estruturas Metálicas na Construção CivilO uso das Estruturas Metálicas na Construção Civil
O uso das Estruturas Metálicas na Construção Civil
 
JLAA2015年度ブロック会議_公開資料_株式会社らしく
JLAA2015年度ブロック会議_公開資料_株式会社らしくJLAA2015年度ブロック会議_公開資料_株式会社らしく
JLAA2015年度ブロック会議_公開資料_株式会社らしく
 

Similar to Collecting a dataset of information behaviour in context

Using Social Media Data for Online Television Recommendation Services at RTÉ ...
Using Social Media Data for Online Television Recommendation Services at RTÉ ...Using Social Media Data for Online Television Recommendation Services at RTÉ ...
Using Social Media Data for Online Television Recommendation Services at RTÉ ...
Andrea Barraza-Urbina
 
Research data management and Cambridge and our motivations for the Pilot
Research data management and Cambridge and our motivations for the PilotResearch data management and Cambridge and our motivations for the Pilot
Research data management and Cambridge and our motivations for the Pilot
Jisc RDM
 
Introduction to Data Management Planning at Alien Challenge COST workshop
Introduction to Data Management Planning at Alien Challenge COST workshopIntroduction to Data Management Planning at Alien Challenge COST workshop
Introduction to Data Management Planning at Alien Challenge COST workshop
Aaike De Wever
 
Estermann wikimania2015 glam-survey_20150719
Estermann wikimania2015 glam-survey_20150719Estermann wikimania2015 glam-survey_20150719
Estermann wikimania2015 glam-survey_20150719
Beat Estermann
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
Projeto RCAAP
 
Global Research Data Initiatives
Global Research Data InitiativesGlobal Research Data Initiatives
Global Research Data Initiatives
Sarah Jones
 
Sarah Jones - National approaches to data management
Sarah Jones - National approaches to data managementSarah Jones - National approaches to data management
Sarah Jones - National approaches to data management
dri_ireland
 
LAK14 Data Challenge
LAK14 Data ChallengeLAK14 Data Challenge
LAK14 Data Challenge
Hendrik Drachsler
 
Open Learning Analytics LSAC2018
Open Learning Analytics LSAC2018Open Learning Analytics LSAC2018
Open Learning Analytics LSAC2018
Ian Dolphin
 
Going Digital - Introductory Workshop
Going Digital - Introductory WorkshopGoing Digital - Introductory Workshop
Going Digital - Introductory Workshop
Collections Trust
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
Sarah Jones
 
Research and Deployment of Analytics in Learning Settings
Research and Deployment of Analytics in Learning SettingsResearch and Deployment of Analytics in Learning Settings
Research and Deployment of Analytics in Learning Settings
Katrien Verbert
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing data
Sarah Jones
 
University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management
Bill Worthington
 
A landscape survey of Active DMPs
A landscape survey of Active DMPsA landscape survey of Active DMPs
A landscape survey of Active DMPs
Sarah Jones
 
Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961
Sigaard
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data Things
Katina Toufexis
 
RDM @ UoE
RDM @ UoERDM @ UoE
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
Andrea Scharnhorst
 
Collections Trust 'Going Digital' workshop
Collections Trust 'Going Digital' workshopCollections Trust 'Going Digital' workshop
Collections Trust 'Going Digital' workshop
Collections Trust
 

Similar to Collecting a dataset of information behaviour in context (20)

Using Social Media Data for Online Television Recommendation Services at RTÉ ...
Using Social Media Data for Online Television Recommendation Services at RTÉ ...Using Social Media Data for Online Television Recommendation Services at RTÉ ...
Using Social Media Data for Online Television Recommendation Services at RTÉ ...
 
Research data management and Cambridge and our motivations for the Pilot
Research data management and Cambridge and our motivations for the PilotResearch data management and Cambridge and our motivations for the Pilot
Research data management and Cambridge and our motivations for the Pilot
 
Introduction to Data Management Planning at Alien Challenge COST workshop
Introduction to Data Management Planning at Alien Challenge COST workshopIntroduction to Data Management Planning at Alien Challenge COST workshop
Introduction to Data Management Planning at Alien Challenge COST workshop
 
Estermann wikimania2015 glam-survey_20150719
Estermann wikimania2015 glam-survey_20150719Estermann wikimania2015 glam-survey_20150719
Estermann wikimania2015 glam-survey_20150719
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
Global Research Data Initiatives
Global Research Data InitiativesGlobal Research Data Initiatives
Global Research Data Initiatives
 
Sarah Jones - National approaches to data management
Sarah Jones - National approaches to data managementSarah Jones - National approaches to data management
Sarah Jones - National approaches to data management
 
LAK14 Data Challenge
LAK14 Data ChallengeLAK14 Data Challenge
LAK14 Data Challenge
 
Open Learning Analytics LSAC2018
Open Learning Analytics LSAC2018Open Learning Analytics LSAC2018
Open Learning Analytics LSAC2018
 
Going Digital - Introductory Workshop
Going Digital - Introductory WorkshopGoing Digital - Introductory Workshop
Going Digital - Introductory Workshop
 
RDM LIASA webinar
RDM LIASA webinarRDM LIASA webinar
RDM LIASA webinar
 
Research and Deployment of Analytics in Learning Settings
Research and Deployment of Analytics in Learning SettingsResearch and Deployment of Analytics in Learning Settings
Research and Deployment of Analytics in Learning Settings
 
Managing and sharing data
Managing and sharing dataManaging and sharing data
Managing and sharing data
 
University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management University of Hertfordshire researcher development - research data management
University of Hertfordshire researcher development - research data management
 
A landscape survey of Active DMPs
A landscape survey of Active DMPsA landscape survey of Active DMPs
A landscape survey of Active DMPs
 
Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961Lcwebinar rise of-the_databrarian_73961
Lcwebinar rise of-the_databrarian_73961
 
20160414 23 Research Data Things
20160414 23 Research Data Things20160414 23 Research Data Things
20160414 23 Research Data Things
 
RDM @ UoE
RDM @ UoERDM @ UoE
RDM @ UoE
 
Drowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research fundingDrowning in information – the need of macroscopes for research funding
Drowning in information – the need of macroscopes for research funding
 
Collections Trust 'Going Digital' workshop
Collections Trust 'Going Digital' workshopCollections Trust 'Going Digital' workshop
Collections Trust 'Going Digital' workshop
 

More from Leiden University

‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
Leiden University
 
Text mining for health knowledge discovery
Text mining for health knowledge discoveryText mining for health knowledge discovery
Text mining for health knowledge discovery
Leiden University
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
Leiden University
 
'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals
Leiden University
 
kanker.nl & Data Science
kanker.nl & Data Sciencekanker.nl & Data Science
kanker.nl & Data Science
Leiden University
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van teksten
Leiden University
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
Leiden University
 
Computationeel denken
Computationeel denkenComputationeel denken
Computationeel denken
Leiden University
 
Summarizing discussion threads
Summarizing discussion threadsSummarizing discussion threads
Summarizing discussion threads
Leiden University
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van teksten
Leiden University
 
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leiden University
 
RemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt ResearchRemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt Research
Leiden University
 
Search engines for the humanities that go beyond Google
Search engines for the humanities that go beyond GoogleSearch engines for the humanities that go beyond Google
Search engines for the humanities that go beyond Google
Leiden University
 
Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?
Leiden University
 

More from Leiden University (14)

‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...‘Big models’: the success and pitfalls of Transformer models in natural langu...
‘Big models’: the success and pitfalls of Transformer models in natural langu...
 
Text mining for health knowledge discovery
Text mining for health knowledge discoveryText mining for health knowledge discovery
Text mining for health knowledge discovery
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals
 
kanker.nl & Data Science
kanker.nl & Data Sciencekanker.nl & Data Science
kanker.nl & Data Science
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van teksten
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
 
Computationeel denken
Computationeel denkenComputationeel denken
Computationeel denken
 
Summarizing discussion threads
Summarizing discussion threadsSummarizing discussion threads
Summarizing discussion threads
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van teksten
 
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
 
RemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt ResearchRemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt Research
 
Search engines for the humanities that go beyond Google
Search engines for the humanities that go beyond GoogleSearch engines for the humanities that go beyond Google
Search engines for the humanities that go beyond Google
 
Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?
 

Recently uploaded

Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 

Recently uploaded (20)

Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 

Collecting a dataset of information behaviour in context

Editor's Notes

  1. Figure 4 shows an example of a behavioural analysis: a transition graph for the workers’ information behaviour. The size of the states represents their relative frequency. The state ‘query’ represents events where the active application is the web browser, in which a Google query and its results are shown. The state ‘Google’ represents events where a Google page is active without query. The state ‘OtherURL’ represents events where the active application is the web browser, with another URL than Google. The transition probability between states S 1 and S 2 was calculated as count(S 1 -> S 2 )/count(S 2 ). Only transitions with a probability > 0.1 are shown.The graph shows that when users are asked to write reports or prepare presentations on a relatively new topic, they spend more time on web pages than in the report they are writing, and they switch frequently between URLs and the report in order to gather the relevant information. The graph also shows the relatively frequent interruptions of e-mail, which is known to be very common for knowledge workers (S. Whittaker and C. Sidner, 1996)
  2. Combining the data from the keylogging software and the browser history software was not trivial, even with exactly matching timestamps. This was because the user could have multi ple tabs active in the browser, with not all tab titles being separately recorded by the keylogging software. Users clicking one of Google’s query suggestions sometimes led to incomplete queries and missing URLs. For example, we found that the query ‘napol’ lead to the URL http://nl.wikipedia.org/wiki/ Napoleon Dynamite. We sus pect that his happened because the user selected the sug gested query ‘napoleon dynamite’ after the offset ‘napol’, and then clicked the Wikipedia URL. In some cases the window title of the browser did not change when a user clicked on a result (especially when the click was a result from Google Images), which caused the clicked URL to be included in the same event block as the query, and dwell time was missing for this particular URL. Fourth, the browser logging resulted in a lot of noise. We had to filter out a large amount of on-page social media plugins, advertisements and icons. In addition, browsing the Google domain leads to many additional URLs. An extreme example was 25 occurrences of the Google Maps URL http://maps.google.nl/maps?hl=nl &q=usa&bav=on.2 in one event block.
  3. We collected and preprocessed a dataset of information behaviour of knowledge workers in the context of realistic work tasks. The data set is relatively small in terms of the number of participants, but large in terms of types of information collected. The contributions of this dataset are: it includes different types of data, including keylogging data, desktop video recordings and browser history; the information seeking behaviour is completely natural because it results from the recording of user behaviour during report writing, e-mail reading and presentation preparationthe search activities have been recorded together with the context of these other tasks, which allows for future research in context-aware information retrieval.