SlideShare a Scribd company logo
1 of 16
Download to read offline
HIGH-VALUE DATASETS
FROM PUBLICATION TO IMPACT
Elena Simperl
@esimperl
National Open Data Conference
December 3, 2020
How do people search, make sense of, and use
open data?
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Frameworks
and models
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Methods and guidance
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Tools
HUMAN DATA INTERACTION
FRAMEWORKS, METHODS, TOOLS
Analysis
Analysis
HIGH-VALUE DATASETS
UNDERSTANDING USE THROUGH BEHAVIOUR ANALYSIS
TO PROVIDE GUIDANCE TO PUBLISHERS
Open
government
data
portals
• Search logs
• Data requests
Data
science
platforms
• Activity logs
2018 STUDY
ANALYSIS OF LOGS AND REQUESTS
Four national open government data portals, 2.2 million
queries from 2013 to 2016, 1500 data requests.
Data search is a work-related activity.
Shorter queries, include time and location, with varying
levels of granularity.
Explorative search, using keywords and filters.
Native and external queries topically different.
Data requests describe the data through boundaries and
restrictions on location, time, data type, granularity.
Kacprzak, E., Koesten, L., Ibáñez, L.D., Blount, T., Tennison, J. and Simperl, E., 2018. Characterising dataset search—An analysis of search logs and data requests. Journal of Web Semantics.
2020 STUDY
ANALYSIS OF LOGS
844,343 user sessions
(April 18 to June 20)
Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal, Analytical Report 18, 2020
TOOLS TO FIND DATA
EDP is used to find datasets, but it is not the only tool
people use. 60% of sessions arrive to the EDP from the
web.
Changes in portal design impact traffic (and user
experience).
Covid-19 datasets were in high demand in 2020.
A large majority of users visit only one section of the
portal at a time.
Content on the site needs to be better interlinked (both
data and other pages). When links exist, people use them.
SEARCH APPROACHES AND AFFORDANCES
Filters are important: 60% of native sessions use
filters-only; 15-20% use keywords and filters.
Common search strategies: single-filter; keywords
first, then one or more filters.
Popular filters are country and category. Less so:
format, license.
Keyword queries are short, less use of time,
format and data attributes, more use of location.
SUCCESS IN DATASET SEARCH
20-40% of native queries and 8-25% of
external queries are successful.
Keywords + filters seem to work better. Might
also be a proxy for seasoned users.
IMPLICATIONS FOR PUBLISHERS
SEO strategy Filter
affordances
Granular
location data
Dataset
retrieval
Dataset
preview
pages
Links between
content and
datasets
WHAT’S NEXT
Studies on users and their
information needs.
Granular activity data captured
and shared by portals for new
studies.
Portals publish lots of data. They
now need to do more to become
data communities.
PUBLICATIONS
Talking Datasets — understanding data sensemaking behaviours. L Koesten, K
Gregory, P Groth, E Simperl. Currently under review at the International Journal
of Human-Computer Studies. 2020
Everything You Always Wanted to Know about a Dataset: Studies in Data
Summarisation. L Koesten, E Simperl, E Kacprzak, T Blount, J Tennison.
International Journal of Human-Computer Studies. 2019
Collaborative Practices with Structured Data: Do Tools Support what Users Need?
L Koesten, E Kacprzak, E Simperl, J Tennison; ACM CHI Conference on Human
Factors in Computing Systems, CHI 2019.
Dataset search: a survey. A Chapman, E Simperl, L Koesten, G Konstantinidis,
LD Ibáñez, E Kacprzak, P Groth. The International Journal on Very Large Data
Bases, 2019.
Characterising dataset search — An analysis of search logs and data requests. E
Kacprzak, L Koesten, LD Ibáñez, T Blount, J Tennison, E Simperl; Journal of Web
Semantics, 2018
Characterising Dataset Search on the European Data Portal: An Analysis of
Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal,
Analytical Report 18, 2020
The Trials and Tribulations of Working with Structured Data - a Study on
Information Seeking Behaviour. L Koesten, E Kacprzak, J Tennison, E Simperl.
Proceedings of ACM CHI Conference on Human Factors in Computing Systems,
CHI 2017.
Dataset Reuse: Toward Translating Principles to Practice. L Koesten, P Vougiouklis,
E Simperl, P Groth - Patterns, 2020
Pie Chart or Pizza: Identifying Chart Types and Their Virality on Twitter - P
Vougiouklis, L Carr, E Simperl - Proceedings of the International AAAI Conference
on Web and Social Media, 2020

More Related Content

What's hot

Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesElena Simperl
 
Loops of humans and bots in Wikidata
Loops of humans and bots in WikidataLoops of humans and bots in Wikidata
Loops of humans and bots in WikidataElena Simperl
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Istituto nazionale di statistica
 
Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3SMCFrance
 
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” Daniel X. O'Neil
 
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleBernhard Rieder
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Bernhard Rieder
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020P2Pvalue
 
Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25Jody Ranck
 
The GIS Guide to Public Domain Data
The GIS Guide to Public Domain DataThe GIS Guide to Public Domain Data
The GIS Guide to Public Domain DataEsri
 
Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)Han Woo PARK
 
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...Brittne Kakulla, Ph.D.
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Bernhard Rieder
 
Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical GesturesBernhard Rieder
 

What's hot (20)

Crowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart citiesCrowdsourcing and citizen engagement for people-centric smart cities
Crowdsourcing and citizen engagement for people-centric smart cities
 
Loops of humans and bots in Wikidata
Loops of humans and bots in WikidataLoops of humans and bots in Wikidata
Loops of humans and bots in Wikidata
 
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
Giorgio Alleva, Data Innovation in Official Statistics: the Leading Role of O...
 
Innovations in Data for Decision Making
Innovations in Data for Decision MakingInnovations in Data for Decision Making
Innovations in Data for Decision Making
 
GI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationshipsGI Management Transformation: from geometry to databased relationships
GI Management Transformation: from geometry to databased relationships
 
Homelessness Data Discussion
Homelessness Data DiscussionHomelessness Data Discussion
Homelessness Data Discussion
 
Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3Franck Rebillard, Professeur Université Paris 3
Franck Rebillard, Professeur Université Paris 3
 
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet” DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
DXO On Big Data, Open Data, and the Perils of “Democracy by Spreadsheet”
 
Data Power
Data PowerData Power
Data Power
 
Ongoing Research in Data Studies
Ongoing Research in Data StudiesOngoing Research in Data Studies
Ongoing Research in Data Studies
 
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% SampleTweets are Not Created Equal. Intersecting Devices in the 1% Sample
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
 
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
Analyzing Social Media with Digital Methods. Possibilities, Requirements, and...
 
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
#P2Pvalue at Share and inspire: Infoday on CAPS in Horizon 2020
 
Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25Biosurveillance2.0 ranck digihealth feb 25
Biosurveillance2.0 ranck digihealth feb 25
 
The GIS Guide to Public Domain Data
The GIS Guide to Public Domain DataThe GIS Guide to Public Domain Data
The GIS Guide to Public Domain Data
 
Community Data Program Submitted letter to Open Government Partneship
Community Data Program Submitted letter to Open Government PartneshipCommunity Data Program Submitted letter to Open Government Partneship
Community Data Program Submitted letter to Open Government Partneship
 
Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)Tfsc disc 2014 si proposal (30 june2014)
Tfsc disc 2014 si proposal (30 june2014)
 
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
No More Half Fast: Improving US Broadband Download Speed. Georgetown Universi...
 
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.Engines of Order. Social Media and the Rise of Algorithmic Knowing.
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
 
Platforms and Analytical Gestures
Platforms and Analytical GesturesPlatforms and Analytical Gestures
Platforms and Analytical Gestures
 

Similar to High-value datasets: from publication to impact

The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so farElena Simperl
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactElena Simperl
 
Data Discovery and Visualization
Data Discovery and VisualizationData Discovery and Visualization
Data Discovery and VisualizationDr. Neil Brittliff
 
Talk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceTalk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceGenoveva Vargas-Solar
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...michellep
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnPavankalayankusetty
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013WCJones6348
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
Presentation emerging tecnology
Presentation  emerging tecnologyPresentation  emerging tecnology
Presentation emerging tecnologyAmalAltarge
 
Zeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhZeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhMarcia Zeng
 
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...Andrew Stott
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...John Makridis
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Fernando de Assis Rodrigues
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
Data Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus LearningData Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus LearningPraj H
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )IJDKP
 

Similar to High-value datasets: from publication to impact (20)

The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
Open government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impactOpen government data portals: from publishing to use and impact
Open government data portals: from publishing to use and impact
 
Data Discovery and Visualization
Data Discovery and VisualizationData Discovery and Visualization
Data Discovery and Visualization
 
Talk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial IntelligenceTalk straps: Interactivity between Human and Artificial Intelligence
Talk straps: Interactivity between Human and Artificial Intelligence
 
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
David Nicholas, Ciber: Audience Analysis and Modelling, the case of CIBER and...
 
An Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learnAn Introduction to Data Science.pptx learn
An Introduction to Data Science.pptx learn
 
Emcien overview v6 01282013
Emcien overview v6 01282013Emcien overview v6 01282013
Emcien overview v6 01282013
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Presentation emerging tecnology
Presentation  emerging tecnologyPresentation  emerging tecnology
Presentation emerging tecnology
 
Zeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadhZeng marcia ifla-subjectaccesssmartdatadh
Zeng marcia ifla-subjectaccesssmartdatadh
 
Big Data
Big DataBig Data
Big Data
 
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
Open Data in Practice: Five Years of Lessons Learned and Best Practice in ac...
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
Big Data Analytics and Knowledge Discovery through Location-Based Social Netw...
 
data, big data, open data
data, big data, open datadata, big data, open data
data, big data, open data
 
Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...Identifying semantics characteristics of user’s interactions datasets through...
Identifying semantics characteristics of user’s interactions datasets through...
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Data Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus LearningData Science Skills Study 2019 by AIM And Imarticus Learning
Data Science Skills Study 2019 by AIM And Imarticus Learning
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 

More from Elena Simperl

This talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceElena Simperl
 
Knowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationKnowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationElena Simperl
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backElena Simperl
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringElena Simperl
 
Ten myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfTen myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfElena Simperl
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringElena Simperl
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfElena Simperl
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Elena Simperl
 
Qrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesQrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesElena Simperl
 
Inclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachInclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachElena Simperl
 
Making transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factorMaking transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factorElena Simperl
 
Quality and collaboration in Wikidata
Quality and collaboration in WikidataQuality and collaboration in Wikidata
Quality and collaboration in WikidataElena Simperl
 
Beyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasksBeyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasksElena Simperl
 
The business of open data
The business of open dataThe business of open data
The business of open dataElena Simperl
 
Open data – are we done
Open data – are we doneOpen data – are we done
Open data – are we doneElena Simperl
 

More from Elena Simperl (18)

This talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing science
 
Knowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generationKnowledge graph use cases in natural language generation
Knowledge graph use cases in natural language generation
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineering
 
Ten myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdfTen myths about knowledge graphs.pdf
Ten myths about knowledge graphs.pdf
 
What Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineeringWhat Wikidata teaches us about knowledge engineering
What Wikidata teaches us about knowledge engineering
 
Data commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdfData commons and their role in fighting misinformation.pdf
Data commons and their role in fighting misinformation.pdf
 
Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?Are our knowledge graphs trustworthy?
Are our knowledge graphs trustworthy?
 
Qrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart citiesQrowd and the city: designing people-centric smart cities
Qrowd and the city: designing people-centric smart cities
 
Qrowd and the city
Qrowd and the cityQrowd and the city
Qrowd and the city
 
Inclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approachInclusive cities: a crowdsourcing approach
Inclusive cities: a crowdsourcing approach
 
Making transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factorMaking transport smarter, leveraging the human factor
Making transport smarter, leveraging the human factor
 
Data storytelling
Data storytelling Data storytelling
Data storytelling
 
Quality and collaboration in Wikidata
Quality and collaboration in WikidataQuality and collaboration in Wikidata
Quality and collaboration in Wikidata
 
Beyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasksBeyond monetary incentives: experiments with paid microtasks
Beyond monetary incentives: experiments with paid microtasks
 
The Data Pitch call
The Data Pitch callThe Data Pitch call
The Data Pitch call
 
The business of open data
The business of open dataThe business of open data
The business of open data
 
Open data – are we done
Open data – are we doneOpen data – are we done
Open data – are we done
 

Recently uploaded

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Recently uploaded (20)

E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

High-value datasets: from publication to impact

  • 1. HIGH-VALUE DATASETS FROM PUBLICATION TO IMPACT Elena Simperl @esimperl National Open Data Conference December 3, 2020
  • 2. How do people search, make sense of, and use open data?
  • 4. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Frameworks and models
  • 5. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Methods and guidance
  • 6. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Tools
  • 7. HUMAN DATA INTERACTION FRAMEWORKS, METHODS, TOOLS Analysis Analysis
  • 8. HIGH-VALUE DATASETS UNDERSTANDING USE THROUGH BEHAVIOUR ANALYSIS TO PROVIDE GUIDANCE TO PUBLISHERS Open government data portals • Search logs • Data requests Data science platforms • Activity logs
  • 9. 2018 STUDY ANALYSIS OF LOGS AND REQUESTS Four national open government data portals, 2.2 million queries from 2013 to 2016, 1500 data requests. Data search is a work-related activity. Shorter queries, include time and location, with varying levels of granularity. Explorative search, using keywords and filters. Native and external queries topically different. Data requests describe the data through boundaries and restrictions on location, time, data type, granularity. Kacprzak, E., Koesten, L., Ibáñez, L.D., Blount, T., Tennison, J. and Simperl, E., 2018. Characterising dataset search—An analysis of search logs and data requests. Journal of Web Semantics.
  • 10. 2020 STUDY ANALYSIS OF LOGS 844,343 user sessions (April 18 to June 20) Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal, Analytical Report 18, 2020
  • 11. TOOLS TO FIND DATA EDP is used to find datasets, but it is not the only tool people use. 60% of sessions arrive to the EDP from the web. Changes in portal design impact traffic (and user experience). Covid-19 datasets were in high demand in 2020. A large majority of users visit only one section of the portal at a time. Content on the site needs to be better interlinked (both data and other pages). When links exist, people use them.
  • 12. SEARCH APPROACHES AND AFFORDANCES Filters are important: 60% of native sessions use filters-only; 15-20% use keywords and filters. Common search strategies: single-filter; keywords first, then one or more filters. Popular filters are country and category. Less so: format, license. Keyword queries are short, less use of time, format and data attributes, more use of location.
  • 13. SUCCESS IN DATASET SEARCH 20-40% of native queries and 8-25% of external queries are successful. Keywords + filters seem to work better. Might also be a proxy for seasoned users.
  • 14. IMPLICATIONS FOR PUBLISHERS SEO strategy Filter affordances Granular location data Dataset retrieval Dataset preview pages Links between content and datasets
  • 15. WHAT’S NEXT Studies on users and their information needs. Granular activity data captured and shared by portals for new studies. Portals publish lots of data. They now need to do more to become data communities.
  • 16. PUBLICATIONS Talking Datasets — understanding data sensemaking behaviours. L Koesten, K Gregory, P Groth, E Simperl. Currently under review at the International Journal of Human-Computer Studies. 2020 Everything You Always Wanted to Know about a Dataset: Studies in Data Summarisation. L Koesten, E Simperl, E Kacprzak, T Blount, J Tennison. International Journal of Human-Computer Studies. 2019 Collaborative Practices with Structured Data: Do Tools Support what Users Need? L Koesten, E Kacprzak, E Simperl, J Tennison; ACM CHI Conference on Human Factors in Computing Systems, CHI 2019. Dataset search: a survey. A Chapman, E Simperl, L Koesten, G Konstantinidis, LD Ibáñez, E Kacprzak, P Groth. The International Journal on Very Large Data Bases, 2019. Characterising dataset search — An analysis of search logs and data requests. E Kacprzak, L Koesten, LD Ibáñez, T Blount, J Tennison, E Simperl; Journal of Web Semantics, 2018 Characterising Dataset Search on the European Data Portal: An Analysis of Search Logs. LD Ibáñez, E Kacprzak, L Koesten, E Simperl. European Data Portal, Analytical Report 18, 2020 The Trials and Tribulations of Working with Structured Data - a Study on Information Seeking Behaviour. L Koesten, E Kacprzak, J Tennison, E Simperl. Proceedings of ACM CHI Conference on Human Factors in Computing Systems, CHI 2017. Dataset Reuse: Toward Translating Principles to Practice. L Koesten, P Vougiouklis, E Simperl, P Groth - Patterns, 2020 Pie Chart or Pizza: Identifying Chart Types and Their Virality on Twitter - P Vougiouklis, L Carr, E Simperl - Proceedings of the International AAAI Conference on Web and Social Media, 2020