SlideShare a Scribd company logo
Editing Behavior over Time
Power vs. Standard Wikidata Editors
Cristina Sarasua*
, Alessandro Checco, Gianluca Demartini,
Djellel E. Difallah, Michael Feldman, Lydia Pintscher
sarasua@ifi.uzh.ch
@csarasuagar
WikidataCon 2017
6.8K - 8.7K
Active Editors
Source: 08.2016 - 08.2017 The Wikidata Revolution, Lydia Pintscher, Wikimania 2017
Ultimate Goal:
Help these editors find valuable work
to do in Wikidata
Ultimate Goal:
Help these editors find valuable work
to do in Wikidata
Editor Knowledge
Base
1. Understand differences in the behaviour
between power editors and standard editors
2. Be able to identify if an editor will be “power”
or “standard” editor
3. Provide a method that helps interested
standard editors find their editing mission
Data-driven study
Discussion
Editor Types Evolution
S1 S2 S3 S4
M1 M2 M3 M4
session-based
month-based
# edits (volume)
● High
● Low
# months (lifespan)
● Long
● Short
Our Task
Editing Behaviour Over Time
Short of long
lifespan?
Contribution
Participation
Diversity
Our Task
Editing Behaviour Over Time
Contribution
Participation
Diversity
High or low
volume of
edits?
What does the related work say?
“Wikipedians are born, not made. They don’t do more over time
and they maintain a high and constant level of participation.”
[Panciera et al. 2009, Data-driven study]
“Wikidatians” acquire a higher sense of responsibility for
their work, interact more with the community, take on
more advanced tasks, and use a wider range of tools”
[Piscopo et al. 2017, Interviews]
“There are different functional roles among editors: reference editor, item
editor, item creator, item expert, property editor, and property engineer.”
[Mueller-Birn et al. 2015, Data-driven study]
Methodology
139+K editors, 32+M
edits, 7+M items
Data
(human edits, item
pages, without tools)
Grouped in sessions
Descriptive Statistics
Statistical Model to see
Trends among different
editors
Classification method to
guess the lifespan and edits
that an editor will have
What did we find?
Edit sessions
F1. Shorter times between edits, and a longer definition of session than in
Wikipedia (4.37 hours)
[Wikipedia, Geiger et al. 2013]
Editors and Items
F2. Few editors with many edits (and vice versa), few items with many editors
(and vice versa)
Lifespan
F3. Few editors worked over almost 4y, no linear relation between edit count and
lifespan
F4.CONTRIBUTION
# edits (session, month)
# edits per item (s,m)
# items edited (s,m)
Editors with longer lifespan
tend to maintain a constant
contribution.
Others don’t.
Editors with higher volume
tend to maintain a constant
contribution.
Others don’t (not as clear).
i1 m
lifespan
i1 m
editcount
F5.PARTICIPATION
# seconds spent (session)
Editors with a long lifespan
maintain a constant
participation.
Others don’t.
Some editors with high volume
of edits maintain a constant
participation. i4 s
lifespan
i4 s
editcount
F6.DIVERSITY
# entropy of type of edit
(s,m)
Editors with long lifespan tend
to increase the diversity of the
type of their edits (m).
For the others, some
increase others decrease.
i5 m
lifespan
i5 m
editcount
Identifying power and standard editors
Lifespan prediction: F1-score for Random Forest and Logistic
Classifier predicting using different # of sessions
Volume of edits prediction: F1-score for Random Forest and
Logistic Classifier predicting using different # of sessions
15 months
100 edits
● Lifespan is predicted better
than volume of edits.
Identifying power and standard editors
● Lifespan is predicted better
than volume of edits.
● We can predict volume of edits
better for standard editors than
power users (both in session-
and month-based evolution).
As for lifespan, it is better for
power editors.
Lifespan prediction: F1-score for Random Forest and Logistic
Classifier predicting using different # of sessions
Volume of edits prediction: F1-score for Random Forest and
Logistic Classifier predicting using different # of sessions
15 months
100 edits
Conclusions
from this research
● Skewed distribution in volume
of edits.
● 46 % of editors are presumably
“gone”.
● Power editors (in contrast to
standard editors) tend to have
habits and be constant in
contribution and participation.
● Power editors tend to increase
diversity of type of actions over
months.
How do we help standard users to
have editing habits that suit them?
How do we help standard users to
have editing habits that suit them?
Proposal
● Define intentions,
resolutions
● Identify with roles and
missions
● Publish calls for
actions
● Define data needs
Standard Editors
? Power Editors, Data Providers
Individual / social missions Best practices dissemination
Method & Tool
Focus
Routines
@ Researchers,
Developers
● Related theories to consider?
● What Wikidata tools to integrate
in the process?
@ Editors, Community
Managers
● Are there people overwhelmed
who don’t know how to
contribute best?
● How do we collect and
disseminate tips and tricks
about deciding what to edit?
● How can we enable 1:1
collaboration between power
editors / data providers and
standard users?
Big thanks!
Sponsors & supporters Wikidata community
References
Katherine Panciera, Aaron Halfaker, and Loren Terveen. 2009. Wikipedians are born, not made: a study of power editors on
Wikipedia. In Proceedings of the ACM 2009 international conference on Supporting group work (GROUP '09). ACM, New
York, NY, USA, 51-60. DOI=http://dx.doi.org/10.1145/1531674.1531682
Piscopo, Alessandro, Phethean, Christopher and Simperl, Elena (2017) Wikidatians are born: paths to full participation in a
collaborative structured knowledge base In Proceedings of the 50th Hawaii International Conference on System Sciences.
University of Hawaii. 10 pp, pp. 4354-4363. (doi:10.24251/HICSS.2017.527).
Claudia Müller-Birn, Benjamin Karran, Janette Lehmann, and Markus Luczak-Rösch. 2015. Peer-production system or
collaborative ontology engineering effort: what is Wikidata?. In Proceedings of the 11th International Symposium on Open
Collaboration (OpenSym '15). ACM, New York, NY, USA, Article 20, 10 pages. DOI:
https://doi.org/10.1145/2788993.2789836
R. Stuart Geiger and Aaron Halfaker. 2013. Using edit sessions to measure participation in wikipedia. In Proceedings of the
2013 conference on Computer supported cooperative work (CSCW '13). ACM, New York, NY, USA, 861-870. DOI:
https://doi.org/10.1145/2441776.2441873
Image sources
Slide 6 Attribution Nalex.25 - Creative Commons Attribution-Share Alike 4.0 International
Slide 8 CC0 https://pixabay.com/en/books-education-school-literature-484766/
https://pixabay.com/en/hourglass-sand-watch-time-glass-1046841/
Sliide 9 https://pixabay.com/en/question-mark-pile-question-mark-2492009/
CC0 https://pixabay.com/en/business-success-winning-chart-163464/
https://pixabay.com/en/code-technology-monitor-computer-2588957/
Slide CC0 https://pixabay.com/en/user-person-people-profile-account-1633249/
Slide 24 CC0 https://pixabay.com/en/user-group-icon-person-business-1275780/ https://pixabay.com/en/man-woman-question-mark-problems-2814937/
https://pixabay.com/en/map-travel-compass-magnifying-glass-2685795/
Slide 25 https://pixabay.com/en/protest-models-art-artist-2265287/
Slide 26 https://blog.wikimedia.de/2012/04/04/meet-the-wikidata-team/ photo by Phillip Wilke. CC-BY-SA-3.0
Slide 26 Group photo of Wikimania 2017 attendees. Photo by Victor Grigas/Wikimedia Foundation, CC BY-SA 4.0.

More Related Content

Similar to Editing Behavior over Time Power vs. Standard Wikidata Editors

From academic blog to networked scholarly community: Lessons from the LSE Imp...
From academic blog to networked scholarly community: Lessons from the LSE Imp...From academic blog to networked scholarly community: Lessons from the LSE Imp...
From academic blog to networked scholarly community: Lessons from the LSE Imp...
MmIT - Multimedia Information Technology Group for CILIP
 
Nominet trust projects theory of change presentation 2016
Nominet trust projects theory of change presentation 2016Nominet trust projects theory of change presentation 2016
Nominet trust projects theory of change presentation 2016
Daniel Robinson
 
Conf 2012-empirikom3
Conf 2012-empirikom3Conf 2012-empirikom3
Conf 2012-empirikom3
Clay Spinuzzi
 
Peer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docx
Peer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docxPeer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docx
Peer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docx
templestewart19
 
CHI2007 talk on Conflicts in Wikipedia
CHI2007 talk on Conflicts in WikipediaCHI2007 talk on Conflicts in Wikipedia
CHI2007 talk on Conflicts in Wikipedia
Ed Chi
 
Action Learning Sets: An Innovative Way to Facilitate Writing for Publication
Action Learning Sets: An Innovative Way to Facilitate Writing for Publication Action Learning Sets: An Innovative Way to Facilitate Writing for Publication
Action Learning Sets: An Innovative Way to Facilitate Writing for Publication
Self Employed
 
Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015
Dawn Foster
 
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
kristenlabonte
 
Activating Research Collaboratories with Collaboration Patterns
Activating Research Collaboratories with Collaboration PatternsActivating Research Collaboratories with Collaboration Patterns
Activating Research Collaboratories with Collaboration Patterns
CommunitySense
 
Paving the way to open and interoperable research data service workflows
Paving the way to open and interoperable research data service workflowsPaving the way to open and interoperable research data service workflows
Paving the way to open and interoperable research data service workflows
The University of Edinburgh
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
National Information Standards Organization (NISO)
 
Henley KM forum wikis and blogs working group meeting December 2007
Henley KM forum wikis and blogs working group meeting December 2007Henley KM forum wikis and blogs working group meeting December 2007
Henley KM forum wikis and blogs working group meeting December 2007
Rowan Purdy
 
Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...
ResearchSpace
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
Daniel S. Katz
 
Skills & ideas for #ProblemGamblingKTE
Skills & ideas for #ProblemGamblingKTE Skills & ideas for #ProblemGamblingKTE
Skills & ideas for #ProblemGamblingKTE
Anne Bergen
 
Essay Revision Online.pdf
Essay Revision Online.pdfEssay Revision Online.pdf
Essay Revision Online.pdf
Vanessa Henderson
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
Elena Simperl
 
UCD Workshop - Shad MUN 2008
UCD Workshop - Shad MUN 2008UCD Workshop - Shad MUN 2008
UCD Workshop - Shad MUN 2008
guest63c15b
 
Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008
Patañjali Chary
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
Carole Goble
 

Similar to Editing Behavior over Time Power vs. Standard Wikidata Editors (20)

From academic blog to networked scholarly community: Lessons from the LSE Imp...
From academic blog to networked scholarly community: Lessons from the LSE Imp...From academic blog to networked scholarly community: Lessons from the LSE Imp...
From academic blog to networked scholarly community: Lessons from the LSE Imp...
 
Nominet trust projects theory of change presentation 2016
Nominet trust projects theory of change presentation 2016Nominet trust projects theory of change presentation 2016
Nominet trust projects theory of change presentation 2016
 
Conf 2012-empirikom3
Conf 2012-empirikom3Conf 2012-empirikom3
Conf 2012-empirikom3
 
Peer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docx
Peer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docxPeer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docx
Peer Review of Workflow ModelsIn the Week 4 Discussion, you ex.docx
 
CHI2007 talk on Conflicts in Wikipedia
CHI2007 talk on Conflicts in WikipediaCHI2007 talk on Conflicts in Wikipedia
CHI2007 talk on Conflicts in Wikipedia
 
Action Learning Sets: An Innovative Way to Facilitate Writing for Publication
Action Learning Sets: An Innovative Way to Facilitate Writing for Publication Action Learning Sets: An Innovative Way to Facilitate Writing for Publication
Action Learning Sets: An Innovative Way to Facilitate Writing for Publication
 
Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015Operationalisation of Collaboration Sunbelt 2015
Operationalisation of Collaboration Sunbelt 2015
 
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
Professor Dagobert Soergel's talk (2009 CISTA Award Recipient): Task-centric ...
 
Activating Research Collaboratories with Collaboration Patterns
Activating Research Collaboratories with Collaboration PatternsActivating Research Collaboratories with Collaboration Patterns
Activating Research Collaboratories with Collaboration Patterns
 
Paving the way to open and interoperable research data service workflows
Paving the way to open and interoperable research data service workflowsPaving the way to open and interoperable research data service workflows
Paving the way to open and interoperable research data service workflows
 
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at ScaleFull Erdmann Ruttenberg Community Approaches to Open Data at Scale
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
 
Henley KM forum wikis and blogs working group meeting December 2007
Henley KM forum wikis and blogs working group meeting December 2007Henley KM forum wikis and blogs working group meeting December 2007
Henley KM forum wikis and blogs working group meeting December 2007
 
Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...Paving the way to open and interoperable research data service workflows Prog...
Paving the way to open and interoperable research data service workflows Prog...
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
Skills & ideas for #ProblemGamblingKTE
Skills & ideas for #ProblemGamblingKTE Skills & ideas for #ProblemGamblingKTE
Skills & ideas for #ProblemGamblingKTE
 
Essay Revision Online.pdf
Essay Revision Online.pdfEssay Revision Online.pdf
Essay Revision Online.pdf
 
Building better knowledge graphs through social computing
Building better knowledge graphs through social computingBuilding better knowledge graphs through social computing
Building better knowledge graphs through social computing
 
UCD Workshop - Shad MUN 2008
UCD Workshop - Shad MUN 2008UCD Workshop - Shad MUN 2008
UCD Workshop - Shad MUN 2008
 
Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008Ucd Techniques - Shad MUN 2008
Ucd Techniques - Shad MUN 2008
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 

More from Cristina Sarasua

Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
Cristina Sarasua
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
Cristina Sarasua
 
Closing session
Closing sessionClosing session
Closing session
Cristina Sarasua
 
Reviews and awards
Reviews and awardsReviews and awards
Reviews and awards
Cristina Sarasua
 
Crowd statement marathon
Crowd statement marathonCrowd statement marathon
Crowd statement marathon
Cristina Sarasua
 
Paper presentations1
Paper presentations1Paper presentations1
Paper presentations1
Cristina Sarasua
 
Paper presentations2
Paper presentations2Paper presentations2
Paper presentations2
Cristina Sarasua
 
Hello session
Hello sessionHello session
Hello session
Cristina Sarasua
 
Tecnología e Igualdad
Tecnología e IgualdadTecnología e Igualdad
Tecnología e Igualdad
Cristina Sarasua
 
Introduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata EditathonIntroduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata Editathon
Cristina Sarasua
 
Interlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAsInterlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAs
Cristina Sarasua
 
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Cristina Sarasua
 
Swib2014csarasua
Swib2014csarasuaSwib2014csarasua
Swib2014csarasua
Cristina Sarasua
 
Crowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro WorkCrowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro Work
Cristina Sarasua
 
Dbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_openDbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_open
Cristina Sarasua
 
Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...
Cristina Sarasua
 

More from Cristina Sarasua (16)

Methods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of DataMethods for Intrinsic Evaluation of Links in the Web of Data
Methods for Intrinsic Evaluation of Links in the Web of Data
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
 
Closing session
Closing sessionClosing session
Closing session
 
Reviews and awards
Reviews and awardsReviews and awards
Reviews and awards
 
Crowd statement marathon
Crowd statement marathonCrowd statement marathon
Crowd statement marathon
 
Paper presentations1
Paper presentations1Paper presentations1
Paper presentations1
 
Paper presentations2
Paper presentations2Paper presentations2
Paper presentations2
 
Hello session
Hello sessionHello session
Hello session
 
Tecnología e Igualdad
Tecnología e IgualdadTecnología e Igualdad
Tecnología e Igualdad
 
Introduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata EditathonIntroduccion a Wikidata DSS Wikidata Editathon
Introduccion a Wikidata DSS Wikidata Editathon
 
Interlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAsInterlinking Is More Than owl:sameAs
Interlinking Is More Than owl:sameAs
 
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
Programmatic Access to Crowdsourced Human Computation for Designing and Enhan...
 
Swib2014csarasua
Swib2014csarasuaSwib2014csarasua
Swib2014csarasua
 
Crowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro WorkCrowd Work CV: Recognition for Micro Work
Crowd Work CV: Recognition for Micro Work
 
Dbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_openDbpedia leipzig2014 csarasua_open
Dbpedia leipzig2014 csarasua_open
 
Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...Exploring the challenge of linking scientific publications and studies with c...
Exploring the challenge of linking scientific publications and studies with c...
 

Recently uploaded

"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
Fwdays
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
DanBrown980551
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
Mydbops
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
Ivo Velitchkov
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
Ajin Abraham
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
Sease
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
UiPathCommunity
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
DianaGray10
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Neo4j
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 

Recently uploaded (20)

"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba"NATO Hackathon Winner: AI-Powered Drug Search",  Taras Kloba
"NATO Hackathon Winner: AI-Powered Drug Search", Taras Kloba
 
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Must Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during MigrationMust Know Postgres Extension for DBA and Developer during Migration
Must Know Postgres Extension for DBA and Developer during Migration
 
Apps Break Data
Apps Break DataApps Break Data
Apps Break Data
 
AppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSFAppSec PNW: Android and iOS Application Security with MobSF
AppSec PNW: Android and iOS Application Security with MobSF
 
From Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMsFrom Natural Language to Structured Solr Queries using LLMs
From Natural Language to Structured Solr Queries using LLMs
 
Session 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdfSession 1 - Intro to Robotic Process Automation.pdf
Session 1 - Intro to Robotic Process Automation.pdf
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsConnector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectors
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and BioinformaticiansBiomedical Knowledge Graphs for Data Scientists and Bioinformaticians
Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 

Editing Behavior over Time Power vs. Standard Wikidata Editors

  • 1. Editing Behavior over Time Power vs. Standard Wikidata Editors Cristina Sarasua* , Alessandro Checco, Gianluca Demartini, Djellel E. Difallah, Michael Feldman, Lydia Pintscher sarasua@ifi.uzh.ch @csarasuagar WikidataCon 2017
  • 2. 6.8K - 8.7K Active Editors Source: 08.2016 - 08.2017 The Wikidata Revolution, Lydia Pintscher, Wikimania 2017
  • 3.
  • 4.
  • 5. Ultimate Goal: Help these editors find valuable work to do in Wikidata
  • 6. Ultimate Goal: Help these editors find valuable work to do in Wikidata Editor Knowledge Base
  • 7. 1. Understand differences in the behaviour between power editors and standard editors 2. Be able to identify if an editor will be “power” or “standard” editor 3. Provide a method that helps interested standard editors find their editing mission Data-driven study Discussion
  • 8. Editor Types Evolution S1 S2 S3 S4 M1 M2 M3 M4 session-based month-based # edits (volume) ● High ● Low # months (lifespan) ● Long ● Short
  • 9. Our Task Editing Behaviour Over Time Short of long lifespan? Contribution Participation Diversity
  • 10. Our Task Editing Behaviour Over Time Contribution Participation Diversity High or low volume of edits?
  • 11. What does the related work say? “Wikipedians are born, not made. They don’t do more over time and they maintain a high and constant level of participation.” [Panciera et al. 2009, Data-driven study] “Wikidatians” acquire a higher sense of responsibility for their work, interact more with the community, take on more advanced tasks, and use a wider range of tools” [Piscopo et al. 2017, Interviews] “There are different functional roles among editors: reference editor, item editor, item creator, item expert, property editor, and property engineer.” [Mueller-Birn et al. 2015, Data-driven study]
  • 12. Methodology 139+K editors, 32+M edits, 7+M items Data (human edits, item pages, without tools) Grouped in sessions Descriptive Statistics Statistical Model to see Trends among different editors Classification method to guess the lifespan and edits that an editor will have
  • 13. What did we find?
  • 14. Edit sessions F1. Shorter times between edits, and a longer definition of session than in Wikipedia (4.37 hours) [Wikipedia, Geiger et al. 2013]
  • 15. Editors and Items F2. Few editors with many edits (and vice versa), few items with many editors (and vice versa)
  • 16. Lifespan F3. Few editors worked over almost 4y, no linear relation between edit count and lifespan
  • 17. F4.CONTRIBUTION # edits (session, month) # edits per item (s,m) # items edited (s,m) Editors with longer lifespan tend to maintain a constant contribution. Others don’t. Editors with higher volume tend to maintain a constant contribution. Others don’t (not as clear). i1 m lifespan i1 m editcount
  • 18. F5.PARTICIPATION # seconds spent (session) Editors with a long lifespan maintain a constant participation. Others don’t. Some editors with high volume of edits maintain a constant participation. i4 s lifespan i4 s editcount
  • 19. F6.DIVERSITY # entropy of type of edit (s,m) Editors with long lifespan tend to increase the diversity of the type of their edits (m). For the others, some increase others decrease. i5 m lifespan i5 m editcount
  • 20. Identifying power and standard editors Lifespan prediction: F1-score for Random Forest and Logistic Classifier predicting using different # of sessions Volume of edits prediction: F1-score for Random Forest and Logistic Classifier predicting using different # of sessions 15 months 100 edits ● Lifespan is predicted better than volume of edits.
  • 21. Identifying power and standard editors ● Lifespan is predicted better than volume of edits. ● We can predict volume of edits better for standard editors than power users (both in session- and month-based evolution). As for lifespan, it is better for power editors. Lifespan prediction: F1-score for Random Forest and Logistic Classifier predicting using different # of sessions Volume of edits prediction: F1-score for Random Forest and Logistic Classifier predicting using different # of sessions 15 months 100 edits
  • 22. Conclusions from this research ● Skewed distribution in volume of edits. ● 46 % of editors are presumably “gone”. ● Power editors (in contrast to standard editors) tend to have habits and be constant in contribution and participation. ● Power editors tend to increase diversity of type of actions over months.
  • 23. How do we help standard users to have editing habits that suit them?
  • 24. How do we help standard users to have editing habits that suit them?
  • 25. Proposal ● Define intentions, resolutions ● Identify with roles and missions ● Publish calls for actions ● Define data needs Standard Editors ? Power Editors, Data Providers Individual / social missions Best practices dissemination Method & Tool Focus Routines
  • 26. @ Researchers, Developers ● Related theories to consider? ● What Wikidata tools to integrate in the process? @ Editors, Community Managers ● Are there people overwhelmed who don’t know how to contribute best? ● How do we collect and disseminate tips and tricks about deciding what to edit? ● How can we enable 1:1 collaboration between power editors / data providers and standard users?
  • 27. Big thanks! Sponsors & supporters Wikidata community
  • 28. References Katherine Panciera, Aaron Halfaker, and Loren Terveen. 2009. Wikipedians are born, not made: a study of power editors on Wikipedia. In Proceedings of the ACM 2009 international conference on Supporting group work (GROUP '09). ACM, New York, NY, USA, 51-60. DOI=http://dx.doi.org/10.1145/1531674.1531682 Piscopo, Alessandro, Phethean, Christopher and Simperl, Elena (2017) Wikidatians are born: paths to full participation in a collaborative structured knowledge base In Proceedings of the 50th Hawaii International Conference on System Sciences. University of Hawaii. 10 pp, pp. 4354-4363. (doi:10.24251/HICSS.2017.527). Claudia Müller-Birn, Benjamin Karran, Janette Lehmann, and Markus Luczak-Rösch. 2015. Peer-production system or collaborative ontology engineering effort: what is Wikidata?. In Proceedings of the 11th International Symposium on Open Collaboration (OpenSym '15). ACM, New York, NY, USA, Article 20, 10 pages. DOI: https://doi.org/10.1145/2788993.2789836 R. Stuart Geiger and Aaron Halfaker. 2013. Using edit sessions to measure participation in wikipedia. In Proceedings of the 2013 conference on Computer supported cooperative work (CSCW '13). ACM, New York, NY, USA, 861-870. DOI: https://doi.org/10.1145/2441776.2441873
  • 29. Image sources Slide 6 Attribution Nalex.25 - Creative Commons Attribution-Share Alike 4.0 International Slide 8 CC0 https://pixabay.com/en/books-education-school-literature-484766/ https://pixabay.com/en/hourglass-sand-watch-time-glass-1046841/ Sliide 9 https://pixabay.com/en/question-mark-pile-question-mark-2492009/ CC0 https://pixabay.com/en/business-success-winning-chart-163464/ https://pixabay.com/en/code-technology-monitor-computer-2588957/ Slide CC0 https://pixabay.com/en/user-person-people-profile-account-1633249/ Slide 24 CC0 https://pixabay.com/en/user-group-icon-person-business-1275780/ https://pixabay.com/en/man-woman-question-mark-problems-2814937/ https://pixabay.com/en/map-travel-compass-magnifying-glass-2685795/ Slide 25 https://pixabay.com/en/protest-models-art-artist-2265287/ Slide 26 https://blog.wikimedia.de/2012/04/04/meet-the-wikidata-team/ photo by Phillip Wilke. CC-BY-SA-3.0 Slide 26 Group photo of Wikimania 2017 attendees. Photo by Victor Grigas/Wikimedia Foundation, CC BY-SA 4.0.