SlideShare a Scribd company logo
1 of 60
Download to read offline
Wikipedia: the encyclopedia
anyone can edit…
even with Python
Miguel-Angel Monjas
mmonjas@gmail.com
miguelangelmonjas@wikimedia.es
Meetup Python Madrid, ICAI-ICADE, 2018-04-12
About me
Miguel-Angel Monjas:
● Telecom engineer, data scientist,
innovation coach… in Ericsson Spain
● Proud wikipedian since December 2004
● Member of Wikimedia Spain (wikimedia.es)
Author’s portrait © 2016-2018 Miguel-Angel Monjas
@mmonjaspro
http://www.linkedin.com/in/mmonjas
https://github.com/miguel-angel-monjas/
2
Wikipedia?
Old stuff but…
3
Pages
5
Pages:
history
6
Pages:
Wikitext
7
Pages:
Wikitext
8
Categories
9
Users
10
Users
11
Images
12
Images
13
Images
14
Images
15
Sites
16
Wikidata
Items
17
A Python bot framework:
pywikibot
Concepts and how-to
18
Bots:
A definition
A bot is a software application
that performs automated tasks.
Typically, bots perform tasks
that are both simple and
structurally repetitive, at a
much higher rate than would
be possible for a human alone.
In the Wikimedia projects, a
bot is any software application
that modifies any element of
the project, by uploading
pictures, updating a page…
either massively or not
19
pywikibot
● Python library and
collection of tools that
automate work on
MediaWiki sites.
● https://www.mediawiki.org/
wiki/Manual:Pywikibot
● https://doc.wikimedia.org/p
ywikibot/api_ref/pywikibot.
html
20
pywikibot: Site
21
pywikibot: Page
22
pywikibot: Page
23
pywikibot: Page
24
pywikibot: Page
25
pywikibot: Page
26
pywikibot: FilePage
27
pywikibot: FilePage
28
pywikibot: FilePage
29
30
pywikibot: Category
31
pywikibot: Category
pywikibot: User
32
Editing in Wikipedia
33
Editing in Wikipedia
34
35
Editing in Wikipedia
36
37
MediaWiki API
Another way to interact with the projects
38
The
MediaWiki API
● The MediaWiki API is a web
service that provides access
to wiki features, data, and
meta-data over HTTP, via a
URL usually at api.php
● https://www.mediawiki.org
/wiki/API:Main_page
● https://www.mediawiki.org
/wiki/API:FAQ
39
URL elements
https://commons.wikimedia.org/w/api.php ?
action=query &
format=json &
titles=File:Madrid - Puerta Alcala 01.jpg &
prop=globalusage &
guprop=url|namespace &
gulimit=500
END POINT
ACTION FORMAT
QUERY-SPECIFIC
PARAMETERS
METHOD: POST
40
QUERY TYPE
41
42
IMAGE TITLE
PAGE
NAME SPACE
PROJECT
43
Extension:Kartographer
Maps capabilities in Wikimedia projects
44
Extension:
Kartographer
● A MediaWiki extension that
adds maps capabilities to
Wikimedia projects
● https://www.mediawiki.org/
wiki/Help:Extension:Kartogr
apher
● Based on OpenStreetMap
● Enabled by the <mapframe>
tag within Wikimedia pages
● Content must be valid
GeoJSON
(http://geojson.org/)
45
46
IMAGE
POINT
LINK
47
TAG
MAP CENTER
Wikitext
MAP SIZE
48
FEATURE TYPE: POINT
COORDINATES
IMAGE
LINKMARKER FEATURES
GeoJSON
49
MARKER
PAWS (Pywikibot: A Web Server)
Local Pywikibot deployment not needed
any more
50
● PAWS is a Jupyter Notebook
Server provided by the
Wikimedia Cloud Services.
● Pre-integrated with Pywikibot
and many other Python
packages
● pip, Git available…
● Authenticated (OAuth) with
your Wikimedia account.
● https://paws.wmflabs.org
● http://paws-
public.wmflabs.org/paws-
public/User:YOURUSERNAME/
51
52
53
54
55
56
57
58
The home directory (minus
secret credential files) is public
by default
59
Attribution-ShareAlike 4.0
(CC BY-SA 4.0)
Except where otherwise noted, this work by Miguel-Angel Monjas is licensed under
https://creativecommons.org/licenses/by-sa/4.0/
60
Thank you!
mmonjas@gmail.com
miguelangelmonjas@wikimedia.es
61

More Related Content

Similar to Python Madrid: Wikipedia, the encyclopedia anyone can edit... even with Python

The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic WebJohn Breslin
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastSammy Fung
 
Charla Grasia: A framework for building distributed social network websites
Charla Grasia: A framework for building distributed social network websitesCharla Grasia: A framework for building distributed social network websites
Charla Grasia: A framework for building distributed social network websitesatapiador
 
Introduction to RFX for Backend Developer
Introduction to RFX for Backend DeveloperIntroduction to RFX for Backend Developer
Introduction to RFX for Backend DeveloperTrieu Nguyen
 
JIO and WebViewers: interoperability for Javascript and Web Applications
JIO and WebViewers: interoperability  for Javascript and Web ApplicationsJIO and WebViewers: interoperability  for Javascript and Web Applications
JIO and WebViewers: interoperability for Javascript and Web ApplicationsXWiki
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Amélie Gyrard
 
Exploiting Wikipedia for Information Retrieval Tasks, SIGIR Tutorial
Exploiting Wikipedia for Information Retrieval Tasks, SIGIR TutorialExploiting Wikipedia for Information Retrieval Tasks, SIGIR Tutorial
Exploiting Wikipedia for Information Retrieval Tasks, SIGIR TutorialVictor Makarenkov
 
Contributions to an open source project: Igalia and the Chromium project
Contributions to an open source project: Igalia and the Chromium projectContributions to an open source project: Igalia and the Chromium project
Contributions to an open source project: Igalia and the Chromium projectIgalia
 
Wirecloud hamburg kickoff
Wirecloud hamburg kickoffWirecloud hamburg kickoff
Wirecloud hamburg kickoffMiguel Jiménez
 
The OpenEuropa Initiative
The OpenEuropa InitiativeThe OpenEuropa Initiative
The OpenEuropa InitiativeNuvole
 
Magnolia and the IOT
Magnolia and the IOTMagnolia and the IOT
Magnolia and the IOTMagnolia
 
Magnolia CMS and the IoT
Magnolia CMS and the IoTMagnolia CMS and the IoT
Magnolia CMS and the IoTmycontainer
 
Combining Machine Learning with Physical Computing - June 2023
Combining Machine Learning with Physical Computing - June 2023Combining Machine Learning with Physical Computing - June 2023
Combining Machine Learning with Physical Computing - June 2023Hal Speed
 
Shockingly Fast Site Development with Acquia Lightning 4.0
Shockingly Fast Site Development with Acquia Lightning 4.0Shockingly Fast Site Development with Acquia Lightning 4.0
Shockingly Fast Site Development with Acquia Lightning 4.0Rachel Wandishin
 
WikiLoop: Big tech's Open Knowledge contributions
WikiLoop: Big tech's Open Knowledge contributionsWikiLoop: Big tech's Open Knowledge contributions
WikiLoop: Big tech's Open Knowledge contributionsAll Things Open
 
Dissmark Ii Social Software
Dissmark Ii Social SoftwareDissmark Ii Social Software
Dissmark Ii Social Softwaredavidroethler
 
Digital Tools for Manuscript Study IIIF
Digital Tools for Manuscript Study IIIFDigital Tools for Manuscript Study IIIF
Digital Tools for Manuscript Study IIIFRachel Di Cresce
 
Open Source Social Software
Open Source Social SoftwareOpen Source Social Software
Open Source Social SoftwareJosie Fraser
 

Similar to Python Madrid: Wikipedia, the encyclopedia anyone can edit... even with Python (20)

The Social Semantic Web
The Social Semantic WebThe Social Semantic Web
The Social Semantic Web
 
Global Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 ForecastGlobal Open Source Development 2011-2014 Review and 2015 Forecast
Global Open Source Development 2011-2014 Review and 2015 Forecast
 
Charla Grasia: A framework for building distributed social network websites
Charla Grasia: A framework for building distributed social network websitesCharla Grasia: A framework for building distributed social network websites
Charla Grasia: A framework for building distributed social network websites
 
Introduction to RFX for Backend Developer
Introduction to RFX for Backend DeveloperIntroduction to RFX for Backend Developer
Introduction to RFX for Backend Developer
 
JIO and WebViewers: interoperability for Javascript and Web Applications
JIO and WebViewers: interoperability  for Javascript and Web ApplicationsJIO and WebViewers: interoperability  for Javascript and Web Applications
JIO and WebViewers: interoperability for Javascript and Web Applications
 
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
Knowledge Extraction for the Web of Things (KE4WoT) Challenge: Co-located wit...
 
Exploiting Wikipedia for Information Retrieval Tasks, SIGIR Tutorial
Exploiting Wikipedia for Information Retrieval Tasks, SIGIR TutorialExploiting Wikipedia for Information Retrieval Tasks, SIGIR Tutorial
Exploiting Wikipedia for Information Retrieval Tasks, SIGIR Tutorial
 
Contributions to an open source project: Igalia and the Chromium project
Contributions to an open source project: Igalia and the Chromium projectContributions to an open source project: Igalia and the Chromium project
Contributions to an open source project: Igalia and the Chromium project
 
Wirecloud hamburg kickoff
Wirecloud hamburg kickoffWirecloud hamburg kickoff
Wirecloud hamburg kickoff
 
The OpenEuropa Initiative
The OpenEuropa InitiativeThe OpenEuropa Initiative
The OpenEuropa Initiative
 
Magnolia and the IOT
Magnolia and the IOTMagnolia and the IOT
Magnolia and the IOT
 
Magnolia CMS and the IoT
Magnolia CMS and the IoTMagnolia CMS and the IoT
Magnolia CMS and the IoT
 
Combining Machine Learning with Physical Computing - June 2023
Combining Machine Learning with Physical Computing - June 2023Combining Machine Learning with Physical Computing - June 2023
Combining Machine Learning with Physical Computing - June 2023
 
Shockingly Fast Site Development with Acquia Lightning 4.0
Shockingly Fast Site Development with Acquia Lightning 4.0Shockingly Fast Site Development with Acquia Lightning 4.0
Shockingly Fast Site Development with Acquia Lightning 4.0
 
WikiLoop: Big tech's Open Knowledge contributions
WikiLoop: Big tech's Open Knowledge contributionsWikiLoop: Big tech's Open Knowledge contributions
WikiLoop: Big tech's Open Knowledge contributions
 
Web 2.0: a course
Web 2.0: a courseWeb 2.0: a course
Web 2.0: a course
 
Dissmark Ii Social Software
Dissmark Ii Social SoftwareDissmark Ii Social Software
Dissmark Ii Social Software
 
Digital Tools for Manuscript Study IIIF
Digital Tools for Manuscript Study IIIFDigital Tools for Manuscript Study IIIF
Digital Tools for Manuscript Study IIIF
 
Open Source Social Software
Open Source Social SoftwareOpen Source Social Software
Open Source Social Software
 
Geoportal4everybody
Geoportal4everybodyGeoportal4everybody
Geoportal4everybody
 

Recently uploaded

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 

Recently uploaded (20)

What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Python Madrid: Wikipedia, the encyclopedia anyone can edit... even with Python