Open Science : Democratizing Access to Science

Okba BEKHELIFI
PhD Candidate @ LARESI, USTO-MB
Open Science: Democratizing Access
to Science
UConf 3.0
Do I.T. Well
@okbalefthanded

Today’s story:
The Supervisor The Student
‫بطولة‬‫كة‬‫بمشار‬‫العالم‬‫الق‬‫دير‬

Doing research: Literature review
Academic journals online portals

Pay to read/access to articles

Expensive journal annual subscription!
$ 10426.00
‫بزاف‬‫مولاي‬‫بزاف‬

- Authors are not paid for papers they publish!
- Editorial board (often) receives no monetary compensation
for reviewing.
- Research projects are (mostly) publicly funded.
- Public Institutions have to pay to access research
articles.
- Content created by research scientists. ‫عندك‬‫كة؟‬‫فلو‬
‫ها‬‫مانجمش‬‫َهمك‬‫ف‬‫ن‬
We need a solution, please!!!!

OPEN ACCESS(OA) *: “literature is digital, online,
free of charge, and free of most copyright and
licensing restrictions”.
*Suber, P. Open Access; MIT Press: Cambridge, MA, USA, 2012; Chapter 1.
Available online: http://mitpress.mit.edu/books/open-access

Cost of Open Access publishing:
*Table4: H. Morrison, J. Salhab, A. Calvé-Genest, and T. Horava, “Open Access Article Processing
Charges: DOAJ survey May 2014,” Publications, vol. 3, no. 1, pp. 1–16, 2015.

Golden Open Access*: peer-reviewed journals that conduct
peer-reviewing and often charges authors for publication.
*Suber, P. Knoweldge Unbound; MIT Press: Cambridge, MA, USA,2016;
https://mitpress.mit.edu/books/knowledge-unbound
2 types of OA:
Green Open Access*: repositories that host pre-prints or
free to access. (e.g.: ArXiv).

Open Access criticism:
- “Double dipping”: charging both subscriptions and OA, a business model
adopted by some Hybrid Access Journals.
- Watering-down science by encouraging low-quality science publication
(predatory Open Access publishers: Pay to Publish).
- Misinterpretation of research findings.

Checklist:
✓ Research papers
- [locked item, proceed to unlock]
The Student is happy!

Doing research: 2) Experimenting

Doing research: Experimentation
- Research studies require data collection and analysis for hypotheses testing
and the investigation of novel methods.
- Research equipment is expensive, requires maintenance and upgrades.
- Laboratories budget cannot afford access to multiple commercial datasets.
- ‫يال‬‫الماتر‬‫يلحق‬‫روطار‬

Doing research: Data
OPEN DATA*: “Open Data is research data that is freely available on the
internet permitting any user to download, copy, analyze, re-process,
pass to software or use for any other purpose without financial, legal, or
technical barriers other than those inseparable from gaining access to the
internet itself.”
* https://sparcopen.org/open-data/

Open Data repositories
- Community driven
project.
- Public data sources in
30 subjects :
● Agriculture
● Biology
● Climate + Weather
● EarthScience
● Economics
● ....
https://github.com/awesomedata/awesome-public-datasets

Open Data repositories
- EU funded project.
- General purpose OA
repository:
○ Up to 50 GB of free
space per dataset.
○ All research output
accepted.
https://zenodo.org

Open Data criticism:
The Skeptic
- Gaining profit from the labour of scientists.
- Privacy concerns.
- Misinterpretation and misuse of shared data.

Doing research: Experimentation
Checklist:
✓ Research papers
✓ Dataset

Doing research: 3) Analyzing data

Doing research: Software for Data Analysis
- Proprietary research software licenses are expensive.
- Reinventing the wheel : re-implementing data analysis
methods minimizes time dedicated to significant
research.
- Software developers shortage in research
laboratories.

Doing research: Source code
OPEN SOURCE*: “programmers or users can read,
modify and redistribute the source code of a piece of
software.”
*F. Pereira, “The Need for Open Source Software in Machine Learning,” vol. 8, pp. 2443–2466, 2007.

Open Source Software licence
F. Pereira, “The Need for Open Source Software in Machine Learning,” vol. 8, pp. 2443–2466, 2007.

Open Source hosting services
- Control Version System (CVS) based services :
- Github, Gitlab, BitBucket.
- Private hosts:
- Institutions dedicated servers: this-uni.edu/lab/repo/project
- Research oriented : RunMyCode
- Allows to share source code and data associated
with a research publication

Open Source for science
- Python programming language ecosystem

Open Source scientific computing environment
- Anaconda : open source freemium scientific Python distribution
includes:
- Python 2.x & 3.x interpreters + package manager.
- +1,000 Anaconda-curated and community packages.
- IDEs: Jupyter, Spyder, Rstudio.
- +6 million users.

Sponsoring Open Source
NumFocus (Nonprofit organization) :
- Provides fiscal sponsorship for Open
source scientific data projects.
- Sponsored by : IBM, Microsoft,
Bloomberg, Anaconda, Intel, ...
- 21 projects sponsored (so far).

Open Source criticism:
The Skeptic
- Software companies benefits Open Source
software with 0 costs and 0 compensation.
- Bad documentation and weak support.
- Issues with Open Source licenses compatibility.

Doing research: Data analysis
Checklist:
✓ Research papers
✓ Dataset
✓ Source code

The whole package: Open Science

Open Science
S. Bartling and S. Friesike, “Open Science :One Term, Five Schools of Thought”, Opening Science The Evolving Guide on How the Int
Changing Research, Collaboration and Scholarly Publishing. 2014.

Open Science
Reproducibility
Open
Access
Open
Source
Open
Data
𝑂𝑝𝑒𝑛 𝐴𝑐𝑐𝑒𝑠𝑠 + 𝐷𝑎𝑡𝑎 + 𝑆𝑜𝑢𝑟𝑐𝑒 = 𝑅𝑒𝑝𝑟𝑜𝑑𝑢𝑐𝑖𝑏𝑖𝑙𝑖𝑡𝑦

Open Science
Reproducibility, what for ?
*Hugo Larochelle, “Some Opinions on Reproducibility in ML”,Reproducibility Workshop, ICML 2017.
- Validating research findings : “If it isn’t reproducible, it might just as well not exist!”.*
- Accelerating novel research discoveries by reducing the time spent on reproducing
experiments.
- Measuring the impact of papers beyond citations.
- Overcoming the replication crisis and fight against research misconducts (eg: p-
hacking).

Open Science
Moabb (Mother of All Brain-Computer Interface Benchmarks):
Open Science Project: (Work in progress)
- BCI Benchmarking tool: Compare
different BCI algorithms for different
EEG Paradigms on different datasets.
- Created by NeuroTechX community.
- Based on Python Open Source
packages: MNE, Scikit-learn, ...

Further readings
Opening Science
(2014)
P.Suber : Open Access
(2012)
Knowledge unbound
(2016)
All these books are Open Access!

More on Open Access editors:
Useful links

- Opening Science: http://www.openingscience.org
- Open knowledge foundation : https://okfn.org
- Directory Of Open Access Journals : https://doaj.org/
- Phd Comics : http://phdcomics.com
Useful links

Open Science : Democratizing Access to Science

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Open Science : Democratizing Access to Science

Similar to Open Science : Democratizing Access to Science (20)

Recently uploaded

Recently uploaded (20)

Open Science : Democratizing Access to Science