- The sbv IMPROVER project is a crowdsourcing platform led by PMI R&D to verify methods in industrial research through challenges in data science, biology and medicine. It aims to provide quality control of company research.
- Challenges follow five stages: preparation, launch, running the challenge, ranking submissions, and sharing results. Defining precise questions helps obtain focused solutions.
- Challenges engage crowds of young researchers interested in machine learning and data science. Advertising occurs through social media, conferences, and directly engaging previous participants.
- Benefits include scientific publications, learning, and driving innovation through crowdsourced verification of methods. Maintaining the platform requires significant communication efforts but eng
Crowdsourcing à la sbv IMPROVER: the challenge of being your own client
1. www.sbvimprover.com
Crowdsourcing à la
sbv IMPROVER
The challenge of being your own client.
Dr. Adrian Stan
Senior Scientist
PMI R&D, Philip Morris Products S.A.,
Quai Jeanrenaud 5, CH-2000,
Neuchatel, Switzerland
CrowdSourcing Week Global, September 12th, 2019, San Francisco, USA
2. WHAT TYPE OF CROWDSOURCING IS SBV?
Data Science
Biology
Medicine
Found at the intersection of these disciplines, sbv IMPROVER
aims to provide a measure of quality control of industrial research
and development by verifying the methods being used.
The sbv IMPROVER project is a collaborative effort led and funded
by PMI Research and Development.
3. SOME HISTORY…
Diagnostic Signature Challenge
Network Verification Challenge I
Species Translation Challenge
Network Verification Challenge II
Systems Toxicology Challenge I
Singapore Datathon
Israel Epigenomics Challenge
Japan Biological Interpretation Datathon
Microbiomics Challenge
Systems Toxicology Challenge II
Metagenomics Diagnosis for IBD Challenge
2012
2013
2014
2015
2016
2017
2018
2019
5. Formulation
Delivery 1
Delivery 2
Delivery 3
Benchmark/Validate
Benchmark/Validate
A step in the pipeline of the project
that needs validation or benchmarking.
HOW DOES IT WORK FOR THE COMPANY?
6. Formulation
Delivery 1
Delivery 2
Delivery 3
Benchmark/Validate
A step in the pipeline of the project
that needs validation or benchmarking.
Benchmark/Validate
Delivery 4
HOW DOES IT WORK FOR THE COMPANY?
7. For example, this step
requires the use of a
machine learning method.
Formulation
Delivery 1
Delivery 2
Delivery 3
Benchmark/Validate
A step in the pipeline of the project
that needs validation or benchmarking.
Benchmark/Validate
Delivery 4
HOW DOES IT WORK FOR THE COMPANY?
8. THE FIVE STAGES OF A CHALLENGE
Prepare Launch Run Rank Share
Define a scientific
question, gather
data, formulate
the problem, and
prepare the
website.
Launch the
website and
advertise to the
community - use
social media,
advertisement
companies, etc.
Organise
webinars,
answer
questions from
participants, and
maintain the
website. Keep
advertising.
Gather the
solutions, score
them, and rank
the participants.
Publish the
scientific
outcome of the
challenge,
promote it at
conferences.
9. THE FIVE STAGES OF A CHALLENGE
TIME CONSUMING
Prepare Launch Run Rank Share
• Over the years, we have learned to define the scope
of challenges such that the scoring part has become
more efficient. In general, broader questions lead to
a very broad range of answers.
• Because of the emphasis on verification of the sbv
IMPROVER platform, we can define precise
questions and even offer template files for results.
Question
Narrow
Fast Slow
Solution Slow Fast
Broad
Prepare
Rank
10. PARTICIPANTS & MOTIVATION
Because we are addressing a
small population pool, the
communication strategy and
channels are the most
important components to
ensure the success of a given
challenge.
Data Source: World Development Indicators (data.worldbank.org)
Percentage of R&D Scientists in the Population
• Young researchers and professionalswith a science background,
• Between 25 and 40 years old,
• Already working with machine learning, software, and data,
• From academia: Post-Doc, PhD
• From companies: Data Science, Machine learning, Bioinformatics.
12. LET’S TALK ABOUT THE BENEFITS
For participants For the company
Peer recognition and self-
esteem
Co-author publications
Be part of a pioneering
community, working
together to improve how
scientific research is verified
Monetary incentives
Access a network of experts
Continuous learning
Contribute and work
towards consensus in the
scientific community
Scientific publications
Complement peer review
Drive innovation in science
with crowdsourced solutions
Scientific outreach
13. • Belcastro, V. et al. The sbv IMPROVER Systems Toxicology computational challenge: Identification of human and species-independent blood response
markers as predictors of smoking exposure and cessation status. Computational Toxicology, (2017).
• Poussin, C. et al. Crowd-Sourced Verification of Computational Methods and Data in Systems Toxicology: A Case Study with a Heat-Not-Burn
Candidate Modified Risk Tobacco Product. Chemical research in toxicology 30, 934-945, (2017).
• sbv IMPROVER project team et al. Community-Reviewed Biological Network Models for Toxicology and Drug Discovery Applications. Gene regulation
and systems biology 10, 51-66, (2016).
• sbv IMPROVER project team et al. Reputation-based collaborative network biology. Pacific Symposium on Biocomputing. Pacific Symposium on
Biocomputing, 270-281 (2015).
• sbv IMPROVER project team et al. Enhancement of COPD biological networks using a web-based collaboration interface. F1000Res. 4, 32, (2015).
• Rhrissorrakrai, K. et al. Understanding the limits of animal models as predictors of human biology: lessons learned from the sbv IMPROVER Species
Translation Challenge. Bioinformatics 31, 471-483, (2015).
• Hoeng, J., Peitsch, M. C., Meyer, P. & Jurisica, I. Where are we at regarding species translation? A review of the sbv IMPROVER challenge.
Bioinformatics 31, 451-452, (2015).
• Boue, S. et al. Enhancement of COPD biological networks using a web-based collaboration interface. F1000Research 4 (2015).
• Binder, J. et al. Reputation-based collaborative network biology, Pacific Symposium on Biocomputing. 270-281 (2015).
• Bilal, E. et al. A crowd-sourcing approach for the construction of species-specific cell signaling networks. Bioinformatics 31, 484-491, (2015).
• Poussin, C. et al. The species translation challenge-a systems biology perspective on human and rat bronchial epithelial cells. Scientific data 1,140009,
(2014).
• Tarca, A. L. et al. Strengths and limitations of microarray-based phenotype prediction: lessons learned from the IMPROVER Diagnostic Signature
Challenge. Bioinformatics 29, 2892-2899, (2013).
• Ansari, S. et al. On crowd-verification of biological networks. Bioinformatics and biology insights 7 (2013).
• sbv IMPROVER project team et al. On Crowd-verification of Biological Networks. Bioinformatics and biology insights 7, 307-325, (2013).
• Meyer, P. et al. Industrial methodology for process verification in research (IMPROVER): toward systems biology verification. Bioinformatics 28,
1193-1201, (2012).
• Meyer, P. et al. Verification of systems biology research in the age of collaborative competition. Nature biotechnology 29, 811-815, (2011).
PUBLICATIONS (16/6)
14. LOOKING FOR A NEW DIAGNOSTIC TOOL
URL:
https://www.sbvimprover.com/challenge-5
E-mail:
Sbvimprover.RD@pmi.com
15. WHAT IS IT ABOUT?
This challenge aims at finding the best classification algorithm that can be used for diagnosing
Inflammatory Bowel Disease with data obtained from non-invasive clinical samples.
Kaplan, G. G. The global burden of IBD: from 2015 to 2025, Nat. Rev. Gastroenterol. Hepatol. (2015)
The global prevalence of IBD in 2015
16. THE STRUCTURE OF THE CHALLENGE
Schematic representation of the challenge and its two sub-challenges.
The two sub-challenges address different crowds: Sub-challenge 1 addresses the Bioinformatics
crowd, while Sub-challenge 2 addresses a Data Science or Machine Learning crowd.
We wanted to capture solutions from a wider audience. So, we split the challenge in two sub-challenges.
17. CONCLUSIONS
• It is possible cor a company to organise and maintain its own crowdsourcing platform,
one that aims at verification in an industrial setting.
The sbv IMPROVER project, the websites, and the Symposia are part of a collaborative project designed to
enable scientists to learn about and contribute to the development of a new crowdsourcing method for
verification of scientific data and results. The project is led and funded by Philip Morris International.
For more information on the focus of Philip Morris International’s research, please visit www.pmiscience.com.
• For an R&D centre, crowdsourcing can mean an increase in the scientific output.
• Scientific transparency is a consequence of crowdsourcing due to the extra layer of
review that can be added through crowdsourcing that aims at verification.
• The communication effort per challenge is significant but it engages a challenge-
specific crowd.
• Challenge-specific crowds are easier to communicate with and can bring significantly
more pertinent solutions. They drive innovation faster.