www.sbvimprover.com
Crowdsourcing à la
sbv IMPROVER
The challenge of being your own client.
Dr. Adrian Stan
Senior Scientist
PMI R&D, Philip Morris Products S.A.,
Quai Jeanrenaud 5, CH-2000,
Neuchatel, Switzerland
CrowdSourcing Week Global, September 12th, 2019, San Francisco, USA
WHAT TYPE OF CROWDSOURCING IS SBV?
Data Science
Biology
Medicine
Found at the intersection of these disciplines, sbv IMPROVER
aims to provide a measure of quality control of industrial research
and development by verifying the methods being used.
The sbv IMPROVER project is a collaborative effort led and funded
by PMI Research and Development.
SOME HISTORY…
Diagnostic Signature Challenge
Network Verification Challenge I
Species Translation Challenge
Network Verification Challenge II
Systems Toxicology Challenge I
Singapore Datathon
Israel Epigenomics Challenge
Japan Biological Interpretation Datathon
Microbiomics Challenge
Systems Toxicology Challenge II
Metagenomics Diagnosis for IBD Challenge
2012
2013
2014
2015
2016
2017
2018
2019
Formulation
Delivery 1
Delivery 2
Delivery 3
HOW DOES IT WORK FOR THE COMPANY?
Formulation
Delivery 1
Delivery 2
Delivery 3
Benchmark/Validate
Benchmark/Validate
A step in the pipeline of the project
that needs validation or benchmarking.
HOW DOES IT WORK FOR THE COMPANY?
Formulation
Delivery 1
Delivery 2
Delivery 3
Benchmark/Validate
A step in the pipeline of the project
that needs validation or benchmarking.
Benchmark/Validate
Delivery 4
HOW DOES IT WORK FOR THE COMPANY?
For example, this step
requires the use of a
machine learning method.
Formulation
Delivery 1
Delivery 2
Delivery 3
Benchmark/Validate
A step in the pipeline of the project
that needs validation or benchmarking.
Benchmark/Validate
Delivery 4
HOW DOES IT WORK FOR THE COMPANY?
THE FIVE STAGES OF A CHALLENGE
Prepare Launch Run Rank Share
Define a scientific
question, gather
data, formulate
the problem, and
prepare the
website.
Launch the
website and
advertise to the
community - use
social media,
advertisement
companies, etc.
Organise
webinars,
answer
questions from
participants, and
maintain the
website. Keep
advertising.
Gather the
solutions, score
them, and rank
the participants.
Publish the
scientific
outcome of the
challenge,
promote it at
conferences.
THE FIVE STAGES OF A CHALLENGE
TIME CONSUMING
Prepare Launch Run Rank Share
• Over the years, we have learned to define the scope
of challenges such that the scoring part has become
more efficient. In general, broader questions lead to
a very broad range of answers.
• Because of the emphasis on verification of the sbv
IMPROVER platform, we can define precise
questions and even offer template files for results.
Question
Narrow
Fast Slow
Solution Slow Fast
Broad
Prepare
Rank
PARTICIPANTS & MOTIVATION
Because we are addressing a
small population pool, the
communication strategy and
channels are the most
important components to
ensure the success of a given
challenge.
Data Source: World Development Indicators (data.worldbank.org)
Percentage of R&D Scientists in the Population
• Young researchers and professionalswith a science background,
• Between 25 and 40 years old,
• Already working with machine learning, software, and data,
• From academia: Post-Doc, PhD
• From companies: Data Science, Machine learning, Bioinformatics.
ADVERTISING
Previous
Participants
Direct
Outreach
Internal
Scientific
Engagement
Blogs
Conferences
Job fairs
Ambassadors
Press
Releases
Social Media
Posts
Partner
Companies
LET’S TALK ABOUT THE BENEFITS
For participants For the company
Peer recognition and self-
esteem
Co-author publications
Be part of a pioneering
community, working
together to improve how
scientific research is verified
Monetary incentives
Access a network of experts
Continuous learning
Contribute and work
towards consensus in the
scientific community
Scientific publications
Complement peer review
Drive innovation in science
with crowdsourced solutions
Scientific outreach
• Belcastro, V. et al. The sbv IMPROVER Systems Toxicology computational challenge: Identification of human and species-independent blood response
markers as predictors of smoking exposure and cessation status. Computational Toxicology, (2017).
• Poussin, C. et al. Crowd-Sourced Verification of Computational Methods and Data in Systems Toxicology: A Case Study with a Heat-Not-Burn
Candidate Modified Risk Tobacco Product. Chemical research in toxicology 30, 934-945, (2017).
• sbv IMPROVER project team et al. Community-Reviewed Biological Network Models for Toxicology and Drug Discovery Applications. Gene regulation
and systems biology 10, 51-66, (2016).
• sbv IMPROVER project team et al. Reputation-based collaborative network biology. Pacific Symposium on Biocomputing. Pacific Symposium on
Biocomputing, 270-281 (2015).
• sbv IMPROVER project team et al. Enhancement of COPD biological networks using a web-based collaboration interface. F1000Res. 4, 32, (2015).
• Rhrissorrakrai, K. et al. Understanding the limits of animal models as predictors of human biology: lessons learned from the sbv IMPROVER Species
Translation Challenge. Bioinformatics 31, 471-483, (2015).
• Hoeng, J., Peitsch, M. C., Meyer, P. & Jurisica, I. Where are we at regarding species translation? A review of the sbv IMPROVER challenge.
Bioinformatics 31, 451-452, (2015).
• Boue, S. et al. Enhancement of COPD biological networks using a web-based collaboration interface. F1000Research 4 (2015).
• Binder, J. et al. Reputation-based collaborative network biology, Pacific Symposium on Biocomputing.  270-281 (2015).
• Bilal, E. et al. A crowd-sourcing approach for the construction of species-specific cell signaling networks. Bioinformatics 31, 484-491, (2015).
• Poussin, C. et al. The species translation challenge-a systems biology perspective on human and rat bronchial epithelial cells. Scientific data 1,140009,
(2014).
• Tarca, A. L. et al. Strengths and limitations of microarray-based phenotype prediction: lessons learned from the IMPROVER Diagnostic Signature
Challenge. Bioinformatics 29, 2892-2899, (2013).
• Ansari, S. et al. On crowd-verification of biological networks. Bioinformatics and biology insights 7 (2013).
• sbv IMPROVER project team et al. On Crowd-verification of Biological Networks. Bioinformatics and biology insights 7, 307-325, (2013).
• Meyer, P. et al. Industrial methodology for process verification in research (IMPROVER): toward systems biology verification. Bioinformatics 28,
1193-1201, (2012).
• Meyer, P. et al. Verification of systems biology research in the age of collaborative competition. Nature biotechnology 29, 811-815, (2011).
PUBLICATIONS (16/6)
LOOKING FOR A NEW DIAGNOSTIC TOOL
URL:
https://www.sbvimprover.com/challenge-5
E-mail:
Sbvimprover.RD@pmi.com
WHAT IS IT ABOUT?
This challenge aims at finding the best classification algorithm that can be used for diagnosing
Inflammatory Bowel Disease with data obtained from non-invasive clinical samples.
Kaplan, G. G. The global burden of IBD: from 2015 to 2025, Nat. Rev. Gastroenterol. Hepatol. (2015)
The global prevalence of IBD in 2015
THE STRUCTURE OF THE CHALLENGE
Schematic representation of the challenge and its two sub-challenges.
The two sub-challenges address different crowds: Sub-challenge 1 addresses the Bioinformatics
crowd, while Sub-challenge 2 addresses a Data Science or Machine Learning crowd.
We wanted to capture solutions from a wider audience. So, we split the challenge in two sub-challenges.
CONCLUSIONS
• It is possible cor a company to organise and maintain its own crowdsourcing platform,
one that aims at verification in an industrial setting.
The sbv IMPROVER project, the websites, and the Symposia are part of a collaborative project designed to
enable scientists to learn about and contribute to the development of a new crowdsourcing method for
verification of scientific data and results. The project is led and funded by Philip Morris International.
For more information on the focus of Philip Morris International’s research, please visit www.pmiscience.com.
• For an R&D centre, crowdsourcing can mean an increase in the scientific output.
• Scientific transparency is a consequence of crowdsourcing due to the extra layer of
review that can be added through crowdsourcing that aims at verification.
• The communication effort per challenge is significant but it engages a challenge-
specific crowd.
• Challenge-specific crowds are easier to communicate with and can bring significantly
more pertinent solutions. They drive innovation faster.
Thank you!

Crowdsourcing à la sbv IMPROVER: the challenge of being your own client

  • 1.
    www.sbvimprover.com Crowdsourcing à la sbvIMPROVER The challenge of being your own client. Dr. Adrian Stan Senior Scientist PMI R&D, Philip Morris Products S.A., Quai Jeanrenaud 5, CH-2000, Neuchatel, Switzerland CrowdSourcing Week Global, September 12th, 2019, San Francisco, USA
  • 2.
    WHAT TYPE OFCROWDSOURCING IS SBV? Data Science Biology Medicine Found at the intersection of these disciplines, sbv IMPROVER aims to provide a measure of quality control of industrial research and development by verifying the methods being used. The sbv IMPROVER project is a collaborative effort led and funded by PMI Research and Development.
  • 3.
    SOME HISTORY… Diagnostic SignatureChallenge Network Verification Challenge I Species Translation Challenge Network Verification Challenge II Systems Toxicology Challenge I Singapore Datathon Israel Epigenomics Challenge Japan Biological Interpretation Datathon Microbiomics Challenge Systems Toxicology Challenge II Metagenomics Diagnosis for IBD Challenge 2012 2013 2014 2015 2016 2017 2018 2019
  • 4.
    Formulation Delivery 1 Delivery 2 Delivery3 HOW DOES IT WORK FOR THE COMPANY?
  • 5.
    Formulation Delivery 1 Delivery 2 Delivery3 Benchmark/Validate Benchmark/Validate A step in the pipeline of the project that needs validation or benchmarking. HOW DOES IT WORK FOR THE COMPANY?
  • 6.
    Formulation Delivery 1 Delivery 2 Delivery3 Benchmark/Validate A step in the pipeline of the project that needs validation or benchmarking. Benchmark/Validate Delivery 4 HOW DOES IT WORK FOR THE COMPANY?
  • 7.
    For example, thisstep requires the use of a machine learning method. Formulation Delivery 1 Delivery 2 Delivery 3 Benchmark/Validate A step in the pipeline of the project that needs validation or benchmarking. Benchmark/Validate Delivery 4 HOW DOES IT WORK FOR THE COMPANY?
  • 8.
    THE FIVE STAGESOF A CHALLENGE Prepare Launch Run Rank Share Define a scientific question, gather data, formulate the problem, and prepare the website. Launch the website and advertise to the community - use social media, advertisement companies, etc. Organise webinars, answer questions from participants, and maintain the website. Keep advertising. Gather the solutions, score them, and rank the participants. Publish the scientific outcome of the challenge, promote it at conferences.
  • 9.
    THE FIVE STAGESOF A CHALLENGE TIME CONSUMING Prepare Launch Run Rank Share • Over the years, we have learned to define the scope of challenges such that the scoring part has become more efficient. In general, broader questions lead to a very broad range of answers. • Because of the emphasis on verification of the sbv IMPROVER platform, we can define precise questions and even offer template files for results. Question Narrow Fast Slow Solution Slow Fast Broad Prepare Rank
  • 10.
    PARTICIPANTS & MOTIVATION Becausewe are addressing a small population pool, the communication strategy and channels are the most important components to ensure the success of a given challenge. Data Source: World Development Indicators (data.worldbank.org) Percentage of R&D Scientists in the Population • Young researchers and professionalswith a science background, • Between 25 and 40 years old, • Already working with machine learning, software, and data, • From academia: Post-Doc, PhD • From companies: Data Science, Machine learning, Bioinformatics.
  • 11.
  • 12.
    LET’S TALK ABOUTTHE BENEFITS For participants For the company Peer recognition and self- esteem Co-author publications Be part of a pioneering community, working together to improve how scientific research is verified Monetary incentives Access a network of experts Continuous learning Contribute and work towards consensus in the scientific community Scientific publications Complement peer review Drive innovation in science with crowdsourced solutions Scientific outreach
  • 13.
    • Belcastro, V.et al. The sbv IMPROVER Systems Toxicology computational challenge: Identification of human and species-independent blood response markers as predictors of smoking exposure and cessation status. Computational Toxicology, (2017). • Poussin, C. et al. Crowd-Sourced Verification of Computational Methods and Data in Systems Toxicology: A Case Study with a Heat-Not-Burn Candidate Modified Risk Tobacco Product. Chemical research in toxicology 30, 934-945, (2017). • sbv IMPROVER project team et al. Community-Reviewed Biological Network Models for Toxicology and Drug Discovery Applications. Gene regulation and systems biology 10, 51-66, (2016). • sbv IMPROVER project team et al. Reputation-based collaborative network biology. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, 270-281 (2015). • sbv IMPROVER project team et al. Enhancement of COPD biological networks using a web-based collaboration interface. F1000Res. 4, 32, (2015). • Rhrissorrakrai, K. et al. Understanding the limits of animal models as predictors of human biology: lessons learned from the sbv IMPROVER Species Translation Challenge. Bioinformatics 31, 471-483, (2015). • Hoeng, J., Peitsch, M. C., Meyer, P. & Jurisica, I. Where are we at regarding species translation? A review of the sbv IMPROVER challenge. Bioinformatics 31, 451-452, (2015). • Boue, S. et al. Enhancement of COPD biological networks using a web-based collaboration interface. F1000Research 4 (2015). • Binder, J. et al. Reputation-based collaborative network biology, Pacific Symposium on Biocomputing.  270-281 (2015). • Bilal, E. et al. A crowd-sourcing approach for the construction of species-specific cell signaling networks. Bioinformatics 31, 484-491, (2015). • Poussin, C. et al. The species translation challenge-a systems biology perspective on human and rat bronchial epithelial cells. Scientific data 1,140009, (2014). • Tarca, A. L. et al. Strengths and limitations of microarray-based phenotype prediction: lessons learned from the IMPROVER Diagnostic Signature Challenge. Bioinformatics 29, 2892-2899, (2013). • Ansari, S. et al. On crowd-verification of biological networks. Bioinformatics and biology insights 7 (2013). • sbv IMPROVER project team et al. On Crowd-verification of Biological Networks. Bioinformatics and biology insights 7, 307-325, (2013). • Meyer, P. et al. Industrial methodology for process verification in research (IMPROVER): toward systems biology verification. Bioinformatics 28, 1193-1201, (2012). • Meyer, P. et al. Verification of systems biology research in the age of collaborative competition. Nature biotechnology 29, 811-815, (2011). PUBLICATIONS (16/6)
  • 14.
    LOOKING FOR ANEW DIAGNOSTIC TOOL URL: https://www.sbvimprover.com/challenge-5 E-mail: Sbvimprover.RD@pmi.com
  • 15.
    WHAT IS ITABOUT? This challenge aims at finding the best classification algorithm that can be used for diagnosing Inflammatory Bowel Disease with data obtained from non-invasive clinical samples. Kaplan, G. G. The global burden of IBD: from 2015 to 2025, Nat. Rev. Gastroenterol. Hepatol. (2015) The global prevalence of IBD in 2015
  • 16.
    THE STRUCTURE OFTHE CHALLENGE Schematic representation of the challenge and its two sub-challenges. The two sub-challenges address different crowds: Sub-challenge 1 addresses the Bioinformatics crowd, while Sub-challenge 2 addresses a Data Science or Machine Learning crowd. We wanted to capture solutions from a wider audience. So, we split the challenge in two sub-challenges.
  • 17.
    CONCLUSIONS • It ispossible cor a company to organise and maintain its own crowdsourcing platform, one that aims at verification in an industrial setting. The sbv IMPROVER project, the websites, and the Symposia are part of a collaborative project designed to enable scientists to learn about and contribute to the development of a new crowdsourcing method for verification of scientific data and results. The project is led and funded by Philip Morris International. For more information on the focus of Philip Morris International’s research, please visit www.pmiscience.com. • For an R&D centre, crowdsourcing can mean an increase in the scientific output. • Scientific transparency is a consequence of crowdsourcing due to the extra layer of review that can be added through crowdsourcing that aims at verification. • The communication effort per challenge is significant but it engages a challenge- specific crowd. • Challenge-specific crowds are easier to communicate with and can bring significantly more pertinent solutions. They drive innovation faster.
  • 18.