SlideShare a Scribd company logo
1
Center for Data Science
Paris-Saclay
DR / CNRS	

LAL & LRI	

CNRS & University Paris-Sud
BALÁZS KÉGL
Pr / UPSud	

LRI
CÉCILE GERMAIN
MdC / Telecom ParisTech	

LTCI
ALEXANDRE GRAMFORT
MdC / Mines ParisTech	

CGS
AKIN KAZAKÇI
Pr / ENSAE	

Laboratoire de Statistique
ARNAK DALALYAN
2
Meta
3
I will not talk about science
4
I will talk about 

management 

(of) (data) science
Center for Data Science
Paris-Saclay
• My eight-year of experience interfacing
between high-energy physics and data science	

• Our one-year experience of running PSCDS	

• Extensive discussions with management
scientist and the MI/Mastodons/MaDICS for
the last year
5
WHERE DOES IT COME FROM?
Center for Data Science
Paris-Saclay6
UNIVERSITÉ PARIS-SACLAY
19 founding partners
Center for Data Science
Paris-Saclay
UNIVERSITÉ PARIS-SACLAY
7
+ horizontal multi-disciplinary and multi-partner
initiatives (“lidex”) to create cohesion
Center for Data Science
Paris-Saclay8
Biology & bioinformatics
IBISC/UEvry
LRI/UPSud
Hepatinov
CESP/UPSud-UVSQ-Inserm
IGM-I2BC/UPSud
MIA/Agro
MIAj-MIG/INRA
LMAS/Centrale
Chemistry
EA4041/UPSud
Earth sciences
LATMOS/UVSQ
GEOPS/UPSud
IPSL/UVSQ
LSCE/UVSQ
LMD/Polytechnique
Economy
LM/ENSAE
RITM/UPSud
LFA/ENSAE
Neuroscience
UNICOG/Inserm
U1000/Inserm
NeuroSpin/CEA
Particle physics
astrophysics &
cosmology
LPP/Polytechnique
DMPH/ONERA
CosmoStat/CEA
IAS/UPSud
AIM/CEA
LAL/UPSud
250researchers in 35laboratories
Machine learning
LRI/UPSud
LTCI/Telecom
CMLA/Cachan
LS/ENSAE
LIX/Polytechnique
MIA/Agro
CMA/Polytechnique
LSS/Supélec
CVN/Centrale
LMAS/Centrale
DTIM/ONERA
IBISC/UEvry
Visualization
INRIA
LIMSI
Signal processing
LTCI/Telecom
CMA/Polytechnique
CVN/Centrale
LSS/Supélec
CMLA/Cachan
LIMSI
DTIM/ONERA
Statistics
LMO/UPSud
LS/ENSAE
LSS/Supélec
CMA/Polytechnique
LMAS/Centrale
MIA/AgroParisTech
machine learning
information retrieval
signal processing
data visualization
databases
Domain science
human society
life
brain
earth
universe
Tool building
software engineering
clouds/grids
high-performance
computing
optimization
Domain scientistSoftware engineer
datascience-paris-saclay.fr
Center for Data Science
Paris-Saclay
A multi-disciplinary initiative to define, structure, and manage
the data science ecosystem at the Université Paris-Saclay
http://www.datascience-paris-saclay.fr/
Center for Data Science
Paris-Saclay
DATA SCIENCE
9
Design of automated methods
to analyze massive and complex data
to extract useful information
Center for Data Science
Paris-Saclay10
CENTER FOR DATA SCIENCE

=

DATA CENTER

We are focusing on inference:
data knowledge
Interfacing with HPC, cloud, storage, production
Center for Data Science
Paris-Saclay11
PARAMETERS
• 2 years:April 2014 - June 2016, 1.2M€	

• +1 year, conditional on evaluation	

• Light management	

• executive committee of 17 members	

• work groups	

• management, strategy (around objectives)	

• thematic (around scientific themes), 	

• open to everyone to propose and to participate
Center for Data Science
Paris-Saclay
Domain science
energy and physical sciences
health and life sciences
Earth and environment
economy and society
brain
12
THE DATA SCIENCE LANDSCAPE
Data scientist
Data trainer
Applied scientist
Domain scientistSoftware engineer
Data engineer
Data science
statistics

machine learning
information retrieval
signal processing
data visualization
databases
Tool building
software engineering

clouds/grids
high-performance

computing
optimization
Center for Data Science
Paris-Saclay
• Manpower	

• especially at the interfaces	

• industrial brain-drain	

• Incentives	

• data scientists are not incentivized to work on domain science	

• scientists are not incentivized to work on tools	

• Access	

• no well-developed channels to identify the right experts for a given problem	

• Tools	

• few tools that can help domain scientists and data scientists to collaborate efficiently
13
CHALLENGES
Center for Data Science
Paris-Saclay
TOOLS
14
We are designing and learning to manage tools to
accompany data science projects with different needs
Center for Data Science
Paris-Saclay
TOOLS: LANDSCAPE TO ECOSYSTEM
15
Data scientist
Data trainer
Applied scientist
Domain expertSoftware engineer
Data engineer
Tool building Data domains
Data science
statistics

machine learning
information retrieval
signal processing
data visualization
databases
• interdisciplinary projects
• matchmaking tool
• design and innovation strategy workshops
• data challenges
• coding sprints
• Open Software Initiative
• code consolidator and engineering projects
software engineering

clouds/grids
high-performance

computing
optimization
energy and physical sciences
health and life sciences
Earth and environment
economy and society
brain
• data science bootcamps
• IT platform for linked data
• annotation tool
• SaaS data science platform
Center for Data Science
Paris-Saclay16
POSTDOCS, THESES, SABBATICALS
• Common selection criteria	

• scientific quality	

• expected results both in domain science and data science	

• relevance and feasibility	

• (real) scientific data, available at the start of the project	

• interdisciplinarity (PIs both from domain and data sciences, different LABEXs)	

• community building	

• organizing and participating in thematic days, bootcamps, workshops
Center for Data Science
Paris-Saclay17
ENGINEERING AND CODE CONSOLIDATING
PROJECTS
• No research	

• implementation, maintenance, integration	

• 1 year engineering projects	

• 3-6 month code consolidating projects	

• a postdoc or PhD student drops research during the project,
implements his/her research code in a professional software, or
integrates it into a toolbox
Center for Data Science
Paris-Saclay
• A window to open data at Paris-Saclay	

• We are not storing or handling existing large data sets	

• Rather indexing, linking, and mapping, embedding in the
worldwide linked data (RDF) ecosystem	

• Storing small data sets of small teams is possible	

• Subsets of large sets for prototyping	

• Or simply store metadata plus pointer
18
IT PLATFORM FOR LINKED DATA
http://io.datascience-paris-saclay.fr/
Center for Data Science
Paris-Saclay
IT PLATFORM FOR LINKED DATA
19
Center for Data Science
Paris-Saclay20
BOOTCAMPS
• Single-day coding sessions	

• 20-30 participants	

• Goals	

• training PhD students, postdocs, engineers, senior researchers for
hands-on data science (problem types, tools)	

• solving (prototyping) real data science problems	

• networking, knowing each other
21
BOOTCAMPS
Center for Data Science
Paris-Saclay22
BOOTCAMPS
23
BOOTCAMPS
24
BOOTCAMPS
Center for Data Science
Paris-Saclay
• A data challenge is a recently developed unconventional
dissemination and communication tool	

• a scientific or industrial data producer arrives with a well-defined problem
and a corresponding annotated data set	

• defines a quantitative goal	

• makes the problem and part of the data set (the training set) public on a
dedicated site	

• data science experts then take the public training data and submit solutions
for a test set with hidden annotations	

• submissions are evaluated numerically using the quantitative measure	

• contestants are listed on a leaderboard	

• after a predefined time, typically a couple of months, the final results are
revealed and the winners are awarded
25
DATA CHALLENGES
Center for Data Science
Paris-Saclay
• The HiggsML challenge on Kaggle	

• https://www.kaggle.com/c/higgs-boson	

• 1785 teams, huge publicity 	

• significant improvement on baseline	

• 18 month preparation, yet partially missing the target
26
DATA CHALLENGES
Center for Data Science
Paris-Saclay
• Challenges are useful for	

• generating visibility in the data science community about
novel application domains	

• benchmarking in a fair way state-of-the-art techniques on
well-defined problems	

• finding talented data scientists	

• Limitations	

• not necessary adapted to solving complex and open-ended
data science problems in realistic environments	

• emphasizes competition
27
DATA CHALLENGES
28
DESIGN AND INNOVATION STRATEGY WORKSHOPS
Center for Data Science
Paris-Saclay
• Putting domain scientists, data scientists, and management
scientist in the same room	

• Getting them understand each other	

• Keeping them collectively creative	

• The goal: identifying and defining projects	

• low-hanging fruits	

• breakthrough projects	

• long-term vision
29
DESIGN AND INNOVATION STRATEGY WORKSHOPS
Center for Data Science
Paris-Saclay
innovative design 	

=	

interaction and joint
expansion of concepts
and knowledge
30
DESIGN AND INNOVATION STRATEGY WORKSHOPS
C/K design theory
Center for Data Science
Paris-Saclay31
DESIGN AND INNOVATION STRATEGY WORKSHOPS
[P]$Project$
building!
Ini3alisa3on$
[K]$Knowledge$
sharing$
Workshops$
[C]$IFM?Design$
Workshops$
[RUN]!
DKCP process: linearizing C-K dynamics
Center for Data Science
Paris-Saclay32
TAKE HOME MESSAGE NO1
If you are interested in adapting any of these tools in
your project/site, feel free to contact us, we would be
happy to share our experience
Center for Data Science
Paris-Saclay
• We need data engineers and trainers to support research	

• ideally: 75% research scientists, 25% research engineers	

• We are lucky in France that the position exists in public research	

• Tasks	

• integrating research code into general-purpose professional software (e.g.,
scikit-learn,Torch)	

• providing an interface between computational infrastructure (e.g., clouds)
and scientists	

• training scientists to use the tools
33
WHAT CNRS CAN DO
Center for Data Science
Paris-Saclay34
WHAT CNRS CAN DO
Incentives
Center for Data Science
Paris-Saclay35
WHAT CNRS CAN DO
Most data scientists, as other scientists, are trained and incentivized to do research on
highly specialized domains. They search scientific visibility in their international community,
which is equally highly specialized, because their carrier advancement is almost entirely
based on peer-reviewed publications. Even when they would have the expertise, they have
little incentive to venture into the tool builder (data engineer) role since software
authorship has little value in their evaluation, and it can only serve them implicitly through
the visibility they gain in the community of tool users. By the same token, they have little
incentive to venture into domain sciences and to tackle economic or societal challenges. It
is possible that a domain science or an industrial project requires new techniques which
then can be published in data science venues, but this is not guaranteed at all. It usually
takes heavy investment of time and effort to be able to understand domain problems, so
excursions into domain sciences are highly risky. Even when such collaborations are
established, the data scientist has a strong prior to use his/her favorite methodology which
is not necessarily the best solution for a given problem. Finally, the data scientist has little
incentive to bring the project to full fruition, and he/she often “runs away” with an abstract
data science problem (and solution) extracted from the project. Symmetrically, domain
scientist and industrial domain experts have no incentive to advance data science and to
develop and publish new techniques, as long as their data science problems get solved.
When they venture into tool development, they have little incentive in developing general
purpose tools.
Center for Data Science
Paris-Saclay
Affirmative action
36
TAKE HOME MESSAGE NO2
• Overweight (e.g, double count) out-of-domain papers in
every evaluation	

• Make software tool ownership count
Affirmative action
37
TAKE HOME MESSAGE NO2
Center for Data Science
Paris-Saclay38
TAKE HOME MESSAGE NO3
Co-locality is important.
It would be desirable that data science for scientific
data in France be grouped into 5-6 sites of similar size.
Resources and experience can and should be shared.
Center for Data Science
Paris-Saclay39
THANK YOU!

More Related Content

What's hot

Chocolate Flavoured Data Science
Chocolate Flavoured Data ScienceChocolate Flavoured Data Science
Chocolate Flavoured Data Science
Thilo Stadelmann
 
Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics
Tore Hoel
 
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Steven Miller
 
UCT eResearch - Presentation for IT reps
UCT eResearch  - Presentation for IT repsUCT eResearch  - Presentation for IT reps
UCT eResearch - Presentation for IT reps
eResearchatUCT
 
Building the Data Science Profession in Europe
Building the Data Science Profession in EuropeBuilding the Data Science Profession in Europe
Building the Data Science Profession in Europe
Steven Miller
 
Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019
Shoaib Sufi
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications
Sandra Gesing
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Debraj GuhaThakurta
 

What's hot (8)

Chocolate Flavoured Data Science
Chocolate Flavoured Data ScienceChocolate Flavoured Data Science
Chocolate Flavoured Data Science
 
Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics Towards Open Architectures and Interoperability for Learning Analytics
Towards Open Architectures and Interoperability for Learning Analytics
 
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
Driving Data and Cognitive Sciences Curriculum at the Nexus of Society, Polic...
 
UCT eResearch - Presentation for IT reps
UCT eResearch  - Presentation for IT repsUCT eResearch  - Presentation for IT reps
UCT eResearch - Presentation for IT reps
 
Building the Data Science Profession in Europe
Building the Data Science Profession in EuropeBuilding the Data Science Profession in Europe
Building the Data Science Profession in Europe
 
Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019Highlights from the Workshop on Sustainable Software Sustainability 2019
Highlights from the Workshop on Sustainable Software Sustainability 2019
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications
 
Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017Team Data Science Process Presentation (TDSP), Aug 29, 2017
Team Data Science Process Presentation (TDSP), Aug 29, 2017
 

Similar to The Paris-Saclay Center for Data Science

RAMP: Collaborative challenge with code submission
RAMP: Collaborative challenge with code submissionRAMP: Collaborative challenge with code submission
RAMP: Collaborative challenge with code submission
Balázs Kégl
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data Science
Institute of Contemporary Sciences
 
Enabling Research without Geographical Boundaries via Collaborative Research ...
Enabling Research without Geographical Boundaries via Collaborative Research ...Enabling Research without Geographical Boundaries via Collaborative Research ...
Enabling Research without Geographical Boundaries via Collaborative Research ...
Sandra Gesing
 
What is wrong with data challenges
What is wrong with data challengesWhat is wrong with data challenges
What is wrong with data challenges
Balázs Kégl
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
AkhilGGM
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
Sandra Gesing
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
Sandra Gesing
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paper
Jose Enrique Ruiz
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
BigData_Europe
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
KumarNaik21
 
OpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open ScienceOpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open Science
OpenAIRE
 
Frankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee ProjeectFrankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee Projeect
Goethe Univeristy
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
KumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
VamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
saitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
Nithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
VamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
SayyedYusufali
 

Similar to The Paris-Saclay Center for Data Science (20)

RAMP: Collaborative challenge with code submission
RAMP: Collaborative challenge with code submissionRAMP: Collaborative challenge with code submission
RAMP: Collaborative challenge with code submission
 
From Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data ScienceFrom Science to Data: Following a principled path to Data Science
From Science to Data: Following a principled path to Data Science
 
Enabling Research without Geographical Boundaries via Collaborative Research ...
Enabling Research without Geographical Boundaries via Collaborative Research ...Enabling Research without Geographical Boundaries via Collaborative Research ...
Enabling Research without Geographical Boundaries via Collaborative Research ...
 
What is wrong with data challenges
What is wrong with data challengesWhat is wrong with data challenges
What is wrong with data challenges
 
Data Science Training and Placement
Data Science Training and PlacementData Science Training and Placement
Data Science Training and Placement
 
SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...SGCI - The Science Gateways Community Institute: International Collaboration ...
SGCI - The Science Gateways Community Institute: International Collaboration ...
 
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
SGCI - URSSI - Research Software Engineers, Science Gateway Developers and Cy...
 
Digital Science: Towards the executable paper
Digital Science: Towards the executable paperDigital Science: Towards the executable paper
Digital Science: Towards the executable paper
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
OpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open ScienceOpenAIRE: eInfrastructure for Open Science
OpenAIRE: eInfrastructure for Open Science
 
Frankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee ProjeectFrankfurt Big Data Lab & Refugee Projeect
Frankfurt Big Data Lab & Refugee Projeect
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 

More from Balázs Kégl

Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural nets
Balázs Kégl
 
Model-based reinforcement learning and self-driving engineering systems
Model-based reinforcement learning and self-driving engineering systemsModel-based reinforcement learning and self-driving engineering systems
Model-based reinforcement learning and self-driving engineering systems
Balázs Kégl
 
Managing the AI process: putting humans (back) in the loop
Managing the AI process: putting humans (back) in the loopManaging the AI process: putting humans (back) in the loop
Managing the AI process: putting humans (back) in the loop
Balázs Kégl
 
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
Balázs Kégl
 
Machine learning in scientific workflows
Machine learning in scientific workflowsMachine learning in scientific workflows
Machine learning in scientific workflows
Balázs Kégl
 
A historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricksA historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricks
Balázs Kégl
 
Build your own data challenge, or just organize team work
Build your own data challenge, or just organize team workBuild your own data challenge, or just organize team work
Build your own data challenge, or just organize team work
Balázs Kégl
 
Learning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physicsLearning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physics
Balázs Kégl
 

More from Balázs Kégl (8)

Data-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural netsData-driven hypothesis generation using deep neural nets
Data-driven hypothesis generation using deep neural nets
 
Model-based reinforcement learning and self-driving engineering systems
Model-based reinforcement learning and self-driving engineering systemsModel-based reinforcement learning and self-driving engineering systems
Model-based reinforcement learning and self-driving engineering systems
 
Managing the AI process: putting humans (back) in the loop
Managing the AI process: putting humans (back) in the loopManaging the AI process: putting humans (back) in the loop
Managing the AI process: putting humans (back) in the loop
 
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...   DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
DARMDN: Deep autoregressive mixture density nets for dynamical system mode...
 
Machine learning in scientific workflows
Machine learning in scientific workflowsMachine learning in scientific workflows
Machine learning in scientific workflows
 
A historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricksA historical introduction to deep learning: hardware, data, and tricks
A historical introduction to deep learning: hardware, data, and tricks
 
Build your own data challenge, or just organize team work
Build your own data challenge, or just organize team workBuild your own data challenge, or just organize team work
Build your own data challenge, or just organize team work
 
Learning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physicsLearning do discover: machine learning in high-energy physics
Learning do discover: machine learning in high-energy physics
 

Recently uploaded

一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
uevausa
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
asyed10
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
hqfek
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
vasanthatpuram
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
Bisnar Chase Personal Injury Attorneys
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
ugydym
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
Vineet
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
oaxefes
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
MastanaihnaiduYasam
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
z6osjkqvd
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
eudsoh
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
bmucuha
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
Timothy Spann
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
eoxhsaa
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Marlon Dumas
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
nyvan3
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
keesa2
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 

Recently uploaded (20)

一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
一比一原版加拿大渥太华大学毕业证(uottawa毕业证书)如何办理
 
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
一比一原版美国帕森斯设计学院毕业证(parsons毕业证书)如何办理
 
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
一比一原版爱尔兰都柏林大学毕业证(本硕)ucd学位证书如何办理
 
Cell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docxCell The Unit of Life for NEET Multiple Choice Questions.docx
Cell The Unit of Life for NEET Multiple Choice Questions.docx
 
Drownings spike from May to August in children
Drownings spike from May to August in childrenDrownings spike from May to August in children
Drownings spike from May to August in children
 
一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理一比一原版南昆士兰大学毕业证如何办理
一比一原版南昆士兰大学毕业证如何办理
 
Sample Devops SRE Product Companies .pdf
Sample Devops SRE  Product Companies .pdfSample Devops SRE  Product Companies .pdf
Sample Devops SRE Product Companies .pdf
 
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
一比一原版卡尔加里大学毕业证(uc毕业证)如何办理
 
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative ClassifiersML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
ML-PPT-UNIT-2 Generative Classifiers Discriminative Classifiers
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
一比一原版英属哥伦比亚大学毕业证(UBC毕业证书)学历如何办理
 
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
一比一原版马来西亚博特拉大学毕业证(upm毕业证)如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
06-12-2024-BudapestDataForum-BuildingReal-timePipelineswithFLaNK AIM
 
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
一比一原版多伦多大学毕业证(UofT毕业证书)学历如何办理
 
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
 
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
一比一原版英国赫特福德大学毕业证(hertfordshire毕业证书)如何办理
 
一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理一比一原版悉尼大学毕业证如何办理
一比一原版悉尼大学毕业证如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 

The Paris-Saclay Center for Data Science

  • 1. 1 Center for Data Science Paris-Saclay DR / CNRS LAL & LRI CNRS & University Paris-Sud BALÁZS KÉGL Pr / UPSud LRI CÉCILE GERMAIN MdC / Telecom ParisTech LTCI ALEXANDRE GRAMFORT MdC / Mines ParisTech CGS AKIN KAZAKÇI Pr / ENSAE Laboratoire de Statistique ARNAK DALALYAN
  • 3. 3 I will not talk about science
  • 4. 4 I will talk about 
 management 
 (of) (data) science
  • 5. Center for Data Science Paris-Saclay • My eight-year of experience interfacing between high-energy physics and data science • Our one-year experience of running PSCDS • Extensive discussions with management scientist and the MI/Mastodons/MaDICS for the last year 5 WHERE DOES IT COME FROM?
  • 6. Center for Data Science Paris-Saclay6 UNIVERSITÉ PARIS-SACLAY 19 founding partners
  • 7. Center for Data Science Paris-Saclay UNIVERSITÉ PARIS-SACLAY 7 + horizontal multi-disciplinary and multi-partner initiatives (“lidex”) to create cohesion
  • 8. Center for Data Science Paris-Saclay8 Biology & bioinformatics IBISC/UEvry LRI/UPSud Hepatinov CESP/UPSud-UVSQ-Inserm IGM-I2BC/UPSud MIA/Agro MIAj-MIG/INRA LMAS/Centrale Chemistry EA4041/UPSud Earth sciences LATMOS/UVSQ GEOPS/UPSud IPSL/UVSQ LSCE/UVSQ LMD/Polytechnique Economy LM/ENSAE RITM/UPSud LFA/ENSAE Neuroscience UNICOG/Inserm U1000/Inserm NeuroSpin/CEA Particle physics astrophysics & cosmology LPP/Polytechnique DMPH/ONERA CosmoStat/CEA IAS/UPSud AIM/CEA LAL/UPSud 250researchers in 35laboratories Machine learning LRI/UPSud LTCI/Telecom CMLA/Cachan LS/ENSAE LIX/Polytechnique MIA/Agro CMA/Polytechnique LSS/Supélec CVN/Centrale LMAS/Centrale DTIM/ONERA IBISC/UEvry Visualization INRIA LIMSI Signal processing LTCI/Telecom CMA/Polytechnique CVN/Centrale LSS/Supélec CMLA/Cachan LIMSI DTIM/ONERA Statistics LMO/UPSud LS/ENSAE LSS/Supélec CMA/Polytechnique LMAS/Centrale MIA/AgroParisTech machine learning information retrieval signal processing data visualization databases Domain science human society life brain earth universe Tool building software engineering clouds/grids high-performance computing optimization Domain scientistSoftware engineer datascience-paris-saclay.fr Center for Data Science Paris-Saclay A multi-disciplinary initiative to define, structure, and manage the data science ecosystem at the Université Paris-Saclay http://www.datascience-paris-saclay.fr/
  • 9. Center for Data Science Paris-Saclay DATA SCIENCE 9 Design of automated methods to analyze massive and complex data to extract useful information
  • 10. Center for Data Science Paris-Saclay10 CENTER FOR DATA SCIENCE
 =
 DATA CENTER
 We are focusing on inference: data knowledge Interfacing with HPC, cloud, storage, production
  • 11. Center for Data Science Paris-Saclay11 PARAMETERS • 2 years:April 2014 - June 2016, 1.2M€ • +1 year, conditional on evaluation • Light management • executive committee of 17 members • work groups • management, strategy (around objectives) • thematic (around scientific themes), • open to everyone to propose and to participate
  • 12. Center for Data Science Paris-Saclay Domain science energy and physical sciences health and life sciences Earth and environment economy and society brain 12 THE DATA SCIENCE LANDSCAPE Data scientist Data trainer Applied scientist Domain scientistSoftware engineer Data engineer Data science statistics
 machine learning information retrieval signal processing data visualization databases Tool building software engineering
 clouds/grids high-performance
 computing optimization
  • 13. Center for Data Science Paris-Saclay • Manpower • especially at the interfaces • industrial brain-drain • Incentives • data scientists are not incentivized to work on domain science • scientists are not incentivized to work on tools • Access • no well-developed channels to identify the right experts for a given problem • Tools • few tools that can help domain scientists and data scientists to collaborate efficiently 13 CHALLENGES
  • 14. Center for Data Science Paris-Saclay TOOLS 14 We are designing and learning to manage tools to accompany data science projects with different needs
  • 15. Center for Data Science Paris-Saclay TOOLS: LANDSCAPE TO ECOSYSTEM 15 Data scientist Data trainer Applied scientist Domain expertSoftware engineer Data engineer Tool building Data domains Data science statistics
 machine learning information retrieval signal processing data visualization databases • interdisciplinary projects • matchmaking tool • design and innovation strategy workshops • data challenges • coding sprints • Open Software Initiative • code consolidator and engineering projects software engineering
 clouds/grids high-performance
 computing optimization energy and physical sciences health and life sciences Earth and environment economy and society brain • data science bootcamps • IT platform for linked data • annotation tool • SaaS data science platform
  • 16. Center for Data Science Paris-Saclay16 POSTDOCS, THESES, SABBATICALS • Common selection criteria • scientific quality • expected results both in domain science and data science • relevance and feasibility • (real) scientific data, available at the start of the project • interdisciplinarity (PIs both from domain and data sciences, different LABEXs) • community building • organizing and participating in thematic days, bootcamps, workshops
  • 17. Center for Data Science Paris-Saclay17 ENGINEERING AND CODE CONSOLIDATING PROJECTS • No research • implementation, maintenance, integration • 1 year engineering projects • 3-6 month code consolidating projects • a postdoc or PhD student drops research during the project, implements his/her research code in a professional software, or integrates it into a toolbox
  • 18. Center for Data Science Paris-Saclay • A window to open data at Paris-Saclay • We are not storing or handling existing large data sets • Rather indexing, linking, and mapping, embedding in the worldwide linked data (RDF) ecosystem • Storing small data sets of small teams is possible • Subsets of large sets for prototyping • Or simply store metadata plus pointer 18 IT PLATFORM FOR LINKED DATA http://io.datascience-paris-saclay.fr/
  • 19. Center for Data Science Paris-Saclay IT PLATFORM FOR LINKED DATA 19
  • 20. Center for Data Science Paris-Saclay20 BOOTCAMPS • Single-day coding sessions • 20-30 participants • Goals • training PhD students, postdocs, engineers, senior researchers for hands-on data science (problem types, tools) • solving (prototyping) real data science problems • networking, knowing each other
  • 22. Center for Data Science Paris-Saclay22 BOOTCAMPS
  • 25. Center for Data Science Paris-Saclay • A data challenge is a recently developed unconventional dissemination and communication tool • a scientific or industrial data producer arrives with a well-defined problem and a corresponding annotated data set • defines a quantitative goal • makes the problem and part of the data set (the training set) public on a dedicated site • data science experts then take the public training data and submit solutions for a test set with hidden annotations • submissions are evaluated numerically using the quantitative measure • contestants are listed on a leaderboard • after a predefined time, typically a couple of months, the final results are revealed and the winners are awarded 25 DATA CHALLENGES
  • 26. Center for Data Science Paris-Saclay • The HiggsML challenge on Kaggle • https://www.kaggle.com/c/higgs-boson • 1785 teams, huge publicity • significant improvement on baseline • 18 month preparation, yet partially missing the target 26 DATA CHALLENGES
  • 27. Center for Data Science Paris-Saclay • Challenges are useful for • generating visibility in the data science community about novel application domains • benchmarking in a fair way state-of-the-art techniques on well-defined problems • finding talented data scientists • Limitations • not necessary adapted to solving complex and open-ended data science problems in realistic environments • emphasizes competition 27 DATA CHALLENGES
  • 28. 28 DESIGN AND INNOVATION STRATEGY WORKSHOPS
  • 29. Center for Data Science Paris-Saclay • Putting domain scientists, data scientists, and management scientist in the same room • Getting them understand each other • Keeping them collectively creative • The goal: identifying and defining projects • low-hanging fruits • breakthrough projects • long-term vision 29 DESIGN AND INNOVATION STRATEGY WORKSHOPS
  • 30. Center for Data Science Paris-Saclay innovative design = interaction and joint expansion of concepts and knowledge 30 DESIGN AND INNOVATION STRATEGY WORKSHOPS C/K design theory
  • 31. Center for Data Science Paris-Saclay31 DESIGN AND INNOVATION STRATEGY WORKSHOPS [P]$Project$ building! Ini3alisa3on$ [K]$Knowledge$ sharing$ Workshops$ [C]$IFM?Design$ Workshops$ [RUN]! DKCP process: linearizing C-K dynamics
  • 32. Center for Data Science Paris-Saclay32 TAKE HOME MESSAGE NO1 If you are interested in adapting any of these tools in your project/site, feel free to contact us, we would be happy to share our experience
  • 33. Center for Data Science Paris-Saclay • We need data engineers and trainers to support research • ideally: 75% research scientists, 25% research engineers • We are lucky in France that the position exists in public research • Tasks • integrating research code into general-purpose professional software (e.g., scikit-learn,Torch) • providing an interface between computational infrastructure (e.g., clouds) and scientists • training scientists to use the tools 33 WHAT CNRS CAN DO
  • 34. Center for Data Science Paris-Saclay34 WHAT CNRS CAN DO Incentives
  • 35. Center for Data Science Paris-Saclay35 WHAT CNRS CAN DO Most data scientists, as other scientists, are trained and incentivized to do research on highly specialized domains. They search scientific visibility in their international community, which is equally highly specialized, because their carrier advancement is almost entirely based on peer-reviewed publications. Even when they would have the expertise, they have little incentive to venture into the tool builder (data engineer) role since software authorship has little value in their evaluation, and it can only serve them implicitly through the visibility they gain in the community of tool users. By the same token, they have little incentive to venture into domain sciences and to tackle economic or societal challenges. It is possible that a domain science or an industrial project requires new techniques which then can be published in data science venues, but this is not guaranteed at all. It usually takes heavy investment of time and effort to be able to understand domain problems, so excursions into domain sciences are highly risky. Even when such collaborations are established, the data scientist has a strong prior to use his/her favorite methodology which is not necessarily the best solution for a given problem. Finally, the data scientist has little incentive to bring the project to full fruition, and he/she often “runs away” with an abstract data science problem (and solution) extracted from the project. Symmetrically, domain scientist and industrial domain experts have no incentive to advance data science and to develop and publish new techniques, as long as their data science problems get solved. When they venture into tool development, they have little incentive in developing general purpose tools.
  • 36. Center for Data Science Paris-Saclay Affirmative action 36 TAKE HOME MESSAGE NO2
  • 37. • Overweight (e.g, double count) out-of-domain papers in every evaluation • Make software tool ownership count Affirmative action 37 TAKE HOME MESSAGE NO2
  • 38. Center for Data Science Paris-Saclay38 TAKE HOME MESSAGE NO3 Co-locality is important. It would be desirable that data science for scientific data in France be grouped into 5-6 sites of similar size. Resources and experience can and should be shared.
  • 39. Center for Data Science Paris-Saclay39 THANK YOU!