Data Science Governance Series
How to organize a center for
multi core - multi project
Health Data Science
Abbas Shojaee MD
Associate Research Scientist - Healthcare Data Science
Certified Project Manager / Software Architect
Center for Outcomes Research and Evaluation
Yale University
July 2013
A few notes
• This presentation proposes the team structure for a successful multi project data
science practice
• Please find other topics for data science project planning, life cycle management,
risk management, activity chart, team interaction guidelines , result
dissemination and post implementation in my other presentations
• This is not meant to be inclusive
• This design is based on my experiences in managing analytics and scientific
computation Research and Development efforts in industry and in academia
• I combined concepts from research management methodology, software project
management methodology, enterprise analysis to devise this structure
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Multi core - Multi project
Data Science
Team Structure
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Overview
• As in every other project, in data science research a team structure is
necessary
• This team structure can scale up and down and get adapted to small
startups or to big industries. Adapting to different environments is
not covered in this presentation.
• Team structure sets up clear accountability and shared responsibility
• A two layer, flat, flexible structure for maximizing engagement of
available human resources in a matrix project structure.
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Some presumptions
• Data Science is a team work activity due to its interdisciplinary nature
• The team structure is reversed. Team structure forms on top of the
project lead.
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Team structure
• Two layer structure
• Data Science Project layer
• Data scientist or data science practitioner as the seed of project plus 1 or 2 assistants or
trainees.
• Reference layer
• Provides administrative support, domain expertise, scientific computation expertise and
data definition expertise.
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Research Center
Reference Groups
Data
Healthcare
Modeling
Writing &
publishing
Disseminating &
community building
Administration &
Project management
Data science
practitioners
6
1
2
3
4
5
n
Modeling
Healthcare
Data
Writing
Administration
Dissemination
Team structure
Abbas Shojaee MD – Data Science Governance in Healthcare -
part 1 - June 2013
Team structure
• Data scientist (DS) will work as project seed.
• Data scientist would define the interesting clinical question, the required
dataset that may be used and would propose the computational approach.
• Data scientist will discuss his/her approach with different reference groups
and will attract people from each reference group that are interested to
help proposed project.
• Data scientist will propose up to 3 projects to Center for Data Science, in
formal proposals.
• Data scientist is in charge to define the project plan and to submit timely
progress reports
• Center for Data Science will assign a project manager, who will monitor the
progress, ensures the proper level of team work, monitors progress and
informs Center for Data Science of direct and indirect costs
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Healthcare Data Scientist
• Understanding of healthcare system or biomedical sciences
• Expert in computational techniques
• Hands on experience with data extraction, transforming and loading
technologies
• Self motivated, work in autonomy, ability to define the combined
question in healthcare data science.
• Each Data scientist initiates 1-3 parallel projects.
• Data scientists will have trainees/ assistances on their projects :
• To make research projects sustainable by making human capital
• To boost productivity of project seeds to use the most of their abilities.
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Reference Groups
Reference groups
• Center for Data Science forms reference groups by inviting, informing
and organizing experts from other internal/ external departments or
experts from other universities or industry
• Each of the reference groups will have a defined role, task list, regular
meetings and would support the rest of team by providing
consultation and scientific lectures.
• Members of reference group will be members of different DS teams.
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Clinical team: Biomedical subject matter team
• Mission:
• Enables other teams to comprehend and use meaning of medial data
• Answers questions about translating healthcare domain knowledge to specific concepts that have an identity in some vocabulary
• Resolves/ bridges vocabulary, ontology conflicts/ gaps
• Maintains the set (and map ) of reference ontologies in team
• Takes required steps to ensure best practice of ontology usage among the team.
• knows about or required team expertise:
• medicine
• public health
• health economy
• healthcare delivery
• outcome research
• health informatics
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Modeling & Algorithm subject matter team
• Mission:
• Algorithm coherency among the team
• Oversees potentials of synergism among the teams based on algorithm connections
• Takes required steps to ensure algorithm reuse.
• Brainstorms on new and ground breaking algorithm research on major problems
• Identifies and introduces promising or flourishing algorithms
• knows about:
• Different algorithms and their application
• Algorithm time-space complexity
• Available libraries
• May provide general advice for, OR contribute to:
• choosing or developing proper algorithm for specific clinical problem
• customize or enhance existing algorithm
• will help others as Computation designers
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Data subject matter team
• Mission:
• Maintains information about available datasets
• Traces new open & close datasets
• Suggests potentials for integration
• Maintains reusable code for data conversion preparation and reshaping
• Practice efficient techniques for document retrieval & information extraction
• Looks for incorporation of unusual data sources.
• knows about
• Computer science
• Databases & Data formats
• Database management systems
• Efficient storage and data conversion
• Information extraction
• Healthcare interoperability standards
• Structure
• Consists of Center for Data Science statisticians.
• Will have a subgroup of data practitioners that is:
• consisted of a dynamic pool of 2-3 undergraduate or graduate students who are working in data extraction, transformation and
loading.
• Mission: to facilitate data reshaping and curation for the entire Center for Data Science and DS teams
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Documentary & Dissemination team
• Mission:
• To keep track of expertise and people
• To build online presence and build up community around the work as a matter of creating human and social capital, in order to keep
projects sustainable and live.
• To keep documentation, keep backups and manage updates of developed opensource work in a reusable manner.
• About:
• Knows about team building
• Knows about science social networking and community building
• Knows about online presence
• Knows about software documentary
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
Project management and administrative reference group
• Mission of reference group:
• Orchestrates all teams toward over all mission.
• Ensures a coherent and concordant implementation.
• Keeps project master plan
• Keeps track of fund sources and opportunities.
• Mission of each member:
• Works in DS teams as Center for Data Science advocate
• Monitor the progress,
• Keeps Center for Data Science informed of project progress and direct & indirect project costs
• Tries to identify teamwork or other problems earlier than later proactively.
• Works with Center for Data Science as DS team advocate
• Knows about:
• Matrix project management
• Software project management
• Resource planning
• Readiness management
• ensures the proper level of team work, monitors progress and informs Center for Data Science of direct and indirect costs
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
My other related presentations
• Data science project planning
• Life cycle management
• Risk management and reasons for low productivity data science
• Activity chart
• Team interaction guidelines
• Result dissemination
• Post implementation phase
Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013

Data Science Governance in Healthcare

  • 1.
    Data Science GovernanceSeries How to organize a center for multi core - multi project Health Data Science Abbas Shojaee MD Associate Research Scientist - Healthcare Data Science Certified Project Manager / Software Architect Center for Outcomes Research and Evaluation Yale University July 2013
  • 2.
    A few notes •This presentation proposes the team structure for a successful multi project data science practice • Please find other topics for data science project planning, life cycle management, risk management, activity chart, team interaction guidelines , result dissemination and post implementation in my other presentations • This is not meant to be inclusive • This design is based on my experiences in managing analytics and scientific computation Research and Development efforts in industry and in academia • I combined concepts from research management methodology, software project management methodology, enterprise analysis to devise this structure Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 3.
    Multi core -Multi project Data Science Team Structure Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 4.
    Overview • As inevery other project, in data science research a team structure is necessary • This team structure can scale up and down and get adapted to small startups or to big industries. Adapting to different environments is not covered in this presentation. • Team structure sets up clear accountability and shared responsibility • A two layer, flat, flexible structure for maximizing engagement of available human resources in a matrix project structure. Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 5.
    Some presumptions • DataScience is a team work activity due to its interdisciplinary nature • The team structure is reversed. Team structure forms on top of the project lead. Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 6.
    Team structure • Twolayer structure • Data Science Project layer • Data scientist or data science practitioner as the seed of project plus 1 or 2 assistants or trainees. • Reference layer • Provides administrative support, domain expertise, scientific computation expertise and data definition expertise. Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 7.
    Research Center Reference Groups Data Healthcare Modeling Writing& publishing Disseminating & community building Administration & Project management Data science practitioners 6 1 2 3 4 5 n Modeling Healthcare Data Writing Administration Dissemination Team structure Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 8.
    Team structure • Datascientist (DS) will work as project seed. • Data scientist would define the interesting clinical question, the required dataset that may be used and would propose the computational approach. • Data scientist will discuss his/her approach with different reference groups and will attract people from each reference group that are interested to help proposed project. • Data scientist will propose up to 3 projects to Center for Data Science, in formal proposals. • Data scientist is in charge to define the project plan and to submit timely progress reports • Center for Data Science will assign a project manager, who will monitor the progress, ensures the proper level of team work, monitors progress and informs Center for Data Science of direct and indirect costs Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 9.
    Healthcare Data Scientist •Understanding of healthcare system or biomedical sciences • Expert in computational techniques • Hands on experience with data extraction, transforming and loading technologies • Self motivated, work in autonomy, ability to define the combined question in healthcare data science. • Each Data scientist initiates 1-3 parallel projects. • Data scientists will have trainees/ assistances on their projects : • To make research projects sustainable by making human capital • To boost productivity of project seeds to use the most of their abilities. Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 10.
  • 11.
    Reference groups • Centerfor Data Science forms reference groups by inviting, informing and organizing experts from other internal/ external departments or experts from other universities or industry • Each of the reference groups will have a defined role, task list, regular meetings and would support the rest of team by providing consultation and scientific lectures. • Members of reference group will be members of different DS teams. Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 12.
    Clinical team: Biomedicalsubject matter team • Mission: • Enables other teams to comprehend and use meaning of medial data • Answers questions about translating healthcare domain knowledge to specific concepts that have an identity in some vocabulary • Resolves/ bridges vocabulary, ontology conflicts/ gaps • Maintains the set (and map ) of reference ontologies in team • Takes required steps to ensure best practice of ontology usage among the team. • knows about or required team expertise: • medicine • public health • health economy • healthcare delivery • outcome research • health informatics Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 13.
    Modeling & Algorithmsubject matter team • Mission: • Algorithm coherency among the team • Oversees potentials of synergism among the teams based on algorithm connections • Takes required steps to ensure algorithm reuse. • Brainstorms on new and ground breaking algorithm research on major problems • Identifies and introduces promising or flourishing algorithms • knows about: • Different algorithms and their application • Algorithm time-space complexity • Available libraries • May provide general advice for, OR contribute to: • choosing or developing proper algorithm for specific clinical problem • customize or enhance existing algorithm • will help others as Computation designers Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 14.
    Data subject matterteam • Mission: • Maintains information about available datasets • Traces new open & close datasets • Suggests potentials for integration • Maintains reusable code for data conversion preparation and reshaping • Practice efficient techniques for document retrieval & information extraction • Looks for incorporation of unusual data sources. • knows about • Computer science • Databases & Data formats • Database management systems • Efficient storage and data conversion • Information extraction • Healthcare interoperability standards • Structure • Consists of Center for Data Science statisticians. • Will have a subgroup of data practitioners that is: • consisted of a dynamic pool of 2-3 undergraduate or graduate students who are working in data extraction, transformation and loading. • Mission: to facilitate data reshaping and curation for the entire Center for Data Science and DS teams Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 15.
    Documentary & Disseminationteam • Mission: • To keep track of expertise and people • To build online presence and build up community around the work as a matter of creating human and social capital, in order to keep projects sustainable and live. • To keep documentation, keep backups and manage updates of developed opensource work in a reusable manner. • About: • Knows about team building • Knows about science social networking and community building • Knows about online presence • Knows about software documentary Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 16.
    Project management andadministrative reference group • Mission of reference group: • Orchestrates all teams toward over all mission. • Ensures a coherent and concordant implementation. • Keeps project master plan • Keeps track of fund sources and opportunities. • Mission of each member: • Works in DS teams as Center for Data Science advocate • Monitor the progress, • Keeps Center for Data Science informed of project progress and direct & indirect project costs • Tries to identify teamwork or other problems earlier than later proactively. • Works with Center for Data Science as DS team advocate • Knows about: • Matrix project management • Software project management • Resource planning • Readiness management • ensures the proper level of team work, monitors progress and informs Center for Data Science of direct and indirect costs Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013
  • 17.
    My other relatedpresentations • Data science project planning • Life cycle management • Risk management and reasons for low productivity data science • Activity chart • Team interaction guidelines • Result dissemination • Post implementation phase Abbas Shojaee MD – Data Science Governance in Healthcare - part 1 - June 2013