The Role of Automated Function Prediction in
the Era of Big Data and Small Budgets
Philip E. Bourne Ph.D.
Associate Direct...
A View from the Funding Agencies
“It was the best of times, it was the
worst of times, it was the age of
wisdom, it was the age of foolishness,
it was the ...
Roughly translated…
A time of great (unprecedented?)
scientific development but limited
funding
A time of upheaval in the ...
From a funders perspective…
A time to squeeze every cent/penny to
maximize the amount of research that
can be done
A time ...
Top Down vs Bottom Up
 Top Down
– Regulations e.g. US:
Common Rule, FISMA,
HIPPA
– Data sharing policies
• GWAS
• Genome ...
A Time for New Models
Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
And This May Just be the Beginning
 Evidence:
– Google car
– 3D printers
– Waze
– Robotics
From: The Second Machine Age: ...
Consider This an Opportunity
 Look at the value of
data
 Derive new business
models
 Look for new
efficiencies
 Foster...
It is the age when functional
annotation is in the greatest demand
for science..
It is the age when the rewards outside
ac...
Associate Director for Data Science
Commons
Training
Center
BD2K
Modified
Review
Sustainability* Education* Innovation* Pr...
Innovation – Big Data to Knowledge
BD2K
 Centers of excellence
 Software catalog
 Data catalog
 Software initiatives
...
Sustainability and Sharing: The Commons
Data
The Long Tail
Core Facilities/HS Centers
Clinical /Patient
The Why:
Data Shar...
What The Commons Is and Is Not
 Is Not:
– A database
– Confined to one physical
location
– A new large
infrastructure
– O...
What Does the Commons Enable?
 Dropbox like storage
 The opportunity to apply quality metrics
 Bring compute to the dat...
[Adapted from George Komatsoulis]
One Possible Commons Business Model
HPC, Institution …
What Are the Benefits to Those Doing
Functional Annotation?
 Open environment in which to test new ideas – better
for cro...
Commons Pilots
 Define a set of use cases emphasizing:
– Openness of the system
– Support for basic statistical analysis
...
Some Acknowledgements
 Eric Green & Mark Guyer (NHGRI)
 Jennie Larkin (NHLBI)
 Leigh Finnegan (NHGRI)
 Vivien Bonazzi ...
NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
Upcoming SlideShare
Loading in …5
×

The Role of Automated Function Prediction in the Era of Big Data and Small Budgets

1,279 views
1,046 views

Published on

Keynote presentation at the Automated Functional Prediction Special Interest Group, ISMB Conference, July 11, 2014 Boston, MA USA.

Published in: Education, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,279
On SlideShare
0
From Embeds
0
Number of Embeds
85
Actions
Shares
0
Downloads
14
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Federal Information Security Management Act of 2002
    The Health Insurance Portability and Accountability Act of 1996
  • The Role of Automated Function Prediction in the Era of Big Data and Small Budgets

    1. 1. The Role of Automated Function Prediction in the Era of Big Data and Small Budgets Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health
    2. 2. A View from the Funding Agencies
    3. 3. “It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair …”
    4. 4. Roughly translated… A time of great (unprecedented?) scientific development but limited funding A time of upheaval in the way we do science
    5. 5. From a funders perspective… A time to squeeze every cent/penny to maximize the amount of research that can be done A time for when top down approaches meet bottom up approaches
    6. 6. Top Down vs Bottom Up  Top Down – Regulations e.g. US: Common Rule, FISMA, HIPPA – Data sharing policies • GWAS • Genome data • Clinical trials – Digital enablement – Moves towards reproducibility  Bottom Up – Communities emerge and crowdsource • Collaboration • Data shared • Open source software • Common principles • Standards
    7. 7. A Time for New Models Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
    8. 8. And This May Just be the Beginning  Evidence: – Google car – 3D printers – Waze – Robotics From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
    9. 9. Consider This an Opportunity  Look at the value of data  Derive new business models  Look for new efficiencies  Foster best practices  Foster collaboration  ….
    10. 10. It is the age when functional annotation is in the greatest demand for science.. It is the age when the rewards outside academia are greater than the rewards inside
    11. 11. Associate Director for Data Science Commons Training Center BD2K Modified Review Sustainability* Education* Innovation* Process • Cloud – Data & Compute • Search • Security • Reproducibility Standards • App Store • Coordinate • Hands-on • Syllabus • MOOCs • Community • Centers • Training Grants • Catalogs • Standards • Analysis • Data Resource Support • Metrics • Best Practices • Evaluation • Portfolio Analysis The Biomedical Research Digital Enterprise Communication Collaboration rogrammatic Theme Deliverable Example Features • IC’s • Researchers • Federal Agencies • International Partners • Computer Scientists Scientific Data Council External Advisory Board * Hires made
    12. 12. Innovation – Big Data to Knowledge BD2K  Centers of excellence  Software catalog  Data catalog  Software initiatives  Standards  Training bd2k.nih.gov
    13. 13. Sustainability and Sharing: The Commons Data The Long Tail Core Facilities/HS Centers Clinical /Patient The Why: Data Sharing Plans The Commons Government The How: Data Discovery Index Sustainable Storage Quality Scientific Discovery Usability Security/ Privacy Commons == Extramural NCBI == Research Object Sandbox == Collaborative Environment The End Game: KnowledgeNIH Awardees Private Sector Metrics/ Standards Rest of Academia Software Standards Index BD2K Centers Cloud, Research Objects,
    14. 14. What The Commons Is and Is Not  Is Not: – A database – Confined to one physical location – A new large infrastructure – Owned by any one group  Is: – A conceptual framework – Analogous to the Internet – A collaboratory – A few shared rules • All research objects have unique identifiers • All research objects have limited provenance
    15. 15. What Does the Commons Enable?  Dropbox like storage  The opportunity to apply quality metrics  Bring compute to the data  A place to collaborate  A place to discover http://100plus.com/wp-content/uploads/Data-Commons-3- 1024x825.png
    16. 16. [Adapted from George Komatsoulis] One Possible Commons Business Model HPC, Institution …
    17. 17. What Are the Benefits to Those Doing Functional Annotation?  Open environment in which to test new ideas – better for crowdsourcing  Opportunity to gain resources to run annotation pipelines  Opportunity to collaborate through provision of open APIs  Better characterization and accessibility to annotation methods
    18. 18. Commons Pilots  Define a set of use cases emphasizing: – Openness of the system – Support for basic statistical analysis – Embedding of existing applications – API support into existing resources  Evaluate against the use cases  Review results & business model with NIH leadership  Design a pilot phase with various groups  Conduct pilot for 6-12 months  Evaluate outcomes and determine whether a wider deployment makes sense  Report to NIH leadership summer 2015
    19. 19. Some Acknowledgements  Eric Green & Mark Guyer (NHGRI)  Jennie Larkin (NHLBI)  Leigh Finnegan (NHGRI)  Vivien Bonazzi (NHGRI)  Michelle Dunn (NCI)  Mike Huerta (NLM)  David Lipman (NLM)  Jim Ostell (NLM)  Andrea Norris (CIT)  Peter Lyster (NIGMS)  All the over 100 folks on the BD2K team
    20. 20. NIHNIH…… Turning Discovery Into HealthTurning Discovery Into Health

    ×