Your SlideShare is downloading. ×
BEACON's Cyberinfrastructure Needs
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

BEACON's Cyberinfrastructure Needs

359
views

Published on

Our slides from an NSF meeting on computing needs in biology.

Our slides from an NSF meeting on computing needs in biology.

Published in: Education, Technology

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
359
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. BEACON Center for the Study of Evolution in Action An NSF Science and Technology CenterHeadquartered at Michigan State University Funded in 2010, at $25 million for first five years – Celebrating our second anniversary in August, funding expected through 2020 5 Partner Universities, 131 Faculty Members 42 Postdocs – 180 Graduate/47 UG Students; 445 people total Diverse Research – Microbiology, Robotic Swarms, Genetic Algorithms, Zoology, Computational Evolution, Plant Biology, and many other areas
  • 2. MISSION: Illuminating and harnessing the power of evolution in action to advance science and technology and benefit society Three cross-cutting themes of BEACON researchBiological Evolution Digital Evolution Evolutionary Applications
  • 3. Extreme compute: Avida• Avida: digital model for evolution Self-replicating computer programs• 100k CPU days per PhD thesis• ~100 GB of data per run to analyze Much less to preserve/archive• Low memory: < 1 GB of RAM/run• GPGPU not useful• Data archiving: community & university standards• Bottom line: Fairly traditional compute use (more cores, mo’ better!)
  • 4. Extreme data: NGS data• Sequencing non-model organisms & communities – Soil metagenomics – Non-model animal transcriptomics• $10k sequencing/week => 1 TB of data Assembly requires 2x bigmem machine-weeks (512+ GB of RAM)• RAM and I/O limited.• “Big graph” problem with no locality Not easily distributable.• Estimate 5-50 Tbp of sequence needed/sample ~1m genomes/sample …multiple samples/thesis.• Community & university data archiving stds. Growth in sequencing capacity is outpacing Moore’s Law; new algorithmic approaches needed.
  • 5. Our efforts1. Training - Biology has become data-intensive quite quickly! - Most biologists are not trained in effective use of computation. - Grad students are extremely motivated! - We run intro courses & many focused workshops: Intro grad course; Software Carpentry (Sloan); Analyzing Next- Gen Sequencing Data (NIH); metagenomics.2. Well-integrated layer of “cyberinfrastructure research” - Faculty research programs, labs incorporate development of robust community of software for modeling, simulation, data analysis. - Algorithm research is tightly integrated with biological research programs; e.g. novel compression approaches provide significant leverage on next-gen sequencing problems. - Exploration & adaptation to loosely coupled, poor I/O platforms (i.e. the Amazon cloud) to enable flexible extension of compute capacity. - …underappreciated, underfunded.