Jeremy Yang
                    Software Systems Manager
                    Division of Biocomputing
                    ...
I. What is Biocomputing?
 II. Cyber Revolution (~1980-2010+)
III. Cyberinfrastructure (To be or not to be?)
IV. Super Comp...
Division of Biocomputing
       http://biocomp.health.unm.edu/
Department of Biochemistry & Molecular Biology
            ...
    Biomolecular screening     Data mining, machine
     informatics                 learning
    Cheminformatics      ...
Larry Sklar, et al., UNMCMD (NIH Roadmap)




                               ~$20M NIH awarded to date
 32 cpu Linux cluster          2+ Oracle instances
 32GB RAM server               PostgreSQL, MySQL


 Linux: OpenSUS...
Virtual chemistry; property prediction, chemspace
navigation, computer aided molecular design, graph
                  the...
 Nucleotide and protein sequence analysis
 Genomics, proteomics


 Merging with chemical biology, etc.
 Computational search for likely
biological actives                               Example:
                              ...
atoms, bonds, surfaces, fields, interactions, stereo




     serotonin




                                  hemoglobin
Computational models for protein-ligand binding


           Abl kinase
           (1iep.pdb)‫‏‬
                         ...
(Watch movies...)



PyMol movie:

http://video.google.com/videoplay?
docid=-5859274887925224981#




Jmol interactive DNA...
E.g., Searching NIH PubChem for non-selectivity
Many biomedical data sources worldwide




                                 SLIDE 15 (15 MIN?)
Division of Biocomputing in 2008
 Rapid change, challenge and opportunity
 Learning from history, trends (new not enough)


 Winners and losers


 Scie...
 Rapid change, challenge and opportunity
 Learning from history, trends


 Winners and losers


 Science, experts have...
1977: Atari 2600
1978: Space Invaders
1981: IBM-PC (MS-DOS)
1983: cellphone
1983: GNU Project
1984: Neuromancer,
  William...
1985: Oracle 5 (client-server)
1989: Intel 486 Pentium (1M
  transistors, 50MHz)
1990: MS Windows 3.0
1990: WWW (Berners-L...
1993: Jurassic Park (via SGI)
1993: NCSA Mosaic
1994: Netscape Navigator
1994: “Good Times” hoax
1994: Match.com
1995: “Co...
1995: Amazon.com
1995: My mother gets email
1997: Google
1997: eBay
1999: Melissa virus (Outlook)
1999: Napster (p2p)
2000...
2000: 802.11b wireless
2001: Apple iPod
2001: Apple iTunes
2001: Wikipedia
2003: Skype
2005: YouTube
2005: Rio power grid ...
2006: Amazon Cloud
2007: DOD hacked
2008: 70M USA broadband*
2009: Cyberdefense USA priority
2009: Twitter role in Iran el...
The dotted line keeps moving...

Case study: database cheminformatics in
     pharma research, 1990→2000.
 In 1990, high speed chemical searching was
beyond standard capabilities.
 Research groups managed local servers in

the...
Standard      cocaine
               functions:
             substructure,
               similarity,
                iden...
(1) office equipment
(2) lab equipment
(3) experimental apparatus
(4) the experiment
(5) a commodity
(6) custom configured...
(1) office equipment
(2) lab equipment
(3) experimental apparatus
(4) the experiment
(5) a commodity
(6) custom configured...
 Scientific software
 Computational science


 Commodity software


 Engineering enables science


 Science requires ...
 Scientific research       Scientific software for
                           experts
 Computational research
         ...
IT: “Poorly managed       Research: “We need
computers and needy ill-    power, flexibility and
  trained users put the   ...
And with other cyberfolks too. And with great
                   results.
 In ~5 yrs, super → un-super
 Super computing? Define computer.


 Advances from unexpected places:


           gamin...
Advances from unexpected places...
Colossus code breaking computer, UK.
Eniac computer, Univ of Pennsylvania.
Cray computer
SLIDE 40 (40 MIN?)
High performance (super) computing is pushing the current limits.
This is what a “computer” looks like.
“The network is the computer.” - John Gage (Sun, NetDay founder)
Corollaries:
 The network is the (semantic) database


 The network is cyberspace


 The network is us too
 Super users → super computing
  Blackbox AI/monolith paradigm limiting


  Human/computer co-evolution




        Cyt...
“Super Computers” @ Division of Biocomputing
 Tudor Oprea
 Cristian Bologa


 Stephen Mathias


 Oleg Ursu
           ...
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Cyberinfrastructure Day 2010: Applications in Biocomputing
Upcoming SlideShare
Loading in...5
×

Cyberinfrastructure Day 2010: Applications in Biocomputing

1,166
-1

Published on

UNM Cyberinfrastructure Day 2010 presentation: Applications in Biocomputing, biomedical and cheminformatics research computing cyberinfrastructure issues.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,166
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
58
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Cyberinfrastructure Day 2010: Applications in Biocomputing

  1. 1. Jeremy Yang Software Systems Manager Division of Biocomputing Dept. of Biochemistry & Molecular Biology UNM School of Medicine Cyberinfrastructure Day -- April 22, 2010
  2. 2. I. What is Biocomputing? II. Cyber Revolution (~1980-2010+) III. Cyberinfrastructure (To be or not to be?) IV. Super Computing, Redefined
  3. 3. Division of Biocomputing http://biocomp.health.unm.edu/ Department of Biochemistry & Molecular Biology School of Medicine Also affiliated with the NIH Roadmap-funded UNM Center for Molecular Discovery
  4. 4.   Biomolecular screening  Data mining, machine informatics learning   Cheminformatics   3D visualization   Bioinformatics   Public data integration   Genomics   Collaborations in   Virtual screening chemistry, biology, medicine, comp sci   Molecular modeling   BIOMED 505 course   SAR (Structure- Activity-Relationship)   Software development, management, deployment & support
  5. 5. Larry Sklar, et al., UNMCMD (NIH Roadmap) ~$20M NIH awarded to date
  6. 6.  32 cpu Linux cluster  2+ Oracle instances  32GB RAM server  PostgreSQL, MySQL  Linux: OpenSUSE, CentOS,  Stereo graphics RedHat, Fedora, Ubuntu workstation  SGI/IRIX  25+ scientific software  Windows, Mac OS X packages  Automated integration with  Supported in-house NIH databases applications We are cyberinfrastructure users and providers!
  7. 7. Virtual chemistry; property prediction, chemspace navigation, computer aided molecular design, graph theory, databases
  8. 8.  Nucleotide and protein sequence analysis  Genomics, proteomics  Merging with chemical biology, etc.
  9. 9.  Computational search for likely biological actives Example: 3D shape search;  Database may be real or virtual prozac & paxil compounds  2D and 3D methods  2D similarity search  3D similarity search (shape, pharmacophore)  docking (3D, protein binding site) c/o OpenEye Rocs
  10. 10. atoms, bonds, surfaces, fields, interactions, stereo serotonin hemoglobin
  11. 11. Computational models for protein-ligand binding Abl kinase (1iep.pdb)‫‏‬ interaction potentia hydrophobic (green hbond acceptors (r Gleevec in binding site Gleevec is a leukemia drug known to bind with Abl kinase.
  12. 12. (Watch movies...) PyMol movie: http://video.google.com/videoplay? docid=-5859274887925224981# Jmol interactive DNA modeling demo: http://chemapps.stolaf.edu/pe/protexpl/htm/top.htm? id=1d66&&&chpa=true Expert users can advance understanding via rich, dynamic, visual interfaces.
  13. 13. E.g., Searching NIH PubChem for non-selectivity
  14. 14. Many biomedical data sources worldwide SLIDE 15 (15 MIN?)
  15. 15. Division of Biocomputing in 2008
  16. 16.  Rapid change, challenge and opportunity  Learning from history, trends (new not enough)  Winners and losers  Science, experts have led and followed.  ~1980-2010 covers 3σ (99.7%)  And evolution...
  17. 17.  Rapid change, challenge and opportunity  Learning from history, trends  Winners and losers  Science, experts have led and followed.  ~1980-2010 covers 3σ (99.7%)  And evolution...
  18. 18. 1977: Atari 2600 1978: Space Invaders 1981: IBM-PC (MS-DOS) 1983: cellphone 1983: GNU Project 1984: Neuromancer, William Gibson, “cyberspace” 1984: Apple Mac, mouse, windows & icons
  19. 19. 1985: Oracle 5 (client-server) 1989: Intel 486 Pentium (1M transistors, 50MHz) 1990: MS Windows 3.0 1990: WWW (Berners-Lee) 1991: High Perf Comp & Comm Act (Al Gore) 1991: Linux (Linux Torvalds) 1991: AOL 1991: ETrade
  20. 20. 1993: Jurassic Park (via SGI) 1993: NCSA Mosaic 1994: Netscape Navigator 1994: “Good Times” hoax 1994: Match.com 1995: “Concept” virus (Word) 1995: Internet Explorer 1995: Apache project 1995: Yahoo!
  21. 21. 1995: Amazon.com 1995: My mother gets email 1997: Google 1997: eBay 1999: Melissa virus (Outlook) 1999: Napster (p2p) 2000: MS convicted 2000: 3M USA broadband* 2000: dot-com bubble pops *Fixed non dial-up internet connections >56k (FCC).
  22. 22. 2000: 802.11b wireless 2001: Apple iPod 2001: Apple iTunes 2001: Wikipedia 2003: Skype 2005: YouTube 2005: Rio power grid hacked 2005: NSA domestic surveillance 2006: Facebook
  23. 23. 2006: Amazon Cloud 2007: DOD hacked 2008: 70M USA broadband* 2009: Cyberdefense USA priority 2009: Twitter role in Iran election protests 2010: UAVs are SOPs 2011: Cyber terrorism? *Fixed non dial-up internet connections >56k (FCC).
  24. 24. The dotted line keeps moving... Case study: database cheminformatics in pharma research, 1990→2000.
  25. 25.  In 1990, high speed chemical searching was beyond standard capabilities.  Research groups managed local servers in their labs & specialized DB engines (e.g. Daylight Inc.).  By 2000, this function had moved to IT (via Oracle cartridges, etc.) corporate informatics infrastructure  Transition not smooth, but very beneficial.
  26. 26. Standard cocaine functions: substructure, similarity, identity chemical searching imidazoles
  27. 27. (1) office equipment (2) lab equipment (3) experimental apparatus (4) the experiment (5) a commodity (6) custom configured experimental vehicle for exploration (5) all of the above
  28. 28. (1) office equipment (2) lab equipment (3) experimental apparatus (4) the experiment (5) a commodity (6) custom configured experimental vehicle for exploration (5) all of the above
  29. 29.  Scientific software  Computational science  Commodity software  Engineering enables science  Science requires agile development, high performance, experimentation, risk taking, play.  Cyberinfrastructure users and developers/maintainers SLIDE 30 (30 MIN?)
  30. 30.  Scientific research  Scientific software for experts  Computational research  Enabling software for  High performance scientists computing as a research tool  Commoditization (e.g. cloud computing)  High performance infrastructure as a  Plumbing vs. productivity tool experimental apparatus  Appropriate tiers and domains
  31. 31. IT: “Poorly managed Research: “We need computers and needy ill- power, flexibility and trained users put the access and not another system at risk.” lame PC.”
  32. 32. And with other cyberfolks too. And with great results.
  33. 33.  In ~5 yrs, super → un-super  Super computing? Define computer.  Advances from unexpected places:   gaming, movies (graphics -- vs. AI)   social networking (crowdsourcing)   even business (web standards, UIs, security)  Super computing is pushing the current limits  But where are the key frontiers?
  34. 34. Advances from unexpected places...
  35. 35. Colossus code breaking computer, UK.
  36. 36. Eniac computer, Univ of Pennsylvania.
  37. 37. Cray computer
  38. 38. SLIDE 40 (40 MIN?)
  39. 39. High performance (super) computing is pushing the current limits.
  40. 40. This is what a “computer” looks like.
  41. 41. “The network is the computer.” - John Gage (Sun, NetDay founder)
  42. 42. Corollaries:  The network is the (semantic) database  The network is cyberspace  The network is us too
  43. 43.  Super users → super computing  Blackbox AI/monolith paradigm limiting  Human/computer co-evolution Cytoscape biological network visualizer with drug - target interactions
  44. 44. “Super Computers” @ Division of Biocomputing  Tudor Oprea  Cristian Bologa  Stephen Mathias  Oleg Ursu Happy Earth Day!  Jerome Abear  Ramona Curpan  Liliana Halip Jeremy Yang  Andrei Leitao jjyang@salud.unm.edu Cyberinfrastructure Day -- April 22, 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×