How to use BioJavato calculate one billion protein structure alignments at                the RCSB PDB website            ...
My Two Hats   RCSB PDB    BioJava
Number of released entries                                    www.pdb.org                                                 ...
JmolSome of the things you can do at the  RCSB PDB site • Advanced queries                Custom                          ...
www.pdb.org Systematic Structural Alignment Objective: Find novel relationshipsExample: Green FluorescentProtein§ Nidogen...
Open Science Grid   based on the FATCAT (rigid) algorithm      Yuzhen Ye & Adam Godzik. Flexible structure alignment by ch...
Java Clients can                                     run anywhere      Custom JobPDB   Management           Sends out inst...
Initial calculation of frozen                             snapshot of PDB                            ~170k CPU hours      ...
BioJava• Major rewrite - BioJava 3
BioJava 1   BioJava 3  core data modelsymbols/alphabets, counts, distributions Genome/sequencing    Mult. seq. alignStruct...
Acknowledgments  RCSB PDB                        BioJava  •   Spencer Bliven        •   all contributors  •   Peter Rose  ...
Upcoming SlideShare
Loading in …5
×

A Prlic - BioJava update

860 views
734 views

Published on

Presentation by Prlic at BOSC2012 "BioJava Update"

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
860
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

A Prlic - BioJava update

  1. 1. How to use BioJavato calculate one billion protein structure alignments at the RCSB PDB website Andreas Prlić
  2. 2. My Two Hats RCSB PDB BioJava
  3. 3. Number of released entries www.pdb.org OverviewYear
  4. 4. JmolSome of the things you can do at the RCSB PDB site • Advanced queries Custom report • Custom reports • Visualization • Education section • Comparisons across PDB, based on sequence and 3D structure similarities Ligand Explorer
  5. 5. www.pdb.org Systematic Structural Alignment Objective: Find novel relationshipsExample: Green FluorescentProtein§ Nidogen-1: similar 11-stranded§ beta-barrel and internal helices§ 3 Å RMSD, only 9% sequence identity§ Nidogen-1: component of basementmembrane, no chromophore§ GFP and NID-1 may share commonancestor
  6. 6. Open Science Grid based on the FATCAT (rigid) algorithm Yuzhen Ye & Adam Godzik. Flexible structure alignment by chaining aligned fragment pairs allowing twists. 2003. Bioinformatics vol.19 suppl. 2. ii246-ii255. Systematic comparisons of representative chains from 40% sequence identity clusters 22000 sequence clusters 33000 representative domains
  7. 7. Java Clients can run anywhere Custom JobPDB Management Sends out instructions Open to clients Science Grid . Writes results to disk . .
  8. 8. Initial calculation of frozen snapshot of PDB ~170k CPU hours on OSG Incremental weekly updates (~1-2 million alignments) <1000 CPU hours1 billion alignments available freely at www.rcsb.org Code www.biojava.org
  9. 9. BioJava• Major rewrite - BioJava 3
  10. 10. BioJava 1 BioJava 3 core data modelsymbols/alphabets, counts, distributions Genome/sequencing Mult. seq. alignStructure alignment Modfinder AA Properties Protein Disorder Hmmer3 WS NCBI WS Parsers: Genbank/Embl/Blast
  11. 11. Acknowledgments RCSB PDB BioJava • Spencer Bliven • all contributors • Peter Rose • A.Yates, J. Jacobsen, P. Troshin, M. Chapman, J. • Phil Bourne Gao, C.H. Koh, S. Foisy, R. Holland, G. Rimsa, M. Heuer, H. Brandstaetter- Mueller, S. Willis RCSB PDBFunding Google Summer of Code Open Science Grid

×