Server daTAA:  http://toolkit.tuebingen.mpg.de/dataa Paweł Szczęsny  MPI for Developmental Biology, Tuebingen, Germany  In...
Internal complexity of TAAs MFWMCFVIFFIGEFIMKKLSVTSKRQYNLYASPISRRLSLLMKL SLETVTVMFLLGASPVLA / SNLALTG AKNLSQNSPGVNYSKGSHGS...
Automated vs manual annotation Coverage of annotation Domain type PFAM manually Present in PFAM 28% 35% Not present in PFA...
Automated vs manual annotation Coverage of annotation Domain type PFAM daTAA manually Present in PFAM 28% 32% 35% Not pres...
Prediction of individual repeats in YadA |----------Hep_Hag---------|---------Hep_Hag- |---Ylhead---|---Ylhead----|----- A...
 
 
 
 
 
Key points <ul><li>Approach of human annotator implemented in a computer system </li></ul><ul><li>Improvement in coverage ...
Acknowledgements <ul><li>MPI for Developmental Biology </li></ul><ul><li>Institute of Biochemistry and Biophysics PAS </li...
Upcoming SlideShare
Loading in …5
×

daTAA server

1,202 views
1,165 views

Published on

My talk about domain annotation in trimeric autotransporter adhesins.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,202
On SlideShare
0
From Embeds
0
Number of Embeds
24
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

daTAA server

  1. 1. Server daTAA: http://toolkit.tuebingen.mpg.de/dataa Paweł Szczęsny MPI for Developmental Biology, Tuebingen, Germany Institute of Biochemistry and Biophysics PAS, Warsaw, Poland
  2. 2. Internal complexity of TAAs MFWMCFVIFFIGEFIMKKLSVTSKRQYNLYASPISRRLSLLMKL SLETVTVMFLLGASPVLA / SNLALTG AKNLSQNSPGVNYSKGSHGSIVLSGDDDFCGADYVLGRGGNSTVRNGIPISVEEEYERFVKQKLMNNATSPYSQSSEQQVWTGDGLTSKGSGYMGGKSTDGDKNIL PE A Y G IY------------------------- SFATG CG S S A Q G NY------------------------- SVAFG AN A T A L T GG------------------------- S Q AFG VA A L A S G RV------------------------- SVAIG VG S E A T G EA------------------------- GVSLG GL S K A A G AR------------------------- SVAIG TR A N A Y G EE------------------------- SIAIG GGLKQGSDNKIGS A V A Q G LK------------------------- AISIG SD S V G FQHY------------------------- AVAIG AK S R A LLLK------------------------- SVALG SY S V A DVDAGVR GYDP VEDEPSKNVSFVWKSSVG AVSVG NRKEGLTRQ IIGVAAG---TEDTDAVNVAQL KALR:GMISEK|G GW NLTVNNDNNTVVSSGGALDLSSGSKNLKIAKDGKKNNVTFDVARDL TL KSIKLDGVTLNETGLFIANGPQITAS GIN AGSQK ITGVAEG---TDANDAVNFGQL ----------------------------------------------------------------------------------- KKI|ETEVKE -----QVA A SGFV KQD SDTK: YLTIGKDTDGDTINIANNKSDKRT LMGIKEGDISKDSSEAITGSQ L FT T NQN V KT V SDN L QT A ATN I AK T FGG DAKYE-DGEWTAPTFKVKTVTGEGKE-EEKT YQNVADALAGV GSS I TN V Q-------NK V TEQ V NNA IT--KVE G DALL WSDEANAFVAR H EKSKLEKGASKATQENSK ITYLLDGDVSKDSTDAITGKQ L YSLGD--------------KIASY LGG NAKYE-NGEWTAPTFKVKTVKEDGKE-EEQT YHNVAAAFEGV GTS F TN V K-------NE I TKQ I NHL ----QSD D SAVV HYD KDDK- NGSINYASVTLGKGKDSAAVT LHNVAAGNIAKDSHDAINGSQ I YSLNE--------------QLATY FGG GAGYNKEGKWTAPTFTVKTVKEDGEE-EEKT YQNVAEALTGV GTS F TN I K-------SE I TKQ I ANE IS--NVT G DSLV KKD LDTN LITIGKEVAGTEINIASVSKADRT LSGVKEA---VKDNEAVNKGQ L ------------------------ --- ------------------------------------------ --- - -- - ---------- - DKG L KHL SDSLQSE D SAVV HYD KKTDE TGGINYTSVTLG-GKDKTPVA LHNVADGSISKDSHDAINGGQ I HTIGE--------------DVAKF LGG AASFN-NGAFTGPTYKLSNIDAKGDV-QQSE FKDIGSAFAGL DTN I KN V NNN V TNK F NE L TQN I TNV TQ--QVK G DALL WSDEANAFVAR H EKSKLGKGASKATQENSK ITYLLDGDVSKDSTDAITGKQ L YSLGD--------------KIASY LGG NAKYE-DGEWTAPTFKVKTVKEDGKE-EEKT YQNVAEALTGV GTS F TN V K-------NE I TKQ I NHL ----QSD D SAVV HYD KNKDE TGGINYASVTLGKGKDSAAVT LHNVADGSISKDSRDAINGSQ I YSLNE--------------QLATY FGG GAKYE-NGQWTAPIFKVKTVKEDGEE-EEKT YQNVAEALTGV GTS F TN I K-------SE I TKQ I ANE IS--SVT G DSLV KKD LATN LITIGKEVAGTEINIASVSKADRT LSGVKEA---VKDNEAVNKGQ L ------------------------ --- ------------------------------------------ DTN I KK V E-------DK L TEA V GKV TQ--QVK G DALL WSNEDNAFVAD H GKDSAKTKSK ITHLLDGNIASGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGQWTAPTFKVKTVNGEGKE-EEQT YQNVAEALTGV GAS F MN V QNK I T---NE I TNQ V NNA IT--KVE G DSLV KQD NLG- IITLGKERGGLKVDFANRDGLDRT LSGVKEA---VNDNEAVNKGQ L ------------------------ --- ------------------------------------------ DAD I SK V NNN V TNK F NE L TQN I TNV TQ--QVK G DALL WSDEANAFVAR H EKSKLEKGVSKATQENSK ITYLLDGDISKGSTDAVTGGQ L YSLNE--------------QLATY FGG DAKYE-NGQWTAPTFKVKTVNGEGKE-EEQT YHNVAAAFEGV GTS F TN I K-------SE I TKQ I NNE IS--NVK G DSLV KKD LATN LITIGKEVAGTEINIASVSKADRT LSGVKEA---VKDNEAVNKGQ L ------------------------ --- ------------------------------------------ DTN I KK V E-------DK L TEA V GKV TQ--QVK G DALL WSNEDNAFVAD H GKDSAKTKSK ITHLLDGNIASGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGQWTAPTFKVKTVNGDGKE-EEQT YQNVAEALTGV GTS F TN V QNK I T---NE I TNQ V NNA IT--KVE G DSLV KQD NLG- IITLGKERGGLKVDFANRDGLDRT LSGVKEA---VNDNEAVNKGQ L ------------------------ --- ------------------------------------------ DAN I SK V NNN V TNK F NE L TQN I TNV TQ--QVQ G DTLL WSDEANAFVAR H EKSKLEKGVSKATQENSK ITYLLDGDISKGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGEWTAPTFKVKTVNGEGKE-EEQT YHNVAAAFEGV GTS F TN I K-------SE I TKQ I DNE II--NVK G DSLV KRD LATN LITIGKEIEGSAINIANKSGEART ISGVKEA---VNNNEAVNKGQ L ------------------------ --- ------------------------------------------ DTN I KK V E-------DK L TEA V GKV TQ--QVK G DALL WSNEDNAFVAD H GKDSAKTKSK ITHLLDGNIASGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGQWTAPSFKVKTVKEDGKE-EEQT YQNVAEALTGV GTS F TN V K-------NE I TKQ I NHL ----QSD D SAVV HYD KNKDE TGTINYASVTLGKGKDSAAVT LHNVADGSISKDSRDAINGGQ I HTIGE--------------DVAKF LGG DAAFK-DGAFTGPTYKLSNIDAKGDV-QQSE FKDIGSAFAGL DTN I KN V NNN V TNK F NE L TQS I TNV TQ--QVK G DSLL WSDEANAFVAR H EKSKLEKGASKAIQENSK ITYLLDGNVSKGSTDAVTGGQ L YSMSN--------------MLATY LGG NAKYE-NGEWTAPTFKVKTVNGEGKE-EEQT YQNVAEALTGV GTS F TN I K-------SE I AKQ I NHL ----QSD D SAVI HYD KNKDE TGTINYASVTLGKGEDSAAVA LHNVAAGNIAKDSRDAINGSQ L YS L NE--------------Q L LTY FGG NAGYK-DGQWIAPKFQVSQFKSDGSSGEKES YDNVAAAFEGV NKS L AG M --------NERINN V VTA GQ--NVS S NSLN WNETEGGYDAR H NGVDSK LTHVENGDVSEKSKEAVNGSQ L WN T NEK V EA V EKD V KN I EKK V QD I ATVADSAVKYEKDSTGKKTNVIKLVGGSESDPVL IDNVADGDIKEGSKQAVNGGQ L RD YTE KQMKIVLEDAKK YTD ERFNDVVNNGVNEAKA YTD MKFEALSYAVEDVRKEARQA QLLVWRYLTYVTMIYRDL AAIGLAV SN LRYYDIPGS L S L S F G T G I WRSQSA F A V G A G Y TSED G N I R S N L S I TNAGGH W G V G A G I T L R L K
  3. 3. Automated vs manual annotation Coverage of annotation Domain type PFAM manually Present in PFAM 28% 35% Not present in PFAM - 18% Coiled coils - 3% Total 28% 56% Present in PFAM 26% 31% Not present in PFAM - 36% Coiled coils - 25% Total 26% 92%
  4. 4. Automated vs manual annotation Coverage of annotation Domain type PFAM daTAA manually Present in PFAM 28% 32% 35% Not present in PFAM - 13% 18% Coiled coils - 5% 3% Total 28% 50% 56% Present in PFAM 26% 28% 31% Not present in PFAM - 27% 36% Coiled coils - 11% 25% Total 26% 66% 92%
  5. 5. Prediction of individual repeats in YadA |----------Hep_Hag---------|---------Hep_Hag- |---Ylhead---|---Ylhead----|----- ASAKGIH SIAIG ATAEAAKGA AVAVG AGSIATGVN SVAIG PLSKALG   ----------| |---------Hep_Hag------- Ylhead--|---Ylhead---|---Ylhead---|----Ylhead-- D SAVTYG AASTAQKD GVAIG ARASTSDT GVAVG FNSKADAKN SVAIG   ---| |----------Hep_Hag---------| -|----Ylhead-----|----Ylhead---| HSSHVAANHGY SIAIG DRSKTDREN SVSIG HESL
  6. 11. Key points <ul><li>Approach of human annotator implemented in a computer system </li></ul><ul><li>Improvement in coverage and accuracy over general annotation servers </li></ul><ul><li>Unique workflow with knowledge-based rules </li></ul><ul><li>Visual helpers for interpretation of the results </li></ul>
  7. 12. Acknowledgements <ul><li>MPI for Developmental Biology </li></ul><ul><li>Institute of Biochemistry and Biophysics PAS </li></ul><ul><li>Andrei Lupas </li></ul><ul><li>Dirk Linke </li></ul><ul><li>Toolkit development team </li></ul><ul><li>Piotr Zielenkiewicz </li></ul><ul><li>Marcin Grynberg </li></ul>

×