Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

daTAA server

1,337 views

Published on

My talk about domain annotation in trimeric autotransporter adhesins.

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

daTAA server

  1. 1. Server daTAA: http://toolkit.tuebingen.mpg.de/dataa Paweł Szczęsny MPI for Developmental Biology, Tuebingen, Germany Institute of Biochemistry and Biophysics PAS, Warsaw, Poland
  2. 2. Internal complexity of TAAs MFWMCFVIFFIGEFIMKKLSVTSKRQYNLYASPISRRLSLLMKL SLETVTVMFLLGASPVLA / SNLALTG AKNLSQNSPGVNYSKGSHGSIVLSGDDDFCGADYVLGRGGNSTVRNGIPISVEEEYERFVKQKLMNNATSPYSQSSEQQVWTGDGLTSKGSGYMGGKSTDGDKNIL PE A Y G IY------------------------- SFATG CG S S A Q G NY------------------------- SVAFG AN A T A L T GG------------------------- S Q AFG VA A L A S G RV------------------------- SVAIG VG S E A T G EA------------------------- GVSLG GL S K A A G AR------------------------- SVAIG TR A N A Y G EE------------------------- SIAIG GGLKQGSDNKIGS A V A Q G LK------------------------- AISIG SD S V G FQHY------------------------- AVAIG AK S R A LLLK------------------------- SVALG SY S V A DVDAGVR GYDP VEDEPSKNVSFVWKSSVG AVSVG NRKEGLTRQ IIGVAAG---TEDTDAVNVAQL KALR:GMISEK|G GW NLTVNNDNNTVVSSGGALDLSSGSKNLKIAKDGKKNNVTFDVARDL TL KSIKLDGVTLNETGLFIANGPQITAS GIN AGSQK ITGVAEG---TDANDAVNFGQL ----------------------------------------------------------------------------------- KKI|ETEVKE -----QVA A SGFV KQD SDTK: YLTIGKDTDGDTINIANNKSDKRT LMGIKEGDISKDSSEAITGSQ L FT T NQN V KT V SDN L QT A ATN I AK T FGG DAKYE-DGEWTAPTFKVKTVTGEGKE-EEKT YQNVADALAGV GSS I TN V Q-------NK V TEQ V NNA IT--KVE G DALL WSDEANAFVAR H EKSKLEKGASKATQENSK ITYLLDGDVSKDSTDAITGKQ L YSLGD--------------KIASY LGG NAKYE-NGEWTAPTFKVKTVKEDGKE-EEQT YHNVAAAFEGV GTS F TN V K-------NE I TKQ I NHL ----QSD D SAVV HYD KDDK- NGSINYASVTLGKGKDSAAVT LHNVAAGNIAKDSHDAINGSQ I YSLNE--------------QLATY FGG GAGYNKEGKWTAPTFTVKTVKEDGEE-EEKT YQNVAEALTGV GTS F TN I K-------SE I TKQ I ANE IS--NVT G DSLV KKD LDTN LITIGKEVAGTEINIASVSKADRT LSGVKEA---VKDNEAVNKGQ L ------------------------ --- ------------------------------------------ --- - -- - ---------- - DKG L KHL SDSLQSE D SAVV HYD KKTDE TGGINYTSVTLG-GKDKTPVA LHNVADGSISKDSHDAINGGQ I HTIGE--------------DVAKF LGG AASFN-NGAFTGPTYKLSNIDAKGDV-QQSE FKDIGSAFAGL DTN I KN V NNN V TNK F NE L TQN I TNV TQ--QVK G DALL WSDEANAFVAR H EKSKLGKGASKATQENSK ITYLLDGDVSKDSTDAITGKQ L YSLGD--------------KIASY LGG NAKYE-DGEWTAPTFKVKTVKEDGKE-EEKT YQNVAEALTGV GTS F TN V K-------NE I TKQ I NHL ----QSD D SAVV HYD KNKDE TGGINYASVTLGKGKDSAAVT LHNVADGSISKDSRDAINGSQ I YSLNE--------------QLATY FGG GAKYE-NGQWTAPIFKVKTVKEDGEE-EEKT YQNVAEALTGV GTS F TN I K-------SE I TKQ I ANE IS--SVT G DSLV KKD LATN LITIGKEVAGTEINIASVSKADRT LSGVKEA---VKDNEAVNKGQ L ------------------------ --- ------------------------------------------ DTN I KK V E-------DK L TEA V GKV TQ--QVK G DALL WSNEDNAFVAD H GKDSAKTKSK ITHLLDGNIASGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGQWTAPTFKVKTVNGEGKE-EEQT YQNVAEALTGV GAS F MN V QNK I T---NE I TNQ V NNA IT--KVE G DSLV KQD NLG- IITLGKERGGLKVDFANRDGLDRT LSGVKEA---VNDNEAVNKGQ L ------------------------ --- ------------------------------------------ DAD I SK V NNN V TNK F NE L TQN I TNV TQ--QVK G DALL WSDEANAFVAR H EKSKLEKGVSKATQENSK ITYLLDGDISKGSTDAVTGGQ L YSLNE--------------QLATY FGG DAKYE-NGQWTAPTFKVKTVNGEGKE-EEQT YHNVAAAFEGV GTS F TN I K-------SE I TKQ I NNE IS--NVK G DSLV KKD LATN LITIGKEVAGTEINIASVSKADRT LSGVKEA---VKDNEAVNKGQ L ------------------------ --- ------------------------------------------ DTN I KK V E-------DK L TEA V GKV TQ--QVK G DALL WSNEDNAFVAD H GKDSAKTKSK ITHLLDGNIASGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGQWTAPTFKVKTVNGDGKE-EEQT YQNVAEALTGV GTS F TN V QNK I T---NE I TNQ V NNA IT--KVE G DSLV KQD NLG- IITLGKERGGLKVDFANRDGLDRT LSGVKEA---VNDNEAVNKGQ L ------------------------ --- ------------------------------------------ DAN I SK V NNN V TNK F NE L TQN I TNV TQ--QVQ G DTLL WSDEANAFVAR H EKSKLEKGVSKATQENSK ITYLLDGDISKGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGEWTAPTFKVKTVNGEGKE-EEQT YHNVAAAFEGV GTS F TN I K-------SE I TKQ I DNE II--NVK G DSLV KRD LATN LITIGKEIEGSAINIANKSGEART ISGVKEA---VNNNEAVNKGQ L ------------------------ --- ------------------------------------------ DTN I KK V E-------DK L TEA V GKV TQ--QVK G DALL WSNEDNAFVAD H GKDSAKTKSK ITHLLDGNIASGSTDAVTGGQ L YSLNE--------------QLATY FGG GAKYE-NGQWTAPSFKVKTVKEDGKE-EEQT YQNVAEALTGV GTS F TN V K-------NE I TKQ I NHL ----QSD D SAVV HYD KNKDE TGTINYASVTLGKGKDSAAVT LHNVADGSISKDSRDAINGGQ I HTIGE--------------DVAKF LGG DAAFK-DGAFTGPTYKLSNIDAKGDV-QQSE FKDIGSAFAGL DTN I KN V NNN V TNK F NE L TQS I TNV TQ--QVK G DSLL WSDEANAFVAR H EKSKLEKGASKAIQENSK ITYLLDGNVSKGSTDAVTGGQ L YSMSN--------------MLATY LGG NAKYE-NGEWTAPTFKVKTVNGEGKE-EEQT YQNVAEALTGV GTS F TN I K-------SE I AKQ I NHL ----QSD D SAVI HYD KNKDE TGTINYASVTLGKGEDSAAVA LHNVAAGNIAKDSRDAINGSQ L YS L NE--------------Q L LTY FGG NAGYK-DGQWIAPKFQVSQFKSDGSSGEKES YDNVAAAFEGV NKS L AG M --------NERINN V VTA GQ--NVS S NSLN WNETEGGYDAR H NGVDSK LTHVENGDVSEKSKEAVNGSQ L WN T NEK V EA V EKD V KN I EKK V QD I ATVADSAVKYEKDSTGKKTNVIKLVGGSESDPVL IDNVADGDIKEGSKQAVNGGQ L RD YTE KQMKIVLEDAKK YTD ERFNDVVNNGVNEAKA YTD MKFEALSYAVEDVRKEARQA QLLVWRYLTYVTMIYRDL AAIGLAV SN LRYYDIPGS L S L S F G T G I WRSQSA F A V G A G Y TSED G N I R S N L S I TNAGGH W G V G A G I T L R L K
  3. 3. Automated vs manual annotation Coverage of annotation Domain type PFAM manually Present in PFAM 28% 35% Not present in PFAM - 18% Coiled coils - 3% Total 28% 56% Present in PFAM 26% 31% Not present in PFAM - 36% Coiled coils - 25% Total 26% 92%
  4. 4. Automated vs manual annotation Coverage of annotation Domain type PFAM daTAA manually Present in PFAM 28% 32% 35% Not present in PFAM - 13% 18% Coiled coils - 5% 3% Total 28% 50% 56% Present in PFAM 26% 28% 31% Not present in PFAM - 27% 36% Coiled coils - 11% 25% Total 26% 66% 92%
  5. 5. Prediction of individual repeats in YadA |----------Hep_Hag---------|---------Hep_Hag- |---Ylhead---|---Ylhead----|----- ASAKGIH SIAIG ATAEAAKGA AVAVG AGSIATGVN SVAIG PLSKALG   ----------| |---------Hep_Hag------- Ylhead--|---Ylhead---|---Ylhead---|----Ylhead-- D SAVTYG AASTAQKD GVAIG ARASTSDT GVAVG FNSKADAKN SVAIG   ---| |----------Hep_Hag---------| -|----Ylhead-----|----Ylhead---| HSSHVAANHGY SIAIG DRSKTDREN SVSIG HESL
  6. 11. Key points <ul><li>Approach of human annotator implemented in a computer system </li></ul><ul><li>Improvement in coverage and accuracy over general annotation servers </li></ul><ul><li>Unique workflow with knowledge-based rules </li></ul><ul><li>Visual helpers for interpretation of the results </li></ul>
  7. 12. Acknowledgements <ul><li>MPI for Developmental Biology </li></ul><ul><li>Institute of Biochemistry and Biophysics PAS </li></ul><ul><li>Andrei Lupas </li></ul><ul><li>Dirk Linke </li></ul><ul><li>Toolkit development team </li></ul><ul><li>Piotr Zielenkiewicz </li></ul><ul><li>Marcin Grynberg </li></ul>

×