Introduction to CSBB Lab

Computational Biology Laboratory Chuan Yi Tang CS Department, NTHU [email_address]

Our Aims ,[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object]

Coregulated genes Gene 1 Gene 2 Gene 3 Transcription factor atgaccgggatactgattaat a caa g gt tgggtataatggagtacgataa attgaga t caa t gt acggcgggtgctctcccgattggaag a caa c gt ggg gcaatcgggatc a caa c gt agaattggatgtcaaaataatggagtggcac gtcaatcgaaaaaacggtggtgagc g caa a gt aaagggattggaccgctt S1 S2 S3 S4

SP 1 5 0 0 0 9 0 0 0 g 4 9 9 6 0 9 9 4 c 0 0 0 3 0 0 0 5 t 0 0 0 0 0 0 0 0 a 8 7 6 5 4 3 2 1

IUPAC code Sp1 binding site Y CCG Y CC S

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],A RR TT YYRS A high motif degeneracy , weak motif AAGTT YYR CA low motif degeneracy , strong motif

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],METHODS atgaccgggatactgattaat a caa g gt tgggtataatggagtacgataa attgaga t caa t gt acggcgggtgctctcccgattggaag a caa c gt ggg gcaatcgggatc a caa c gt agaattggatgtcaaaataatggagtggcac gtcaatcgaaaaaacggtggtgagc g caa a gt aaagggattggaccgctt S1 S2 S3 S4

e.g. l =3, d =1 k =4 W ij = ATA All possible set of degenerate positions : {P1, p2,p3} _ TA, A _ A, AT _ For each possible set X = { p 1, …, pd } of degenerate positions, all Wpq with V ( Wij , Wpq )  X are collected. K=4 K=5 K=2 _TA ATA (S1) CTA (S2) ATA (S3) CTA (S3) TTA (S4) A_A ATA (S1) ATA (S2) ATA (S3) ACA (S4) AAA (S4) ACA (S5) AT_ ATC (S2) ATT (S3) ATA (S3) ATA(S3) AAA(S3)

Background letter probabilities are P A = 0.22, P T = 0.22 P C = 0.28, and P G = 0.28. A negative ( p , q )-entry means that the letter p at position q is weakly conserved in G ( Wij | X ). L pq = log[(observed probability of p at position q in G ( W ij | X )) / P p ] Pseudo occurrence elimination

Motif scoring methods s 1 = (  Lij / pj ) / l , This fact is used to measure the conservation and the significance of each reported motif. (1.51+1.51+1.51+1.51+(0.31+0.31)/2+1.51+(0.31+0.82)/2)

The measure used for comparison is the performance coefficient | K  P | / | K  P |. (Pevzner P. A. and Sze, S. H. (2000) Combinatorial approaches to finding subtle signals in DNA sequences. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000), 269-278.) K is the set of positions of the known motif occurrences in the input sequences. P is the set of predicted positions. The best performance coefficients among the top ten motifs found by these tools are compared. Evaluation of performance on synthetic data atgaccgggatactgattaat a caa g gt tgggtataatggagtacgataa attgaga t caa t gt acggcgggtgctctcccgattggaag a caa c gt ggg gcaatcgggatc a caa c gt agaattggatgtcaaccaaagtggagtggcac Red words the set of positions of the known motif occurrences ( K ) the set of predicted positions ( P ) | K  P | = 21 | K  P | = 35 | K  P | / | K  P |= 21/35 = 0.6 S1 S2 S3

Evaluation of performance on synthetic data

MotifSeeker Specificity : | K  P | / | P | false positive Sensitivity : | K  P | / | K | false negative

The best performance coefficient among the top ten motifs selected.

Evaluation of performance on tissue-specific regulatory elements ,[object Object],[object Object],[object Object]

Reference ,[object Object],[object Object],[object Object],[object Object]

臺灣土雞在育種上所面臨的問題 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

雞群育種育種計劃篩選基因型表現型

利用血清蛋白質當作篩選標誌 ,[object Object],[object Object],[object Object],[object Object],[object Object]

研究蛋白質體當做雞群的篩選 ,[object Object],[object Object],[object Object]

禽類產蛋之生物路徑分析科學農業 (2004), 10 月 ,[object Object],[object Object],[object Object]

Serum protein marker ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Exp I Exp II

Stage selection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Exp (I) Fig. 1. Egg production rate of TRFCC (n=157). (A) Total egg number of all hens, (B) hens in four groups (A) (B)

Fig. 3. Association of relative protein levels with total egg number. (A) Vitellogenin (B) Apo A-I

(C) Ovotransferrin (D) X protein

Exp II. 篩選策略 ,[object Object],[object Object],[object Object],[object Object],[object Object]

Fig. 1. Egg production rate of batch A (n=77) and batch B (n=78) of TRFCC.

Code-selection ,[object Object],Score rank score rank Batch A Batch B Transformation Regional codes code

Code-selection Step 1: selection 20% of low egg number of birds in batch B of TRFCC

Step 2: Transform codes in batch A of birds

結論 ,[object Object],[object Object],[object Object],[object Object]

致謝參與土雞計劃之合作及研究人員 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

刀鋒式伺服器在尖端科學計算領域的研發 ( 廣達產學 ) 子計畫二 : 建置叢集計算技術於理論物理及生物資訊的環境國家實驗研究院 : 莊哲男院長國家高速網路與計算中心 : 張西亞博士國家理論科學研究中心 : 張圖南主任清華大學資訊工程學系 : 唐傳義教授

Performance Comparison between IB and GE on Quanta Blade Server ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Quanta Blade Server

生物資訊相關應用的研發 (1) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

生物資訊相關應用的研發 (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

方法的研發 (3) ,[object Object],[object Object],[object Object],[object Object]

第二年的研究計畫 (2006/11~2007/7) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

實驗室未來導向 ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

核醫影像銀行的病史探勘及其在癌症診斷上的應用唐傳義閻紫宸 ( 長庚核醫科主任 ) 王速貞 (FDA USA)

背景 ,[object Object],[object Object],[object Object],[object Object]

那些是有價值的資訊 ,[object Object],[object Object],[object Object]

鼻咽癌（ Nasopharyngenl Carcinoma ， NPC ） ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Genome-wide Interpretation: Informatics of Immune Responses -The Concept of Immunometer 林口長庚紀念醫院內科部感染醫學科黃景泰醫師 Ching-Tai Huang, M.D., Ph.D. Infectious Diseases, Medicine Chang Gung Memorial Hospital

自體抗原腫瘤傳染性微生物環境抗原 Immune Tolerance & Immune Activation - Balance between Physiology & Pathology Tolerance Activation 移植器官

Transgenic Mouse Model -Adoptive Transfer System Recipients HA expressing Transgenic Mice Pooled splenocytes & lymph node cells C3-HA Low Donors HA specific TCR Transgenic Mice a) CD4 + : 6.5 (I-E d HA 110-120 ) b) CD8 + : clone 4 (K d HA 518-526 ) C3-HA High Non-Tg

Immune Tolerance & Immune Activation -in CD4+ T Cells Tolerance Memory Anergic/Regulatory Activated/Memory Naive

Immune Tolerance & Immune Activation -Dynamic genomic approach (With Affymetrix Gene Chips) Day 2 Day 3 Day 4 Naive Tolerance Memory Anergic/Regulatory Activated/Memory RNA RNA RNA RNA RNA RNA RNA

Our Aims ,[object Object],[object Object],[object Object]

Introduction to CSBB Lab

Recommended

Recommended

More Related Content

Similar to Introduction to CSBB Lab

Similar to Introduction to CSBB Lab (11)

More from Abner Huang

More from Abner Huang (7)

Introduction to CSBB Lab