7. 染色體晶片分析簡介 (2/2)
Gain or Loss是拷貝數變異(Copy number variation, CNV)
每個探針訊號的測量: Log2(T/R)
T: 測量出的DNA拷貝數(copy number)
R=2 for humans (diploid organisms)
受檢樣本
參照樣本
Gain
Loss
15. 正規化 (Normalization)
LOWESS (LOcally WEighted Scatterplot Smoothing) regression
Same
data
set
normalized
by:
Mnorm
=
M-‐c(A)
where
c(A)
is
an
intensity
dependent
funcon
esmated
by
local
regression
Rao
–
Intensity
(M-‐A)
plot
of
raw
data:
M
=
log2(R/G)
;
A
=
(log2(R)
+
log2(G))
/
2
17. 異常訊號偵測相關演算法
• Circular Binary Segmentation (CBS)
• HMMs
• Bayesian HMMs
• Kalman Filters
• Wavelet decompositions
• Quantile regression
• EM and edge filtering
• Lasso…….
CBS
as
the
best
operaonal
characteriscs
in
terms
of
its
sensivity
and
FDR
for
breakpoint
detecon.
Lai,W.R.
et
al.
(2005)
Bioinformacs,
21,
3763–3770.
17
18. CBS (circular binary segmentation)演算法 (1/2)
Recursive change point algorithm: the change-points are the
genomic locations of copy number transitions
• H0: there is no change-point, H1: there
are change-points locating at i and j
1,2,3, ….,i-1, i, i+1,…,j-1,j, j+1,...n
1. Form the sequence of intensities
(Log ratio) into a circle by joining
the first and last probes
2. For all possible ways of dividing
up the circle into complimentary
arcs, compute the t-test statistic
for a difference in means
between the two arcs
Olshen et al. Biostatistics. 2004 Oct;5(4):557-72. Bioinformatics. 15;23(6):657-63. 2007
19. CBS (circular binary segmentation)演算法 (2/2)
3. If the maximum of these test statistics exceeds its null
distribution critical value, segment the circle there
4. Repeat recursively for the segmented arcs until no more
significant segments can be found