Model Selection with Piecewise Regular GaugesGabriel Peyré
Talk given at Sampta 2013.
The corresponding paper is :
Model Selection with Piecewise Regular Gauges (S. Vaiter, M. Golbabaee, J. Fadili, G. Peyré), Technical report, Preprint hal-00842603, 2013.
http://hal.archives-ouvertes.fr/hal-00842603/
Prepared as part of the course requirements for the subject IT for Business Intelligence at Vinod Gupta School of Management, IIT Kharagpur. This paper discusses some of the data mining techniques using examples in the software WEKA.
To get a copy of the slides for free Email me at: japhethmuthama@gmail.com
You can also support my PhD studies by donating a 1 dollar to my PayPal.
PayPal ID is japhethmuthama@gmail.com
Contoh analisis uji beda nonparamaetrik wilcoxonEDI RIADI
Ranking bertanda Wilcoxon banyak digunakan untuk menguji perbedaan perlakuan yang diberikan kepada objek penelitian dengan mempertimbangkan arah dan magnitude relatif perbedaan dari dua sampel berpsangan.
Model Selection with Piecewise Regular GaugesGabriel Peyré
Talk given at Sampta 2013.
The corresponding paper is :
Model Selection with Piecewise Regular Gauges (S. Vaiter, M. Golbabaee, J. Fadili, G. Peyré), Technical report, Preprint hal-00842603, 2013.
http://hal.archives-ouvertes.fr/hal-00842603/
Prepared as part of the course requirements for the subject IT for Business Intelligence at Vinod Gupta School of Management, IIT Kharagpur. This paper discusses some of the data mining techniques using examples in the software WEKA.
To get a copy of the slides for free Email me at: japhethmuthama@gmail.com
You can also support my PhD studies by donating a 1 dollar to my PayPal.
PayPal ID is japhethmuthama@gmail.com
Contoh analisis uji beda nonparamaetrik wilcoxonEDI RIADI
Ranking bertanda Wilcoxon banyak digunakan untuk menguji perbedaan perlakuan yang diberikan kepada objek penelitian dengan mempertimbangkan arah dan magnitude relatif perbedaan dari dua sampel berpsangan.
Cómo se vincula el trauma del nacimiento de Otto Rank con la psicoterapia cor...Solcitho Fariña
Ha sido investigada por años la infancia y los estados fetales ya que se ha demostrado que los primeros años de vida de un sujeto son los que van a determinar una parte importante en la construcción de la personalidad y en aspectos sociales y afectivos van a predecir gran parte de su comportamiento a futuro.
La psicología se enfoca en poder entender esta etapa y buscar respuestas acerca de la construcción y el pensamiento acerca de la vida que viven los niños, los cuales significan de formas diversas este mundo, tratando de comprender los sucesos que le ocurren y la cultura en general, acostumbrándose al mundo fuera del útero.
La comunicación en esta etapa será de suma importancia, pero como el lenguaje no está presente en los primeros meses esta será a través de su cuerpo, el cual con el paso del tiempo se vuelve mucho más escaso, pero sigue escondiendo señales de como es lo que sentimos y como estamos reaccionando ante las situaciones difíciles.
Esta investigación busca relacionar el tema que plantea Otto Rank sobre los traumas de nacimiento y su influencia en el futuro de un sujeto con la psicología corporal la cual estará fundamentada por el Psicólogo chileno André Sassenfeld, sus aportes se utilizaran con el fin de conocer más acerca de esta terapia.
El acto de nacer puede estar sujeto a miles de factores los cuales influyen en gran medida al bebe por estímulos exteriores lo cuales se expresan de diversas formas a través del cuerpo del infante. Este solo hecho revoluciona prácticamente toda la Psicología en años posteriores dejando entre ver la naturaleza humana en su máxima expresión.
simple linear regression - brief introductionedinyoka
Goal of regression analysis: quantitative description and
prediction of the interdependence between two or more variables.
• Definition of the correlation
• The specification of a simple linear regression model
• Least squares estimators: construction and properties
• Verification of statistical significance of regression model
ppt Coefficient Of Correlation By Spearmans Rank Method And Concurrent Deviation Method.
it contains steps to solve questions with these methods along with some example
2. Objektif Pembelajaran
• Untuk digunakan dalam pengujian hipotesis
apabila tidak boleh membuat sebarang
anggapan terhadap taburan yang kita ambil
• Untuk mengetahui ujian untuk taburan bebas
yang digunakan dalam keadaan tertentu
• Untuk menggunakan dan menjelaskan enam
jenis pengujian hipotesis tak berparameter
• Ujian mengetahui kelemahan dan kelebihan
ujian tak berparameter
3. Statistik Berparameter vs Tidak
Berparameter
• Statistik Berparameter adalah teknik statistik
berdasarkan kepada andaian berkaitan populasi
dimana sampel data adalah dipungut.
– Andaian dimana data yang dianalisis adalah
dipilih secara rawak dari populasi yang
bertaburan normal.
– Memerlukan ukuran kuantitatif yang
menghasilkan data bertaraf interval atau
perkadaran.
4. Statistik Berparameter vs Tidak
Berparameter
• Statistik Tidak Berparameter adalah
berdasarkan andaian yang kurang populasi
dan parameter.
– Kadangkala dipanggil sebagai statistik
“tidak mempunyai taburan”.
– Berbagai-bagai jenis statistik tidak
berparameter yang ada untuk digunakan
dengan data bertaraf nominal atau ordinal.
5. Kebaikan Teknik Tidak Berparameter
• Kadangkala tidak terdapat teknik berparameter
alternatif untuk digunakan berbanding teknik tidak
berparameter.
• Beberapa ujian tidak berparameter boleh digunakan
untuk menganalisis data nominal.
• Beberapa ujian tidak berparameter boleh digunakan
untuk menganalisis data ordinal.
• Pengiraan statistik tidak berparameter kurang rumit
berbanding kaedah berparameter, terutama untuk
sampel yang kecil.
• Pernyataan kebarangkalian yang diperolehi dari
kebanyakan ujian tidak berparameter adalah
kebarangkalian yang tepat.
6. Kelemahan Statistik Tidak
Berparameter
• Ujian tidak berparameter boleh membazirkan data
jika ujian berparaeter boleh digunakan untuk data
tersebut.
• Ujian tidak berparameter biasanya tidak
digunakan dengan meluas dan kurang dikenali
berbanding ujian berparameter.
• Untuk sampel yang besar, pengiraan bagi
kebanyakan ujian tidak berparameter boleh
mengelirukan.
8. Runs Test
• Test for randomness - is the order or sequence of
observations in a sample random or not
• Each sample item possesses one of two possible
characteristics
• Run - a succession of observations which possess
the same characteristic
• Example with two runs: F, F, F, F, F, F, F, F, M,
M, M, M, M, M, M
• Example with fifteen runs: F, M, F, M, F, M, F,
M, F, M, F, M, F, M, F
9. Runs Test: Sample Size
Consideration
• Sample size: n
• Number of sample member possessing
the first characteristic: n1
• Number of sample members possessing
the second characteristic: n2
• n = n1 + n2
• If both n1 and n2 are ≤ 20, the small
sample runs test is appropriate.
10. Runs Test: Small Sample
H : The observations in Example
0 the sample are randomly generated.
H : The observations in the sample are not randomly generated.
a
α = .05
n1 = 18
n2 = 8
If 7 ≤ R ≤ 17, do not reject H0
Otherwise, reject H0.
1 2 3 4 5 6 7 8 9 10 11 12
D CCCCC D CC D CCCC D C D CCC DDD CCC
R = 12
Since 7 ≤ R = 12 ≤ 17, do not reject H0
11. Runs Test: Large Sample
2n n
If either n or n is > 20, µ = +1 1 2
1
the sampling
2
R + n n
1 2
distribution of R is
approximately normal.
2 n1 n2 (2 n1 n2 − n1 − n2 )
σ =
(n1+ n2)
R 2
+ (n1 + n2 − 1)
R − µR
Z=
σ R
12. Runs Test: Large Sample
0
Example
H : : The observations in the sample are randomly generated.
H The observations in the sample are randomly generated.
0
H : : The observations in the sample are not randomly generated.
H The observations in the sample are not randomly generated.
a
a
α = .05
α = .05
n1 = 40
n = 40
n21= 10
n2 = 10
If -1.96 ≤ Z ≤ 1.96, do not reject H0
If -1.96 ≤ Z ≤ 1.96, do not reject H0
Otherwise, reject H0. .
Otherwise, reject H0
1
1
1
1 2
2 3 4 5 6
3 4 5 6 7
7 8
8 9
9 0
0 11
11
NNN
NNN F NNNNNNN F NN FF NNNNNN
F NNNNNNN F NN FF NNNNNN F NNNN
F NNNN F NNNNN
F NNNNN
12
12 13
13
FFFF NNNNNNNNNNNN
FFFF NNNNNNNNNNNN R = 13
R = 13
13. Runs Test: Large Sample
2n n Example n n − n − n )
2 n n (2
µ = 1 2
+1 σR= 1 2 1 2 1 2
R
n1 + n2 (n1+ n2)
2
+ (n1 + n2 − 1)
2(40)(10)
= +1 2(40)(10)[ 2(40)(10) − (40) − (10)]
40 + 10 =
(40+10)
2
= 17 + (40 + 10 − 1)
= 2.213
R − µR 13 − 17
-1.96 ≤ Z = -1.81 ≤ 1.96,
Z= = = −181
. do not reject H0
σ R
2.213
15. Mann-Whitney U Test
• Nonparametric counterpart of the t test for
independent samples
• Does not require normally distributed populations
• May be applied to ordinal data
• Assumptions
– Independent Samples
– At Least Ordinal Data
16. Mann-Whitney U Test:
Sample Size Consideration
• Size of sample one: n1
• Size of sample two: n2
• If both n1 and n2 are ≤ 10, the small sample
procedure is appropriate.
• If either n1 or n2 is greater than 10, the large
sample procedure is appropriate.
17. Mann-Whitney U Test:
Small Sample Example
H0: The health service Health Educational
population is identical to the Service Service
educational service 20.10 26.19
population on employee 19.80 23.88
compensation 22.36 25.50
Ha: The health service 18.75 21.64
population is not identical to 21.90 24.85
the educational service 22.96 25.30
population on employee 20.75 24.12
compensation 23.45
18. Mann-Whitney U Test:
α = .05
Small Sample Example
Compensation Rank Group
18.75 1 H
If the final p-value < .05, reject H0. 19.80 2 H
20.10 3 H
20.75 4 H
21.64 5 E
21.90 6 H
W1 = 1 + 2 + 3 + 4 + 6 + 7 + 8 22.36 7 H
22.96 8 H
= 31 23.45 9 E
23.88 10 E
W2 = 5 + 9 + 10 + 11 + 12 + 13 + 14 + 15 24.12 11 E
24.85 12 E
= 89 25.30 13 E
25.50 14 E
26.19 15 E
19. Mann-Whitney U Test:
Small Sample Example
n (n + 1) − Since U2 < U1, U = 3.
U =n n
1 1 2
+ 1
2
1
W1
(7)(8) p-value = .0011 < .05, reject H0.
= (7)(8) + − 31
2
= 53
n (n + 1)
U =n n
2 1 2
+ 2 2
2
−W 2
(8)(9)
= (7)(8) n1 n2 + − 89
2
=3
20. Mann-Whitney U Test:
Formulas for Large Sample Case
U = n1 n2 n ( n + 1) − n ⋅n
µ = 2 1 2
+
2
1 1
W 1 U
where : n1 = number in group 1 n ⋅n ( n + n )
+1
σU
=
1 2 1
12
2
n 2
= number in group 2
U − µU
Z=
W 1
= sum or the ranks of σ U
values in group 1
21. Incomes of PBS PBS Non-PBS
and Non-PBS Viewers 24,500
39,400
41,000
32,500
Ho: The incomes for PBS viewers 36,800 33,000
44,300 21,000
and non-PBS viewers are
57,960 40,500
identical 32,000 32,400
Ha: The incomes for PBS viewers 61,000 16,000
and non-PBS viewers are not 34,000 21,500
identical 43,500 39,500
55,000 27,600
n1 = 14
α =.05 39,000 43,500
62,500 51,900
If Z < −1.96 or Z > 1.96, reject Ho
n2 = 13 61,400 27,800
53,000
22. Ranks of Income from Combined
Groups of PBS and Non-PBS
Income Rank Viewers Rank Group
Group Income
16,000 1 Non-PBS 39,500 15 Non-PBS
21,000 2 Non-PBS 40,500 16 Non-PBS
21,500 3 Non-PBS 41,000 17 Non-PBS
24,500 4 PBS 43,000 18 PBS
27,600 5 Non-PBS 43,500 19.5 PBS
27,800 6 Non-PBS 43,500 19.5 Non-PBS
32,000 7 PBS 51,900 21 Non-PBS
32,400 8 Non-PBS 53,000 22 PBS
32,500 9 Non-PBS 55,000 23 PBS
33,000 10 Non-PBS 57,960 24 PBS
34,000 11 PBS 61,000 25 PBS
36,800 12 PBS 61,400 26 PBS
39,000 13 PBS 62,500 27 PBS
39,400 14 PBS
26. Wilcoxon Matched-Pairs
Signed Rank Test
• A nonparametric alternative to the t test for related
samples
• Before and After studies
• Studies in which measures are taken on the same
person or object under different conditions
• Studies or twins or other relatives
27. Wilcoxon Matched-Pairs
Signed Rank Test
• Differences of the scores of the two matched
samples
• Differences are ranked, ignoring the sign
• Ranks are given the sign of the difference
• Positive ranks are summed
• Negative ranks are summed
• T is the smaller sum of ranks
28. Wilcoxon Matched-Pairs Signed
Rank Test: Sample Size
Consideration
• n is the number of matched pairs
• If n > 15, T is approximately normally
distributed, and a Z test is used.
• If n ≤ 15, a special “small sample” procedure is
followed.
– The paired data are randomly selected.
– The underlying distributions are symmetrical.
29. Wilcoxon Matched-Pairs Signed
Rank Test: Small Sample
H: M =0
0 d
Example
Family
Ha: Md ≠ 0 Pair Pittsburgh Oakland
1 1,950 1,760
n=6 2 1,840 1,870
3 2,015 1,810
α =0.05 4 1,580 1,660
5 1,790 1,340
6 1,925 1,765
If Tobserved ≤ 1, reject H0.
30. Wilcoxon Matched-Pairs Signed
Rank Test: Small Sample
Family
Example d Rank
Pair Pittsburgh Oakland
1 1,950 1,760 190 +4
2 1,840 1,870 -30 -1
3 2,015 1,810 205 +5
4 1,580 1,660 -80 -2
5 1,790 1,340 450 +6
6 1,925 1,765 160 +3
T = minimum(T+, T-) T = 3 > Tcrit = 1, do not reject H0.
T+ = 4 + 5 + 6 + 3= 18
T- = 1 + 2 = 3
T=3
31. Wilcoxon Matched-Pairs Signed
Rank Test: Large Sample
Formulas
( n )( n + 1)
µ T 4
=
n( n + 1)( 2n + 1)
σT= 24
T−µ
Z= T
σ T
where : n = number of pairs
T = total ranks for either + or - differences, whichever is less
36. Kruskal-Wallis Test
• A nonparametric alternative to one-way analysis
of variance
• May used to analyze ordinal data
• No assumed population shape
• Assumes that the C groups are independent
• Assumes random selection of individual items
37. Kruskal-Wallis K Statistic
12 T j
C 2
− 3( n + 1)
K= ∑
n( n + 1) j =1 n j
where : C = number of groups
n = total number of items
T j
= total of ranks in a group
n j = number of items in a group
K ≈ χ 2 , with df = C - 1
38. Number of Patients per Day
per Physician in Three Organizational
Categories
Ho: The three populations are identical
Ha: At least one of the three populations is different
Three or
Two More
α = 0.05 Partners Partners HMO
df = C − 1 = 3 − 1 = 2 13 24 26
15 16 22
χ
2
.05, 2
= 5.991 20 19 31
18 22 27
If K > 5.991, reject Ho. 23 25 28
14 33
17
42. Friedman Test
• A nonparametric alternative to the randomized
block design
• Assumptions
– The blocks are independent.
– There is no interaction between blocks and
treatments.
– Observations within each block can be ranked.
• Hypotheses
– Ho: The treatment populations are equal
– Ha: At least one treatment population
yields larger values than at least one
other treatment population
43. Friedman Test
C
12
χ ∑ R j − 3b(C + 1)
2 2
=
r bC (C + 1) j =1
where : C = number of treatment levels (columns)
b = number of blocks (rows)
R j = total ranks for a particular treatment level
j = particular treatment level
χ ≈χ
2 2
, with df = C - 1
r
44. Friedman Test: Tensile Strength
of Plastic Housings
Ho: The supplier populations are equal
Ha: At least one supplier population yields larger
values than at least one other supplier population
Supplier 1 Supplier 2 Supplier 3 Supplier 4
Monday 62 63 57 61
Tuesday 63 61 59 65
Wednesday 61 62 56 63
Thursday 62 60 57 64
Friday 64 63 58 66
45. Friedman Test: Tensile Strength
of Plastic Housings
α = 0.05
df = C − 1 = 4 − 1 = 3
χ
2
.05, 3
= 7.81473
χ
2
If r
> 7.81473, reject Ho.
49. Spearman’s Rank Correlation
• Analyze the degree of association of two
variables
• Applicable to ordinal level data (ranks)
6∑ d
2
r = 1−
s
( n − 1)
n
2
where: n = number of pairs being correlated
d = the difference in the ranks of each pair