More Related Content
Similar to Ch2006slide (9)
Ch2006slide
- 1. 1
— —
Hilofumi Yamamoto
December 14th 2006
- 6. 6
-
octpus
(Crystal, 1984: 18)
(Stubb, 2001: 198)
(intertextuality)
- 8. 8
•
(Crystal, 1984)
•
(Stede 1999)
•
(Voloshinov, 1973)
(Goddard, 1998)
- 10. 10
Schramn
10 @$5* 20 @$5*
7P83Ln @lLg2H$N7P83Ln
R = CT − OP
20 @$5*
0lHLFIT$N
7P83Ln
- 11. 11
:
298— (1982)
— —— — — — — — — —
( )
— — — — — — — — —
( ) ( )
[ ] — — — — — —
( ) ( ) ( )
- 13. 13
1.
2. idf
→
3. →
4.
5.
6. (cw)
7. cw
- 16. 16
Reality
↓
Abstraction
hit
John Mary
let
Elaboration Sally
- 18. 18
w(t, d) = (1 + log tf (t, d)) idf (t)
√
cw(t1 , t2 , d) = (1 + log ctf (t1 , t2 , d)) idf (t1 ) idf (t2 )
N
idf (t) = log
df (t)
- 19. 19
Inverse Document Frequency
Sp¨rck Jones (1972)
a
N
idf (t) = log
df (t)
N
idf (iru) = log (1)
df (iru)
10000
= log (2)
4383
= log 2.281542.. (3)
= 0.824614.. (4)
- 20. 20
Inverse Document Frequency
Sp¨rck Jones (1972)
a
N
idf (t) = log
df (t)
N
idf (uguisu) = log (5)
df (uguisu)
10000
= log (6)
239
= log 41.841.. (7)
= 3.733877.. (8)
- 21. 21
800
warbler
cuckoo
the number of co-occurrence patterns
700 plum
cherry
600
500
400
300
200
100
0
2 4 6 8 10 12
co-occurrence weight (cw)
- 22. 22
25000
warbler
cuckoo
the number of co-occurrence patterns
plum
20000 cherry
15000
10000
5000
0
5 10 15 20
co-occurrence weight (cw)
- 23. 23
high cw
KEY CT BG-01-5620-02-130 23 229 3.73
cw ctf t1 idf tf t2 idf tf
1 19.18 9 8.52 10 4.23 9
2 18.71 56 3.71 56 3.73 229
3 18.62 10 3.73 229 8.52 10
4 18.17 35 3.73 229 4.26 35
5 17.98 145 3.73 229 2.42 152
6 17.72 6 5.99 10 6.72 6
7 17.32 88 2.68 88 3.73 229
8 17.00 62 2.94 62 3.73 229
9 16.80 10 5.66 10 4.58 10
10 16.59 10 8.52 10 2.96 11
- 24. 24
low cw
KEY CT BG-01-5620-02-130 23 229 3.73
cw ctf t1 idf tf t2 idf tf
10962 1.56 1 1.33 50 1.83 35
10963 1.55 1 1.67 11 1.44 43
10964 1.53 1 2.07 8 1.13 75
10965 1.52 1 1.33 50 1.75 21
10966 1.49 1 1.67 11 1.33 50
10967 1.48 1 2.56 9 0.86 33
10968 1.48 1 1.31 44 1.67 11
10969 1.37 1 1.13 75 1.67 11
10970 1.33 1 0.86 33 2.07 11
10971 1.20 1 1.67 11 0.86 33
- 27. 27
OP( ) CT( )
OP OP ∩ CT CT