Experimenting the TextTiling Algorithm

Experimenting the
TextTiling Algorithm
Summary of the work done by master
students at Université Toulouse Le Mirail
Adam C., Andreani V., Bengsston J., Bouchara N., Choucavy L.,
Delpech E., El Maarouf I., Fontan L., Gotlik W.

Experimenting the Text Tiling
algorithm
Part I : What is the Text Tiling Algorithm ?
Part II : Experimentations with the Text
Tiling algorithm
Part III : Demo

Part I :
What is the TextTiling algorithm?
 « an algorithm for partitionning expository texts into
coherent multi-paragraph discourse units which reflects
the subtopic structure of the texts »

 developed by Marti Hearst (1997):
«TextTiling: Segmenting Text into Multi-Paragraph
Subtopic Passages », In Computational Linguistics, March
1997.
http://www.ischool.berkeley.edu/~hearst/tiling-about.html

Why segment a text into multi-paragraphs
unit ?
Computational tasks that use arbitrary windows might
benefit from using windows with motivated boundaries
Ease of readability for online long texts (Reading
Assistant Tools)
IR : retrieving relevant passages instead of whole
document
Summarization : extract sentences according to their
position in the subtopic structure

What is the hypothesis behind TextTiling ?

 « TextTiling assumes that a set of lexical items is in use
during the course of a given subtopic discussion, and
when that subtopic changes, a significant proportion
when that subtopic changes, a significant proportion of the
of the vocabulary changes
vocabulary changes as well »as well »
Text Tiling doesn’t detect subtopics per se but shifts in
topic by means of change in vocabulary
Operates a linear segmentation (no hierarchy)

Detection of topic shift
Raw text
Tokenisation

similarity score SS
bloc A vs bloc B S
S

Segmentation into
pseudo-sentences
(20 tokens)

a similarity score is computed every
pseudo-sentence between 2 blocks of 6
pseudo-sequences


the more vocabulary in common, the
highest the score


S
S
S
S
S
S
S
S
S
S
S
S
S
S
S

I. Detection of topic shift
SCORE
1

 a gap means there is a

0,85

0,9

drop in vocabulary similarity

0,8

0,8

0,7

 topic shifts occur at the

0,6
0,75

deepest gaps (after
smoothing)

0,5
0,4
0,7

tiles boundaries will be
adjusted to the nearest
paragraph break

0,3
0,65
0,2

0,1
0,6
0
1 1 3 3 5 5 7 7 9 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
11 13 15 17 19 21 23 25 27 29 31 33 35 37 39

Pseudo-sentence
number

Evaluation by Hearst (1997)
 Evaluation on 12 magazine articles annotated by 7
judges

 Judges are asked « to mark the paragraph boudary at
which the topic changed »

 In case of disagreement among judges, a boudary is
kept if at least 3 judges agree on it

 Agreement among judges (kappa measure) :

kappa = 0.647

Evaluation by Hearst (1997)
Precision

Recall

0.43

0.42

TextTiler

0.66

0.61

Judges

0.81

0.71

Baseline
(random)

Works well on long (+1800 words) expository texts with
little structural demarcation

Part II : Experimentations with
theTextTiling algorithm
 Work done by masters students, Université Toulouse Le
Mirail

 Implementation in Perl
 Experimentations :
 cross annotation of 3 texts
 variation of :


linguistic parameters



computation parameters

Annotation of topic boundary
 No clear-cut topic shift, rather ‘regions’ of shift
Annotators felt a smaller unity (sentence) would have
been more convenient

 Our kappa : 0.56
 Heart’s judges : 0.65

 kappa should be at least > 0.67, the best is > 0.8

 A difficult (unnatural ?) task for humans

Variation of linguistic parameters
basic

trigrams

lemmatization (TreeTagger*)
0,61

0,7

0,58

0,6

0,53

0,5

0,35
0,34

0,26
0,23

PRECISION
F-MESURE

0,4

0,25

0,3
0,2

0,17

0,1
0

RECALL
* http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/

Variation of computation parameters
 Computation window :


pseudo-sentence length



block length

 Smoothing :
0,7

0,7
0,7

0,6

0,6
0,6

0,5

0,5
0,5

0,4

0,4
0,4

0,3

0,3
0,3

0,2

0,2
0,2

0,1

0,1
0,1

0

0

0

1

1

15
57
71
18 17 22 2736 40 5053 65 66 78 85 92 99 105 118127 137 141 148 155 162170 183 196
1425 29 41 4349 57 64 73 81 89 92 105 113 121 129 134 145 153157169 177 185 193 197
79 97 106 113 120 131 144 161 169 176 183 190 201
9
33

Size of computation window
Pseudo-sentence length

Block length
2

4

6

8

10

12

14

16

18

20

5

++ +++ ++

++

++

++

++

++

++

++

10

++

++

++

+

+

++

+

+

+

+

15

++

+

+

+

+

+

+

-

-

-

20

+

+

+

-

-

-

-

-

-

--

25

+

+

-

-

-

-

-

--

--

--

30

+

-

-

-

-

--

--

--

--

--

35

+

-

-

-

-

--

--

--

--

--

40

--

--

--

--

--

--

--

--

--

--

Correlation
window size / smoothing
window size (number of tokens)
10

30

40

50

iteration

3

3

1

1

1

width

Smoothing

20

2

1

2

2

1

 Correlation between window size and smoothing :
The smallest your window, the more smoothing you need
to smoothe

Optimal parameters set
Nb
parag.

Nb
Words sentences tokens
smooth.
words /
/
/
iteration
parag. block
sentence

smooth.
width

Text 1

12

2000

167

6

5

3

2

Text 2

22

2400

109

6

10

1

1

Text 3

37

1750

20

8

10

1

1

 One optimal parameters set per text
 Optimal set varies according to text/paragraph
length ?

Final thoughts
 Linguistic processing :
lemmatization doesn’t significantly improve TextTiling
 what about stemming ?


 Computation parameters :
 parameters are highly dependent


optimal parameters set vary from text to text

 Proposal : an adaptative Text Tiler ?
 window size could be adapted to text intrinsic qualities
 smoothing could then be adapted to window size

Similarity score – Hearst (1997)

Sim (b1 ,b2) =

∑t wt,b1 . wt,b2

√ ∑ w² b1 . ∑ w² b2
t

t

t

t

b1 : block 1
b2 : block 2
t : token
w : weight (frequency) of the token in the block
back

Kappa measure
http://www.musc.edu/dc/icrebm/kappa.html
Annot 1
yes

no

TOTAL

40

35

Y2=75

no

5

20

N2=25

TOTAL

Y1=45

N1=55

T=100

Annot2 yes

Kappa

Agreement
P(A) = 0.6
Expected agreement
P(E)
= (Y1.Y2 + N1.N2) / T²
= 0.475

P(A) – P(E)
=

1 – P(E)

= 0.24
back

Experimenting the TextTiling Algorithm

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Experimenting the TextTiling Algorithm

Similar to Experimenting the TextTiling Algorithm (20)

More from Estelle Delpech

More from Estelle Delpech (19)

Recently uploaded

Recently uploaded (20)

Experimenting the TextTiling Algorithm