Evaluation of Anchor Texts for Automated Link Discovery in Semi-structured Web Documents

Evaluation of Anchor Texts
for Automated Link Discovery
in Semi-structured Web Documents
Na’im Tyson, Jon Roberts, Jeﬀ Allen and Matt Lipson
Sciences, About.com
Novel Incentives for Collecting Data & Annotation from People
Tyson, Roberts, Allen, Lipson (About.com) Evaluation of Annotations of Anchor Texts LREC 2016 1 / 19

Purpose
Research Questions
Q: What do you do when you have little time and funding to
annotate web pages?

Purpose
Research Questions
annotate web pages?
A: Create an algorithm to annotate web pages with anchor texts

Purpose
Research Questions
annotate web pages?
Q: How do you measure quality and consistency between your
algorithm and human annotators?

Purpose
Research Questions
annotate web pages?
Q: How do you measure quality and consistency between your
algorithm and human annotators?
A: Create an evaluation framework to determine consistency
between annotations of algorithm and humans!

Introduction What is About.com?
• Composition
1
Top health content migrated to verywell.com.

• Composition
• Intent-driven website of two million articles divided into seven major
verticals: food, health1
, home, money, style, tech and travel
1

• Composition
• Over 200 million monthly visits from U.S., Western Europe and parts
of India
1

• Composition
of India
• Content Structure
1

• Composition
of India
• Content written by a large number of writers, experts, using a
content-management system (CMS)—with each expert having their
own topic area
1

• Composition
of India
own topic area
• Nascent content can be linked to other articles with hypertext links -
inline links
1

• Composition
of India
own topic area
inline links
• Inline links - necessary for user recirculation
1

• Composition
of India
own topic area
inline links
• Inline links - necessary for user recirculation
• Have higher clicks per session cf. article listings (@ bottom of page),
trending articles and navigation units
1

Introduction What makes Inline Links problematic?
• Experts hardly add links!
Figure 1: Histogram of link density of articles prior to the launch of automated link discovery.

Introduction What makes Inline Links problematic? (Continued)
• Experts do not receive extra incentives for annotating anchor text for
inline links

inline links
• Producing quality inline links takes time

inline links
• Experts must know their content and other neighboring content to link
to it

inline links
• Experts must know their content and other neighboring content to link
to it
• Experts not compensated for direction of traﬃc outside their site

Generating Anchor Texts Making Anchor Texts from Keyword Extraction Algorithms
• Empirically-driven inline linking produces long sequences
Figure 2: Part-of-speech (POS) histogram of expert-generated anchor texts consisting of
six words in full-text articles.

Generating Anchor Texts Making Anchor Texts from Keyword Extraction Algorithms
• Empirically-driven inline linking produces long sequences
Figure 2: Part-of-speech (POS) histogram of expert-generated anchor texts consisting of
six words in full-text articles.
• TextRank [Mihalcea, 2004], KEA [Witten et al., 1999] and Hulth
[2004] produce sequences that are too short

Generating Anchor Texts Making Anchor Texts using Chunk Parsing
• Starting point: Generic grammar of POS sequences originally
derived from Hulth (2004)

• Grammars used to identify candidates for anchor text suggestions

• Augmented with entity and datetime tags

• POS sequences expressed as Chunk Rules implemented in Python’s
NLTK

NLTK
• Candidates selected based on weighted sum of document-level
features between source and target documents

NLTK
• Candidates selected based on weighted sum of document-level
features between source and target documents
• Weights based on existing expert links

Evaluating Anchor Texts Quality Assurance Setup & Workﬂow
• Annotation of top 86k documents

• Done across 13 annotators paid hourly

• Annotators modiﬁed documents within an annotation environment

1 Keep anchor text

1 Keep anchor text
2 Modify anchor text (by expanding/contracting)

1 Keep anchor text
3 Delete anchor text

1 Keep anchor text
3 Delete anchor text
4 Modify link target

Evaluating Anchor Texts Quality Assurance Setup & Workﬂow (continued)
Figure 3: Example annotation environment used for investopedia.com.

Evaluating Anchor Texts Computing Inter-labeler Agreement
• For each annotator...
B B
positive negative
A positive a b
A negative c d
Table 1: Contingency table for the anchor text generator (A), and a single annotator (B).
Algorithm: ˆ quick brown fox jumps over the lazy dog $
Annotator: ˆ the quick brown fox $
d c a a a b b b b b d
Example 1: Example of phrase alignment between the anchor text generator and an annotator.

Evaluating Anchor Texts Computing Inter-labeler Agreement - Continued
Algorithm: ˆ quick brown fox jumps over the lazy dog $
Annotator: ˆ the quick brown fox $
d c a a a b b b b b d
B B
positive negative
A positive 3/11 = 0.27 5/11 = 0.45
A negative 1/11 = 0.09 2/11 = 0.18
Table 2: Contingency table computed from relative word agreements from Table 1 for the
generator (A) and annotator (B).

Evaluating Anchor Texts Computing Inter-labeler Agreement - Final
Cohen’s Kappa
K =
Pr(a) − Pr(e)
1 − Pr(e)
(1)
2
See Pustejovksy [2013, p. 133–134] for a detailed example of how to compute both
Pr(a) and Pr(e).

Cohen’s Kappa
K =
Pr(a) − Pr(e)
1 − Pr(e)
(1)
Average Cohen’s Kappa
¯K =
1
|A|
K (2)
2
Pr(a) and Pr(e).

Cohen’s Kappa
K =
Pr(a) − Pr(e)
1 − Pr(e)
(1)
¯K =
1
|A|
K (2)
Annotation Precision of Document
Precision(dk) =
a
a + b
(3)
2
Pr(a) and Pr(e).

Cohen’s Kappa
K =
Pr(a) − Pr(e)
1 − Pr(e)
(1)
¯K =
1
|A|
K (2)
Annotation Precision of Document
Precision(dk) =
a
a + b
(3)
Mean Average Precision
MAP =
1
|A|
|A|
i=1
1
m
m
k=1
Precision(dk) (4)
2
Pr(a) and Pr(e).

Results
• Average K: 0.33
• Fair level of agreement
• slight < fair < moderate < substantial < perfect
• MAP: 0.40
• Roughly a 40% agreement, on average, between the linker and an
annotator for this dataset

Discussion Two-tier Approach to Web Page Annotation
Testing Labeling Consistency
• Create same set of documents for annotation
• Documents already linked using the automated linking process
• Measure the mean average relative agreement, MAR
• Using agreements a and d from Table 1 for anchor text, t, compute
average relative agreement—between an annotator and another
annotator—for a document consisting of each anchor text, t
1
n
n
k=1
atk
+ dtk
(5)
• Compute mean across the set of all documents, D, to get the mean
average relative agreement between any two annotators:
MAR<i,j> =
1
|D|
|D|
k=1
1
n
n
l=1
atl
+ dtl
(6)
• Average all MAR<i,·> scores for each annotator i

Discussion Two-tier Approach to Web Page Annotation
Establish Best Practices
• Use threshold on MAR to remove bad actors within the group
[Neuendorf, 2002]
• Hold general meeting of annotators exposing good and bad practices
in annotation

Discussion Improvements to Anchor Text Selection
• Oﬀer more parses of sentences given the noun phrase grammar
• NLTK returns ﬁrst matching rule to grammar
• Probabilistic noun phrase grammar in NLTK
• Compute probabilities based on anchor texts used in annotations of this
study

Conclusions Lessons Learned
• Use a reference corpus
• Introduce annotators to one another to oﬀer best practices
• Establish social media group to foster communication
• Static rules require updating to take into account experts’ choices
• Probabilistic grammar accommodates parsing ﬂexibility

References
References I
R. Mihalcea and P. Tarau.
TextRank: Bringing Order into Texts
Conference on Empirical Methods in Natural Language Processing,
2004.
I.H. Witten et al.
KEA: Practical automatic keyphrase extraction
International Workshop on Description Logics, p. 254–256, 1999
A. Hulth.
Combining Machine Learning and Natural Language Processing for
Automatic Keyword Extraction.
Department of Computer and Systems Sciences, Stockholm University,
2004.

References
References II
J. Pustejovsky and A. Stubbs.
Natural Language Annotation for Machine Learning
O’Reilly Media, Inc., 2013.
K. A. Neuendorf.
The Content Analysis Guidebook
Thousand Oaks, California: Sage Publications.

Evaluation of Anchor Texts for Automated Link Discovery in Semi-structured Web Documents

Recommended

Recommended

More Related Content

Similar to Evaluation of Anchor Texts for Automated Link Discovery in Semi-structured Web Documents

Similar to Evaluation of Anchor Texts for Automated Link Discovery in Semi-structured Web Documents (20)

Recently uploaded

Recently uploaded (20)

Evaluation of Anchor Texts for Automated Link Discovery in Semi-structured Web Documents