2. www.dama.upc.edu2
● Huge Repository of Unstructured Information
4,744,318 articles
● Good source of information for Knowledge
Extraction
Query Answering
Entity Linkage
Query Expansion
etc…
KNOWLEDGE BASE: WIKIPEDIA
3. www.dama.upc.edu3
● Introduce new Expansion Features to improve the query
results.
● Knowledge Extraction from the query's topic point of
view.
Topic: “Graffiti Street Art on Walls”
• Articles: “Graffiti”, “Street Art” , “Walls”
Semantically Related Information
• “Banksy”, “Cha_(artist)”, “Berlin_Wall_graffiti_art”
QUERY EXPANSION - QE
4. www.dama.upc.edu4
● Goal:
To analyze the structure of Wikipedia.
To understand how different categories of
data within relate to each other.
To describe relationships that contributes
to improve results in QE scenario.
Identify goals for vendors.
GOAL
6. www.dama.upc.edu6
1. Entity Linkage on the Collections of Results.
Each document as set of Articles
2. For each query
1. Select the articles whose titles improve the most
the results.
2. Build a subgraph out of these articles, their
categories, and their redirects.
Precisions at Top1,5,10,15 close to 1.
ANALYSIS DESCRIPTION
7. www.dama.upc.edu7
Many Connected Components
Only one CC with the articles
that matches the query
Structure behind
QUERY GRAPH EXAMPLE
8. www.dama.upc.edu8
● Cycles Introduces Relevant Information
● Equivalent to best results of ImageCLEF
Visual Search Engine + Relevance FeedBack
FIRST GLANCE
9. www.dama.upc.edu9
● Analyze the Cycles to find correlation
between:
Cycle Characteristics
• Length, Category Ratio, Direction,
Chord(less)/Edge Ratio
Contribution
NEXT STEPS