Concept based information retrieval using explicit
Upcoming SlideShare
Loading in...5
×
 

Concept based information retrieval using explicit

on

  • 202 views

 

Statistics

Views

Total Views
202
Views on SlideShare
202
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Concept based information retrieval using explicit Concept based information retrieval using explicit Presentation Transcript

  • Concept-Based Information Retrieval using Explicit Semantic Analysis OFER EGOZI, SHAUL MARKOVITCH, and EVGENIY GABRILOVICH Technion-Israel Institute of Technology
  • Content • Information Retrieval • Keyword- retrieval Bag- Of-Word (BOW) • Irrelevant Data • Concept Based Retrieval • Explicit Semantic Analysis • Morag System • Conclusion
  • Information Retrieval Systems Query IR Recall Precision Query
  • Keyword-Based Retrieval Query IR Bag Of Words (BOW)
  • Irrelevant Data ?? • Vocabulary Problems - Synonymy - World Knowledge
  • Concept Based IR • Transform to a domain of concepts (not to domain of words) • Less dependent on specific terms
  • Explicit Semantic Analysis
  • Wikipedia Based ESA
  • ESA Based Data Retrieval - Example salvaging shipwreck treasure “ANCIENT ARTIFACTS FOUND. Divers have recovered artifacts lying underwater for more than 2,000 years in the wreck of a Roman ship that sank in the Gulf of Baratti, 12 miles off the island of Elba, newspapers reported Saturday." •SHIPWRECK •TREASURE •MARITIME ARCHAEOLOGY •MARINE SALVAGE •HISTORY OF THE BRITISH VIRGIN ISLANDS •WRECKING (SHIPWRECK) •KEY WEST, FLORIDA •FLOTSAM AND JETSAM •WRECK DIVING •SPANISH TREASURE FLEET •SCUBA DIVING •WRECK DIVING •RMS TITANIC •USS HOEL (DD-533) •SHIPWRECK •UNDERWATER ARCHAEOLOGY •USS MAINE (ACR-1) •MARITIME ARCHAEOLOGY •TOMB RAIDER II •USS MEADE (DD-602)
  • Irrelevant Docs • ESTONIA AT THE 2000 SUMMER OLYMPICS • ESTONIA AT THE 2004 SUMMER OLYMPICS • 2006 COMMONWEALTH GAMES • ESTONIA AT THE 2006 WINTER OLYMPICS • 1992 SUMMER OLYMPICS • ATHLETICS AT THE 2004 SUMMER OLYMPICS • 2000 SUMMER OLYMPICS • 2006 WINTER OLYMPICS • CROSS-COUNTRY SKIING 2006 WINTER OLYMPICS • NEW ZEALAND AT THE 2006 WINTER OLYMPICS “Olympic News In Brief: Cycling win for Estonia. Erika Salumae won Estonia's first Olympic gold when retaining the women's cycling individual sprint title she won four years ago in Seoul as a Soviet athlete. " Estonia Economy • ESTONIA • ECONOMY OF ESTONIA • ESTONIA AT THE 2000 SUMMER OLYMPICS • ESTONIA AT THE 2004 SUMMER OLYMPICS • ESTONIA NATIONAL FOOTBALL TEAM • ESTONIA AT THE 2006 WINTER OLYMPICS • BALTIC SEA ?? • EUROZONE • TIIT VÄHI • MILITARY OF ESTONIA??
  • Selecting Query Features • Selection could remove noisy ESA concepts • However, IR task provides no training data… Utility function U(+|-) requires target measure >> training set f =ESA(q) Filter U f’ Focus on query concepts - Query is short and noisy, while FS at indexing lacks context
  • Pseudo Relevant Feedback
  • ESA Feature Selection Methods • IG- calculate each feature’s Information Gain in separating positive and negative examples, take best performing features • IIG- add concepts in the positive examples to candidate features, and re-weight all features based on their weights in examples • RV- find subset of features that best separates positive and negative examples, employing heuristic search
  • MORAG System
  • MORAG Evaluation
  • Conclusion • MORAG: a new methodology for concept- based information retrieval • Documents and query are enhanced by Wikipedia concepts • Informative features are selected using pseudo-relevance feedback • The generated features improve the performance of BOW-based systems
  • Thank You
  • Q & A