WikitologyWikipedia as an OntologyZareen Syed, Tim Finin  and Anupam Joshi University of Maryland   Baltimore County      ...
Outline• Introduction and motivation• Wikipedia• Methodology and Experiments• Evaluation• Future Work Directions• Conclusi...
Introduction• Identifying the topics and concepts  associated with a document or collection  of documents is a common task...
Motivation• Problem: describe what an analyst has  been working on to support collaboration• Idea: – track documents she r...
Approach• Use Wikipedia articles and categories as  ontology terms• Categories as Generalized Concepts• Articles as Specia...
What’s a document            about?• Two common approaches: •   Statistical Approach     Select words and phrases using TF...
Wikitology !• Using Wikipedia as an ontology offers the best of both  approaches• Each article is a concept in the ontolog...
Wikipedia Graph                Structures• Wikipedia Category graph is a thesaurus• Wikipedia Page links                  ...
Methods• Goal: given one or more documents, compute  a ranked list of the top N Wikipedia articles  and/or categories that...
Spreading Activation• In associative retrieval the idea is that it is  possible to retrieve relevant documents if  they ar...
Spreading ActivationStart with an initial set of activated nodes
Spreading ActivationAt each pulse/iteration, spread activation to adjacent nodes
Spreading ActivationSome nodes will have higher activation       Constraintsthan others                              • Dis...
Method 1Using Wikipedia Article Text and Categories to PredictConcepts     Input     Query     similar to     doc(s)      ...
Method 1Using Wikipedia Article Text and Categories to PredictConcepts                                                    ...
Method 1Using Wikipedia Article Text and Categories to Predict Concepts  Output Rank Categories                           ...
Method 2Using Spreading Activation on Category Links Graph to getAggregated Concepts                                      ...
• Can we predict concepts that are NOT  present in the category hierarchy?• Use the article concepts!• But How?
Method 3         Using Spreading Activation on Article Links Graph  Input                                                 ...
Preliminary Experiments   • An initial informal evaluation compared results against     our own judgments   • Downloaded a...
Preliminary ExperimentsPrediction for Set of Test DocumentsTest Document Titles in the Set: (Wikipedia Articles)Crop_rotat...
Evaluation•   Select wikipedia articles randomly and predict their    categories and links•   Sort the results based on Av...
EvaluationMedicines                           Observation                                    Articles are linked          ...
Category Prediction Evaluation                                                                           M1 Method 1      ...
Article Links Prediction              Evaluation• Spreading activation with one pulse worked best• Only considering articl...
Prediction Accuracy• Issues:  – To what extent the concept is represented in    Wikipedia For eg. we have a category relat...
Potential Applications•   Recommending categories and links    for new Wikipedia articles•   Introducing new Wikipedia cat...
Future Work• Classifying links in Wikipedia using  Machine learning techniques – To Predict semantic type of article – To ...
Document Expansion with Wikipedia                                                                                *        ...
Conclusion• We tested the idea of using Wikitology  for describing documents and proposed  different methods using the Wik...
References• Crestani, F. 1997. Application of Spreading Activation  Techniques in Information Retrieval. Artificial Intell...
References• Gabrilovich, E., and Markovitch, S. 2007. Computing Semantic  Relatedness using Wikipedia-based Explicit Seman...
Thank youQuestions and Suggestions?
Upcoming SlideShare
Loading in...5
×

Wikipedia for Semantic Search

975

Published on

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
975
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • My talk today is about our research work related to using the Wikipedia’s Ontology which we refer to as Wikitology for describing documents
  • So this is the outline of the talk, I will be starting with the introduction and motivation Then I’ll give some details about Wikipedia After that I will be discussing the methods we proposed and the experiments that we conducted Then how we evaluated our methods Then what are our future work directions and finally the conclusion
  • Identifying the topics and concepts associated with a document or collection of documents is a common task for many applications and can help in: Primarily the annotation and categorization of documents themselves If we know what documents the user is looking at or reading, we can use them to model the user’s interests We can use it for business intelligence and even aid in advertising
  • Our motivation was the following problem: A team of analysts is working on a set of common tasks, They work sometimes independently and sometimes in a tightly coordinated group, They can collaborate more effectively if they are aware of what their colleagues are working on
  • Use Wikipedia articles and categories as ontology terms Since Wikipedia is encyclopedic in nature, we consider each category and the article title as a concept in Wikitology Categories can be considered as broad or more generalized concepts whereas the article titles can be considered as specialized concepts Rest just read the slides
  • How to describe what the document is about? Two common approaches
  • A famous algorithm an
  • Distance Constraints: first order,second order or third order nodes Fan out constraint: high degree node should be less activated Path Constraint: certain paths may allow more flow than others Activation Constraint: threshold depending on the overall activation of the network
  • Utilizing Wikipedia Article Text Input 'm' test documents as search query to wikipedia index Retrieve top 'n' matching documents to each test document Extract wikipedia categories for matching documents and score categories based on two scoring schemes Scoring Scheme 1: Category score = Number of occurrences Scoring Scheme 2: Category score= Sum of tf.idf score for matching document for each occurrence of category
  • Utilizing Wikipedia Article Text Input 'm' test documents as search query to wikipedia index Retrieve top 'n' matching documents to each test document Extract wikipedia categories for matching documents and score categories based on two scoring schemes Scoring Scheme 1: Category score = Number of occurrences Scoring Scheme 2: Category score= Sum of tf.idf score for matching document for each occurrence of category
  • Utilizing Wikipedia Article Text Input 'm' test documents as search query to wikipedia index Retrieve top 'n' matching documents to each test document Extract wikipedia categories for matching documents and score categories based on two scoring schemes Scoring Scheme 1: Category score = Number of occurrences Scoring Scheme 2: Category score= Sum of tf.idf score for matching document for each occurrence of category
  • Utilizing Wikipedia Article Text Input 'm' test documents as search query to wikipedia index Retrieve top 'n' matching documents to each test document Extract wikipedia categories for matching documents and score categories based on two scoring schemes Scoring Scheme 1: Category score = Number of occurrences Scoring Scheme 2: Category score= Sum of tf.idf score for matching document for each occurrence of category
  • Utilizing Wikipedia Article Text and Article Links Graph with Spreading Activation Input 'm' test documents as search query to wikipedia index Retrieve top 'n' matching documents to each test document Select them as initial set of activated nodes in the article links graph. Can We predict concepts not in the category hierarchy?
  • An initial informal evaluation compared results against our own judgments Used to select promising combinations of ideas and parameter settings Formal evaluation by selecting wikipedia articles randomly and predicting their categories and links Experiment Set No. 1: Objective : To observe how well do wikipedia categories represent concepts in individual documents Methods: Method 1 & 2 Experiment Set No 2: Objective : Repeat Experiment 1 but with a set of test documents Methods: Method 1 & 2 Experiment Set No 3: Objective : To observe if we can get additional information about documents using article links graph as well using wikipedia articles themselves as test set Methods: Method 1,2 & 3 3Weather Prediction of thunder storms (taken from CNN Weather Prediction)“Weather_Hazards”“Winds”“Severe_weather_and_convection”“Weather_Hazards”“Current_events”“Types_of_cyclone”“Meterology”“Nature”“Weather”“Main_topic_classifications”“Fundamental”“Atmospheric_sciences”
  • Antibiotics -> medicines -> medical treatment -> medicines Select 100 Wikipedia articles randomly for testing; remove from Wikipedia index and graphs For each, use methods to predict categories and linked articles Compare results using precision and recall to known categories and linked articles If our system predicts a category three levels higher in hierarchy than the original category, we consider our prediction correct Compare predicted links with actual links Select 100 Wikipedia articles randomly for testing; remove from Wikipedia index and graphs For each, use methods to predict categories and linked articles Compare results using precision and recall to known categories and linked articles If our system predicts a category three levels higher in hierarchy than the original category, we consider our prediction correct Compare predicted links with actual links
  • Tetracycline_antibiotics
  • These graphs show the precision, average precision, recall and f-measure metrics as the average similar- ity threshold varies from 0.1 to 0.8. Legend label M1 is method 1, M1(1) and M(2) are method 1 with scoring schemes 1 and 2, respectively, and SA1 and S2 represent the use of spreading activation with one and two pulses, respectively. Considering top three level categories Spreading activation with two pulses worked best Only considering articles with similarity > 0.5 was a good threshold
  • Single document in many categories: George W. Bush is included in about 29 categories Links between articles belonging to very different categories John F. Kennedy has a link for “coincidence theory” which belongs to the Mathematical Analysis/ Topology/Fixed Points. Number of articles with in a category: Some categories are under represented where as others have many articles Administrative Categories For eg: Clean up from Sep 2006 Articles with unsourced statements Links to words in an article For eg. If the word United States appears in the document then that word might be linked to the page on “United States”
  • In Table 4 we report retrieval performance using mean average precision (MAP) and precision at ten documents (P@10). When normal relevance feedback is used a 19% relative improvement is seen over the base run. The concepts+rf condition is about as effective as base+rf . Mean average precision suffers a 2.8% relative drop and P@10 gains 1.6%; neither difference is statistically significant.
  • Wikipedia for Semantic Search

    1. 1. WikitologyWikipedia as an OntologyZareen Syed, Tim Finin and Anupam Joshi University of Maryland Baltimore County zarsyed1@umbc.edu, finin@cs.umbc.edu, joshi@cs.umbc.edu
    2. 2. Outline• Introduction and motivation• Wikipedia• Methodology and Experiments• Evaluation• Future Work Directions• Conclusion • intro • wikipedia • experiments • evaluation • next • conclusion •
    3. 3. Introduction• Identifying the topics and concepts associated with a document or collection of documents is a common task for many applications and can help in: – Annotation and categorization of documents in a corpus. – Modelling user interests – Business intelligence – Selecting Advertisements• intro • wikipedia • experiments • evaluation • next • conclusion •
    4. 4. Motivation• Problem: describe what an analyst has been working on to support collaboration• Idea: – track documents she reads – map these to terms in an ontology – aggregate to produce a short list of topics• intro • wikipedia • experiments • evaluation • next • conclusion •
    5. 5. Approach• Use Wikipedia articles and categories as ontology terms• Categories as Generalized Concepts• Articles as Specialized Concepts• How to map the documents she reads to the ontology terms? – Use document to Wiki-article similarity for the mapping• How to aggregate to get a shorter list? – Use spreading activation algorithm for aggregation• intro • wikipedia • experiments • evaluation • next • conclusion •
    6. 6. What’s a document about?• Two common approaches: • Statistical Approach Select words and phrases using TF-IDF that characterize the document (2) Controlled Vocabulary or Ontology Map document to a list of terms from a controlled vocabulary or ontology• First approach is flexible and does not require creating and maintaining an ontology• Second approach can tie documents to a rich knowledge base• intro • wikipedia • experiments • evaluation • next • conclusion •
    7. 7. Wikitology !• Using Wikipedia as an ontology offers the best of both approaches• Each article is a concept in the ontology• Terms linked via Wikipedia’s category system and inter-article links• It’s a consensus ontology created, kept current and maintained by a diverse community• Overall content quality is high• Terms have unique IDs (URLs) and are “self describing” for people• Underlying graphs provide structure: categories, article links• intro • wikipedia • experiments • evaluation • next • conclusion •
    8. 8. Wikipedia Graph Structures• Wikipedia Category graph is a thesaurus• Wikipedia Page links Ou r p lan s a re going ah ead , He ylig hen i s get ting ticke ts, so lets put th at in in ha rd pencil, but keep yo ur erase r h and y a ll the sam e. M y lif e is Ou r p la n s a re going ahead , Heylighen is getting t icke ts, so let s put re ally hectic, and I won t be in a sane pla ce ti ll m id -Apr il . We ll p in th at in in ha rd pen cil, but keep your dow n th e det ails later , but a t this po int erase r h and y a ll the s ame . My life is re ally hect ic, and I wont be i n a sane we we re consi dering Val onl y t o ta lk on "T he M etasystem Tra nsition as pla ce ti ll m id -A pril . We ll p in Ou r p la n s ar e going ah ead , He ylig hen is dow n th e det ai ls lat er, b ut at t his point getting t ic ke ts, so le t s put th e Qu ant um of E volu tion". Th is is th e th eore tical ba se to t he P C P, wh ich I we we re consi deri ng Va l only to th at in in har d pen cil, but keep yo ur ta lk on "T he M etasyst em Tr ansitio n as erase r hand y all the same . My lif e is described t he for m of in my ta lk th e Qu antum of E volu tion" . Th is is re ally hect ic , and I won t be i n a sane to W ES S. I t s b asically be tw een F ran cis and Val ho w would like to talk, or th e th eore ti cal ba se to the P C P, wh ic h I pla ce till mid -A pril. We ll p in described the f orm of in m y talk dow n th e detai ls la t er, b ut a t t his po int bot h, or wha t. to W ES S . It s b asica lly be tw een F ran cis we we re consideri ng Va l onl y t o and V al ho w wou ld li k e to tal k, or ta lk on " The Metasyst em Tra nsitio n as graph is similar to WWW both, or wha t. th e Qu antum of E voluti on". Th is is th e th eore ti cal base to the P C P, wh ic h I described the f orm of in m y ta lk to W ES S . Its basica l y be tw een F ran cis and V al how wou ld li ke to t al k, or both, or wha t. Ou r p lan s a re going ah ead , He ylig hen i s get t ing ticke ts , so lets put th at i n in ha rd pencil, but keep yo ur Ou r p lan s a re goi ng erase r h and y a ll the sam e. M y lif e is ahe ad, Heyl igh en is re al ly hectic, and I won t be in a sane get ting t ick e ts, so pla ce ti ll m id- Apr il . We ll p in lets put dow n th e details later , but a t this po int th at in in ha rd we we re consi dering Val onl y t o Network pen cil, but keep ta lk on "T he M etasystem Tra nsitio n as your era ser han dy th e Qu ant um of E volu tion". Th is is Ou r p lan s ar e going ahead , He ylig hen i s all the sa m e. M y life th e th eore tical ba se to t he P C P, wh ich I gett ing t icke ts, so le t s put Ou r p lan s a re goi ng ah ead , He ylig hen i s descri bed t he for m of in my ta lk th at in in har d pen cil, but keep yo ur get ting t ick e ts, so lets put to W E SS. I ts b asically be tw een F ran cis eras e r h and y all the same . My life is th at in in ha rd pen cil, but keep yo ur and Val ho w would like to talk, or re al ly hect ic , and I wont be i n a sane er ase r h and y a ll t he sam e. M y lif e is bot h, or what. pla c e till mid -A pril. We ll p in re ally hect ic, and I won t be in a sane dow n th e detai ls la t er, b ut at t his po int pla ce till m id -April . We ll p in we we re c onsideri ng Va l only to dow n th e det ails later , but a t this po int ta lk on "T he Metasyst em Tr ans it io n as we we re consi dering Val onl y th e Qu ant um of E volution" . Th is is th e th eore ti cal base to the P C P, wh ich I described the f orm of in m y talk to W ES S. Its basica l y be tw een F ran c is and Val ho w wou ld li ke to tal k, or both, or wha t. Ou r p lan s ar e going ahead , He ylig hen i s get ting t ic ke ts, so le t s put O ur pla ns are go ing a head, H eyli ghen is th at in in har d pen cil, but keep yo ur ge t in g ti cket s, so le t s p ut erase r hand y all the same . My life is that in in h ard pe ncil, bu t keep your re ally hect ic , and I wont be i n a sane era ser han dy all t he sa me. My li fe is pla ce till mid -A pril. We ll p in rea lly he cti c, an d I wo nt be in a sane dow n th e detai ls la t er, b ut a t t his po int place till m id-A pri l. W ell pin we we re consideri ng Va l only to do wn the de ta ils la te r, but at this p oin t ta lk on " The Metasyst em Tr a nsit io n as we w ere con side ring V al o nly to th e Qu antum of E volution" . Th is is talk on "T he Me ta syste m Tr ansition a s th e th eore ti cal base to the P C P, wh ich I the Q uan tu m o f Evol ution" . This is described the f orm of in m y ta lk the theo ret ic al b ase to th e P CP, which I to W ES S . Its basica l y be tw een F ran cis de scribe d th e fo rm o f in my t alk and Val how wou ld li ke to tal k, or to WE SS. It s basi cally bet we en F ranci s bot h, or wha t. an d Va l how w ould like t o talk, or bo th , or wh at. • intro • wikipedia • experiments • evaluation • next • conclusion •
    9. 9. Methods• Goal: given one or more documents, compute a ranked list of the top N Wikipedia articles and/or categories that describe it.• Basic metric: document similarity between Wikipedia article and document(s)• Variations: – role of categories – eliminating uninteresting articles – use of spreading activation – using similarity scores for weighing links – number of spreading activation pulses – individual or set of query documents, etc, etc.
    10. 10. Spreading Activation• In associative retrieval the idea is that it is possible to retrieve relevant documents if they are associated with other documents that have been considered relevant by the user.• The documents can be represented as nodes and their associations as links in a network. • intro • wikipedia • experiments • evaluation • next • conclusion •
    11. 11. Spreading ActivationStart with an initial set of activated nodes
    12. 12. Spreading ActivationAt each pulse/iteration, spread activation to adjacent nodes
    13. 13. Spreading ActivationSome nodes will have higher activation Constraintsthan others • Distance • Fan out • Path constraints • Activation threshold
    14. 14. Method 1Using Wikipedia Article Text and Categories to PredictConcepts Input Query similar to doc(s) 0.2 0.1 0.8 Similar Wikipedia Articles Cosine similarity 0.2 0.3
    15. 15. Method 1Using Wikipedia Article Text and Categories to PredictConcepts Wikipedia Category Graph Input Query similar to doc(s) 0.2 0.1 0.8 Similar Wikipedia Articles Cosine similarity 0.2 0.3
    16. 16. Method 1Using Wikipedia Article Text and Categories to Predict Concepts Output Rank Categories Wikipedia • Links Category • Cosine similarity Graph 0.9 3 Input Query similar to doc(s) 0.2 0.1 0.8 Similar Wikipedia Articles Cosine similarity 0.2 0.3
    17. 17. Method 2Using Spreading Activation on Category Links Graph to getAggregated Concepts Spreading ActivationOutputRanked Concepts based Wikipedia Categoryon Final Activation Score Graph Input Query Similar to doc(s) 0.2 0.1 0.8 Cosine similarity 0.2 0.3 Input Function Ij = ∑ Oi i Aj Output Function Oj = Dj * k
    18. 18. • Can we predict concepts that are NOT present in the category hierarchy?• Use the article concepts!• But How?
    19. 19. Method 3 Using Spreading Activation on Article Links Graph Input Threshold: Ignore Spreading Query Similar To Activation to articles with less doc(s) than 0.4 Cosine similarity scoreEdge Weights: Cosinesimilarity between linkedarticles Wikipedia Article Links Graph Spreading ActivationNode Input Function Ij = ∑ Oiwij i Aj OutputNode Output Function Oj = Ranked Concepts based on Final Activation Score k
    20. 20. Preliminary Experiments • An initial informal evaluation compared results against our own judgments • Downloaded articles from internet and predicted concepts • Using Single Document and Group of Related DocumentsPrediction for Single Test Document Test Document Method 1 Method 2 Method 2 Title Ranking Categories Directly Spreading Spreading Activation Activation Pulses=3 Pulses=2Weather Prediction “Weather_Hazards” “Weather_Hazards” “Meterology” of thunder “Winds” “Current_events” “Nature” storms (CNN) “Severe_weather_and_convection” “Types_of_cyclone” “Weather” More pulses -> More Generalized Concepts • intro • wikipedia • experiments • evaluation • next • conclusion •
    21. 21. Preliminary ExperimentsPrediction for Set of Test DocumentsTest Document Titles in the Set: (Wikipedia Articles)Crop_rotationPermacultureBeneficial_insectsNeemLady_BirdPrinciples_of_Organic_AgricultureRhizobia Concept not in theBiointensive Category HierarchyInter-croppingGreen_manure Method 1 Method 2 (2 pulses) Method 3 (2 pulses) Ranking Categories Directly Spreading Activation on Spreading Activation on Article Category links Graph Links Graph Agriculture Skills Organic_farming Sustainable_technologies Applied_sciences Sustainable_agriculture Crops Land_management Organic_gardening Agronomy Food_industry Agriculture Permaculture Agriculture Companion_planting • intro • wikipedia • experiments • evaluation • next • conclusion •
    22. 22. Evaluation• Select wikipedia articles randomly and predict their categories and links• Sort the results based on Average Similarity Average Similarity 0.8 0.5 0.7 Query similar to 0.2 0.9 doc(s) Cosine similarity 0.5 + 0.9 + 0.7 + 0.2 + 0.8 5 • intro • wikipedia • experiments • evaluation • next • conclusion •
    23. 23. EvaluationMedicines Observation Articles are linked often with super Medical Treatments and sub categories both 1st Antibiotics • If our system predicts a Tetracyclin category three levels higher in hierarchy than the original category we consider our prediction to be correct Oxytetracyclin
    24. 24. Category Prediction Evaluation M1 Method 1 SA1 Spreading Activation pulse(s)= 1 SA2 Spreading Activation pulse(s)=2 • Spreading activation with two pulses worked best • Only considering articles with similarity > 0.5 was a good threshold • intro • wikipedia • experiments • evaluation • next • conclusion •
    25. 25. Article Links Prediction Evaluation• Spreading activation with one pulse worked best• Only considering articles with similarity > 0.5 was a good threshold Similar Documents, N = 5 Spreading Activation pulses=1 • intro • wikipedia • experiments • evaluation • next • conclusion •
    26. 26. Prediction Accuracy• Issues: – To what extent the concept is represented in Wikipedia For eg. we have a category related to the fruit apple but not for mango – Presence of links between semantically related concepts – Presence of links between irrelevant articles (term definitions, country names)• Possible Solutions: – Use Average Similarity Score to measure the extent of concept representation with in Wikipedia – Use existing semantic relatedness measures to handle presence or absence of semantically related links • intro • wikipedia • experiments • evaluation • next • conclusion •
    27. 27. Potential Applications• Recommending categories and links for new Wikipedia articles• Introducing new Wikipedia categories• Automating the process of building a Wiki from a corpus
    28. 28. Future Work• Classifying links in Wikipedia using Machine learning techniques – To Predict semantic type of article – To control flow of spreading activation• Exploit parallel execution on cluster• Refining Wikipedia ontology• Bridging the gap between Wikipedia and formal ontologies• intro • wikipedia • experiments • evaluation • next • conclusion •
    29. 29. Document Expansion with Wikipedia * Derived Ontology Terms • Expansion of each TREC document using Wikitology terms • We are still working on refining the methodology Doc: FT921-4598 (3/9/92) ... Alan Turing, described as a brilliant mathematician and a key figure in the breaking of the Nazis Enigma codes. Prof IJ Good says it is as well that British security was unaware of Turings homosexuality, otherwise he might have been fired and we might have lost the war. In 1950 Turing wrote the seminal paper Computing Machinery And Intelligence, but in 1954 killed himself ... Turing_machine, Turing_test, Church_Turing_thesis, Halting_problem, Computable_number, Bombe, Alan_Turing, Recusion_theory, Formal_methods, Computational_models, Theory_of_computation, Theoretical_computer_science, Artificial_Intelligence* In Collaboration with Paul McNamee, John Hopkins University Applied Physics Laboratory
    30. 30. Conclusion• We tested the idea of using Wikitology for describing documents and proposed different methods using the Wikipedia article text, category links and article links• Suggested improvements• Using average similarity to judge the accuracy of prediction• Easily extendable to other wikis and collaborative KBs, e.g., Intellipedia, Freebase• intro • wikipedia • experiments • evaluation • next • conclusion •
    31. 31. References• Crestani, F. 1997. Application of Spreading Activation Techniques in Information Retrieval. Artificial Intelligence Review, 1997, vol 11; No. 6, 453-482.• Gabrilovich, E., and Markovitch, S. 2006. Overcoming the brittleness bottleneck using Wikipedia: Enhancing text categorization with encyclopedic knowledge. Proceedings of the Twenty-First National Conference on Artificial Intelligence. AAAI’06. Boston, MA.• Schonhofen, P. 2006. Identifying Document Topics Using the Wikipedia Category Network. Proc. 2006 IEEE/WIC/ACM International Conference on Web Intelligence. 456-462, 2006. IEEE Computer Society, Washington, DC, USA.• Strube,M., and Ponzetto, S.P. 2006. Exploiting semantic role labeling, WordNet and Wikipedia for coreference resolution. Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (2006). Asso-ciation for Computational Linguistics Morristown, NJ, USA.
    32. 32. References• Gabrilovich, E., and Markovitch, S. 2007. Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis, Proc. of the 20th International Joint Con-ference on Artificial Intelligence (IJCAI’07), 6-12.• Krizhanovsky, A. 2006. Synonym search in Wikipedia: Synarcher.• URL:http://arxiv.org/abs/cs/0606097v1• Mihalcea, R. 2007. Using Wikipedia for Automatic Word Sense Disambiguation. Proc NAACL HLT. 196-203.• Strube,M., and Ponzetto, S.P. 2006. WikiRelate! Computing semantic relatedness using Wikipedia. American Association for Artificial Intelligence, 2006, Boston, MA.• Voss, J. 2006. Collaborative thesaurus tagging the Wikipedia way. Collaborative Web Tagging Workshop. Arxiv Computer Science e prints . URL http://arxiv.org/abs/cs/0604036• Milne, D. 2007. Computing Semantic Relatedness using Wikipedia Link Structure. Proceedings of the New Zealand Computer Science Research Student conference (NZCSRSC’07), Hamilton, New Zealand.
    33. 33. Thank youQuestions and Suggestions?
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×