Your SlideShare is downloading. ×
0
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
IR3.ppt
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
368
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Search Engine Technology (3) Prof. Dragomir R. Radev radev@umich.edu
  • 2. SET Fall 2009 … 5. Evaluation of IR systems Reference collections TREC …
  • 3. Relevance • Difficult to change: fuzzy, inconsistent • Methods: exhaustive, sampling, pooling, search-based
  • 4. Contingency table w=tp x=fn y=fp z=tn n2 = w + y n1 = w + x N relevant not relevant retrieved not retrieved
  • 5. Precision and Recall Recall: Precision: w w+y w+x w
  • 6. Exercise Go to Google (www.google.com) and search for documents on Tolkien’s “Lord of the Rings”. Try different ways of phrasing the query: e.g., Tolkien, “JRR Tolkien”, +”JRR Tolkien” +Lord of the Rings”, etc. For each query, compute the precision (P) based on the first 10 documents returned by AltaVista. Note! Before starting the exercise, have a clear idea of what a relevant document for your query should look like. Try different information needs. Later, try different queries.
  • 7. n Doc. no Relevant? Recall Precision 1 588 x 0.2 1.00 2 589 x 0.4 1.00 3 576 0.4 0.67 4 590 x 0.6 0.75 5 986 0.6 0.60 6 592 x 0.8 0.67 7 984 0.8 0.57 8 988 0.8 0.50 9 578 0.8 0.44 10 985 0.8 0.40 11 103 0.8 0.36 12 591 0.8 0.33 13 772 x 1.0 0.38 14 990 1.0 0.36 [From Salton’s book]
  • 8. P/R graph 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Precision
  • 9. P/R graph 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Recall Precision Interpolated average precision (e.g., 11pt) Interpolation – what is precision at recall=0.5?
  • 10. Issues • Why not use accuracy A=(w+z)/N? • Average precision • Average P at given “document cutoff values” • Report when P=R • F measure: F=(β2 +1)PR/(β2 P+R) • F1 measure: F1 = 2/(1/R+1/P) : harmonic mean of P and R
  • 11. Kappa • N: number of items (index i) • n: number of categories (index j) • k: number of annotators )(1 )()( EP EPAP − − =κ ∑∑= = − − − = N i n j ij k m kNk AP 1 1 2 1 1 )1( 1 )( 2 1 1 )(             = ∑ ∑ = = Nk m EP N i ijn j
  • 12. Kappa example J1+ J1- TOTAL J2+ 300 10 310 J2- 20 70 90 TOTA L 320 80 400
  • 13. Kappa (cont’d) • P(A) = 370/400 = 0.925 • P (-) = (10+20+70+70)/800 = 0.2125 • P (+) = (10+20+300+300)/800 = 0.7875 • P (E) = 0.2125 * 0.2125 + 0.7875 * 0.7875 = 0.665 • K = (0.925-0.665)/(1-0.665) = 0.776 • Kappa higher than 0.67 is tentatively acceptable; higher than 0.8 is good
  • 14. Sample TREC query <top> <num> Number: 305 <title> Most Dangerous Vehicles <desc> Description: Which are the most crashworthy, and least crashworthy, passenger vehicles? <narr> Narrative: A relevant document will contain information on the crashworthiness of a given vehicle or vehicles that can be used to draw a comparison with other vehicles. The document will have to describe/compare vehicles, not drivers. For instance, it should be expected that vehicles preferred by 16-25 year-olds would be involved in more crashes, because that age group is involved in more crashes. I would view number of fatalities per 100 crashes to be more revealing of a vehicle's crashworthiness than the number of crashes per 100,000 miles, for example. </top> LA031689-0177 FT922-1008 LA090190-0126 LA101190-0218 LA082690-0158 LA112590-0109 FT944-136 LA020590-0119 FT944-5300 LA052190-0048 LA051689-0139 FT944-9371 LA032390-0172 LA042790-0172 LA021790-0136 LA092289-0167 LA111189-0013 LA120189-0179 LA020490-0021 LA122989-0063 LA091389-0119 LA072189-0048 FT944-15615 LA091589-0101 LA021289-0208
  • 15. <DOCNO> LA031689-0177 </DOCNO> <DOCID> 31701 </DOCID> <DATE><P>March 16, 1989, Thursday, Home Edition </P></DATE> <SECTION><P>Business; Part 4; Page 1; Column 5; Financial Desk </P></SECTION> <LENGTH><P>586 words </P></LENGTH> <HEADLINE><P>AGENCY TO LAUNCH STUDY OF FORD BRONCO II AFTER HIGH RATE OF ROLL-OVER ACCIDENTS </P></HEADLINE> <BYLINE><P>By LINDA WILLIAMS, Times Staff Writer </P></BYLINE> <TEXT> <P>The federal government's highway safety watchdog said Wednesday that the Ford Bronco II appears to be involved in more fatal roll-over accidents than other vehicles in its class and that it will seek to determine if the vehicle itself contributes to the accidents. </P> <P>The decision to do an engineering analysis of the Ford Motor Co. utility-sport vehicle grew out of a federal accident study of the Suzuki Samurai, said Tim Hurd, a spokesman for the National Highway Traffic Safety Administration. NHTSA looked at Samurai accidents after Consumer Reports magazine charged that the vehicle had basic design flaws. </P> <P>Several Fatalities </P> <P>However, the accident study showed that the "Ford Bronco II appears to have a higher number of single-vehicle, first event roll-overs, particularly those involving fatalities," Hurd said. The engineering analysis of the Bronco, the second of three levels of investigation conducted by NHTSA, will cover the 1984-1989 Bronco II models, the agency said. </P> <P>According to a Fatal Accident Reporting System study included in the September report on the Samurai, 43 Bronco II single-vehicle roll-overs caused fatalities, or 19 of every 100,000 vehicles. There were eight Samurai fatal roll-overs, or 6 per 100,000; 13 involving the Chevrolet S10 Blazers or GMC Jimmy, or 6 per 100,000, and six fatal Jeep Cherokee roll-overs, for 2.5 per 100,000. After the accident report, NHTSA declined to investigate the Samurai. </P> ... </TEXT> <GRAPHIC><P> Photo, The Ford Bronco II "appears to have a higher number of single-vehicle, first event roll-overs," a federal official said. </P></GRAPHIC> <SUBJECT> <P>TRAFFIC ACCIDENTS; FORD MOTOR CORP; NATIONAL HIGHWAY TRAFFIC SAFETY ADMINISTRATION; VEHICLE INSPECTIONS; RECREATIONAL VEHICLES; SUZUKI MOTOR CO; AUTOMOBILE SAFETY </P> </SUBJECT> </DOC>
  • 16. TREC (cont’d) • http://trec.nist.gov/tracks.html • http://trec.nist.gov/presentations/presentations.html
  • 17. Most used reference collections • Generic retrieval: OHSUMED, CRANFIELD, CACM • Text classification: Reuters, 20newsgroups • Question answering: TREC-QA • Web: DOTGOV, wt100g • Blogs: Buzzmetrics datasets • TREC ad hoc collections, 2-6 GB • TREC Web collections, 2-100GB
  • 18. Comparing two systems • Comparing A and B • One query? • Average performance? • Need: A to consistently outperform B [this slide: courtesy James Allan]
  • 19. The sign test • Example 1: – A > B (12 times) – A = B (25 times) – A < B (3 times) – p < 0.035 (significant at the 5% level) • Example 2: – A > B (18 times) – A < B (9 times) – p < 0.122 (not significant at the 5% level) – http://www.fon.hum.uva.nl/Service/Statistics/Sign_Tes t.html [this slide: courtesy James Allan]
  • 20. Other tests • Student t-test: takes into account the actual performances, not just which system is better – http://www.fon.hum.uva.nl/Service/Statistics/Student_t _Test.html – http://www.socialresearchmethods.net/kb/stat_t.php • Wilcoxon Matched-Pairs Signed-Ranks Test – http://www.fon.hum.uva.nl/Service/Statistics/Signed_ Rank_Test.html
  • 21. SET Fall 2009 … 6. Automated indexing/labeling Compression …
  • 22. Indexing methods • Manual: e.g., Library of Congress subject headings, MeSH • Automatic: e.g., TF*IDF based
  • 23. LOC subject headings http://www.loc.gov/catdir/cpso/lcco/lcco.html A -- GENERAL WORKS B -- PHILOSOPHY. PSYCHOLOGY. RELIGION C -- AUXILIARY SCIENCES OF HISTORY D -- HISTORY (GENERAL) AND HISTORY OF EUROPE E -- HISTORY: AMERICA F -- HISTORY: AMERICA G -- GEOGRAPHY. ANTHROPOLOGY. RECREATION H -- SOCIAL SCIENCES J -- POLITICAL SCIENCE K -- LAW L -- EDUCATION M -- MUSIC AND BOOKS ON MUSIC N -- FINE ARTS P -- LANGUAGE AND LITERATURE Q -- SCIENCE R -- MEDICINE S -- AGRICULTURE T -- TECHNOLOGY U -- MILITARY SCIENCE V -- NAVAL SCIENCE Z -- BIBLIOGRAPHY. LIBRARY SCIENCE. INFORMATION RESOURCES (GENERAL)
  • 24. Medicine CLASS R - MEDICINE Subclass R R5-920 Medicine (General) R5-130.5 General works R131-687 History of medicine. Medical expeditions R690-697 Medicine as a profession. Physicians R702-703 Medicine and the humanities. Medicine and disease in relation to history, literature, etc. R711-713.97 Directories R722-722.32 Missionary medicine. Medical missionaries R723-726 Medical philosophy. Medical ethics R726.5-726.8 Medicine and disease in relation to psychology. Terminal care. Dying R727-727.5 Medical personnel and the public. Physician and the public R728-733 Practice of medicine. Medical practice economics R735-854 Medical education. Medical schools. Research R855-855.5 Medical technology R856-857 Biomedical engineering. Electronics. Instrumentation R858-859.7 Computer applications to medicine. Medical informatics R864 Medical records R895-920 Medical physics. Medical radiology. Nuclear medicine
  • 25. Automatic methods • TF*IDF: pick terms with the highest TF*IDF scores • Centroid-based: pick terms that appear in the centroid with high scores • The maximal marginal relevance principle (MMR) • Related to summarization, snippet generation
  • 26. Compression • Methods – Fixed length codes – Huffman coding – Ziv-Lempel codes
  • 27. Fixed length codes • Binary representations – ASCII – Representational power (2k symbols where k is the number of bits)
  • 28. Variable length codes • Alphabet: A .- N -. 0 ----- B -... O --- 1 .---- C -.-. P .--. 2 ..--- D -.. Q --.- 3 ...— E . R .-. 4 ....- F ..-. S ... 5 ..... G --. T - 6 -.... H .... U ..- 7 --... I .. V ...- 8 ---.. J .--- W .-- 9 ----. K -.- X -..- L .-.. Y -.— M -- Z --.. • Demo: – http://www.scphillips.com/morse/
  • 29. Most frequent letters in English • Most frequent letters: – E T A O I N S H R D L U • Demo: – http://www.amstat.org/publications/jse/secure/v7n2/count-c • Also: bigrams: – TH HE IN ER AN RE ND AT ON NT
  • 30. Huffman coding • Developed by David Huffman (1952) • Average of 5 bits per character (37.5% compression) • Based on frequency distributions of symbols • Algorithm: iteratively build a tree of symbols starting with the two least frequent symbols
  • 31. Symbol Frequency A 7 B 4 C 10 D 5 E 2 F 11 G 15 H 3 I 7 J 8
  • 32. 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 c b d f g i j he a
  • 33. Symbol Code A 0110 B 0010 C 000 D 0011 E 01110 F 010 G 10 H 01111 I 110 J 111
  • 34. Exercise • Consider the bit string: 011011011110001001100011101001110 00110101101011101 • Use the Huffman code from the example to decode it. • Try inserting, deleting, and switching some bits at random locations and try decoding.
  • 35. Extensions • Word-based • Domain/genre dependent models
  • 36. Ziv-Lempel coding • Two types - one is known as LZ77 (used in GZIP) • Code: set of triples <a,b,c> • a: how far back in the decoded text to look for the upcoming text segment • b: how many characters to copy • c: new character to add to complete segment
  • 37. • <0,0,p> p • <0,0,e> pe • <0,0,t> pet • <2,1,r> peter • <0,0,_> peter_ • <6,1,i> peter_pi • <8,2,r> peter_piper • <6,3,c> peter_piper_pic • <0,0,k> peter_piper_pick • <7,1,d> peter_piper_picked • <7,1,a> peter_piper_picked_a • <9,2,e> peter_piper_picked_a_pe • <9,2,_> peter_piper_picked_a_peck_ • <0,0,o> peter_piper_picked_a_peck_o • <0,0,f> peter_piper_picked_a_peck_of • <17,5,l> peter_piper_picked_a_peck_of_pickl • <12,1,d> peter_piper_picked_a_peck_of_pickled • <16,3,p> peter_piper_picked_a_peck_of_pickled_pep • <3,2,r> peter_piper_picked_a_peck_of_pickled_pepper • <0,0,s> peter_piper_picked_a_peck_of_pickled_peppers
  • 38. Links on text compression • Data compression: – http://www.data-compression.info/ • Calgary corpus: – http://links.uwaterloo.ca/calgary.corpus.html • Huffman coding: – http://www.compressconsult.com/huffman/ – http://en.wikipedia.org/wiki/Huffman_coding • LZ – http://en.wikipedia.org/wiki/LZ77
  • 39. 100 alternative search engines • http://rss.slashdot.org/~r/Slashdot/slashdo t/~3/83468703/article.pl
  • 40. Readings • 2: MRS9 • 3: MRS13, MRS14 • 4: MRS15, MRS16

×