How to Search Annotated Text by Strategy?

  • 251 views
Uploaded on

Spinque presentation at the 23rd Meeting of Computational Linguistics in the Netherlands (CLIN 2013).

Spinque presentation at the 23rd Meeting of Computational Linguistics in the Netherlands (CLIN 2013).

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
251
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. > design > publish > search! How to Search Annotated Text by Strategy? Roberto Cornacchia Wouter Alink Arjen P. De Vries Spinque B.V. CLIN 2013, 18 January 2013 http://www.spinque.com/
  • 2. Search by Strategy> design > publish > search! Design the way you would like to search● A search engine design framework● Custom search engines built from “Strategies”, which: ● are designed as graphs ● abstract data processing ● combine different data sources ● incorporate probabilistic reasoning ● translate to database queries http://www.spinque.com/
  • 3. Search by Strategy> design > publish > search! Dont try and program the ultimate search engine Design a number of domain-specific search strategies Crime map Crime map All houses All houses Query terms Query terms Rank Rank Select Rank Rank Rank Select Rank on location on location on attribute full-text on location on location on attribute full-text Difference Difference Click. Generate Web search engines on probabilistic DB Union Union 3
  • 4. Multiple domains, custom UIs> design > publish > search! 4
  • 5. Multiple domains, custom UIs> design > publish > search! 5
  • 6. Multiple domains, custom UIs> design > publish > search! 6
  • 7. Multiple domains, custom UIs> design > publish > search! 7
  • 8. Strategy Editor> design > publish > search! 8
  • 9. Not only "documents"> design > publish > search! 9
  • 10. Whats in the DB?> design > publish > search! term obj freq subj pred / attr obj / val p t0 o3 0.03 Roberto speaks_to You 0.95 t0 o5 0.21 You listen_to Roberto 0.6 t1 o2 0.08 speech minutes 15 0.8 Full-text search Annotation search obj f1 ... fN obj pre size level o0 0.12 ... 0.84 o0 100 50 0 o1 0.54 ... 0 o1 110 20 1 o2 0.23 ... 0.31 o2 144 16 2 Feature-vectors (CBIR, SVM) Hierarchical search 10
  • 11. Choose hot topics from (kid-)news> design > publish > search! http://www.opstel.eu Kid news Rank on date Expand Extract terms 11
  • 12. Use POS annotations> design > publish > search! Text <abstract date="2013-01-15"> Lilly de pitbull is een held. De hond uit de Amerikaanse staat Massachusetts heeft … </abstract> Annotated text: we are interested in NPs <abstract date="2013-01-15"> <NP>Lilly de pitbull</NP> is <NP>een held</NP>. <NP>De hond uit de Amerikaanse staat Massachusetts</NP> heeft … </abstract> 12
  • 13. "Lilly de held" on Alpino> design > publish > search! 13
  • 14. Choose hot topics from (kid-)news> design > publish > search! http://www.opstel.eu Kid news Rank on date Expand Top terms Top NPs 14
  • 15. Topic suggestion for kids> design > publish > search! http://www.opstel.eu 15
  • 16. Topic suggestion for kids> design > publish > search! Data: Wikipedia, magazines for children, .. Left branch: rank data sources on annotations, e.g.:  Most seen content – hot topics  Seen during night-time? Probably not for kids Right branch: query expansion using recent (hot) content Can we improve this by adding.. ?  Text reading level (machine learning)  Handle spelling mistakes in query expansion  Syntactic dependencies 16
  • 17. Example: syntactic dependencies> design > publish > search! AEGIR dependency parser for English (Koster et al.) Parses text, outputs dependency triples  "PGs prevent the mucosal damage .. " [PG,SUBJ,prevent] [prevent,OBJ,damage] [damage,ATTR,mucosal] ... CLEFIP 2011: Combining document representations for prior-art retrieval, Eva Dhondt, Suzan Verberne, Wouter Alink, Roberto Cornacchia 17
  • 18. > design > publish > search! Prior art search.Designed by Eva Dhondt, Nijmegen 18
  • 19. > design > publish > search! Find patents containing similar triples 19
  • 20. Recap> design > publish > search! Strategies encapsulate domain expert knowledge Crime map Crime map All houses All houses Query terms Query terms (how to find) Rank Rank on location on location Rank Rank on location Select Select on attribute on attribute Rank Rank full-text full-text on location Strategies abstract away Difference Difference search expert knowledge Union (how to search) YOU can easily experiment Union with (new) data representations, ranking formulas, annotations, etc. Strategies facilitate knowledge management  Store / share / publish / refine Minimise the effort needed to design/update complex domain-specific search engines 20
  • 21. > design > publish > search! Thank you www.spinque.com 21