Semantic search in databases

432 views
378 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
432
On SlideShare
0
From Embeds
0
Number of Embeds
31
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Semantic search in databases

  1. 1. Semantic search in databases Tomas Drencak
  2. 2. The problem ● Search in used cars database ● Cca 100k advertisements, all classified: – Brand, type – Mileage, Displacement, Year of production, Gasoline/Diesel, Transmission – Equipment: ● Abs, esr, air condition...
  3. 3. The solution ● Search ● Faceted search ● Semantic search ● Fulltext search
  4. 4. Search
  5. 5. Faceted search
  6. 6. Fulltext search
  7. 7. Semantic search Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace
  8. 8. The problem ● Free form search query: – Auto do 5000 eur – Octavia do 100 000 km, max 8000 eur – Octavia klimatizacia tempomat – Mazda 626 1.6 tdi
  9. 9. Context free grammars S → a S → aS S → bS Terminal symbol: S Non-terminal symbols: a, b
  10. 10. CFG example #1 ● aaba ● S2 aba ● S2 S2 ba ● S2 S2 S3 a ● S2 S2 S3 S1 S1 → a S2 → aS S3 → bS
  11. 11. CFG example #2 Q → Qx | Qx Qx | Qx a Qx | Qx , Qx Qx → VYKON | CENA | WORD VYKON → od NUM kw | do NUM kw | NUM kw CENA → do NUM eur NUM → [0-9]+ WORD → [a-z]+ skoda favorit do 5000 eur a 100 kw skoda favorit do 5000 eur a 100 kw Q WORD WORD CENA VYKON
  12. 12. CFG example #2 – SQL query Q([ WORD(„skoda“), WORD(„favorit“), CENA(500, „<=“), VYKON(100, „=“) ]) skoda favorit do 5000 eur a 100 kw Task: Convert Q object into SQL query select * from advertisement where ad_id in ( select ad_id from advertisement join car where brand = 'skoda' Intersect select ad_id from advertisement join car where type = 'favorit' Intersect select ad_id from advertisement where price <= 5000 Intersect Select ad_id from advertisement where power = 100 )
  13. 13. CFG example #2 – PRICE class CENA(value, operator): def sql_fragment(): return 'select ad_id from advertisement where price ' + + this.operator + ' ' + this.value; class VYKON(value, operator): def sql_fragment(): return 'select ad_id from advertisement where power ' + + this.operator + ' ' + this.value;
  14. 14. CFG example #2 - WORD ● How to find the meaning of the word? ● Inverted index: select table, column from inverted_index where word = 'skoda' Word Target table Target column skoda car brand favorit car type ABS equipment name ESR equipment name Select ad_id from TABLE where COLUMN = VALUE
  15. 15. CFG example #3 - Mistakes ● What about mistakes? – 'otavia' → 'octavia' – 'mercedez' → 'mercedez' ● Use 3-grams to find correct word and then find out the word ngram Full word mer mercedes erc mercedes rce mercedes ced mercedes …. …. oct octavia cta octavia tav octavia
  16. 16. CFG example #2 – WORD class WORD(word): def sql_fragment(): meaning = find_meaning(word) If not meaning: meaning = guess_meaning(word) If meaning: return 'select ad_id from ' + meaning.table + ' where ' + meaning.column ' + ' = ' + this.word
  17. 17. Summary ● Context free grammars ● Parsing ● Inverted index ● Ngram
  18. 18. Thank you! Any questions?

×