Smart Tagger for X-Lab/BIT Application

Published in: Technology
Smart Tagger for X-Lab/BIT Application

1. 1. You expect your computer to store yourdocuments like this1/28
2. 2. 2/28When actually…
3. 3. 3/30… they are stored like this!3/28
4. 4. 4/28Information volumes growexponentiallyProblem
5. 5. 5/28ProblemFor single users…
6. 6. 6/28No time to sort documentsmanuallyProblem
7. 7. 7/28As a result, it is easierto download a document againProblem
8. 8. 8/28Rash titlesProblemNew document.doczdhgd.txt
9. 9. 9/28Intermediate versions ofone documentProblemReport1_final.docReport(copy)111.docReport_final_vol2.0.doc
10. 10. 10/28ProblemFor companies…
11. 11. 11/28Each of multiple users organizeshared folder their own wayProblem
12. 12. ProblemNew employee at firedemployee’s workstation12/28
13. 13. 13/28PwC: even the mostexperienced worker spends 5-15% of work time searching fordocumentsProblem
14. 14. 14/28ProblemIt costs~\$5 700/year
15. 15. SolutionComfortableenvironment for workwith electronicdocuments and e-mail15/28
16. 16. Solution1. Automatic tagsassignment16/28
17. 17. Solution2. Search with synonymsand by similar documents17/28
18. 18. Solution3. Automatic sort byfolders18/28
19. 19. Solution19/284. Search for similar documents
20. 20. Solution20/28
21. 21. mealMenu.pdfDocuments and words arepresented as points in space21/28How it works
22. 22. mealMenu.pdfTheme is a concentration of pointsTags are words near specific documents22/28How it works
23. 23. 1. LSA: full-text search not only by wordsfrom text, but by words which arelikely to be in text2. Metrics: own adaptation ofBreiman, Leo «Random Forests»(2001)3. Probability model: own adaptation ofMark Steyvers; TomGriffiths "Probabilistic Topic Models“(2007)Features23/28
24. 24. Main advantages:•Hierarchical topic structure•Application learns by user activity24/28Features
25. 25. Current results• [beta] Smart Taggersingle-user semantic file explorer• [beta] Mail Tagger (Outlook 2010 plugin)single-user semantic e-mail• 2 corporate pre-sale works in progress25/28
26. 26. Single-userlicensesSMB (5-25users)\$ 50/year\$500-\$2000/year26/28Market positioning
27. 27. Market positioningFull-text searches:X1 Desktop SearchGoogle DesktopSearchHighly sophisticatedcorporate solutions byAutonomy or IBM27/28+intelligence- difficulty
28. 28. Team• Nikita PustovoytovMIPT, 1C, evangelist, consultant• Dmitry Elisov6 year MIPT student, team lead• Victor Kantor6 year MIPT student, algorithms• Irina Elisova6 year MAI student, business analytics28/28
29. 29. Thank you!We appreciate your ime!http:// .ru