Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Smart Tagger for X-Lab/BIT Application


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Smart Tagger for X-Lab/BIT Application

  1. 1. You expect your computer to store yourdocuments like this1/28
  2. 2. 2/28When actually…
  3. 3. 3/30… they are stored like this!3/28
  4. 4. 4/28Information volumes growexponentiallyProblem
  5. 5. 5/28ProblemFor single users…
  6. 6. 6/28No time to sort documentsmanuallyProblem
  7. 7. 7/28As a result, it is easierto download a document againProblem
  8. 8. 8/28Rash titlesProblemNew document.doczdhgd.txt
  9. 9. 9/28Intermediate versions ofone documentProblemReport1_final.docReport(copy)111.docReport_final_vol2.0.doc
  10. 10. 10/28ProblemFor companies…
  11. 11. 11/28Each of multiple users organizeshared folder their own wayProblem
  12. 12. ProblemNew employee at firedemployee’s workstation12/28
  13. 13. 13/28PwC: even the mostexperienced worker spends 5-15% of work time searching fordocumentsProblem
  14. 14. 14/28ProblemIt costs~$5 700/year
  15. 15. SolutionComfortableenvironment for workwith electronicdocuments and e-mail15/28
  16. 16. Solution1. Automatic tagsassignment16/28
  17. 17. Solution2. Search with synonymsand by similar documents17/28
  18. 18. Solution3. Automatic sort byfolders18/28
  19. 19. Solution19/284. Search for similar documents
  20. 20. Solution20/28
  21. 21. mealMenu.pdfDocuments and words arepresented as points in space21/28How it works
  22. 22. mealMenu.pdfTheme is a concentration of pointsTags are words near specific documents22/28How it works
  23. 23. 1. LSA: full-text search not only by wordsfrom text, but by words which arelikely to be in text2. Metrics: own adaptation ofBreiman, Leo «Random Forests»(2001)3. Probability model: own adaptation ofMark Steyvers; TomGriffiths "Probabilistic Topic Models“(2007)Features23/28
  24. 24. Main advantages:•Hierarchical topic structure•Application learns by user activity24/28Features
  25. 25. Current results• [beta] Smart Taggersingle-user semantic file explorer• [beta] Mail Tagger (Outlook 2010 plugin)single-user semantic e-mail• 2 corporate pre-sale works in progress25/28
  26. 26. Single-userlicensesSMB (5-25users)$ 50/year$500-$2000/year26/28Market positioning
  27. 27. Market positioningFull-text searches:X1 Desktop SearchGoogle DesktopSearchHighly sophisticatedcorporate solutions byAutonomy or IBM27/28+intelligence- difficulty
  28. 28. Team• Nikita PustovoytovMIPT, 1C, evangelist, consultant• Dmitry Elisov6 year MIPT student, team lead• Victor Kantor6 year MIPT student, algorithms• Irina Elisova6 year MAI student, business analytics28/28
  29. 29. Thank you!We appreciate your ime!http:// .ru