Indexing Present1

563 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Indexing Present1

  1. 1. Term Project CS 359 Document Indexing and Retrival
  2. 2. IR System
  3. 3. IR System • Spider (Nat)
  4. 4. IR System • Spider (Nat) • Tokenization (Klang)
  5. 5. IR System • Spider (Nat) • Tokenization (Klang) • GUI (Ploy)
  6. 6. IR System • Spider (Nat) • Tokenization (Klang) • GUI (Ploy) • Searching/Scoring (Job)
  7. 7. Spider
  8. 8. Spider • CyberNeko Html • Groovy
  9. 9. Spider • CyberNeko Html • Groovy
  10. 10. Spider (cont.)
  11. 11. Spider (cont.) • Link Gathering (Link Collection) • Html Unescape • Download Page
  12. 12. Link Gathering
  13. 13. Link Gathering
  14. 14. Link Gathering
  15. 15. Tokenization
  16. 16. Tokenization • http://sansarn.com/lexto/
  17. 17. GUI
  18. 18. GUI
  19. 19. Why Groovy? • Super set of Java • Shorter than java .....
  20. 20. Groovy Example • println 0..6 ===> [0,1,2,3,4,5,6] • [0,1,2,3,4,5,6,7,8,9].findAll{it%2==0} ===> [0, 2, 4, 6, 8] • println "http://www.google.com".toURL().text

×