Term Project

          CS 359
Document Indexing and Retrival
IR System
IR System
• Spider (Nat)
IR System
• Spider (Nat)
• Tokenization (Klang)
IR System
• Spider (Nat)
• Tokenization (Klang)
• GUI (Ploy)
IR System
• Spider (Nat)
• Tokenization (Klang)
• GUI (Ploy)
• Searching/Scoring (Job)
Spider
Spider
• CyberNeko Html
• Groovy
Spider
• CyberNeko Html
• Groovy
Spider (cont.)
Spider (cont.)

• Link Gathering (Link Collection)
• Html Unescape
• Download Page
Link Gathering
Link Gathering
Link Gathering
Tokenization
Tokenization



• http://sansarn.com/lexto/
GUI
GUI
Why Groovy?

• Super set of Java
• Shorter than java
  .....
Groovy Example

• println 0..6
    ===> [0,1,2,3,4,5,6]
•   [0,1,2,3,4,5,6,7,8,9].findAll{it%2==0}
    ===> [0, 2, 4, 6, 8]...
Indexing Present1
Upcoming SlideShare
Loading in...5
×

Indexing Present1

369

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
369
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Indexing Present1

    1. 1. Term Project CS 359 Document Indexing and Retrival
    2. 2. IR System
    3. 3. IR System • Spider (Nat)
    4. 4. IR System • Spider (Nat) • Tokenization (Klang)
    5. 5. IR System • Spider (Nat) • Tokenization (Klang) • GUI (Ploy)
    6. 6. IR System • Spider (Nat) • Tokenization (Klang) • GUI (Ploy) • Searching/Scoring (Job)
    7. 7. Spider
    8. 8. Spider • CyberNeko Html • Groovy
    9. 9. Spider • CyberNeko Html • Groovy
    10. 10. Spider (cont.)
    11. 11. Spider (cont.) • Link Gathering (Link Collection) • Html Unescape • Download Page
    12. 12. Link Gathering
    13. 13. Link Gathering
    14. 14. Link Gathering
    15. 15. Tokenization
    16. 16. Tokenization • http://sansarn.com/lexto/
    17. 17. GUI
    18. 18. GUI
    19. 19. Why Groovy? • Super set of Java • Shorter than java .....
    20. 20. Groovy Example • println 0..6 ===> [0,1,2,3,4,5,6] • [0,1,2,3,4,5,6,7,8,9].findAll{it%2==0} ===> [0, 2, 4, 6, 8] • println "http://www.google.com".toURL().text
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×