Financial ComicInformation Retrieval System <br />2010/05/28<br />1<br />
Outline<br />Architecture of IR system<br />Indexing process<br />Query process<br />2<br />
Indexing process<br />MySQL Database<br />Text Acquisition<br />Index Creation<br />Index<br />Financial Comics<br />資料來源:...
Indexing process<br />Text Acquisition<br />Store the description of Financial Comics in the database<br />Database schema...
Indexing process<br />Text Transformation<br />Convert text encoding to UTF-8<br />Stopping<br />Filter punctuation and nu...
Indexing process<br />Index Creation<br />Unigram<br />Bigram<br />Word Segmentation<br />Yahoo! 斷章取義API<br />Compute tf.i...
7<br />idf value<br />tf value<br />
Query process<br />MySQL Database<br />User Interaction<br />Ranking<br />Index<br />8<br />
Query process<br />User Interaction<br />Construct the display of top 10 documents for a query<br />Highlight keywords<br ...
Demo<br />10<br />
Upcoming SlideShare
Loading in …5
×

Financial Comic Information Retrieval System

480 views
374 views

Published on

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
480
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Financial Comic Information Retrieval System

  1. 1. Financial ComicInformation Retrieval System <br />2010/05/28<br />1<br />
  2. 2. Outline<br />Architecture of IR system<br />Indexing process<br />Query process<br />2<br />
  3. 3. Indexing process<br />MySQL Database<br />Text Acquisition<br />Index Creation<br />Index<br />Financial Comics<br />資料來源:鉅融全球資本市場演進知識庫<br />http://www.global5capital.com<br />Text Transformation<br />3<br />
  4. 4. Indexing process<br />Text Acquisition<br />Store the description of Financial Comics in the database<br />Database schema<br />4<br />
  5. 5. Indexing process<br />Text Transformation<br />Convert text encoding to UTF-8<br />Stopping<br />Filter punctuation and number from document<br />Filter a single English alphabet<br />5<br />
  6. 6. Indexing process<br />Index Creation<br />Unigram<br />Bigram<br />Word Segmentation<br />Yahoo! 斷章取義API<br />Compute tf.idf weight for index term<br />tf(term frequency)<br />idf(inverse document frequency)<br />6<br />
  7. 7. 7<br />idf value<br />tf value<br />
  8. 8. Query process<br />MySQL Database<br />User Interaction<br />Ranking<br />Index<br />8<br />
  9. 9. Query process<br />User Interaction<br />Construct the display of top 10 documents for a query<br />Highlight keywords<br />Ranking<br />Measure by tf∙idf weight<br />9<br />
  10. 10. Demo<br />10<br />

×