Your SlideShare is downloading. ×
クエリログとスニペットの単語連接頻度に基づく Web検索クエリのセグメンテーション
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

クエリログとスニペットの単語連接頻度に基づく Web検索クエリのセグメンテーション

905
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
905
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Web {jmiyake, kotsukam, msassano}@yahoo-corp.jp1 Web 1 Web 3SVM2 Bergsma [1] 3.1SVM Tan [2] 1 Web 5-gram WikipediaWang [3] Microsoft Web N-gram Web CountDown
  • 2. 1: 0.915 0.03 0.02 0.013 0.011 ... ... 1: 1 2 1 0.1% iphone4 iphone 4 3-gram 3-gram n-gram q Q q xi N3.2 q = {x0 , x1 , x2 , ..., xN } q∈Q3.2.1 3-gram Q 1 2 q 1 ∑N i=1 log P (xi |xi−2 , xi−1 ) max 2 Web q∈Q N −1 3.2.2 ( + + Web )
  • 3. 2: 2010 10 1 31 615 Wikipedia 82.4 % 2010 10 1 31 20 4 SVM 3: Qry-Acc Seg-Acc 0.645 0.937 0.617 0.923 0.731 0.951 0.732 0.953 SVM + 0.739 0.952 + 0.773 0.962 + 0.781 0.962 Web 4.1 SVM SVM Sassano[4] Neubig [5] 1 ( + ) Bergsma Query n-gram n-gramAccuracy(Qry-Acc) Segment Accuracy(Seg-Acc) SVM Qry-Acc Seg-Acc 0.9 2 1 4.1.1 10 2 SVM n-gram 100% 5% xi , xi+1 600 w 2 xi−w+1 , .., xi , xi+1 , ..., xi+w n-gram 3 n-gram + Qry-Acc Seg-Acc n-gram + n-gram 1 Yahoo! JapanWeb API n-gramhttp://developer.yahoo.co.jp/webapi/jlp/ma/v1/parse.html
  • 4. 4: SVM 5: SVM Qry-Acc Seg-Acc 2010 10 1 31 + 0.659 0.943 10 + 0.667 0.945 2010 10 1 31 20 5 SVM liblinear xi , xi+1 R L xi , xi+1 I Web ipadic-2.7.0-20070801Wikipedia ( Wikpedia: 2 Wikipedia: 10 ) SVM4.2 SVM SVM 1 Web [1] S. Bergsma and Q.I. Wang. Learning noun phrase 10 query segmentation. In Proc. of EMNLP-CoNLL, ( + + 2007. ) 4 [2] B. Tan and F. Peng. Unsupervised query seg- Qry-Acc Seg-Acc mentation using generative language models and SVM liblinear[6] wikipedia. In Proceeding of the 17th internatio- nal conference on World Wide Web, pp. 347–356. 5 n-gram ACM, 2008. 3 [3] K. Wang, C. Thrasher, E. Viegas, X. Li, and B.P. Hsu. An overview of Microsoft web N-gram corpus and applications. In Proceedings of the4.3 NAACL HLT 2010 Demonstration Session, pp. 45–48. Association for Computational Linguis- 5 + tics, 2010. [4] M. Sassano. An empirical study of active learning with support vector machines for Japanese word segmentation. In Proceedings of the 40th Annual Meeting on Association for Computational Lin- guistics, pp. 505–512. Association for Computa- tional Linguistics, 2002. [5] Graham Neubig, , . . 16 (NLP2010), , 3 2010. [6] R.E. Fan, K.W. Chang, C.J. Hsieh, X.R. Wang, and C.J. Lin. LIBLINEAR: A library for large linear classification. The Journal of Machine Le- arning Research, Vol. 9, pp. 1871–1874, 2008.

×