Your SlideShare is downloading. ×
20 mins of Liblinear
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

20 mins of Liblinear

2,067
views

Published on

Published in: Technology, Education

0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,067
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
26
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. LIBLINEAR IN 20 MINSChandler Huangprevia [at] gmail.com
  • 2. Liblinear SVM: Looking for a hyper-plane to separate sampledata SVR: Looking for a hyper-plane to predict datadistribution Example:PASS Grade w1 w2 w3 w4T 95 4.7 118 1M 172T 70 3 121 1.2M 181F 55 3.6 102 0.8M 173F 48 2.7 108 0.85M 183
  • 3. Liblinear Both solve with different
  • 4. Python wrapper of Liblinear liblinear.py liblinear = CDLL(path.join(dirname,../liblinear.so.1)) Class: feature_node, problem, parameter, model liblinearutil.py import liblinear load/save_model(), evaluations(), train(), predict()
  • 5. SOP Text classification Text segmentation Feature selection Train model Verify testing data
  • 6. SOP Text classification Text segmentation N-Gram, HMM Segmentor for Python (Opensource) 囉嗦(Loso) http://opensource.plurk.com/Loso_Chinese_Segmentation_System/ 結巴(jieba) https://github.com/fxsjy/jieba Smallseg https://code.google.com/p/smallseg/
  • 7. SOP Text classification Feature selection Garbage in garbage out EX: Wiki title index http://dumps.wikimedia.org/zhwiktionary/ Libsvm vs Liblinear Libsvm:O(n2) or O(n3) Liblinear: O(n) in practice libsvm becomes painfully slow at 10k samples. http://tinyurl.com/ke4btjv
  • 8. SOP Text classification Train Format Solver type(default 1)0 -- L2-regularized logistic regression (primal)1 -- L2-regularized L2-loss support vector classification(dual)2 -- L2-regularized L2-loss support vector classification(primal)3 -- L2-regularized L1-loss support vector classification(dual)4 -- support vector classification by Crammer and Singer5 -- L1-regularized L2-loss support vector classification
  • 9. SOP Text classification Train -c cost set the parameter C (default 1) -p epsilon set the epsilon in loss function of epsilon-SVR (default 0.1) -e epsilon set tolerance of termination criterion -B bias if bias >= 0, instance x becomes [x; bias]; if < 0, no bias term added (default -1) -wi weight weights adjust the parameter C of different classes (see README for details) -v n n-fold cross validation mode -q quiet mode (no outputs)
  • 10. SOP Text classification Verify testing data Using predict()
  • 11. LIVE DEMO
  • 12. Reference LIBLINEAR A Library for Large Linear Classication http://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf L1, L2-Regularization L1 vs. L2 Regularization and feature selection http://cs.nyu.edu/~rostami/presentations/L1_vs_L2.pdf L1-norm Regularization http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf Sparsity and Some Basics of L1 Regularization http://freemind.pluskid.org/machine-learning/sparsity-and-some-basics-of-l1-regularization/
  • 13. Reference Segmentor 四款python中文分词系统简单测试 http://hi.baidu.com/fooying/item/6ae7a0e26087e8d7eb34c9e8 MMSEG http://technology.chtsai.org/mmseg/ 開源中國,中文分詞庫 http://tinyurl.com/k564x9k
  • 14. THANKS

×