20 mins of Liblinear
Upcoming SlideShare
Loading in...5
×
 

20 mins of Liblinear

on

  • 2,125 views

 

Statistics

Views

Total Views
2,125
Views on SlideShare
2,125
Embed Views
0

Actions

Likes
1
Downloads
10
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

20 mins of Liblinear 20 mins of Liblinear Presentation Transcript

  • LIBLINEAR IN 20 MINSChandler Huangprevia [at] gmail.com
  • Liblinear SVM: Looking for a hyper-plane to separate sampledata SVR: Looking for a hyper-plane to predict datadistribution Example:PASS Grade w1 w2 w3 w4T 95 4.7 118 1M 172T 70 3 121 1.2M 181F 55 3.6 102 0.8M 173F 48 2.7 108 0.85M 183
  • Liblinear Both solve with different
  • Python wrapper of Liblinear liblinear.py liblinear = CDLL(path.join(dirname,../liblinear.so.1)) Class: feature_node, problem, parameter, model liblinearutil.py import liblinear load/save_model(), evaluations(), train(), predict()
  • SOP Text classification Text segmentation Feature selection Train model Verify testing data
  • SOP Text classification Text segmentation N-Gram, HMM Segmentor for Python (Opensource) 囉嗦(Loso) http://opensource.plurk.com/Loso_Chinese_Segmentation_System/ 結巴(jieba) https://github.com/fxsjy/jieba Smallseg https://code.google.com/p/smallseg/
  • SOP Text classification Feature selection Garbage in garbage out EX: Wiki title index http://dumps.wikimedia.org/zhwiktionary/ Libsvm vs Liblinear Libsvm:O(n2) or O(n3) Liblinear: O(n) in practice libsvm becomes painfully slow at 10k samples. http://tinyurl.com/ke4btjv
  • SOP Text classification Train Format Solver type(default 1)0 -- L2-regularized logistic regression (primal)1 -- L2-regularized L2-loss support vector classification(dual)2 -- L2-regularized L2-loss support vector classification(primal)3 -- L2-regularized L1-loss support vector classification(dual)4 -- support vector classification by Crammer and Singer5 -- L1-regularized L2-loss support vector classification
  • SOP Text classification Train -c cost set the parameter C (default 1) -p epsilon set the epsilon in loss function of epsilon-SVR (default 0.1) -e epsilon set tolerance of termination criterion -B bias if bias >= 0, instance x becomes [x; bias]; if < 0, no bias term added (default -1) -wi weight weights adjust the parameter C of different classes (see README for details) -v n n-fold cross validation mode -q quiet mode (no outputs)
  • SOP Text classification Verify testing data Using predict()
  • LIVE DEMO
  • Reference LIBLINEAR A Library for Large Linear Classication http://www.csie.ntu.edu.tw/~cjlin/papers/liblinear.pdf L1, L2-Regularization L1 vs. L2 Regularization and feature selection http://cs.nyu.edu/~rostami/presentations/L1_vs_L2.pdf L1-norm Regularization http://cseweb.ucsd.edu/~saul/teaching/cse291s07/L1norm.pdf Sparsity and Some Basics of L1 Regularization http://freemind.pluskid.org/machine-learning/sparsity-and-some-basics-of-l1-regularization/
  • Reference Segmentor 四款python中文分词系统简单测试 http://hi.baidu.com/fooying/item/6ae7a0e26087e8d7eb34c9e8 MMSEG http://technology.chtsai.org/mmseg/ 開源中國,中文分詞庫 http://tinyurl.com/k564x9k
  • THANKS