Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Machine learning interviews day5 by rajmohanc 1569 views
- Machine learning interviews day2 by rajmohanc 1905 views
- Machine learning interviews day1 by rajmohanc 3176 views
- Generating Sequences with Deep LSTM... by Andre Pemmelaar 6309 views
- Linear Classification by mailund 1388 views
- Text categorization by Phuong Nguyen 3094 views

No Downloads

Total views

1,725

On SlideShare

0

From Embeds

0

Number of Embeds

1,228

Shares

0

Downloads

0

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Machine Learning Interviews – Day 3 Arpit Agarwal
- 2. SMO Algorithm • Optimization problem: • Solve over 2 alphas instead of all
- 3. SMO Algorithm
- 4. SMO Algorithm • Input: C, kernel, kernel parameters, epsilon • Initialize b and all ’s to 0 • Repeat until KKT satisfied (to within epsilon): – Find an example e1 that violates KKT (prefer unbound examples here, choose randomly among those) – Choose a second example e2. Prefer one to maximize step size (in practice, faster to just maximize |E1 – E2|). If that fails to result in change, randomly choose unbound example. If that fails, randomly choose example. If that fails, re-choose e1. – Update α1 and α2 in one step – Compute new threshold b
- 5. Updating Two ’s: One SMO Step • Given examples e1 and e2, set where: • Clip this value in the natural way: if y1 = y2 then: • otherwise: • Set where s = y1y2
- 6. - What is Overfitting? How to avoid it? - “Cross-validation, regularization, bagging” - What is regularization? Why do we need it? - What is Bias-Variance tradeoff?
- 7. Overfitting – Curve Fitting
- 8. Overfitting
- 9. Ensemble Methods • JP wants to do CMO assignment, but he does not know any of the answers. • What will JP do?
- 10. Ensemble Methods Original Training data D .... D1 D2 Dt-1 Dt Step 1: Create Multiple Data Sets C1 C2 Ct -1 Ct Step 2: Build Multiple Classifiers C* Step 3: Combine Classifiers
- 11. Ensemble Methods Types – Bagging (Helps reducing variance of the classifier) – Boosting (Adaboost) (Helps in improving the accuracy of the classifier)
- 12. Decision Trees • JP’s very practical problem:- “Whether to go to prakruthi for tea or not?” Ask Rishabh if he wants to come? Yes No Does Rishabh has money for both of us? Don’t go for tea Yes No Go for tea Don’t go for tea
- 13. Random Forests • Ensemble method specifically designed for decision tree classifiers • Random Forests grows many classification trees (that is why the name!) • Ensemble of unpruned decision trees • Each base classifier classifies a “new” vector • Forest chooses the classification having the most votes (over all the trees in the forest)
- 14. Random Forests • Introduce two sources of randomness: “Bagging” and “Random input vectors” – Each tree is grown using a bootstrap sample of training data – At each node, best split is chosen from random sample of mtry variables instead of all variables
- 15. Random Forests
- 16. Random Forest Algorithm • M input variables, a number m<<M is specified such that at each node, m variables are selected at random out of the M and the best split on these m is used to split the node. • m is held constant during the forest growing • Each tree is grown to the largest extent possible • There is no pruning • Bagging using decision trees is a special case of random forests when m=M
- 17. Random Forest Algorithm • Good accuracy without over-fitting • Fast algorithm (can be faster than growing/pruning a single tree); easily parallelized • Handle high dimensional data without much problem • Only one tuning parameter mtry , usually not sensitive to it
- 18. PCA

No public clipboards found for this slide

Be the first to comment