8. Feature engineering
• What is important ?
• cross-validation, correlated subsets, etc …
• Second-order features
• distribution of requests, banner-lifetime, etc
• Automatic feature engineering.
• Learning to learn by gradient descent by gradient
descent
9. S2
- Q3. Behind obvious
- Expect some internal structure
- [hipotese + PCA]
- Q4. Set of models
- combine to see the best
- (boost, vote, random forest, etc .. )
- classify: what model is valid for some specific case
10. S2
- It is not pure machine learning now.
- It is research
- of real decision making process
- (fundamental model)
- using ML methods
14. Feedback Loop
• We change the world
• We receive information from changed world
• AdSearch example (by LÉON BOTTOU):
• if we have word from search request in ad-
description, than probability of click is higher
• but: engagement ring ~~ cheap diamonds
• What to do: Increase variance, violate search-state limits
15. • Hidden Technical Debt in Machine Learning
Systems. (google)
• Dependencies
• Model boundaries. (CACE : Changing Anything Changes
Everything)
• Data
• Feedback loops.
• ML-Antipatterns
• Glue Code, Pipeline jungles, Abstraction debt.
• Measure, prioritize, refactor:
• As software engineering but harder.
16. Conclusion
• Avoid ‘blackbox with magic’ approach
• Machine Learning ~~ building approximators
• Try to explore fundamental models
• Track feedback loops;
• Keep space for new ideas