SlideShare a Scribd company logo
1 of 17
Ilya Trofimov, Yandex
trofim@yandex-team.ru
Yandex School of Data Analysis conference
Machine Learning and Very Large Data Sets 2013
User’s
query
Ads
Organic
results
 Advertisers select keywords describing their
product or service;
 Ad is eligible to appear at the search engine
result page if ad’s keyword is a subset of
user’s query
 Example: keyword = “digital camera”
 Possible queries:
 “buy digital camera”
 “cheap digital camera”
 “digital camera samsung”
 “digital camera magazine”
 Advertiser is charged each time when his ad
is clicked by a user;
 Advertisers report their bids;
 Advertisers are selected via the Generalized
Second-Price Auction;
 Revenue of Yandex ≈
 The goal is to find P(click|x), x – is a vector
of the all available input features
( )i i
i
P click bid
 The most important input features are the
historical click-through rates (CTR)
 Example of input features:
 CTR(ad) = clicks(ad) / views(ad)
 CTR(web site) = clicks(web site) / views(web site)
 ….
 Text relevance of query and ad’s text
 User behavior features
 There 54 real-valued features total
 Query: “cheap digital camera”
 We selected 3.4*106 binary text-based
features
1, 1,
2 , 2 ,
1 , 0
1 , 0
1 ( ) & ( ),
0
k k k
k k k
km k m
km
x if word keyword otherwise x
x if word residual of query otherwise x
x if word query word residual of query
otherwise x
keywordresidual of query
 The state-of-art solution for the click prediction problem is
to use a composition of boosted decision trees:
 - a decision tree
 Works well for <1000 real-valued features on big datasets
(> 1 million of examples)
 The problem: we want to use millions of binary features
( , )i
f ax
1
1
( | )
1 exp ( , )
n
i i
i
P click
f a
x
x
 The mixed model is a composition of the
decisions trees and the logistic regression
which are fitted sequentially:
1. Fit by means of the boosting;
2. Fit as a logistic regression with L1-
regularization
1
1 1
1
( | ) | |
1 exp ( , )
m
j
n m
j
i i j j
i j
P click
f a z
x
x
, ( , )i i
f ax
i
 For fitting the composition of decision trees we
used MatrixNet
 MatrixNet is a proprietary machine learning
algorithm which is a modification of the Gradient
Boosting Machine (GBM) with stochastic boosting
(Friedman, 2002), (Gulin, 2010) (in Russian)
 The training set were randomly sampled from
one week log of user search sessions
 Training set: 3*106 examples
 54 real-valued features
1. Cyclic coordinate descent
Implemented in BBR, (Genkin et.al. 2007)
http://www.bayesianregression.org/
2. Online learning via truncated gradient
Implemented in the Vowpal Wabbit (Langford et al.,
2009)
https://github.com/JohnLangford/vowpal_wabbit
3. Reducing L1-regularization to L2-regularization
(η-trick)
(Jenatton et al., 2009)
Vowpal Wabbit can be used for solving L2-regularized
logistic regression
 The datasets were randomly sampled from
one week log of user search sessions
 Training set: 67*106 examples
 Test set: 5*106 examples
 3.4*106 unique binary features
 Features which had non-zero coefficients
in > 10 training examples were left
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
Основной Основной Основной Основной Основной Основной Основной Основной
ΔauPRC, %
Non-zero coefficients
BBR L1
VW batch LBFGS L2
VW, L1, 1 epoch
VW, L1, 8 epochs
eta-trick
 We selected the model with 2966 non-zero
features
 BBR with 1
100
 Words from the residual of a query which
increase the probability of click
(translated to English):
Word β
gold +0.52
necessary +0.32
market +0.23
used +0.20
effective +0.19
 Words from the residual of a query which
decrease the probability of click
(translated to English):
Word β
vacancy -0.40
review -0.34
site -0.33
size -0.15
which -0.14
 J. Friedman. Greedy function approximation: A gradient
boosting machine. In Technical Report. Dept. of Statistics,
Stanford University, 1999.
 A. Gulin. Matrixnet. Technical report,
http://www.ashmanov.com/arc/searchconf2010/08gulin-
searchconf2010.ppt, 2010. (in Russian).
 A. Genkin, D. D. Lewis, and D. Madigan. Large-Scale
Bayesian Logistic Regression for Text Categorization.
Technometrics, 49(3):291–304, Aug. 2007.
 J. Langford, L. Li, and T. Zhang. Sparse Online Learning via
Truncated Gradient. Journal of Machine Learning
Research, 10:777–801, 2009.
 R. Jenatton, G. Obozinski, and F. Bach. Structured Sparse
Principal Component Analysis, 2009.
Yandex School of Data Analysis conference, Machine Learning and Very Large Data Sets 2013

More Related Content

Similar to Yandex School of Data Analysis conference, Machine Learning and Very Large Data Sets 2013

1 resource optimization 2
1 resource optimization 21 resource optimization 2
1 resource optimization 2shushay hailu
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesRevolution Analytics
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionJaroslaw Szymczak
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditMichael BENESTY
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Dawen Liang
 
Deepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn WayDeepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn Wayyingfeng
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon Web Services
 
Ml2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionMl2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionankit_ppt
 
Boosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsBoosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsArmando Vieira
 
Florian Douetteau @ Dataiku
Florian Douetteau @ DataikuFlorian Douetteau @ Dataiku
Florian Douetteau @ DataikuPAPIs.io
 
Machine learning workshop @DYP Pune
Machine learning workshop @DYP PuneMachine learning workshop @DYP Pune
Machine learning workshop @DYP PuneGanesh Raskar
 
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...SMART Infrastructure Facility
 
Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...
Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...
Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...Databricks
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDRabi Das
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
 
Study on Application of Ensemble learning on Credit Scoring
Study on Application of Ensemble learning on Credit ScoringStudy on Application of Ensemble learning on Credit Scoring
Study on Application of Ensemble learning on Credit Scoringharmonylab
 
StackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineStackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineLarkin Liu
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...Flink Forward
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksKevin Lee
 

Similar to Yandex School of Data Analysis conference, Machine Learning and Very Large Data Sets 2013 (20)

1 resource optimization 2
1 resource optimization 21 resource optimization 2
1 resource optimization 2
 
Marketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success RatesMarketing Analytics with R Lifting Campaign Success Rates
Marketing Analytics with R Lifting Campaign Success Rates
 
XGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competitionXGBoost: the algorithm that wins every competition
XGBoost: the algorithm that wins every competition
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
Factorization Meets the Item Embedding: Regularizing Matrix Factorization wit...
 
Deepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn WayDeepak-Computational Advertising-The LinkedIn Way
Deepak-Computational Advertising-The LinkedIn Way
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Ml2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regressionMl2 train test-splits_validation_linear_regression
Ml2 train test-splits_validation_linear_regression
 
Boosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithmsBoosting conversion rates on ecommerce using deep learning algorithms
Boosting conversion rates on ecommerce using deep learning algorithms
 
Florian Douetteau @ Dataiku
Florian Douetteau @ DataikuFlorian Douetteau @ Dataiku
Florian Douetteau @ Dataiku
 
Machine learning workshop @DYP Pune
Machine learning workshop @DYP PuneMachine learning workshop @DYP Pune
Machine learning workshop @DYP Pune
 
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
SMART Seminar Series: "Optimisation of closed loop supply chain decisions usi...
 
Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...
Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...
Interaction-Based Feature Extraction: How to Convert Your Users’ Activity int...
 
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYONDIMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
IMPLEMENTATION OF MACHINE LEARNING IN E-COMMERCE & BEYOND
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
Study on Application of Ensemble learning on Credit Scoring
Study on Application of Ensemble learning on Credit ScoringStudy on Application of Ensemble learning on Credit Scoring
Study on Application of Ensemble learning on Credit Scoring
 
StackAdapt Machine Learning Pipeline
StackAdapt Machine Learning PipelineStackAdapt Machine Learning Pipeline
StackAdapt Machine Learning Pipeline
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
Flink Forward SF 2017: Erik de Nooij - StreamING models, how ING adds models ...
 
Machine Learning : why we should know and how it works
Machine Learning : why we should know and how it worksMachine Learning : why we should know and how it works
Machine Learning : why we should know and how it works
 

Recently uploaded

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 

Yandex School of Data Analysis conference, Machine Learning and Very Large Data Sets 2013

  • 1. Ilya Trofimov, Yandex trofim@yandex-team.ru Yandex School of Data Analysis conference Machine Learning and Very Large Data Sets 2013
  • 3.  Advertisers select keywords describing their product or service;  Ad is eligible to appear at the search engine result page if ad’s keyword is a subset of user’s query  Example: keyword = “digital camera”  Possible queries:  “buy digital camera”  “cheap digital camera”  “digital camera samsung”  “digital camera magazine”
  • 4.  Advertiser is charged each time when his ad is clicked by a user;  Advertisers report their bids;  Advertisers are selected via the Generalized Second-Price Auction;  Revenue of Yandex ≈  The goal is to find P(click|x), x – is a vector of the all available input features ( )i i i P click bid
  • 5.  The most important input features are the historical click-through rates (CTR)  Example of input features:  CTR(ad) = clicks(ad) / views(ad)  CTR(web site) = clicks(web site) / views(web site)  ….  Text relevance of query and ad’s text  User behavior features  There 54 real-valued features total
  • 6.  Query: “cheap digital camera”  We selected 3.4*106 binary text-based features 1, 1, 2 , 2 , 1 , 0 1 , 0 1 ( ) & ( ), 0 k k k k k k km k m km x if word keyword otherwise x x if word residual of query otherwise x x if word query word residual of query otherwise x keywordresidual of query
  • 7.  The state-of-art solution for the click prediction problem is to use a composition of boosted decision trees:  - a decision tree  Works well for <1000 real-valued features on big datasets (> 1 million of examples)  The problem: we want to use millions of binary features ( , )i f ax 1 1 ( | ) 1 exp ( , ) n i i i P click f a x x
  • 8.  The mixed model is a composition of the decisions trees and the logistic regression which are fitted sequentially: 1. Fit by means of the boosting; 2. Fit as a logistic regression with L1- regularization 1 1 1 1 ( | ) | | 1 exp ( , ) m j n m j i i j j i j P click f a z x x , ( , )i i f ax i
  • 9.  For fitting the composition of decision trees we used MatrixNet  MatrixNet is a proprietary machine learning algorithm which is a modification of the Gradient Boosting Machine (GBM) with stochastic boosting (Friedman, 2002), (Gulin, 2010) (in Russian)  The training set were randomly sampled from one week log of user search sessions  Training set: 3*106 examples  54 real-valued features
  • 10. 1. Cyclic coordinate descent Implemented in BBR, (Genkin et.al. 2007) http://www.bayesianregression.org/ 2. Online learning via truncated gradient Implemented in the Vowpal Wabbit (Langford et al., 2009) https://github.com/JohnLangford/vowpal_wabbit 3. Reducing L1-regularization to L2-regularization (η-trick) (Jenatton et al., 2009) Vowpal Wabbit can be used for solving L2-regularized logistic regression
  • 11.  The datasets were randomly sampled from one week log of user search sessions  Training set: 67*106 examples  Test set: 5*106 examples  3.4*106 unique binary features  Features which had non-zero coefficients in > 10 training examples were left
  • 12. 0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 Основной Основной Основной Основной Основной Основной Основной Основной ΔauPRC, % Non-zero coefficients BBR L1 VW batch LBFGS L2 VW, L1, 1 epoch VW, L1, 8 epochs eta-trick
  • 13.  We selected the model with 2966 non-zero features  BBR with 1 100
  • 14.  Words from the residual of a query which increase the probability of click (translated to English): Word β gold +0.52 necessary +0.32 market +0.23 used +0.20 effective +0.19
  • 15.  Words from the residual of a query which decrease the probability of click (translated to English): Word β vacancy -0.40 review -0.34 site -0.33 size -0.15 which -0.14
  • 16.  J. Friedman. Greedy function approximation: A gradient boosting machine. In Technical Report. Dept. of Statistics, Stanford University, 1999.  A. Gulin. Matrixnet. Technical report, http://www.ashmanov.com/arc/searchconf2010/08gulin- searchconf2010.ppt, 2010. (in Russian).  A. Genkin, D. D. Lewis, and D. Madigan. Large-Scale Bayesian Logistic Regression for Text Categorization. Technometrics, 49(3):291–304, Aug. 2007.  J. Langford, L. Li, and T. Zhang. Sparse Online Learning via Truncated Gradient. Journal of Machine Learning Research, 10:777–801, 2009.  R. Jenatton, G. Obozinski, and F. Bach. Structured Sparse Principal Component Analysis, 2009.