Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)

4,895 views

Published on

1. Introduction of Mobile Ads
2. Prediction Model in Mobile Ads
3. Logistic Regression & Matrix Factorization
4. Factorization Machine in R

Published in: Data & Analytics
  • Be the first to comment

Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)

  1. 1. R在行動廣告大數據的應用 Leveraging R in Big Data of Mobile Ads Vpon 行動科技 數據科學家 趙國仁 Data Scientist (Craig Chao)
  2. 2. AdN  RTB Source: Chandrakanth(2012), Theorem India
  3. 3. RTB Bidding Flow Source: Adsmogo Mobile Ads eXchange v1.2
  4. 4. Advertiser Utility: The Value Funnel CPM campaign: Revenue = N/1000 ⋅CPM CPC campaign: Revenue = N ⋅ CTR ⋅ CPC CPA campaign: Revenue = N ⋅ CTR ⋅ CVR⋅ CPA
  5. 5. Internet as a mass media
  6. 6. Demand-side platform and its bidding engine in RTB Landscape? Targeting Attr.? Cold start?
  7. 7. In-database Processing(MPP)
  8. 8. Exploratory Architeture
  9. 9. Spark/Had oop Cluster Exploratory Architecture Spark/Had oop Cluster RRE In TD Multi- core RRE RRE In Spark Tableau Aggregate Export RRE In Spark
  10. 10. Pricing Engine Framework Kafka HDFS Apache Spark Jenkins Realtime processors ( Spark Streaming) DataInjection Speed Layer Batch Layer ServingLayer Kafka DataStreaming Couchbase Docker Container Avro Avro Akka/Scala Actors
  11. 11. 問題定義 Find the "best match" between a given user in a given context and a suitable advertisement. -- Dr. Andrei Broder and Dr. Vanja Josifovski, Standford University How to identify? Limited Info. Budget? Creative?... Bid Price?
  12. 12. Sample features & feature families. Chapelle, et al.(2014)
  13. 13. Bid Landscape Forecasting -- Bid Star Tree Expansion Source: Ying Cui, Ruofei Zhang, Wei Li, Jianchang Mao(2011), Bid Landscape Forecasting in Online Ad Exchange Marketplace, Yahoo! Labs * *C 1 * *C 2 * B1 C1 * B1 C2 * * * * * * * * C1 * * C2 * B1 * • Remove few-imp path for not too sparse • Easily to target all Target Attribute
  14. 14. Bidding Price Calculation Bidding Price = F (base price, CVR ) • In the same campaign, the conversion value is the same = base price * φ • φ = CVR / avg CVR φ = p(c|u, i) / Ej [ p(c|u, j) ] Ej [ p(c|u, j) ] = Σj p(c|u, j) p(j) = p(c|u) • I, j : inventory (on Web/App) φ = p(c|u, i) / p(c|u) φ = p(c|s, i) / p(c|s) • All inventories in the same segment are the same, • i could also be inventory cluster
  15. 15. Bidding Price Model Building • Cold start • Training feature of segment and inventory respectively • Do not train combined feature for preventing over fitting on few training dataφ = p(c|s, i) / p(c|s) • Cold start • Training feature of segment AUC: the area under the ROC curve(TP/FP) Lift: target response divided by average response. bid = BasePrice(s, a) * p(c|s, I, a) / p(c|s, a) DSP cross-campaigns:
  16. 16. World, Model & Theory Credit: John F. Sowa
  17. 17. Solutions on CTR Prediction  Logistic Regression  Matrix Factorization Netflix  Deep Learning
  18. 18. Logistic Regression
  19. 19. LM & LR Source: http://www.saedsayad.com/logistic_regression.htm 歸一化的好處在於數值具備可比性和收斂的邊界 Likelihood
  20. 20. User-based Recom.
  21. 21. Matrix = Associations Rose Navy Olive Alice 0 +4 0 Bob 0 0 +2 Carol -1 0 -2 Dave +3 0 0  Things are associated Like people to colors  Associations have strengths Like preferences and dislikes  Can quantify associations Alice loves navy = +4, Carol dislikes olive = -2  We don’t know all associations Many implicit zeroes Source: Sean Owen(2012), Cloudera
  22. 22. From One Matrix, Two  Like numbers, matrices can be factored  m•n matrix = m•k times k•n  Interpretation: Associations can decompose into others  Alice likes navy = Alice loves blues, and blues includes navy Pm = n Xm k k n • Y’ Source: Sean Owen(2012), Cloudera
  23. 23. Latent Factors
  24. 24. In Terms of Few Features  Can explain associations by appealing to underlying features in common (e.g. “blue-ness”)  Relatively few (one “blue-ness”, but many shades) (Alice) (Blue) (Navy) Source: Sean Owen(2012), Cloudera
  25. 25. Losing Information is Helpful  When k (= features) is small, information is lost  Factorization is approximate (Alice appears to like blue-ish periwinkle too) (Alice) (Blue) (Navy) (Periwinkle) Source: Sean Owen(2012), Cloudera
  26. 26. How to Compute? P m = n X k k• Y’ n m
  27. 27. Singular Value Decomposition A m = n S k k• T’ n m •Σ
  28. 28. Context-aware Matrix Factorization
  29. 29. Sample FM Matrix
  30. 30. Optimization Perspective
  31. 31. Gradient Descent
  32. 32. FM with SGD Rendle, S.(2012)
  33. 33. Factorization Machine - R  #  # Factorization machine  #  logis <- function(x) {  result <- 1./(1+exp(-x))  return(result)  }  wTx <- function(x, w, V) { #decision value  V.size <- dim(V)  p <- V.size[1] #rows  k <- V.size[2] #columns  tmp = 0;  for (i in 1:k) {  tmp1 <- 0;  tmp2 <- 0;  for (j in 1:p) {  tmp1 = tmp1 + V[j,i] %*% x[j];  tmp2 = tmp2 + (V[j,i] %*% x[j])^2;  }  tmp = tmp + (tmp1^2-tmp2);  }  tmp = 0.5*tmp;  x[length(x)+1] <- 1  result <- x %*% t(w) + tmp #x is all features + bias  return(result)  } Un-optimized version
  34. 34. Factorization Machine - R  FMlogistic_<- function(A, y, At, yt, k, lambda, eta, numiter) {  #  # A: input matrix  # y: lable  # At: Test of A  # yt: Test of y  # k: number of latent factors  # lambda: regularization parameters  # eta: learning rate  # numiter: number of interactions  #   A.size <- dim(A) #[numinst, numfeat]  numinst <- A.size[1]  numfeat <- A.size[2]  nt <- numinst   #B <- matrix(1, numinst, numfeat)  #B.size <- dim(B)  #Bt <- matrix(1, B.size[1], B.size[2])  sigma <- 0.1 # standard deviation  # Start here… Model parameter theda = (w0, w, V)  w0 <- matrix(0, 1, numfeat+1) # weights of features, +1 is for bias  w <- matrix(0, 1, numfeat+1) # weights of features, +1 is for bias  #V0 <- matrix(c(rnorm(numfeat*k, mean = 0, sd = sigma)), numfeat, k) # generates an numfeat-by-k output matrix  V0 <- matrix(0.1, numfeat, k)  V <- matrix(0, numfeat, k) # output matrix
  35. 35. Factorization Machine - R  for (iter in 1:numiter) {  for (i in 1:numinst) {  for (j in 1:numfeat) {  w[j] <- w0[j] - eta*((logis(wTx(A[i,], w0, V0) %*% y[i])-1)*y[i]*A[i,j]+2*lambda*w0[j])  for (numlatent in 1:k) {  ind <- setdiff(1:numfeat, j)  hx <- A[i,j] %*% sum( V0[ind,numlatent] * t(A[i,ind]) )  V[j,numlatent] <- V0[j,numlatent] - eta*((logis(wTx(A[i,], w0, V0)*y[i])-1)*y[i] * hx + 2*lambda*V0[j,numlatent])  }  }  w[length(w)] = w0[length(w0)] - eta*((logis(wTx(A[i,], w0, V0)*y[i])-1)*y[i]+2*lambda*w0[length(w0)])  V0 <- V  w0 <- w  }  yhat <- matrix(0, nt, 1)  for (i in 1:nt) {  yhat[i] <- wTx(At[i,], w, V)  }  prob <- 1./(1+exp(-yhat));  yhat[yhat>=0] <- 1;  yhat[yhat <0] <- -1;  acc <- sum(yt==yhat)/nt;  cat( sprintf('n#iter = %d, training accurcy = %fn', iter , acc) )   }  return( list(prob, yhat) )  }
  36. 36. Factorization Machine – R_V2 & V3  V2 Optimized & Fast  V3 Train & Test  V4 RRE
  37. 37. Data Economy 傳統 -> 數位經濟學 HighREACH RICHNESS High Low Traditional Economy Internet Economy (quality) (quantity) Attr. vs. behavior Base of targeting
  38. 38. Data-driven Performance Reach Richness High High Low 使用者接觸量(Reach of UU) 資料豐富度 (The power source of behavioral forecasting) Range High 使用者情境 (The audience affiliate of whole context) Problem-solving Thinking Performance (CTR, CVR, CPI) of AdNet and DSP Data Algorit hms Tools
  39. 39. World, Model & Theory Credit: John F. Sowa
  40. 40. 謝謝! Thanks to: Data Team @ Vpon ManKuan @ NCCU Yiren @ PUUC craig.chao@vpon.com, chaocraig@gmail.com

×