Uploaded on

The presentation discusses the significance of testing and how to execute a successful testing program.

The presentation discusses the significance of testing and how to execute a successful testing program.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
166
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. >  Tes&ng  for  Success  <   Elements  of  a  Successful  Tes0ng   Program  
  • 2. >  Agenda  § Why  Test?              § Problem  Diagnosis  § Deciding  what  to  Test      § Test  Execu0on  and  Measurement  § Test  Repor0ng  June  2012   ©  Datalicious  Pty  Ltd   2  
  • 3. 101011010010010010101111010010010101010100001011111001010101010100101011001100010100101001101101001101001010100111001010010010101001001010010100100101001111101010100101001001001010  >  Why  Test?  June  2012   ©  Datalicious  Pty  Ltd   3  
  • 4. 1.  Why  does  your   EVERYONE’S   business/organisa0on   exist?   GOT  AN  2.  How  can  your  business/ OPINION   organisa0on  improve?  June  2012   ©  Datalicious  Pty  Ltd   4  
  • 5. >  Why  Test?  1.  Systema0c  Innova0on  2.  Avoid  costly  mistakes  3.  Know  why  things  go  right,  know  why  things   go  wrong  4.  BeSer  employee  engagement  §  Requires  planning  and  governance!  June  2012   ©  Datalicious  Pty  Ltd   5  
  • 6. 101011010010010010101111010010010101010100001011111001010101010100101011001100010100101001101101001101001010100111001010010010101001001010010100100101001111101010100101001001001010  >  Problem  Diagnosis  June  2012   ©  Datalicious  Pty  Ltd   6  
  • 7. >  What  is  the  business  problem?   Acquisi0on   Up-­‐Sell   Reten0on   Advocacy   Analy&cs  and  metrics  frameworks  June  2012   ©  Datalicious  Pty  Ltd   7  
  • 8. >  Case  Study  June  2012   ©  Datalicious  Pty  Ltd   8  
  • 9. >  Further  Diagnosis   PROBLEM:  Sales  through  online   Not  enough  site  traffic   High  home  page  bounce  rate   Low  conversion  on  product  page   Checkout  fallout  June  2012   ©  Datalicious  Pty  Ltd   9  
  • 10. >  Further  Diagnosis  II   Source:  www.feng-­‐gui.com  June  2012   ©  Datalicious  Pty  Ltd   10  
  • 11. >  Some&mes  the  small  things  count  June  2012   ©  Datalicious  Pty  Ltd   11  
  • 12. >  Further  diagnosis  III   Wrong  message?   Wrong  channel?   Wrong  person?   Wrong  0me?  June  2012   ©  Datalicious  Pty  Ltd   12  
  • 13. >  Tes&ng  as  risk  mi&ga&on   Roll-­‐out  Channel     Press   TV   Radio   Outdoor   Offer,   Crea&ve,   Call-­‐to-­‐ Offer,  Call-­‐ Offer,  Call-­‐ eDM/DM   Call-­‐to-­‐ Ac&on   to-­‐Ac&on   to-­‐Ac&on   Ac&on   Test   Paid   Channel   Search   Offer   Offer   Offer   Offer   Crea&ve,   Display   Offer,  Call-­‐ Offer,  Call-­‐ -­‐   Crea&ve   Media   to  Ac&on   to  Ac&on  June  2012   ©  Datalicious  Pty  Ltd   13  
  • 14. >  Tes&ng  as  standard  prac&ce   Test  Market   Control  Market  (no  ATL)      %  Uplic  in  Sales    TimeJune  2012   ©  Datalicious  Pty  Ltd   14  
  • 15. 101011010010010010101111010010010101010100001011111001010101010100101011001100010100101001101101001101001010100111001010010010101001001010010100100101001111101010100101001001001010  >  Deciding  what  to  Test  June  2012   ©  Datalicious  Pty  Ltd   15  
  • 16. >  Test  Op&ons   Message   Delivery   Components   Components   Product   Targe0ng  &  Segmenta0on   Proposi0on   Communica0on  Channels   Offer   Format   Crea0ve   Timing   Call-­‐to-­‐Ac0on  June  2012   ©  Datalicious  Pty  Ltd   16  
  • 17. Don’t  reinvent  the  wheel  June  2012   ©  Datalicious  Pty  Ltd   17  
  • 18. >  What  are  the  solu&on(s)?  June  2012   ©  Datalicious  Pty  Ltd   18  
  • 19. >  Consumer  Empathy   What  are  your  visitors  trying  to  achieve  by  visi2ng  your  site?  June  2012   ©  Datalicious  Pty  Ltd   19  
  • 20. >  Consumer  Empathy  1.  Make  it  visible   –  People  can’t  convert  if  they  can’t  find  your   ‘Buy  Now’  buSon  2.  Make  it  relevant   –  Need  to  resolve  consumer  reserva0ons/ ques0ons  3.  Make  it  easy   –  Easy  naviga0on,  easy  form  comple0on,  easy  to   read,  quick  page  load  June  2012   ©  Datalicious  Pty  Ltd   20  
  • 21. >  Start  with  the  basics…  1.  The  headline   –  Have  a  headline!   –  Headline  should  be  concrete   –  Headline  should  be  first  thing  visitors  look  at  2.  Call  to  ac&on   –  Don’t  have  too  many  calls  to  ac0on   –  Have  an  ac0onable  call  to  ac0on   –  Have  a  big,  prominent,  visible  call  to  ac0on  3.  Social  proof   –  Logos,  number  of  users,  tes0monials,     case  studies,  media  coverage,  etc  June  2012   ©  Datalicious  Pty  Ltd   21  
  • 22. >  Start  with  the  basics…  June  2012   ©  Datalicious  Pty  Ltd   22  
  • 23. >  Case  Study  June  2012   ©  Datalicious  Pty  Ltd   23  
  • 24. >  Further  Examples   TEST  A   EXISTING  June  2012   ©  Datalicious  Pty  Ltd   24  
  • 25. >  Further  Examples   EXISTING   TEST  June  2012   ©  Datalicious  Pty  Ltd   25  
  • 26. >  Direct  Mail  Example  §  Two  simple  objec&ves   –  Improve  response  rates   –  Increase  amount  donated  §  Understanding  donor   segments   –  Rela0onship  to  disease   –  Value      June  2012   ©  Datalicious  Pty  Ltd   26  
  • 27. >  Targeted  Comms   §  Rela&onship  to  disease   –  Have  the  disease   –  Parent  of  someone  with  the  disease   –  Rela0ve  /  friend  of  someone  with  the  disease   –  No  rela0onship  to  the  disease  June  2012   ©  Datalicious  Pty  Ltd   27  
  • 28. >  Targeted  Comms   §  Value   –  Variable  dona0ons  boxes  based  on  last  dona0on,   increased  in  increments  of  20%  June  2012   ©  Datalicious  Pty  Ltd   28  
  • 29. >  Case  Study  Results  June  2012   ©  Datalicious  Pty  Ltd   29  
  • 30. >  Deciding  What  to  Test   Test  Selec0on  Checklist  §  Is  the  measurement  infrastructure  in  place  already?    [  ✔    ]                §  Can  I  readily  execute  the  solu0on?    [  ✔    ]                §  Do  I  have  enough  sample  to  draw  valid  conclusions?    [  ✔    ]                §  Will  this  prove  the  value  of  tes0ng  in  the  business?    [  ✔    ]                June  2012   ©  Datalicious  Pty  Ltd   30  
  • 31. >    Do  you  have  the  repor&ng?   For  each  of  Segment  X,  Y  and  Z...   Test  Channel     ATL   DM   eDM   Online   Online   ✔   ✔   Mailroom   ✔   Response   Call  Centre   Channel   Bricks  &   Mortar   Channels  in   ✔   Aggregate  June  2012   ©  Datalicious  Pty  Ltd   31  
  • 32. >  Offline  conversions  from  online   Tying  offline  conversions  back  to  online  campaign  and  research  behavior  using   standard  cookie  technology  by  triggering  virtual  online  order  confirma0on   pages  for  offline  sales  using  email  receipts.   Website.com   Phone   Virtual  Order   Research   Orders   @   Confirma&on   Online  Ad   Website.com   Retail   Virtual  Order   Campaign   Research   Orders   @   Confirma&on   Website.com   Online   Online  Order   Virtual  Order   Research   Orders   Confirma&on   @   Confirma&on   Cookie   Cookie   Cookie  June  2012   ©  Datalicious  Pty  Ltd   32  
  • 33. >  Search  call  to  ac&on  for  offline    June  2012   ©  Datalicious  Pty  Ltd   33  
  • 34. >  OTP  Response   –  Different  numbers  for  different  media  channels   –  Different  numbers  for  different  product   categories   –  Different  numbers  for  different  conversion  steps   –  Call  origin  becoming  useful  to  shape  call  script   –  Feasible  to  pause  numbers  to  improve  integrity   …  also  phone  number  reveal.  June  2012   ©  Datalicious  Pty  Ltd   34  
  • 35. >  ‘Rule  of  Thumb’  §  Can  be  used  for  indirect  sales  (resellers)  as  well  as  an  ‘early  read’  for   long  campaign  cycles  §  Typical  approach:   1.  Establish  a  ra0o  for  website  visits  or  calls  to  reseller  enquiries/ sales     2.  Establish  a  pre-­‐campaign  baseline  for  calls  and  website  visits   3.  Measure  the  uplic  in  calls/visits  during  and  following  the   promo0on   4.  Extrapolate  to  sales  using  typical  ra0o  June  2012   ©  Datalicious  Pty  Ltd   35  
  • 36. >  Whose  help  do  you  need?  Technology/IT   UX Agency Analytics!Your boss, Your boss’ boss Creative Agency Customer Contact ManagementJune  2012   ©  Datalicious  Pty  Ltd   36  
  • 37. >  Proving  the  Value   GO  BIG  June  2012   ©  Datalicious  Pty  Ltd   37  
  • 38. >  The  Importance  of  a  Control   Here  there  is  no  control/benchmark:     Response   rate    -­‐  A  separate  offer  has  been            run  in  each  month   New  offer  A     Standard  offer   New  offer    B    -­‐  Offer  A  appears  to  have  out-­‐          performed  the  current  offer      -­‐  Offer  B  appears  to  have            performed  worse      =  Offer  A  appears  to  win   May   June   July  June  2012   ©  Datalicious  Pty  Ltd   38  
  • 39. >  The  Importance  of  a  Control   Introduc&on  of  control/benchmark:     Response   rate    -­‐  The  current  offer  has  been            run  in  each  month  as  a     New  offer  A          benchmark   New  offer    B     Standard  offer    -­‐  Offer  A  did  not  perform  as          well  as  the  current  offer        -­‐  Offer  B  performed  beSer  than          the  current  offer     May   June   July    =  Offer  B  is  the  real  winner  June  2012   ©  Datalicious  Pty  Ltd   39  
  • 40. >  Deciding  What  to  Test   Test  Selec0on  Checklist  §  Is  the  measurement  infrastructure  in  place  already?    [  ✔    ]                §  Can  I  readily  execute  the  solu0on?    [  ✔    ]                §  Do  I  have  enough  sample  to  draw  valid  conclusions?    [  ✔    ]                §  Will  this  prove  the  value  of  tes0ng  in  the  business?    [  ✔    ]                June  2012   ©  Datalicious  Pty  Ltd   40  
  • 41. >  How  much  sample  do  I  need?   BAU/Baseline   Conversion  Rate   #  on  Segments,   #  of  Treatments   n   Expected  Δ   in  Conversion   Time  in  Market   [Digital  Only]  June  2012   ©  Datalicious  Pty  Ltd   41  
  • 42. >  Sta&s&cal  Significance  Q.  How  much  am  I  willing  to  accept  that  the         difference  in  the  results  between  my  test   group  and  control  group  may  have  been  due   to  chance?    A.  Not  much.  I  want  to  be  confident  that  if  I   repeated  the  test  100  &mes,  then  I  would   observe  this  difference  95  &mes.       This  is  ‘95%  confidence’  June  2012   ©  Datalicious  Pty  Ltd   42  
  • 43. >  Type  I  and  Type  II  Error  Type  I:    Accept  result  to  be  true  when  it’s    actually  false  (false  posi&ves)    Type  II:  Accept  result  to  be  false  when  it’s      actually  true  (false  nega&ves)  June  2012   ©  Datalicious  Pty  Ltd   43  
  • 44. >  Es&ma&ng  Sample  Size  (%s)   2 # p1 (1− p1 ) + p2 (1− p2 ) & n = (1.645+1.282) * % 2 ( $ Δ Where:    n    =    es0mated  sample  size  for  each  group    p1  =    expected  conversion  rate  for  your  test  treatment    p2  =    expected  conversion  rate  for  your  control  treatment    Δ    =    expected  minimum  percentage  point  difference  between  test        and  control  results         The  value  of  1.645  reflects  that  we  accept  Type  I  error  probability  of  .05     The  value  of  1.282  reflects  that  we  accept  Type  II  error  probability  of  .10    June  2012   ©  Datalicious  Pty  Ltd   44  
  • 45. >  Es&ma&ng  Sample  Size  (%s)   Typical  Champion  (control)  vs.  Challenger  (test)  A|B  test,  typical  champion   response  rate  of  2.5%.     •  Only  going  to  replace  Champion  with  Challenger  if  Challenger   response  rate  is  3.0%  (0.5%  is  a  meaningful  difference)       2 ! 0.025* 0.975 + 0.030 * 0.970 $ n = (1.645+1.282) * # 2 & " 0.005 % Sample  size  =  18,326  for  each  of  the  Champion  and  Challenger  groups     If  meaningful  difference  is  1.0%  then  sample  size  is  only  4,581  for  each   group  June  2012   ©  Datalicious  Pty  Ltd   45  
  • 46. >  Es&ma&ng  Sample  Size  ($s)   (1.645 +1.282)2 * (s12 + s2 ) 2 n= Δ2 Where:    n    =    number  of  observa0ons  for  each  group    s1  =    expected  standard  devia0on  of  value  for  your  test  treatment    s2  =    expected  standard  devia0on  of  value  for  your  control  treatment    Δ    =    expected  minimum  difference  in  value  between  test        and  control  results         The  value  of  1.645  reflects  an  accepted  Type  I  error  probability  of  .05     The  value  of  1.282  reflects  an  accepted  Type  II  error  probability  of  .10    June  2012   ©  Datalicious  Pty  Ltd   46  
  • 47. >  Standard  Devia&on   Standard  devia0on  is  measure  of  the  variability  of  your  results,  whether  some   your  results  are  quite  different  to  your  mean  (average)  result  or  whether  they   are  quite  similar.   n ∑(x − x ) i i=1 s= n −1 Where:    n    =    number  of  observa0ons    xi  =    the  result  for  the  ith  observa0on    x  =    mean  (average)  for  your  data  June  2012   ©  Datalicious  Pty  Ltd   47  
  • 48. >  Es&ma&ng  Sample  Size  ($s)   Typical  Champion  (control)  vs.  Challenger  (test)  A|B  test,  typical  champion   mean  response  value  of  $20,  typical  response  rate  of  5%     •  Only  going  to  replace  Champion  with  Challenger  if  Challenger  mean   response  value  is  is  $30  ($10  is  a  meaningful  difference)   •  Standard  devia0on  of  Champion  results  is  $5  (based  on  past  results).   We’ll  assume  the  same  for  the  Challenger.         2 2 2   (1.645 +1.282) * (5 + 5 ) n= 2 10 Number  of  observa0ons  =  4.3  (~5)  for  each  of  the  Champion  and  Challenger   groups.     Then  divide  through  with  the  expected  response  rate  to  get  minimum  sample   size  of  86  for  each  of  Challenger  and  Control  groups  (4.3/0.05)  June  2012   ©  Datalicious  Pty  Ltd   48  
  • 49. >  Further  Complexity  I   If  we  wanted  to  test  the  performance  of  Challenger  vs.  Champion  for  different   segments  of  consumers:   Response  Rate   Champion   Challenger   A   %   %   Segment   B   %   %   C   %   %   Using  same  assump0ons  as  in  earlier  example  need  18,326  per  cell,   18,326*6=109,956  in  total  .    June  2012   ©  Datalicious  Pty  Ltd   49  
  • 50. >  Further  Complexity  II   If  we  wanted  to  test  the  performance  of  Challenger  vs.  Champion  for   difference  segments  of  consumers  AND  had  3  different  types  of  Champion   crea0ve:   Response  Rate   Champion Challenger   Challenger   Challenger   /Control   #1   #2   #3   A   %   %   %   %   Segment   B   %   %   %   %   C   %   %   %   %   Using  same  assump0ons  as  in  earlier  example  need  18,326  per  cell,   18,326*12=219,912  in  total.    June  2012   ©  Datalicious  Pty  Ltd   50  
  • 51. >  Further  Complexity  III   If  we  wanted  to  test  the  performance  of  Challenger  crea0ve  that  was   specifically  customised  for  difference  segments  of  consumers,  then  we’re   actually  only  running  6  tests   Response  Rate   Champion Challenger   Challenger   Challenger   /Control   #1   #2   #3   A   %   %   Segment   B   %   %   C   %   %   Using  same  assump0ons  as  in  earlier  example  need  18,326  per  cell,   18,326*6=109.956  in  total.    June  2012   ©  Datalicious  Pty  Ltd   51  
  • 52. >  Mul&variate  Tes&ng  (MVT)   Mul0variate  Tes0ng  (commonly  called  MVT)  is  a  term  used  for  tes0ng  different   varia0ons  of  typical  elements  of  a  landing  page,  direct  mail  leSer,  etc.    The  aim  is   to  determine  which  combina0on  delivers  the  best  result.   Element  #1:  Prominent   headline   §  Element  #1   Element  #2:     –  2  varia0ons  (1  exis0ng,  1  new)   Suppor0ng     Call  to   content   §  Element  #2   ac0on   –  2  varia0ons  (1  exis0ng,  1  new)   Element  #3:  Social  proof  /   §  Element  #3:   trust   –  2  varia0ons  (1  exis0ng,  1  new)   Terms  and  condi0ons  June  2012   ©  Datalicious  Pty  Ltd   52  
  • 53. >  MVT  –  Full  Factorial   A  full  factorial  design  requires  every  unique  combina0on  of  page  elements  and   can  therefore  be  very  sample  hungry.     Element   To  calculate  the   Headline   Call  to  Ac&on   Social  Proof   number  of   1   H1   CTA1   SP1   treatments  just  need   2   H1   CTA1   SP2   to  mul0ply  the   3   H1   CTA2   SP1   number  of  varia0ons   4   H1   CTA2   SP2   Treatment   for  each  factor   5   H2   CTA1   SP1   together:   6   H2   CTA1   SP2     7   H2   CTA2   SP1   2  x  2  x  2  =    8     8   H2   CTA2   SP2  June  2012   ©  Datalicious  Pty  Ltd   53  
  • 54. >  MVT  –  Frac&onal  Factorial   The  alterna0ve  is  called  a  frac0onal  factorial  design  which  is  some  smaller  set  of   elements  combina0ons.  The  design  should  be  ‘balanced’  -­‐  every  varia0on  is   tested  the  same  number  of  0mes  and  each  combina0on  of  varia0ons  occurs  the   same  number  of  0mes.   Element   Headline   Call  to  Ac&on   Social  Proof   1   2   H1   CTA1   SP2   Reduced  sample   3   H1   CTA2   SP1   requirements   Treatment   4   4x18,326=73,304   5   H2   CTA1   SP1   6   7   8   H2   CTA2   SP2  June  2012   ©  Datalicious  Pty  Ltd   54  
  • 55. >  Layout  Before  Content  §  Phase  #1:  A|B  test   –  Test  the  same  landing   Element  #1:  Prominent  headline   page  content  in   completely  different   layouts  §  Phase  #2:  MV  test   Suppor0ng     Element  #2:     –  Then  test  different   content   Call  to  ac0on   content  element   combina0ons  within  the   winning  layout   Element  #3:  Social  proof  /  trust  §  Phase  #3:  MV  test  (if   req’d)   –  Test  with  reduced  set  of   Terms  and  condi0ons   elements  June  2012   ©  Datalicious  Pty  Ltd   55  
  • 56. >  Case  Study  §  Yes,  the  measurement  infrastructure  is  in  place  §  I  can  readily  execute  the  test  design  §  I  have  enough  sample  to  draw  valid  conclusions  §  Yes,  this  design  will  prove  the  value  of  tes0ng  in  my   business  June  2012   ©  Datalicious  Pty  Ltd   56  
  • 57. 101011010010010010101111010010010101010100001011111001010101010100101011001100010100101001101101001101001010100111001010010010101001001010010100100101001111101010100101001001001010  >  Execu&on  &  Measurement  June  2012   ©  Datalicious  Pty  Ltd   57  
  • 58. Before  you  leap…  June  2012   ©  Datalicious  Pty  Ltd   58  
  • 59. >  Sample  Selec&on  §  Each  sample  needs  to  be  alike  in  terms  of   their  predisposi0on  to  conversion   Conversion:  low  rate  credit  card  applica0on  form  comple0on   TEST   CONTROL   18-­‐34   35-­‐64   Mostly  Male   Mostly  Female   Mostly  Low  Income   Mostly  High  Income  June  2012   ©  Datalicious  Pty  Ltd   59  
  • 60. >  Timing  is  Important    ‘Burst’  Non  BAU  ATL   Ideal  Test  Window   Campaign   Sales    Time  June  2012   ©  Datalicious  Pty  Ltd   60  
  • 61. >  A|A  Tes&ng  §  Set  a  test  that  splits  your  visitors  50/50   between  the  same  treatment   –  Check  that  sample  sizes  are  actually  50/50   –  Is  there  should  be  no  difference  in  your   conversion  rates   –  Are  volumes  of  conversions  matching  other   repor0ng?  June  2012   ©  Datalicious  Pty  Ltd   61  
  • 62. >  Measuring  your  performance  §  Propor0ons  (conversion  rates)  §  Means  (average  $s)  §  Variability  of  Means  (standard  devia0on)   Would  my  winning  treatment  s2ll  be  the  winner   across  all  my  customers/visitors/consumers?      §  Use  confidence  intervals  June  2012   ©  Datalicious  Pty  Ltd   62  
  • 63. >  Confidence  Intervals   Conversion  Rate   Revenue  per   Response   A   B   C   A   B   C    Treatments    Treatments  June  2012   ©  Datalicious  Pty  Ltd   63  
  • 64. >  Confidence  Intervals  June  2012   ©  Datalicious  Pty  Ltd   64  
  • 65. >  Confidence  Interval  (%s)   ˆ ˆ p(1− p) ˆ p ±1.96 * n Where:   ^    p    =    response  rate    n  =    sample  size  for  treatment     The  value  of  1.96  reflects  a  95%  confidence  level  June  2012   ©  Datalicious  Pty  Ltd   65  
  • 66. >  Confidence  Interval  Es&ma&on   Typical  Champion  (control)  vs.  Challenger  (test)  A|B  Test   Treatment   Champion   Challenger   Mailed   60850   52812   Responses   1055   455   Response  Rate   1.7   0.9   .017(1−.017) .009(1−.009) 1.7% ±1.96 * 0.9% ±1.96 * 60850 52812 1.7% ± 0.10% 0.9% ± 0.08% 1.69%  ≤  Champion  ≤    1.71%   0.82%  ≤  Challenger  ≤    0.98%  June  2012   ©  Datalicious  Pty  Ltd   66  
  • 67. >  Confidence  Interval  Es&ma&on   p1 (1− p1 ) p2 (1− p2 ) p1 − p2 ±1.96 * + n1 n2 Where:    p1   =    response  rate  for  challenger    p2   =    response  rate  for  champion      n1  =    sample  size  for  challenger    n2  =    sample  size  for  challenger     The  value  of  1.96  reflects  a  95%  confidence  level  June  2012   ©  Datalicious  Pty  Ltd   67  
  • 68. >  Confidence  Interval  Es&ma&on   Typical  Champion  (control)  vs.  Challenger  (test)  A|B  Test   Treatment   Champion   Challenger   Mailed   60850   52812   Responses   1055   455   Response  Rate   1.7   0.9   .009(1−.009) .017(1−.017) 0.9 −1.7 ±1.96 * + 52812 60850 −0.8 ± 0.13 -­‐0.93%  ≤  Difference  Between  Challenger  and  Champion  ≤    -­‐0.67%  June  2012   ©  Datalicious  Pty  Ltd   68  
  • 69. >  Control  Group  Sample  Size   p1 (1− p1 ) p2 (1− p2 ) p1 − p2 ±1.96 * + n1 n2 pc (1− pc ) Rearranged:   nc = 2 " m % pt (1− pt ) $ − # 1.96 & nt Where:    nc    =    sample  size  for  control  group    nt    =    sample  size  for  test  group    pc  =    forecast  response  rate  for  control  group    nt  =    forecast  response  rate  for  test  group    m  =    desired  level  of  precision  (%  that  is  a  meaningful  difference)       The  value  of  1.96  reflects  a  95%  confidence  level  June  2012   ©  Datalicious  Pty  Ltd   69  
  • 70. >  Control  Group  Sample  Size   We  have  50,000  customers  that  we  could  include  in  our  test  design,  what   would  our  control  sample  need  to  be  if  we  tested  40,000  customers,  our   ‘natural’  cross-­‐sell  rate  was  1.0%  and  an  incremental  response  rate  of  1.0%   points  would  be  deemed  to  be  meaningful?   .01(1−.01) nc = 2 " .01 % .02(1−.02) $ − # 1.96 & 40, 000 nc = 387 This  result  suggests  we  could  actually  test  more  of  our  available  customer  base   than  we  might  have  ini0ally  expected  (~40,600).  June  2012   ©  Datalicious  Pty  Ltd   70  
  • 71. >  Confidence  intervals  ($s)   s x ±1.96 * n Where:    x    =    mean  revenue  among  treatment  responders    s  =    standard  devia0on  of  revenue  among  some  treatment’s  responders    n  =    number  of  responders  to  the  treatment     The  value  of  1.96  reflects  a  95%  level  of  confidence.    June  2012   ©  Datalicious  Pty  Ltd   71  
  • 72. >  Standard  Devia&on  (reminder)   Standard  devia0on  is  measure  of  the  variability  of  your  results,  whether  some   your  results  are  quite  different  to  your  mean  (average)  result  or  whether  they   are  quite  similar.   n ∑(x − x ) i i=1 s= n −1 Where:    n    =    number  of  observa0ons    xi  =    the  result  for  the  ith  observa0on    x  =    mean  (average)  for  your  data  June  2012   ©  Datalicious  Pty  Ltd   72  
  • 73. >  Confidence  intervals  ($s)   2 2 s s 1 2 x1 − x2 ±1.96 * + n1 n2 Where:    x1  =    mean  value  among  among  responders  to  a  treatment    x2  =    mean  value  among  among  responders  to  a  different  treatment      s1  =    std.  dev.  of  value  among  one  treatment’s  responders    s2  =    std.  dev.  of  value  among  the  other  treatment’s  responders  n1  =    number  of  responders  to  the  treatment    n2  =    number  of  responders  to  the  other  treatment     The  value  of  1.96  reflects  a  95%  level  of  confidence.   n1  and  n2  is  sufficiently  large  to  es0mate  the  std.  dev.  in  the  popula0on  with   the  std.  dev.  of  the  sample.  June  2012   ©  Datalicious  Pty  Ltd   73  
  • 74. >  Confidence  intervals  ($s)   Typical  Champion  (control)  vs.  Challenger  (test)  A|B  Test   Treatment   Champion   Challenger   Mailed   60850   52812   Responses   1055   455   Response  Rate   1.7   0.9   Total  Value   $36,925   $38,675   Mean  Value   $35   $85   Std  Dev   $30   $50   50 2 30 2 85 − 35 ±1.96 * + 50 ± 4.9 455 1055At  a  minimum,  we  should  expect  an  incremental  $45.1  if  we  rolled  out  the  Challenger  crea0ve  as  BAU  (although  our  total  amount  of  incremental  revenue  would  be  less).  June  2012   ©  Datalicious  Pty  Ltd   74  
  • 75. >  Case  Study  June  2012   ©  Datalicious  Pty  Ltd   75  
  • 76. >  Main  Effects  June  2012   ©  Datalicious  Pty  Ltd   76  
  • 77. >  Main  Effects   Typical  Landing  Page  Test   Element   Results   Call  to   Visitors   Conversion   Headline   Social  Proof   Conversions   Ac&on   Tested   Rate   1   H1   CTA1   SP1   1237   456   37%   2   H1   CTA1   SP2   1456   345   24%   3   H1   CTA2   SP1   1245   234   19%   4   H1   CTA2   SP2   2123   432   20%   Treatment   5   H2   CTA1   SP1   1342   234   17%   6   H2   CTA1   SP2   1102   123   11%   7   H2   CTA2   SP1   1365   700   51%   8   H2   CTA2   SP2   1243   643   52%  Treatment  #7  and  #8  were  the  clear  winners  and  It  looks  as  if  the  Headline  and  Call-­‐to-­‐Ac0on  were  much  bigger  drivers  of  posi0ve  performance  than  the  Social  Proof.  Lets  check  this!  June  2012   ©  Datalicious  Pty  Ltd   77  
  • 78. >  Main  Effects   Typical  Landing  Page  Test   Element   Results   Call  to   Social   Visitors   Conversion   Headline   Ac&on   Proof   Tested   Rate   1   H1   CTA1   SP1   1237   37%   2   H1   CTA1   SP2   1456   24%   Avg  H1=24%   3   H1   CTA2   SP1   1245   19%   4   H1   CTA2   SP2   2123   20%   Treatment   5   H2   CTA1   SP1   1342   17%   6   H2   CTA1   SP2   1102   11%   7   H2   CTA2   SP1   1365   51%   Avg  H2=33%   8   H2   CTA2   SP2   1243   52%  The  Main  Effect  of  the  Headline  is  simply  the  (weighted)  average  conversion  rate  for  Headline  2  less  the  (weighted)  average  conversion  rate  for  Headline  1    (33%-­‐24%=9%)  June  2012   ©  Datalicious  Pty  Ltd   78  
  • 79. >  Main  Effects   Typical  Landing  Page  Test   Main  Effect   Headline   9.4%   Element   Call  to  Ac&on   11.1%   Social  Proof   5.3%  In  actual  fact,  it  was  varia0ons  in  Call  to  Ac0on  that  had  the  most  posi0ve  impact  on  our  results,  improving  conversions  by  11.1%  points.  June  2012   ©  Datalicious  Pty  Ltd   79  
  • 80. >  Interac&on  Effects   Typical  Landing  Page  Test   Element   Results   Call  to   Social   Visitors   Conversion   Headline   Ac&on   Proof   Tested   Rate   1   H1   CTA1   SP1   1237   37%   2   H1   CTA1   SP2   1456   24%   7   H2   CTA2   SP1   1365   51%   8   H2   CTA2   SP2   1243   52%   Treatment   3   H1   CTA2   SP1   1245   19%   4   H1   CTA2   SP2   2123   20%   5   H2   CTA1   SP1   1342   17%   6   H2   CTA1   SP2   1102   11%  An  interac0on  effect  is  present  where  the  performance  of  one  element  is  dependent  on  which  varia0on  of  the  another  variable  is  present.  In  this  example,  we  are  looking  at  whether  the  results  for  each  of  the  Headlines  is  dependent  on  which  Call-­‐to-­‐Ac0on.  June  2012   ©  Datalicious  Pty  Ltd   80  
  • 81. >  Interac&on  Effects   Typical  Landing  Page  Test   Element   Results   Call  to   Social   Visitors   Conversion   Headline   Ac&on   Proof   Tested   Rate   1   H1   CTA1   SP1   1237   37%   2   H1   CTA1   SP2   1456   24%   Wtd  Avg  H1CTA1=30%   3   H1   CTA2   SP1   1245   19%   Wtd  Avg  H1CTA2=20%   4   H1   CTA2   SP2   2123   20%   Treatment   5   H2   CTA1   SP1   1342   17%   Wtd  Avg  H2CTA1=14%   6   H2   CTA1   SP2   1102   11%   7   H2   CTA2   SP1   1365   51%   Wtd  Avg  H2CTA2=51%   8   H2   CTA2   SP2   1243   52%  The  first  step  is  to  create  weighted  average  response  rates  between  for  each  of  the  two  factors  (ignoring  Social  Proof).    June  2012   ©  Datalicious  Pty  Ltd   81  
  • 82. >  Interac&on  Effects   Typical  Landing  Page  Test   Call  to  Ac&on   CTA1   CTA2   Diff   60%   H1   30%   20%   -­‐10%   40%   CTA1   20%   Headline   H2   14%   51%   37%   CTA2   0%   Diff   -­‐16%   31%   H1   H2  The  next  step  is  to  calculate  the  difference  in  performance  of  one  factor  across  different  variants  of  the  other  factor.  If  the  difference  of  this  difference  is  non-­‐zero  (or  not  very  close  to  zero),  then  you  have  an  interac0on  effect.      For  example,  there  is  an  interac0on  effect  between  the  Headline  and  Call  to  Ac0on  as  the  difference  in  the  difference  in  performance  is  non-­‐zero  (31%-­‐(-­‐16%)=47%).  This  is  very  large  interac0on  when  compared  to  the  Main  Effects!  June  2012   ©  Datalicious  Pty  Ltd   82  
  • 83. >  Interac&on  Effects   Typical  Landing  Page  Test   Social  Proof   SP1   SP2   Diff   40%   H1   28%   22%   -­‐6%   SP1   20%   Headline   H2   34%   33%   -­‐1%   SP2   0%   Diff   -­‐6%   11%   H1   H2   Social  Proof   40%   SP1   SP2   Diff   CTA1   27%   18%   -­‐9%   20%   SP1   Call  to   SP2   CTA2   36%   32%   -­‐4%   Ac&on   0%   Diff   9%   14%   CTA1   CTA2  June  2012   ©  Datalicious  Pty  Ltd   83  
  • 84. 101011010010010010101111010010010101010100001011111001010101010100101011001100010100101001101101001101001010100111001010010010101001001010010100100101001111101010100101001001001010  >  Repor&ng  June  2012   ©  Datalicious  Pty  Ltd   84  
  • 85. Document  Everything!  June  2012   ©  Datalicious  Pty  Ltd   85  
  • 86. >  1.  Describe  the  test  §  Describe  the  outcome(s)  you’re  trying  to   influence  §  Describe  your  target  audience  §  Describe  the  different  treatments  including   copies  of  crea0ve  June  2012   ©  Datalicious  Pty  Ltd   86  
  • 87. >  2.  Jus&fy  the  test  design  §  Detail  why  you’ve  chosen  the  par0cular     outcome  you’re  trying  to  influence  §  Detail  why  you’ve  chosen  the  consumers   you  are  trying  to  influence  §  Detail  why  your  interven0on  should  work   –  Past  test  results/Useability  test/Case  studies   –  Marketers  intui0on/logic  June  2012   ©  Datalicious  Pty  Ltd   87  
  • 88. >  3.  Results  &  Conclusions  §  Detail  all  the  performance  results  –  did  you   make  money?  §  Discuss  your  hypotheses  §  Future  tests  §  ‘Meta’  repor0ng  of  your  test  program    June  2012   ©  Datalicious  Pty  Ltd   88  
  • 89. >  Not  just  sta&s&cal  significance  Do  a  sense-­‐check  when  interpre0ng  results:  §  What  was  the  compe00on  doing  when  this  test   was  running?  §  Just  because  this  worked  in  one  loca0on  does  it   mean  it  will  work  in  another?  §  The  offer  was  successful  in  Summer  –  would  it   s0ll  work  in  Winter?  §  Were  there  any  other  abnormal  factors  in  the   marketplace  which  might  have  affected  the   response?  June  2012   ©  Datalicious  Pty  Ltd   89  
  • 90. >  The  Scien&fic  Method   Knowledge   Establish   Develop   Facts   Test(s)   Data  June  2012   ©  Datalicious  Pty  Ltd   90  
  • 91. >  Case  Study  June  2012   ©  Datalicious  Pty  Ltd   91  
  • 92. >  List  of  (Some)  Resources  §  hSp://visualwebsiteop0mizer.com/case-­‐ studies.php  §  hSp://www.whichtestwon.com/  §  hSp://www.feng-­‐gui.com  §  hSp://www.smashingmagazine.com/ 2010/06/24/the-­‐ul0mate-­‐guide-­‐to-­‐a-­‐b-­‐ tes0ng  June  2012   ©  Datalicious  Pty  Ltd   92  
  • 93. Contact  us   msavio@datalicious.com     Learn  more   blog.datalicious.com     Follow  us   twi{er.com/datalicious    June  2012   ©  Datalicious  Pty  Ltd   93  
  • 94. Data  >  Insights  >  Ac&on