SciPy and Real-time Big Data for Site Optimization
Pyleus
Message
Processor
Bolt
Pyleus
Event
Worker
Bolt
Pyleus
SciPy
Optimizer
Bolt
Pyleus
Update
Messenger
Bolt
SciPy
Bayesian
Bandit
Spout
Spout
Application
State
Visitors to Bankrate.com
Impressions
and Clicks
Improve User Experience
For more info, contact:
Winnie.Cheng@bankrate.com
Which
Variation to
show
Bankrate.com Data Science and Engineering Team
Example: Pick better story headlines
Objective for Site Optimization:
Enable fast and cost-efficient ways of testing
new designs to improve user experience
Algorithmically decide which of two headlines to
show user to maximize click-thru-rate (CTR)
Computation Framework with Kafka-Storm
Simulation Results
With more data, algorithm becomes more
confident of estimated CTR for each variation
Bayesian Multi-Armed Bandit
algorithm on Storm Topology
decides how often to show
each variation by analyzing
impressions and clicks
Iteration: 100
W1: 56.37%
W2: 42.63%
Iteration: 1000
W1: 9.82%
W2: 90.18%
Iteration: 2000
behavior reversal
W1: 64.06%
W2: 39.94%
Iteration: 3000
W1: 94.84%
W2: 5.16%
Iteration: 4000
W1: 97.12%
W2: 2.88%

Scipy_v1_reprint

  • 1.
    SciPy and Real-timeBig Data for Site Optimization Pyleus Message Processor Bolt Pyleus Event Worker Bolt Pyleus SciPy Optimizer Bolt Pyleus Update Messenger Bolt SciPy Bayesian Bandit Spout Spout Application State Visitors to Bankrate.com Impressions and Clicks Improve User Experience For more info, contact: Winnie.Cheng@bankrate.com Which Variation to show Bankrate.com Data Science and Engineering Team Example: Pick better story headlines Objective for Site Optimization: Enable fast and cost-efficient ways of testing new designs to improve user experience Algorithmically decide which of two headlines to show user to maximize click-thru-rate (CTR) Computation Framework with Kafka-Storm Simulation Results With more data, algorithm becomes more confident of estimated CTR for each variation Bayesian Multi-Armed Bandit algorithm on Storm Topology decides how often to show each variation by analyzing impressions and clicks Iteration: 100 W1: 56.37% W2: 42.63% Iteration: 1000 W1: 9.82% W2: 90.18% Iteration: 2000 behavior reversal W1: 64.06% W2: 39.94% Iteration: 3000 W1: 94.84% W2: 5.16% Iteration: 4000 W1: 97.12% W2: 2.88%