Upcoming SlideShare
×

# WWW 2012

375 views

Published on

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
375
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
4
0
Likes
0
Embeds 0
No embeds

No notes for slide
• Most of the time systems is viewed and evaluated in a static context Snapshot at time t How the system evolves Items are changing People are changing Change the outcome
• Understand the system dynamics Select important aspects of the system Gain control over certain objectives
• Understand and describe system -&gt; predict over time Example: we know the system at time t, we know the change, we know what it will be at t+1 Organise resources accordingly Controlling the system brings stability
• Describe Select Build feedback Add controller Test
• Performance loss is calculated with respect to training four times a day. Red line twice a day – midnight and midday Blue line once a day – only at midnight
• Use all and train whenever you can Part of data would give more frequent and regular updates
• the relationship between the number of training samples and the overall performance Number of training samples related to performance Gain control performance
• Balance the mass to keep the distance from the wall constant Pull and release
• The  acceleration   a  of a body is  parallel  and directly proportional to the net  force   F  and inversely proportional to the  mass   m First order differential equation
• The analogy to recommender systems is that Y is the performance of the system K – new training samples R – change in new training samples
• Explain input and put through a recsys for 30 days Three scenarios, new data samples are increasing, deceasing and stable (showing the increasing one). Learn parameter R-squared, the coefficient of determination, to measure the accuracy of the prediction.
• General concept Dynamics
• Three components
• Rise time Settling time Overshoot – maximum percentage the system might go over the reference
• Based on analysis Picked to get the fastest rise time Rise time – PI, PID Overshoot – PI, PID not really good PD is the best
• Analysis and results are consistent Settling time – best PI and PID Overshoot – highest for PID and PI P is zero Overall PD has the best properties
• Reduce performance would reduce computation Red line – performance loss, computational gain Blue line – performance loss, computational gain once the required performance has reached, the computational cost becomes stable.
• How it works now - subscription and purchase is used Data that can be collected – PVR recording logs, subscription packages, purchase data, live TV watch, Facebook shares, content information Identify the usefulness of the signal per user basis Enhance recommendation by requesting additional information Reduce data and computation
• Certain aspects – need to fix aspects of the system Stability would bring predictability More optimal resource allocation
• First two steps are similar Example here – optimise performance over time Can be used for cold start problem Good quality recommendation from the beginning
• Sequence of design Dynamics – feedback loop (analytical) Put into production – evaluate on real data Plug-in different dynamics and different controllers
• ### WWW 2012

1. 1. Tamas Jambor, Jun Wang, Neal Lathia (@jamborta,@seawan,@neal_lathia) University College London
2. 2. Recommender systems exists in time. Manyaspects of the system can only be understoodover time.
3. 3. Understanding and controlling dynamic aspects of the systems to gain control over certain objectives.
4. 4.  Predictability over time Allocating resources over time Stable system over time
5. 5.  Describe and estimate the system dynamics Build and analyze a feedback loop controller Test the dynamics with real data
6. 6.  More data -> better performance Frequent training -> Up-to-date recommendation
7. 7. Performance loss over time
8. 8.  Use all the data -> best performance Use part of the data -> frequent and regular updates
9. 9. Relationship between the number of training samples (computation) and performance.
10. 10. How certain input now would change the future state of the system.
11. 11. r m F(t)k y(t) y = displacement F = force applied m = mass of the block r = damping constant k = spring constant
12. 12.  Using Newton’s second law F (t ) = my (t ) ◦ Spring force: f1 = ky (t ) r ◦ Damping force: f 2 = ry (t ) m F(t) This gives the equation k y(t) F (t ) = ky (t ) + ry (t )
13. 13. r F(t) k y(t)F (t ) = ky (t ) + ry (t ) y = performance F = number of training samples k = constant related to the new training samples r = constant related to the change in the new training samples
14. 14.  Generate an input signal Log-normal random walk to add noise Maximum likelihood estimation
15. 15. Build a control loop based on system dynamics
16. 16. Controlled Recommender System Controller Manipulated Variable Control control (e.g., number of input Actuator new training samples) Function Dynamics error +- Monitor Performance variable Referenceperformance
17. 17. Reference Value Disturbance yr (t ) e(t ) = yr (t ) − y (t ) n(t ) + e(t ) u (t ) Performance ∑ Controller Dynamics Rec. Error Control y (t ) - Signal y (t ) Rec. Accuracy Measurement
18. 18.  Rise time Settling time Overshoot
19. 19. Type Rise time Settling time Overshoot (%) (sec) (sec)P 5 10 0PD 1 5 8.21PI 0 10 36.8PID 0 10 15.7P controller PD controller PI controller PID controller
20. 20. How the controller can perform with unseen data.
21. 21.  Simulate system for 30 days (once a day) 3 reference values (RMSE 0.95, 0.9, 0.86) Measure ◦ Settling time ◦ Error (with respect to reference) – RMSE-SS ◦ Stability (with respect to reference) – SD-SS ◦ Overshoot
22. 22. Type Settling RMSE-SS SD-SS Overshoot time (error) (stability) (%)P (0.86) 23 0.0104 0.0034 0PD (0.86) 20.8 0.0059 0.0037 0.18PI (0.86) 18.6 0.0055 0.0047 0.49PID (0.86) 18.6 0.0045 0.0045 0.72 PD controller
23. 23. Type Settling RMSE-SS SD-SS Overshoot time (error) (stability) (%)P 20 0.0068 0.0052 0.50PD 18.8 0.0054 0.0055 0.98PI 20.6 0.0137 0.0075 2.50PID 17.8 0.0091 0.0059 1.86 P controller PD controller PI controller PID controller
24. 24. Computation Performance loss Computational gain
25. 25.  Fixed computational cost would also fix update frequency (given the resources available) System can provide regular updates independent from the rate of the new samples Resources are used to their limits (e.g. EC2)
26. 26. RECOMMENDATION recommendation SERVER usage data RECOMMENDERINPUT CONTROLLER feedback TRAINING DATA PROCESSORRECOMMENDATION request data ASSESSOR USAGE DATA PROCESSOR
27. 27.  Certain aspects of the system are to be controlled Stability and predictability are important Use resources to their limits
28. 28.  Optimise rather than stabilise over time Obtain system dynamics Describe objective to be achieved (finite or infinite horizon)
29. 29.  Describe system dynamics Design a feedback loop Evaluate control system on real data
30. 30. Thank you.