3. Motivation
• Trulia Estimates launched in 2011
• Public records snowball has evolved since then, but the valuation
algorithm has not
• Valuations already have a lot of visibility (valuation heatmaps etc)
and we are planning to give them even more visibility in the near
future (valuations history)
• Brilliant Basics – Improve estimates before surfacing them
everywhere
4. Us v/s Competition
0 5 10 15
Trulia
Estimates
Zestimate
Median Error %
Trulia
Estimates
Zestimate
5. Our Work
• Location specific and temporal features
• Crime Safety
• School Proximity
• Stats and Trends
• New Geoscopes
• Solve the problem of geographic boundaries
• Model Learning Improvements
• Explicit modeling of location hierarchies
• Better learned parameters
• Better feature representation and normalization
9. New Geoscopes
After the initial pass
Coverage improved by 1.67% ~ 1.15million properties throughout the
nation
330 more counties valued
For San Mateo, median error goes from 8.97% to 8.85%
10. Model Learning Improvements
Each geography is different. Static set of model parameters not
always ideal
Using cross validation to learn parameters for each location model
from data
Median error % improves from 8.97 to 8.69 (~3% relative improvement)
Hierarchical Modeling
Explicitly model Location Hierarchies to get smoother estimates using
higher level information
11. What’s Next?
Spend more time optimizing new features – Optimization is
everything!
Add price trends data to the hedonic model and simplify our learning
process
Make per model parameter optimization scalable
Incorporate hierarchical models into the existing mix