4. We provide personalized styling service through a combination of
algorithmic recommendations and stylist curation.
http://algorithms-tour.stitchfix.com/
What do we do at Stitch Fix?
5. We need good inventory to serve good recommendations.
Recommendation algorithms work in both ways.
(Buyers here mean the people who buy clothes from vendors to fill our warehouses)
Why does inventory matter?
Stylists Buyers
6. We need good personalized inventory to serve good
recommendation for each client.
Why does inventory matter?
7. We need good personalized inventory to serve good
recommendation for each client.
Tracer
A time series database providing precise personalized inventory
states at any given point of time
Why does inventory matter?
8. Imagine we have a time series of SKU counts
(count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
The design of Tracer
9. (count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
● This is asking too much! Let’s use a predefined interval to
generate this series, say every 10 minutes.
The design of Tracer
10. (count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
● This is asking too much! Let’s use a predefined interval to
generate counts, say every 10 minutes.
● Problems:
○ A tons of things can happen within 10 minutes during peak
hours
○ We’d like to know what exactly stylists saw when they
started working. 10-min snapshots just isn’t accurate
enough
The design of Tracer
11. (count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
● OK, let’s generate the counts every second!
The design of Tracer
12. (count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
● OK, let’s generate the counts every second!
● Problems
○ Not realistic to aggregate that often in the engineering DB,
where every item is a row.
○ Even if eng maintains a count table, should we snapshot
that every 1 sec?
○ A waste of space for non-moving counts during midnight
The design of Tracer
13. (count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
● Let’s do away with the fixed interval and only generate a count
event when the count changes!
The design of Tracer
14. (count1
, t1
), (count1
, t2
), (count1
, t3
)...
Q: How could we know the count at any t within the range?
● Let’s do away with the fixed interval and only generate a count
event when the count changes!
○ Problems
■ Again, engineering DB works on item level
■ Say t1 is far away from t2, in order to know the count at tx
(t1
< tx
< t2
), we may need to walk through tons of other
events. This could be solved by indexing, but indexing
for each SKU is too much
The design of Tracer
15. (s11
-> s12
, t1
), (s12
-> s13
, t2
), (s13
-> s14
, t3
)...
Q: How could we know the count at any t within the range?
● Let’s tweak this idea a bit and generate events of item state
transitions
● This gives us the flexibility to process item state as we want. In
the case of computing SKU counts, we can transform these
events into SKU count changes:
(delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
The design of Tracer
16. (delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
Q: How could we know the count at any t within the range?
● One missing piece is we still need an initial state to apply a delta
● This can be addressed by creating a state snapshot at the very
beginning
The design of Tracer
17. Now the whole design can be summarized as two pure functions:
● Inventory state function
I(t)
● Difference function
D(t1
,t2
) = I(t2
) - I(t1
) = -D(t2
,t1
)
● Inventory state reasoning
I(t2
) = I(t1
) + D(t1
,t2
) = I(t3
) - D(t2
, t3
)
The design of Tracer
18. ● As we consume the item event stream, we continuously build
delta blocks:
(delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
The implementation of Tracer
19. ● As we consume the item event stream, we continuously build
delta blocks:
(delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
● Periodically we create SKU count snapshot every hour, so that
we don’t need to always go to the very start to apply deltas all
the way from there
The implementation of Tracer
20. ● As we consume the item event stream, we continuously build
delta blocks:
(delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
● Periodically we create SKU count snapshot every hour, so that
we don’t need to always go to the very start to apply deltas all
the way from there
● To speed up searching for a certain snapshot, we index
snapshots. In the case of hourly snapshot, there are only 24
ones to index a day.
The implementation of Tracer
21. ● As we consume the item event stream, we continuously build
delta blocks:
(delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
● Periodically we create SKU count snapshot every hour, so that
we don’t need to always go to the very start to apply deltas all
the way from there
● To speed up searching for a certain snapshot, we index
snapshots. In the case of hourly snapshot, there are only 24
ones to index a day.
● This is all built upon Spark and deltas and snapshots are
stored as Spark dataframe
The implementation of Tracer
22. ● As we consume the item event stream, we continuously build
delta blocks:
(delta1
, t1
), (delta2
, t2
), (delta1
, t3
)...
● Periodically we create SKU count snapshot every hour, so that
we don’t need to always go to the very start to apply deltas all
the way from there
● To speed up searching for a certain snapshot, we index
snapshots. In the case of hourly snapshot, there are only 24
ones to index a day.
● This is all built upon Spark and deltas and snapshots are
stored as Spark dataframe
● We provide both Scala and Python API to query the inventory
state
The implementation of Tracer