Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- Quoraコンペ参加記録 by Takami Sato 5519 views
- Kaggle bosch presentation material ... by Keisuke Hosaka 11776 views
- Kaggle boschコンペ振り返り by Keisuke Hosaka 13675 views
- Hyperoptとその周辺について by Keisuke Hosaka 4335 views
- 機械学習のためのベイズ最適化入門 by hoxo_m 66455 views
- [Dl輪読会]AdaNet: Adaptive Structural ... by Deep Learning JP 582 views

7,441 views

Published on

Instacart 2nd place solution for kaggle meetup in Tokyo

Published in:
Data & Analytics

No Downloads

Total views

7,441

On SlideShare

0

From Embeds

0

Number of Embeds

5,032

Shares

0

Downloads

20

Comments

0

Likes

3

No embeds

No notes for slide

- 1. 2nd Place Solution Instacart Market Basket Analysis
- 2. Agenda • My Background • Problem Overview • Main Approach • Feature Engineering • Feature Importance • Important Findings • F1 maximization
- 3. My Background • Bachelor of Economics • Programmer of Financial Industry • Consultant of Financial Industry • 2nd Place at KDDCUP2015 • Data Scientist at Yahoo! JAPAN
- 4. Problem Overview • In this competition, we have to predict reorder. • So, it is little different from general recommendation. • I mean,
- 5. Problem Overview • How hot(user)? *prior is regarded as train
- 6. Problem Overview • How hot(item)? *Clipped by 500
- 7. Problem Overview • Evaluation metric is mean F1 score • Precision and Recall
- 8. Problem Overview • Links between the files
- 9. Main Approach • We are given orders.csv
- 10. Main Approach • We are given orders.csv
- 11. Main Approach • We are given order_products.csv
- 12. Main Approach • Reorder Prediction user_id product_id label
- 13. Main Approach • None Prediction user_id label
- 14. Main Approach
- 15. Main Approach
- 16. Feature Engineering • I made 4 types of features 1. User • What this user like 2. Item • What this item like 3. User x Item • How do the user feel about the item 4. Datetime • What this day and hour like *For None model, I can’t use above features except user and datetime. So I convert those to stats(min, mean, max, sum, std…).
- 17. Feature Importance for reorder
- 18. Feature Importance for None
- 19. Important Findings for reorder - 1 • user_id: 54035
- 20. Important Findings for reorder - 2 • days_last_order-max is difference between days_since_last_order_this_item and useritem_order_days_max • days_since_last_order_this_item is a feature belong to user and item. This means how many days passed since last order • Also, useritem_order_days_max is a feature belong to user and item. This means max span(day) of order • For more detail, see the next page
- 21. Important Findings for reorder - 2 • See the index 0, this means the user bought this item 14 days ago, and max span is 30 days • So I think this feature says if the user is bored or not by that item
- 22. Important Findings for reorder - 3 • We already know fruits are reordered more frequently than vegetables(3 Million Instacart Orders, Open Sourced) • I wanted to know how often • So I made a item_10to1_ratio feature that’s defined as the reorder ratio after an item is ordered vs. not ordered. • Next page, for more details
- 23. Important Findings for reorder - 3 • Let’s say userA bought itemA at order_number 1 and 4 • And userB bought itemA at order_number 1 and 3 • item_10to1_ratio is 0.5
- 24. Important Findings for None - 1 • Useritem_sum_pos_cart(User A, Item B) is the average position in User A’s cart that Item B falls into • Useritem_sum_pos_cart-mean(User A) is the mean of the above feature across all items • So this feature essentially captures the average position of an item in a user’s cart, and we can see that users who don’t buy many items all at once are more likely to be None
- 25. Important Findings for None - 2 • total_buy is number of total order • If userA bought itemA 3 times in the past, this would be 3 • So total_buy-max is max of above feature by user • We can see that it predicts whether or not a user will make a reorder
- 26. Important Findings for None - 3 • t-1_is_None(User A) is a binary feature that says whether or not the user’s previous order was None. • If the previous order is None, then the next order will also be None with 30% probability.
- 27. F1 maximization • In this competition, the evaluation metric was an F1 score, which is a way of capturing both precision and recall in a single metric. • Thus, we needed to convert reorder probabilities into binary 1/0 (Yes/No) numbers. • However, in order to perform this conversion, we need to know a threshold. At first, I used grid search to find a universal threshold of 0.2. But I saw comments on the Kaggle discussion boards that said different orders should have different thresholds. • To understand why, let’s look at an example.
- 28. F1 maximization
- 29. F1 maximization • In the first example, threshold is between 0.9 and 0.3 • In the second example, threshold is lower than 0.2 • As I showed, each order should have each threshold • But using above calculation, we have to prepare all patterns of probability at first • Thus I needed to come up with another calculation • See the next page
- 30. F1 maximization • Let’s say our model predicts Item A will be reordered with probability 0.9, and Item B with probability 0.3. I then simulate 9,999 target labels (whether A and B will be ordered or not) using these probabilities. • For example, the simulated labels might look like this. • I then calculate the expected F1 score for each set of labels, starting from the highest probability items, and then adding items (e.g., [A], then [A, B], then [A, B, C], etc) until the F1 score peaks and then decreases. • We don’t need to calculate all of patterns like A, B, AB… • Because if we should select itemB, we should select itemA as well
- 31. F1 maximization • F1score_mean( , [A]) -> 0.809747641431 • F1score_mean( , [A,B]) -> 0.709004233757
- 32. F1 maximization - Predicting None • One way to think about None is as the probability (1 - Item A) * (1 - Item B) * … • But another method is to try to predict None as a special case. • By using our None model and treating None as just another item, we can boost the F1 score from 0.400 to 0.407.
- 33. Appendix
- 34. Appendix
- 35. Appendix
- 36. 1 month to go…
- 37. 7 days to go…
- 38. 2 days to go…
- 39. （´-`）.｡oO(
- 40. 1 hours to go…
- 41. 30 minutes to go…
- 42. やったか？！
- 43. やったか？！ （やってない）
- 44. 20 minutes to go…
- 45. EOP

No public clipboards found for this slide

Be the first to comment