What is Google Search Console and What is it provide?
Building Mini-Categories in Product Networks
1. Building Mini-Categories in
Product Networks
Dmitry Zinoviev, Mathematics & Computer Science
Zhen Zhu, Marketing
Kate Li, Information Systems & Operations Management
Suffolk University, Boston
2. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 2
Outline
● Project objectives
● Data set
● Network construction
● Core products
● Tiles
● Future work
3. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 3
Objectives
● Long-Term Objective:
– Identify and predict consumer projects, based on the
purchase data collected by a Fortune 500 Specialty
Retailer (the “Retailer”)
– A consumer project is a collection of consumption and
co-creative actions that use multiple products and
services provided by stores to meet their
idiosyncratic life purposes.
● Short-Term Objectives:
– Use product network approach to identify product
groups (“tiles”) that could serve as “material lists”
and be used as building blocks for consumer projects
4. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 4
Data Set
● Purchase data collected by the Retailer over two years in
2012–2014
● Products: 111,000 material items, 351 non-material items;
15 groups, 235 classes, 1,778 subcategories
● Purchases: ~12 mln sales, 545 thousand returns
– Include household id, register id, date/time, location,
price, quantity, etc.
6. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 6
Product Network (2)
● Two products are considered co-purchased, if they were
purchased:
– by the members of the same household
– within 4 weeks (to cover at least several weekends)
– at least 7 times (to build confidence)
7. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 7
Network Metrics
Metric Value
Complete Network Number of nodes 18,788
Number of edges 154,968
Number of isolated pairs 427
Number of communities 643
Modularity 0.49
GCC Number of nodes 17,294
Number of edges 153,854
Number of communities 80
Modularity 0.48
Potentially good structure!
8. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 8
One of the Components
Scale-free network!
Structural formations
9. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 9
Core Products & Staples
● Long-tail distribution of
purchase volumes
(“staple products” at
the tail)
● Long-tail distribution of
number of links in the
network (“core
products” at the tail)
● Staples hurt
modularity, complicate
clustering
● Eliminate them!
Threshold:
degree ≥ 64.
Staples = core products!
10. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 10
Core Product Statistics
Original network The Core Network w/o Core
Nodes 18,788 875 17,913
Edges 154,968 56,859 25,316
Density 8.8×10-4
0.149 9.9×10-4
Nodes in GCC 17,293 875 9,947
Isolated Pairs 427 0 763
● The core products form a connected network of their own that
deserves a separate study
● Top 10 core products:
plastic bucket, wood stud, soda, seal tape, plastic tape, diet
soda, insulating foam sealant, painting tape, drinking
water, flat brush
11. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 11
Structural Formations
● Tiles:
– Fully-Connected Cliques (mini-projects)
– Spoke-and-Hub Stars (mini-categories)
– Chains and Pendants
● Reflect consumers view on the product hierarchy
● Do not match the Retailer's product hierarchy
12. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 12
Cliques
● Each product is co-purchased with all other products in
the clique
● Cliques represent topical complementary groups
● 22,148 cliques (some cliques overlap)
● Clique sizes have a power-law distribution (average size=4)
14. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 14
Stars
● The hub is the lead product and does not belong to the star
● The lead product is frequently purchased with the leave
products, but the leaves are not purchased together
● Stars represent a topical group of substitutes—“mini-
categories”
● 2,321 stars (some stars overlap)
● Star sizes have a power-law distribution (average size=3.8)
16. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 16
Chains & Pendants
● Linear structures connected to the main component at one
(pendants) or both (chains) ends
● Represent products that are not purchased all together,
but are often purchased pairwise—possibly because of
consumer's uncertainty (substitutes by ignorance) or as a
part of a learning/exploratory process.
● 768 chains/pendants, each not longer than 4 edges
● Average size=3.2
17. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 17
Screw Chain
Note that the neighbors
differ by one size step.
18. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 18
Conclusion
● We built a product network from the purchase data
provided by a Fortune 500 Specialty Retailer
● We identified core products and related them to the
staples—frequently purchased items
● We extracted structural network tiles: stars, cliques, and
chains/pendants—and related them to the retailing classes
of substitutes and complements
● We believe that the tiles reflect the consumer view on the
retail product hierarchy and could be used as building
blocks for automated identification of customer projects
19. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 19
Next Steps
● Study the role of core products and their attribution to the
Retailer's product hierarchy
● Discover tile hierarchies (such as stars of cliques and
cliques of stars)
● Use tiles as building blocks for consumers' projects
20. March 2015 CompleNet'15 NY—Zinoviev/Zhu/Li 20
Acknowledgment
The authors would like to thank Wharton Customer Analytics
Initiative (WCAI) for the provided data set that made this
research possible.