This document discusses reimagining recommendations by measuring business similarity based on shared customers. It presents customer transaction data for Tesco, Asda, and other businesses and uses cosine similarity and matrix factorization to analyze how similar the businesses are based on which customers frequent them. It then discusses using the customer transaction data to create a graph and analyzing the graph to determine expected degrees of separation between businesses as a new recommendation metric. Finally, it outlines some of the complexity and scalability challenges with this approach and proposes solutions like factorizing graphs, eliminating insignificant paths, and linear sampling to address those challenges.
2. 2 | Barclays PCB - Advanced Data Analytics
Recommendation == Similarity
– How is Tesco similar to Asda?
– How is Tesco in Bristol similar to Asda in Bath?
– How is Tesco in Bristol similar to Bristol Angling Centre?
Assume that the similarity of two businesses can be measured by whether
they share customers
3. 3 | Barclays PCB - Advanced Data Analytics
The data: Customer transactions
Timestam
p
Customer Business Amount (£)
… Bob Smith Tesco, Bristol …
… Mary
Jones
Tesco, Bristol …
… Bob Smith Asda, Bath …
… John
Taylor
Bristol Angling Centre …
4. 4 | Barclays PCB - Advanced Data Analytics
Cosine Similarity
𝐴𝑠𝑑𝑎 = 𝐀 = (1,1,1,0,1,1,0,0)
𝑇𝑒𝑠𝑐𝑜 = 𝐁 = 1,0,1,0,1,1,0,1
𝐀 ⋅ 𝐁 = 𝐀 𝐁 cos 𝜃
Customer Tesco Asda
Bob Smith Yes Yes
Mary Jones No Yes
John Taylor Yes Yes
Jane
Williams
No No
Gary Brown Yes Yes
Liz Davis Yes Yes
David Evans No No
Helen
Wilson
Yes No
5. 5 | Barclays PCB - Advanced Data Analytics
Matrix factorisation
⇒
1
0
1
0
1
1
0
1
1
1
0
1
1
0
1 0
≈
…
…
…
…
…
…
…
…
× … …
Customer Tesco Asda
Bob Smith Yes Yes
Mary Jones No Yes
John Taylor Yes Yes
Jane
Williams
No No
Gary Brown Yes Yes
Liz Davis Yes Yes
David Evans No No
Helen
Wilson
Yes No
Preferences of
Customers
Attributes of
Businesses
6. 6 | Barclays PCB - Advanced Data Analytics
Using the whole graph
Tesco
Asda BP
Boots
Timestam
p
CustomerI
D
MerchantName Amount (£)
… 1 Tesco …
… 1 Asda …
… 2 Boots …
… 2 BP …
… 3 Tesco …
… 3 Boots …
… 3 BP …
… 4 Asda …
7. 7 | Barclays PCB - Advanced Data Analytics
Customer transactions as a graph
Tesco Asda BP BootsTesco
Asda BP
Boots
Tesco
Asda
BP
Boots
8. 8 | Barclays PCB - Advanced Data Analytics
Tesco
Asda
BP
Boots
Tesco
Asda
BP
Boots
Our metric: Expected Degrees of Separation
9. 9 | Barclays PCB - Advanced Data Analytics
Complexity
13. 13 | Barclays PCB - Advanced Data Analytics
Elimination of insignificant paths
14. 14 | Barclays PCB - Advanced Data Analytics
Linear Sampling
≈
15. 15 | Barclays PCB - Advanced Data Analytics
DataScienceisreimaginingdata
Editor's Notes
Introduction to Barclays Advanced Data Analytics (ADA)
ADA is a data science team which innovates, designs and builds applications that deliver, direct to customers, relevant analytical content that will help them make smart decisions to improve their lives.
We are aim to make applications that will revolutionise the way Barclays relates to our customers. The long term vision is to give each of our customers the same level of engagement and support in planning their finances and their lives as they would have if they were billionaires.