Applied
Data Science
for E-Commerce
Hello!
I am Arul Bharathi
◇ Data Scientist Intern @ Realtor.com
◇ Master’s in Big Data, Simon Fraser University
◇ PG Diploma in Data Analytics
◇ 5 years in Banking & Financial Services (BI & Data Science)
@linkedin.com/in/arulbharathi
Interests
◇ Deep Learning - Computer Vision
◇ ML Engineering & Deployment
◇ Photography & Blogging
Vision
◇ Non-profit VNC DL Community
◇ Crowdfund and co-learn with DL Enthusiasts
◇ Solving Complex Problems
Why we are all here?
● Data Science Use Cases in E-Commerce
● Recommendation Systems - Walkthrough
● Consumer Behavioral Modelling - RF
● Deep Learning
○ Image Recommendation
○ Image Similarity
Data Science Use Cases
in E-Commerce1
Use Case Types of Models
Consumer Behavior Analysis Tree-based Models - Interpretation
User Experience Recurrent Neural Networks, Language Models
Product Category Classification Deep Learning - Image Classification, Object
Detection
Product Recommendation Deep Learning - Image Regression, NLP,
Recommendation Systems
Recommendation
Systems - Walkthrough2
Types of Recommendation Systems
Recommender
Systems
Content Based
Filtering
Collaborative
Filtering
Item BasedUser Based
Hybrid Methods
Content Based Filtering
● Main Idea - Recommend items to Customer X similar to previous
items rated highly by X
Pros
● No need of data on other users
● Able to recommend users with unique tastes
● Able to recommend new and unpopular items
● Able to provide explanations
Cons
● How to build a item profile for new users?
● Unable to exploit quality judgements of other users
● Overspecialization
Collaborative Filtering
● Consider user X
● Find set N of other users whose ratings are similar to x’s ratings
● Estimate x’s ratings based on the ratings of users in N
User Based Collaborative Filtering
recommended
Prescribe it to you
Prescribe the
movies with high
rating predictions
Find Similar Users
Similar users to
you is found using
correlation metrics
Predict Ratings
Predict the ratings
those users would
give for products
Item Based Collaborative Filtering
Similar Items
Higher
Predicted
Rating
recommendation
Item Based Collaborative Filtering
Prescribe it to you
Prescribe the item
based on rating
metrics
Find Similar Items
Find Similar items
to the item that
has to be
recommended to
the user
Predict Ratings
Predict the user
rating for the item
using weighted
average
Pros
● Works for any kind of item, no feature selection needed
Cons
● Cold Start - Need enough users in the system to start
● Sparsity - Hard to find users that have rated the same item
● Popularity Bias - Cannot recommend items with unique taste
Consumer Behavior
Modelling3
DATA AGGREGATION
FEATURE
EXTRACTION
MODEL
INTERPRETATION
TREE MODEL
BUILDING
PIPELINE
AGGREGATION & FEATURE EXTRACTION
Consumer ID Session Start Time Session End Time Date Price of Purchase
0001 8.00 AM 8.30 AM 01-01-18 0
0002 9.00 AM 10.00 AM 01-02-18 25
0001 5.00 PM 7.00 PM 23-03-18 55
0002 9.00 PM 11.30 PM 05-05-18 50
Consumer ID Total Duration No of Sessions Session Gap Total Price
0001 150 mins 2 81 days 55
0002 210 mins 2 93 days 75
Consumer Profile Rollup
Why Random Forest for Behavior Analysis?
● Almost No Input Preparation
● Implicit Feature Selection(Random)
● Very Quick Training
● Tough to Beat!
● Versatility
● Interpretability (Well, based on your efforts!)
DEMO Powered By
● Dataset - Google Analytics Consumer Revenue
Prediction
● Model Building
● Feature Analysis
● Partial Dependence
● Tree Interpretation
Jeremy Howard
Image Deep Learning in
E-Commerce4
Most Valuable 10 mins about Deep Learning...
Image Deep Learning Use Cases
Use Case Process Technique
Product Recommendation
Image Similarity Cosine Similarity, NN
Image Popularity Prediction Image Regression
Image Quality Scoring
Image Aestheticity
Calculation
Product Image
Enhancement Image Enhancement Reducing Pixel Loss
Product Category
Classification Image Classification
Transfer Learning/Fine
Tuning
Image Deep Learning - Process
● Load, Rescale and Normalize
● Select the CNN Architecture(ResNet, ResNext)
● Retrieve the image embeddings from the final layer
● Perform the operation by adding Fully Connected Layers
Image Popularity Prediction
● Click Through Rate Prediction - Amazon, Pinterest
● No of Likes Prediction - Flickr, Facebook etc.
● CTR, Likes - Labels
● Training Deep CNNs (ResNext101) for regression
Image Aestheticity Prediction
● Explicit Ranking - Prediction
● Labels - Crowdsourced Ratings for images
● Building a Deep Learning Model to predict the ratings
Image Classification
● Product Category Classification - Amazon, eBay
● Product Type (Female Clothing, Backpacks etc.) - Labels
● Training Deep CNNs (ResNext101) for classification
Common Image Similarity Measures
◇ Nearest Neighbors
◇ Cosine Similarity
Product Recommendation using Image
Similarity
0.7984 0.7751
0.7110
Similarity Scores using ResNet50
0.8387
Similarity Scores using ResNet50
0.790
Data Augmentation
● Augment every single image in different angles
● Increases the effectiveness of prediction
● Model learns more clearly
Thanks!
Any questions?
Interested in DL Study group?
You can find me at:
◇ linkedin.com/in/arulbharathi
◇ aarul@sfu.ca

Applied Data Science for E-Commerce

  • 1.
  • 2.
    Hello! I am ArulBharathi ◇ Data Scientist Intern @ Realtor.com ◇ Master’s in Big Data, Simon Fraser University ◇ PG Diploma in Data Analytics ◇ 5 years in Banking & Financial Services (BI & Data Science) @linkedin.com/in/arulbharathi
  • 3.
    Interests ◇ Deep Learning- Computer Vision ◇ ML Engineering & Deployment ◇ Photography & Blogging Vision ◇ Non-profit VNC DL Community ◇ Crowdfund and co-learn with DL Enthusiasts ◇ Solving Complex Problems
  • 4.
    Why we areall here? ● Data Science Use Cases in E-Commerce ● Recommendation Systems - Walkthrough ● Consumer Behavioral Modelling - RF ● Deep Learning ○ Image Recommendation ○ Image Similarity
  • 5.
    Data Science UseCases in E-Commerce1
  • 6.
    Use Case Typesof Models Consumer Behavior Analysis Tree-based Models - Interpretation User Experience Recurrent Neural Networks, Language Models Product Category Classification Deep Learning - Image Classification, Object Detection Product Recommendation Deep Learning - Image Regression, NLP, Recommendation Systems
  • 7.
  • 8.
    Types of RecommendationSystems Recommender Systems Content Based Filtering Collaborative Filtering Item BasedUser Based Hybrid Methods
  • 9.
    Content Based Filtering ●Main Idea - Recommend items to Customer X similar to previous items rated highly by X
  • 10.
    Pros ● No needof data on other users ● Able to recommend users with unique tastes ● Able to recommend new and unpopular items ● Able to provide explanations Cons ● How to build a item profile for new users? ● Unable to exploit quality judgements of other users ● Overspecialization
  • 11.
    Collaborative Filtering ● Consideruser X ● Find set N of other users whose ratings are similar to x’s ratings ● Estimate x’s ratings based on the ratings of users in N
  • 12.
    User Based CollaborativeFiltering recommended
  • 13.
    Prescribe it toyou Prescribe the movies with high rating predictions Find Similar Users Similar users to you is found using correlation metrics Predict Ratings Predict the ratings those users would give for products
  • 14.
    Item Based CollaborativeFiltering Similar Items Higher Predicted Rating recommendation
  • 15.
    Item Based CollaborativeFiltering Prescribe it to you Prescribe the item based on rating metrics Find Similar Items Find Similar items to the item that has to be recommended to the user Predict Ratings Predict the user rating for the item using weighted average
  • 16.
    Pros ● Works forany kind of item, no feature selection needed Cons ● Cold Start - Need enough users in the system to start ● Sparsity - Hard to find users that have rated the same item ● Popularity Bias - Cannot recommend items with unique taste
  • 17.
  • 18.
  • 19.
    AGGREGATION & FEATUREEXTRACTION Consumer ID Session Start Time Session End Time Date Price of Purchase 0001 8.00 AM 8.30 AM 01-01-18 0 0002 9.00 AM 10.00 AM 01-02-18 25 0001 5.00 PM 7.00 PM 23-03-18 55 0002 9.00 PM 11.30 PM 05-05-18 50 Consumer ID Total Duration No of Sessions Session Gap Total Price 0001 150 mins 2 81 days 55 0002 210 mins 2 93 days 75 Consumer Profile Rollup
  • 20.
    Why Random Forestfor Behavior Analysis? ● Almost No Input Preparation ● Implicit Feature Selection(Random) ● Very Quick Training ● Tough to Beat! ● Versatility ● Interpretability (Well, based on your efforts!)
  • 21.
    DEMO Powered By ●Dataset - Google Analytics Consumer Revenue Prediction ● Model Building ● Feature Analysis ● Partial Dependence ● Tree Interpretation Jeremy Howard
  • 22.
    Image Deep Learningin E-Commerce4
  • 23.
    Most Valuable 10mins about Deep Learning...
  • 24.
    Image Deep LearningUse Cases Use Case Process Technique Product Recommendation Image Similarity Cosine Similarity, NN Image Popularity Prediction Image Regression Image Quality Scoring Image Aestheticity Calculation Product Image Enhancement Image Enhancement Reducing Pixel Loss Product Category Classification Image Classification Transfer Learning/Fine Tuning
  • 25.
    Image Deep Learning- Process ● Load, Rescale and Normalize ● Select the CNN Architecture(ResNet, ResNext) ● Retrieve the image embeddings from the final layer ● Perform the operation by adding Fully Connected Layers
  • 26.
    Image Popularity Prediction ●Click Through Rate Prediction - Amazon, Pinterest ● No of Likes Prediction - Flickr, Facebook etc. ● CTR, Likes - Labels ● Training Deep CNNs (ResNext101) for regression
  • 27.
    Image Aestheticity Prediction ●Explicit Ranking - Prediction ● Labels - Crowdsourced Ratings for images ● Building a Deep Learning Model to predict the ratings
  • 28.
    Image Classification ● ProductCategory Classification - Amazon, eBay ● Product Type (Female Clothing, Backpacks etc.) - Labels ● Training Deep CNNs (ResNext101) for classification
  • 29.
    Common Image SimilarityMeasures ◇ Nearest Neighbors ◇ Cosine Similarity Product Recommendation using Image Similarity
  • 30.
  • 31.
  • 32.
    Data Augmentation ● Augmentevery single image in different angles ● Increases the effectiveness of prediction ● Model learns more clearly
  • 33.
    Thanks! Any questions? Interested inDL Study group? You can find me at: ◇ linkedin.com/in/arulbharathi ◇ aarul@sfu.ca