SlideShare a Scribd company logo
1 of 35
Download to read offline
Recommendation Play
     @Flipkart
Recommendation Systems aim to predict the
  user’s intent and help connect them to
products they need or may be interested in,
          in an automated fashion
Item-Item Recommendations
Frequently Bought Together
Why Recommendation
        Important?
Product discovery is an important step in
users interaction with an ecommerce website
and recommendation systems play a very
useful part in the discovery

This becomes ever so important with
increasing size of the catalog, making it
harder for every user to express their intent
through well formed queries
Recommendation Types
● Help Find: other similar products
  smartphones when on samsung galaxy sIII, other budget phones when at
  nokia 101



● Help Decide
  People who viewed this ended up buying



● Complete the purchase experience
  memory card for mobiles, cases for camera



● Excite or wow
  New Chetan Bhagat Novel, Printer ink after 6 months
Recommendation System

          =

         Data
          +
      Algorithms
Data for Recommendations
● Users interactions on the website

● Anything which reflects his/her interest on
  the items (browse, order, compare, review,
  etc)

● Any data that is applicable to all users of the
  website
Algorithms for Recommendations

● Collaborative Filtering
  ○ Item to Item
  ○ User to User

● Content Based Filtering
  ○ Meta data - music genres, brands, authors
  ○ Titles - books on java, accessories for nikon D90
Users   Items
                       User to User
         1
                ●   For user u, find the set
 A                  of users S(u) that have
         2          similar preference as u
 B              ●   Recommend popular
         3          items among users in S
                    (u) to user u.
 C
         4
 D
         5
Item to Item
Users   Items
                ●   For item i, find the set of
         1          items S(i) that are
                    touched together
 A
         2      ●   For every user u, find
                    set of items N(u) they
 B                  have browsed/bought
         3          before
 C              ●   For every item x in N(u)
         4          add S(x) to the
                    recommendation set
 D
         5      ●   Sort and show the
                    recommendation set
Users   Latent Nodes       Items

                            1
 A      budget cameras


                            2
 B          canon

                            3
 C
         classical music
                            4
 D
                            5
Deeper Dive
Item Item Collaborative Filtering
Collaborative Filtering

● Filtering for patterns using collaboration
  among multiple entities: agents, viewpoints,
  data sources, etc

● Automates the process of "word of mouth"
Data Modeling
● Bought History.
  ○ Items bought together in the same transaction.
  ○ Items bought by the User.


● Browse History
  ○ Items browsed together successively with in 5 mins.


● Compare History

● 30 secs music sample etc..
Algorithm
B1 -> P1, P2 , P3, P4
B2 -> P2, P3, P4
B3 -> P1, P2, P4 Item/
                   Item
                          P1   P2   P3   P4


                   P1     -    2    1    2
Recommendation     P2          -    2    3
P1 -> P2, P4, P3   P3               -    2
P2 -> P4, P1,P3
                   P4                    -
P3 -> P2, P4, P1
P4 -> P2, P1, P2
Recommendation Score
Recommendation Score                   between the Products.

                                       P1, P2 is
        P1
                                       Intersection count = 150

                                       P1, P3

                        100,000        Intersection count = 200
       1200
       Visits.                         After removing the
                                       popularity skew,

                                       P1, P2 is
                 1000 Visits
  P2                                   Score = 150 / 1000 = 0.15
                                  P3
                                       P1, P3 is

                                       Score = 200 / 100000 = 0.002
Formula:

Recommendation Score = ( i * i ) / ( n1 * n2).

Where,
 i = count of occurrences of items i1, i2
     together.
 n1 = # of occurrences of the item i1 alone.
 n2 = # of occurrences of the item i2 alone.
Computations Scale
● # of Product Page visits = 6 Million/day.

● # of sessions = 30 Million/Week.

● Size of data crunched for full index = 2TB
Scaling with Map Reduce
●   Calculate 'i' by forming pairs and counting
●   Calculate 'n1' by making P1 as the key
●   Calculate 'n2' by making P2 as the key
●   Took 2 hours on 6 months of data
●   Scales Horizontally

● Previous/Conventional approach
    ○ Hash the pairs to form huge number of small files,
      sort and collate
    ○ Would have taken 20x the time (2 days)
    ○ Not horizontally scalable
Item-Item Recommendations
Personalization
https://www.flipkart.com/recommendations
Personalization explained
● Item-Item Recommendations

● Understand the User's behavior.
    ○   Bought Items
    ○   WishList Items
    ○   Cart Additions
    ○   Rated Products.

 User => Products that are defining the behavior/intentions.

 Products => Recommendations

 Forming Personalized Recommendations,

 User => Recommendations.
Personalization
https://www.flipkart.com/recommendations
Personalized Recommendation
System Architecture
                                                                Recommendation Index: Data Store
                                Map Reduce: Hadoop Farm
      Data Store

                                                                       Redis                 Redis
                                                                      (Master)               (Slave)




   Log Processing
   Engine
                                       Backend
                                    Computation Layer




                                                                         Caching with Lazy




                                                                                                       Memcache Farm
                                    Frontend Service




                                                                              Loads
                          Website       Email             SEM
User Generated Content,
Feedback
                               User Channels: Customers
Are we on right track?
● Metrics
  ○ Click through rates (CTR) :
         ● Relevance/Quality
         ● Visibility
  ○ Click to Orders (Conversion):
         ● Quality
         ● Confidence
  ○ Bounce Rate
  ○ Number of items in an order
  ○ Contribution to overall sale

● A/B Testing
Click Through Ratio(CTR)
Recommendation contribution: # of
Units Purchased
Learnings

● Push changes with AB Testing

● Think about Scalability

● Every user events tells something
Long path ahead
● Incorporate contextual information in CF.
   ○ Differentiating Transient effect Vs Long term pattern
   ○ Dealing with cold starts

● Personalization everywhere
  ○ Understand the Customer, boostrap their profile
  ○ User-Attribute relations that are true across verticals

● Platformization
  ○ Plug in new data sources, pipe in input signals effortlessly
  ○ Plug in multiple algorithms
  ○ Streamlined tracking and feedback collection
Thank You
                 Visit Us
http://flipkart.com/recommendations
               Contact Us
             @flipkart_tech
     Data is the Power!!
Map Reduce Paradigm(Step 1)
B1 -> P1, P2, P3, P4,         Mapper: Key( Pair of items) =>
B2 -> P2, P3, P4, P5,                   Value(weight)
Generating pairs:             Reducer: Accumulates the
                              weights for each Pair.

 Key    Valu   Key     Valu        Reducer O/P   Reducer O/P
         e              e
                                     P1 P2 1       P3 P4 2
P1 P2    1     P2 P3    1
                              =>     P1 P3 1       P3 P5 1
P1 P3    1     P2 P4    1
                                     P1 P4 1       P4 P5 1
P1 P4    1     P2 P5    1
                                     P2 P3 2
P2 P3    1     P3 P4    1
                                     P2 P4 2
P2 P4    1     P3 P5    1
                                     P2 P5 1
P3 P4    1     P4 P5    1
Map Reduce Paradigm(Step 2)
Calculating the value 'n1':
Input:
   P1 P4 1        Mapper: Key ( i1) => Value( Pairs with weights)
   P2 P3 1        Reducer: Accumulates the i1's to form the
   P4 P5 1                 Pairs with weights, n1
   P2 P4 1

     Key      Value                     Reducer Output

     P1      P1 P4 1 1                     P1 P4 1 1

     P2      P2 P3 1 1                     P2 P3 1 2
                            =>
     P4      P4 P5 1 1                     P4 P5 1 1

     P2      P2 P4 1 1                     P2 P4 1 2
Map Reduce Paradigm(Step 3)
Calculating the value 'n2':
Input:
   P1 P4 1        Mapper: Key ( i2) => Value( Pairs with weights)
   P2 P3 1        Reducer: Accumulates the i2's to form the
   P4 P5 1                  Pairs with weights, n2
   P2 P4 1

     Key      Value                  Reducer Output

     P4     P1 P4 1 1 1                P1 P4 1 1 2

     P3     P2 P3 1 1 1                P2 P3 1 2 1
                          =>
     P5     P4 P5 1 1 1                P4 P5 1 1 1

     P4     P2 P4 1 1 1                P2 P4 1 2 2

More Related Content

What's hot

Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureDatabricks
 
Big data introduction
Big data introductionBig data introduction
Big data introductionChirag Ahuja
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Vector databases and neural search
Vector databases and neural searchVector databases and neural search
Vector databases and neural searchDmitry Kan
 
eBook - Data Analytics in Healthcare
eBook - Data Analytics in HealthcareeBook - Data Analytics in Healthcare
eBook - Data Analytics in HealthcareNextGen Healthcare
 
Applying Data Science and Analytics in Marketing
Applying Data Science and Analytics in MarketingApplying Data Science and Analytics in Marketing
Applying Data Science and Analytics in MarketingData Con LA
 
Data Science Use cases in Banking
Data Science Use cases in BankingData Science Use cases in Banking
Data Science Use cases in BankingArul Bharathi
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data miningHoang Nguyen
 
Building a Winning Roadmap for Analytics
Building a Winning Roadmap for AnalyticsBuilding a Winning Roadmap for Analytics
Building a Winning Roadmap for AnalyticsIronside
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data ScienceBrijeshGoyani
 
Art and Science of Dashboard Design
Art and Science of Dashboard DesignArt and Science of Dashboard Design
Art and Science of Dashboard DesignSavvyData
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdfAnand572211
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in HealthcareDavid Talby
 
Big data a possible game changer for e-governance
Big data   a possible game changer for e-governanceBig data   a possible game changer for e-governance
Big data a possible game changer for e-governanceSomenath Nag
 

What's hot (20)

Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Tuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and ArchitectureTuning ML Models: Scaling, Workflows, and Architecture
Tuning ML Models: Scaling, Workflows, and Architecture
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Business Analytics Thesis Topics
Business Analytics Thesis TopicsBusiness Analytics Thesis Topics
Business Analytics Thesis Topics
 
Data science big data and analytics
Data science big data and analyticsData science big data and analytics
Data science big data and analytics
 
Vector databases and neural search
Vector databases and neural searchVector databases and neural search
Vector databases and neural search
 
eBook - Data Analytics in Healthcare
eBook - Data Analytics in HealthcareeBook - Data Analytics in Healthcare
eBook - Data Analytics in Healthcare
 
Amazon case study
Amazon case studyAmazon case study
Amazon case study
 
Applying Data Science and Analytics in Marketing
Applying Data Science and Analytics in MarketingApplying Data Science and Analytics in Marketing
Applying Data Science and Analytics in Marketing
 
Data Science Use cases in Banking
Data Science Use cases in BankingData Science Use cases in Banking
Data Science Use cases in Banking
 
Business analytics and data mining
Business analytics and data miningBusiness analytics and data mining
Business analytics and data mining
 
Building a Winning Roadmap for Analytics
Building a Winning Roadmap for AnalyticsBuilding a Winning Roadmap for Analytics
Building a Winning Roadmap for Analytics
 
Amazon
AmazonAmazon
Amazon
 
Big Data & Data Science
Big Data & Data ScienceBig Data & Data Science
Big Data & Data Science
 
AI in Marketing.pptx
AI in Marketing.pptxAI in Marketing.pptx
AI in Marketing.pptx
 
Art and Science of Dashboard Design
Art and Science of Dashboard DesignArt and Science of Dashboard Design
Art and Science of Dashboard Design
 
Business intelligence architectures.pdf
Business intelligence architectures.pdfBusiness intelligence architectures.pdf
Business intelligence architectures.pdf
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in Healthcare
 
Big data a possible game changer for e-governance
Big data   a possible game changer for e-governanceBig data   a possible game changer for e-governance
Big data a possible game changer for e-governance
 

Viewers also liked

Strategic recommendations for flipkart
Strategic recommendations for flipkartStrategic recommendations for flipkart
Strategic recommendations for flipkartPavankumar Wadhonkar
 
Complete marketing analysis of Flipkart.
Complete marketing analysis of Flipkart. Complete marketing analysis of Flipkart.
Complete marketing analysis of Flipkart. Piyush Kapoor
 
Satya prakash_project report on Flipkart
Satya prakash_project report on FlipkartSatya prakash_project report on Flipkart
Satya prakash_project report on FlipkartSatya Prakash Tripathi
 
Study of “Flipkart.com”: India’s Leading E-business Portal
Study of “Flipkart.com”: India’s Leading E-business PortalStudy of “Flipkart.com”: India’s Leading E-business Portal
Study of “Flipkart.com”: India’s Leading E-business PortalSagar Agrawal
 
Flipkart Logistic & Supply chain management
Flipkart Logistic & Supply chain managementFlipkart Logistic & Supply chain management
Flipkart Logistic & Supply chain managementSagar Sawant
 
Building tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systemsBuilding tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systemsRegunath B
 
Flipkart company analysis and strategic & tactical recommendations
Flipkart  company analysis and strategic & tactical recommendationsFlipkart  company analysis and strategic & tactical recommendations
Flipkart company analysis and strategic & tactical recommendationsSumit K Jha
 
E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres Regunath B
 
Saunders client feedback and recommendation ppt report 3.23
Saunders client feedback and recommendation ppt report 3.23Saunders client feedback and recommendation ppt report 3.23
Saunders client feedback and recommendation ppt report 3.23Natascha Saunders
 
Mech vii-operation research [06 me74]-notes
Mech vii-operation research [06 me74]-notesMech vii-operation research [06 me74]-notes
Mech vii-operation research [06 me74]-notesMallikarjunaswamy Swamy
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender SystemsLior Rokach
 
Summary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperSummary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperChangsung Moon
 
genetic algorithm based music recommender system
genetic algorithm based music recommender systemgenetic algorithm based music recommender system
genetic algorithm based music recommender systemneha pevekar
 
Graph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayGraph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayDataStax Academy
 

Viewers also liked (20)

Strategic recommendations for flipkart
Strategic recommendations for flipkartStrategic recommendations for flipkart
Strategic recommendations for flipkart
 
flipkart marketing strategies
flipkart marketing strategiesflipkart marketing strategies
flipkart marketing strategies
 
Flipkart
FlipkartFlipkart
Flipkart
 
Complete marketing analysis of Flipkart.
Complete marketing analysis of Flipkart. Complete marketing analysis of Flipkart.
Complete marketing analysis of Flipkart.
 
Flipkart
FlipkartFlipkart
Flipkart
 
Satya prakash_project report on Flipkart
Satya prakash_project report on FlipkartSatya prakash_project report on Flipkart
Satya prakash_project report on Flipkart
 
Study of “Flipkart.com”: India’s Leading E-business Portal
Study of “Flipkart.com”: India’s Leading E-business PortalStudy of “Flipkart.com”: India’s Leading E-business Portal
Study of “Flipkart.com”: India’s Leading E-business Portal
 
Flipkart Logistic & Supply chain management
Flipkart Logistic & Supply chain managementFlipkart Logistic & Supply chain management
Flipkart Logistic & Supply chain management
 
Building tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systemsBuilding tiered data stores using aesop to bridge sql and no sql systems
Building tiered data stores using aesop to bridge sql and no sql systems
 
Search@flipkart
Search@flipkartSearch@flipkart
Search@flipkart
 
Flipkart company analysis and strategic & tactical recommendations
Flipkart  company analysis and strategic & tactical recommendationsFlipkart  company analysis and strategic & tactical recommendations
Flipkart company analysis and strategic & tactical recommendations
 
Flipkart
FlipkartFlipkart
Flipkart
 
E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres E commerce data migration in moving systems across data centres
E commerce data migration in moving systems across data centres
 
Saunders client feedback and recommendation ppt report 3.23
Saunders client feedback and recommendation ppt report 3.23Saunders client feedback and recommendation ppt report 3.23
Saunders client feedback and recommendation ppt report 3.23
 
Mech vii-operation research [06 me74]-notes
Mech vii-operation research [06 me74]-notesMech vii-operation research [06 me74]-notes
Mech vii-operation research [06 me74]-notes
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Summary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paperSummary of a Recommender Systems Survey paper
Summary of a Recommender Systems Survey paper
 
Research Methodology
Research MethodologyResearch Methodology
Research Methodology
 
genetic algorithm based music recommender system
genetic algorithm based music recommender systemgenetic algorithm based music recommender system
genetic algorithm based music recommender system
 
Graph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBayGraph Based Recommendation Systems at eBay
Graph Based Recommendation Systems at eBay
 

Similar to Recommendations play @flipkart

Recommendations play @flipkart (3)
Recommendations play @flipkart (3)Recommendations play @flipkart (3)
Recommendations play @flipkart (3)hava101
 
Intelligent Search
Intelligent SearchIntelligent Search
Intelligent SearchTed Dunning
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetCrossing Minds
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with sparkModern Data Stack France
 
Functional Programming - Worth the Effort
Functional Programming - Worth the EffortFunctional Programming - Worth the Effort
Functional Programming - Worth the EffortBoldRadius Solutions
 
Teasing talk for Flow-based programming made easy with PyF 2.0
Teasing talk for Flow-based programming made easy with PyF 2.0Teasing talk for Flow-based programming made easy with PyF 2.0
Teasing talk for Flow-based programming made easy with PyF 2.0Jonathan Schemoul
 
Graph processing at scale using spark & graph frames
Graph processing at scale using spark & graph framesGraph processing at scale using spark & graph frames
Graph processing at scale using spark & graph framesRon Barabash
 
Recommender Systems from A to Z – Model Evaluation
Recommender Systems from A to Z – Model EvaluationRecommender Systems from A to Z – Model Evaluation
Recommender Systems from A to Z – Model EvaluationCrossing Minds
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccsrisatish ambati
 
[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache SparkNaukri.com
 
Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...
Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...
Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...Philipp Schroeder
 
The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...Holden Karau
 
Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Mayank Shrivastava
 
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...OpenSource Connections
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fittingWush Wu
 
Generative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_AlaeddineGenerative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_AlaeddineDeep Learning Italia
 
Securerank ping-opendns
Securerank ping-opendnsSecurerank ping-opendns
Securerank ping-opendnsPing Yan
 
Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...
Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...
Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...Big Data Spain
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Chris Fregly
 

Similar to Recommendations play @flipkart (20)

Recommendations play @flipkart (3)
Recommendations play @flipkart (3)Recommendations play @flipkart (3)
Recommendations play @flipkart (3)
 
1000 track2 Bharadwaj
1000 track2 Bharadwaj1000 track2 Bharadwaj
1000 track2 Bharadwaj
 
Intelligent Search
Intelligent SearchIntelligent Search
Intelligent Search
 
Recommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right DatasetRecommender Systems from A to Z – The Right Dataset
Recommender Systems from A to Z – The Right Dataset
 
Hadoop France meetup Feb2016 : recommendations with spark
Hadoop France meetup  Feb2016 : recommendations with sparkHadoop France meetup  Feb2016 : recommendations with spark
Hadoop France meetup Feb2016 : recommendations with spark
 
Functional Programming - Worth the Effort
Functional Programming - Worth the EffortFunctional Programming - Worth the Effort
Functional Programming - Worth the Effort
 
Teasing talk for Flow-based programming made easy with PyF 2.0
Teasing talk for Flow-based programming made easy with PyF 2.0Teasing talk for Flow-based programming made easy with PyF 2.0
Teasing talk for Flow-based programming made easy with PyF 2.0
 
Graph processing at scale using spark & graph frames
Graph processing at scale using spark & graph framesGraph processing at scale using spark & graph frames
Graph processing at scale using spark & graph frames
 
Recommender Systems from A to Z – Model Evaluation
Recommender Systems from A to Z – Model EvaluationRecommender Systems from A to Z – Model Evaluation
Recommender Systems from A to Z – Model Evaluation
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark[@NaukriEngineering] Apache Spark
[@NaukriEngineering] Apache Spark
 
Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...
Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...
Sketching, Wireframing, Prototyping - How to Be Agile and Avoid Half-Baked Us...
 
The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...The magic of (data parallel) distributed systems and where it all breaks - Re...
The magic of (data parallel) distributed systems and where it all breaks - Re...
 
Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020Apache Pinot Meetup Sept02, 2020
Apache Pinot Meetup Sept02, 2020
 
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
Haystack 2019 - Solving for Satisfaction: Introduction to Click Models - Eliz...
 
Online advertising and large scale model fitting
Online advertising and large scale model fittingOnline advertising and large scale model fitting
Online advertising and large scale model fitting
 
Generative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_AlaeddineGenerative adversarial network_Ayadi_Alaeddine
Generative adversarial network_Ayadi_Alaeddine
 
Securerank ping-opendns
Securerank ping-opendnsSecurerank ping-opendns
Securerank ping-opendns
 
Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...
Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...
Value extraction from BBVA credit card transactions. IVAN DE PRADO at Big Dat...
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 

Recently uploaded

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Recommendations play @flipkart

  • 2. Recommendation Systems aim to predict the user’s intent and help connect them to products they need or may be interested in, in an automated fashion
  • 5. Why Recommendation Important? Product discovery is an important step in users interaction with an ecommerce website and recommendation systems play a very useful part in the discovery This becomes ever so important with increasing size of the catalog, making it harder for every user to express their intent through well formed queries
  • 6. Recommendation Types ● Help Find: other similar products smartphones when on samsung galaxy sIII, other budget phones when at nokia 101 ● Help Decide People who viewed this ended up buying ● Complete the purchase experience memory card for mobiles, cases for camera ● Excite or wow New Chetan Bhagat Novel, Printer ink after 6 months
  • 7. Recommendation System = Data + Algorithms
  • 8. Data for Recommendations ● Users interactions on the website ● Anything which reflects his/her interest on the items (browse, order, compare, review, etc) ● Any data that is applicable to all users of the website
  • 9. Algorithms for Recommendations ● Collaborative Filtering ○ Item to Item ○ User to User ● Content Based Filtering ○ Meta data - music genres, brands, authors ○ Titles - books on java, accessories for nikon D90
  • 10. Users Items User to User 1 ● For user u, find the set A of users S(u) that have 2 similar preference as u B ● Recommend popular 3 items among users in S (u) to user u. C 4 D 5
  • 11. Item to Item Users Items ● For item i, find the set of 1 items S(i) that are touched together A 2 ● For every user u, find set of items N(u) they B have browsed/bought 3 before C ● For every item x in N(u) 4 add S(x) to the recommendation set D 5 ● Sort and show the recommendation set
  • 12. Users Latent Nodes Items 1 A budget cameras 2 B canon 3 C classical music 4 D 5
  • 13. Deeper Dive Item Item Collaborative Filtering
  • 14. Collaborative Filtering ● Filtering for patterns using collaboration among multiple entities: agents, viewpoints, data sources, etc ● Automates the process of "word of mouth"
  • 15. Data Modeling ● Bought History. ○ Items bought together in the same transaction. ○ Items bought by the User. ● Browse History ○ Items browsed together successively with in 5 mins. ● Compare History ● 30 secs music sample etc..
  • 16. Algorithm B1 -> P1, P2 , P3, P4 B2 -> P2, P3, P4 B3 -> P1, P2, P4 Item/ Item P1 P2 P3 P4 P1 - 2 1 2 Recommendation P2 - 2 3 P1 -> P2, P4, P3 P3 - 2 P2 -> P4, P1,P3 P4 - P3 -> P2, P4, P1 P4 -> P2, P1, P2
  • 17. Recommendation Score Recommendation Score between the Products. P1, P2 is P1 Intersection count = 150 P1, P3 100,000 Intersection count = 200 1200 Visits. After removing the popularity skew, P1, P2 is 1000 Visits P2 Score = 150 / 1000 = 0.15 P3 P1, P3 is Score = 200 / 100000 = 0.002
  • 18. Formula: Recommendation Score = ( i * i ) / ( n1 * n2). Where, i = count of occurrences of items i1, i2 together. n1 = # of occurrences of the item i1 alone. n2 = # of occurrences of the item i2 alone.
  • 19. Computations Scale ● # of Product Page visits = 6 Million/day. ● # of sessions = 30 Million/Week. ● Size of data crunched for full index = 2TB
  • 20. Scaling with Map Reduce ● Calculate 'i' by forming pairs and counting ● Calculate 'n1' by making P1 as the key ● Calculate 'n2' by making P2 as the key ● Took 2 hours on 6 months of data ● Scales Horizontally ● Previous/Conventional approach ○ Hash the pairs to form huge number of small files, sort and collate ○ Would have taken 20x the time (2 days) ○ Not horizontally scalable
  • 23. Personalization explained ● Item-Item Recommendations ● Understand the User's behavior. ○ Bought Items ○ WishList Items ○ Cart Additions ○ Rated Products. User => Products that are defining the behavior/intentions. Products => Recommendations Forming Personalized Recommendations, User => Recommendations.
  • 26. System Architecture Recommendation Index: Data Store Map Reduce: Hadoop Farm Data Store Redis Redis (Master) (Slave) Log Processing Engine Backend Computation Layer Caching with Lazy Memcache Farm Frontend Service Loads Website Email SEM User Generated Content, Feedback User Channels: Customers
  • 27. Are we on right track? ● Metrics ○ Click through rates (CTR) : ● Relevance/Quality ● Visibility ○ Click to Orders (Conversion): ● Quality ● Confidence ○ Bounce Rate ○ Number of items in an order ○ Contribution to overall sale ● A/B Testing
  • 29. Recommendation contribution: # of Units Purchased
  • 30. Learnings ● Push changes with AB Testing ● Think about Scalability ● Every user events tells something
  • 31. Long path ahead ● Incorporate contextual information in CF. ○ Differentiating Transient effect Vs Long term pattern ○ Dealing with cold starts ● Personalization everywhere ○ Understand the Customer, boostrap their profile ○ User-Attribute relations that are true across verticals ● Platformization ○ Plug in new data sources, pipe in input signals effortlessly ○ Plug in multiple algorithms ○ Streamlined tracking and feedback collection
  • 32. Thank You Visit Us http://flipkart.com/recommendations Contact Us @flipkart_tech Data is the Power!!
  • 33. Map Reduce Paradigm(Step 1) B1 -> P1, P2, P3, P4, Mapper: Key( Pair of items) => B2 -> P2, P3, P4, P5, Value(weight) Generating pairs: Reducer: Accumulates the weights for each Pair. Key Valu Key Valu Reducer O/P Reducer O/P e e P1 P2 1 P3 P4 2 P1 P2 1 P2 P3 1 => P1 P3 1 P3 P5 1 P1 P3 1 P2 P4 1 P1 P4 1 P4 P5 1 P1 P4 1 P2 P5 1 P2 P3 2 P2 P3 1 P3 P4 1 P2 P4 2 P2 P4 1 P3 P5 1 P2 P5 1 P3 P4 1 P4 P5 1
  • 34. Map Reduce Paradigm(Step 2) Calculating the value 'n1': Input: P1 P4 1 Mapper: Key ( i1) => Value( Pairs with weights) P2 P3 1 Reducer: Accumulates the i1's to form the P4 P5 1 Pairs with weights, n1 P2 P4 1 Key Value Reducer Output P1 P1 P4 1 1 P1 P4 1 1 P2 P2 P3 1 1 P2 P3 1 2 => P4 P4 P5 1 1 P4 P5 1 1 P2 P2 P4 1 1 P2 P4 1 2
  • 35. Map Reduce Paradigm(Step 3) Calculating the value 'n2': Input: P1 P4 1 Mapper: Key ( i2) => Value( Pairs with weights) P2 P3 1 Reducer: Accumulates the i2's to form the P4 P5 1 Pairs with weights, n2 P2 P4 1 Key Value Reducer Output P4 P1 P4 1 1 1 P1 P4 1 1 2 P3 P2 P3 1 1 1 P2 P3 1 2 1 => P5 P4 P5 1 1 1 P4 P5 1 1 1 P4 P2 P4 1 1 1 P2 P4 1 2 2