HBaseCon 2013: Real-Time Model Scoring in Recommender Systems
 

HBaseCon 2013: Real-Time Model Scoring in Recommender Systems

on

  • 2,205 views

Presented by: Jonathan Natkins (WibiData) and Juliet Hougland (WibiData)

Presented by: Jonathan Natkins (WibiData) and Juliet Hougland (WibiData)

Statistics

Views

Total Views
2,205
Views on SlideShare
2,147
Embed Views
58

Actions

Likes
7
Downloads
92
Comments
0

2 Embeds 58

http://www.kiji.org 47
http://www-new.kiji.org 11

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    HBaseCon 2013: Real-Time Model Scoring in Recommender Systems HBaseCon 2013: Real-Time Model Scoring in Recommender Systems Presentation Transcript

    • Real-­‐Time  Model  Scoring  in   Recommender  Systems   (c)  2013  WibiData,  Inc.      Juliet  Hougland  and  Jonathan  Natkins  
    • Why  Real-­‐Time?  
    • Who  Are  We?   •  Jon  "Na@y"  Natkins  (@na@yice)   •  Field  Engineer  at  WibiData   •  Before  that,  Cloudera  SoJware  Engineer   •  Before  that,  VerMca  SoJware/Field  Engineer   •  Juliet  Hougland  (@JulietHougland)   •  PlaPorm  Engineer  at  WibiData   •  MS  in  Applied  Math  and  BA  in  Math-­‐Physics  
    • What  is  Kiji?   The  Kiji  Project  is  a   modular,  open-­‐source   framework  that  enables   developers  and  analysts   to  collect,  analyze  and   use  data  in  real-­‐Mme   applicaMons.   •  kiji.org   •  github.com/kijiproject  
    • Genera<ng  Recommenda<ons  
    • Genera<ng  Recommenda<ons  
    • Modeling  with  KijiMR   Producers   •  Operates  on  a  single  row  in  a  table.   •  Generate  derived  data:   o  Apply  a  classifier   o  Assign  a  user  to  a  cluster  or  segment   o  Recommend  new  items   Gatherers   •  Mapper  with  KijiTable  input.   •  Used  when  training  models.  
    • Genera<ng  Recommenda<ons  
    • Genera<ng  Recommenda<ons  
    • Genera<ng  Recommenda<ons  
    • Batch  Isn't  Good  For  Everything  
    • Batch  Isn't  Good  For  Everything  
    • Batch  Isn't  Good  For  Everything  
    • Fresheners  Compute  Lazily   Freshness   Policy   Read  a  column   Get  from  HBase   Fresh?  Yes,  return  to  client   KijiScoring  API   HBase  
    • Fresheners  Compute  Lazily   Freshness   Policy   Read  a  column   Get  from  HBase   Fresh?   Yes,  return  to  client   KijiScoring  API   HBase   Producer   Freshen   Cache  for  next  Mme  
    • How  can  we  make  "freshenable"   models?   Population interests change slowly Individual interests change quickly
    • How  can  we  make  "freshenable"   models?   Population interests change slowly Individual interests change quickly Models  don't  need  to   retrained  frequently   ApplicaMon  of  a  model   should  be  fast  
    • How  can  we  make  "freshenable"   models?   Individual interests change quickly ApplicaMon  of  a  model   should  be  fast  •  Train  a  model  over  your   enMre  data  set   •  Save  fi@ed  model   parameters  to  a  file,  or   another  table   •  Access  the  model   parameters  through  a   KeyValueStore  when   scoring  new  data  with  a   producer.  
    • More  Modeling  with  KijiMR   KeyValueStores   •  Allows  access  to  external  data  in  Producers  and   Gatherers.   •  Supports  various  file  formats  as  well  as  tables.   •  Makes  joining  dataset  together  very  easy.   •  The  mechanism  for  accessing  fi@ed  model   parameters  when  freshening.  
    • •  A real-time product recommendation system •  Content-based model using product descriptions and TF-IDF KijiShopping   UsersKijiShopping Web Application KijiSchema Avro, HBase KijiMR MapReduce KijiScoring
    • KijiShopping  Data  Collec<on   •  User Logins •  Product Information o  Names, descriptions, SKU information •  User Ratings o  Explicit ratings from users How do we go from data to recommendations?
    • Finding  Useful  Features   •  TF-IDF
    • TF-­‐IDF   •  Term Frequency o  How often does this term appear in this document? •  Document Frequency o  How many documents does this term appear in? •  TF-IDF o  How important is this term to this document? •  In KijiShopping, each is a separate job
    • •  Written as a Producer o  Executed on the Product table as a Map-only job o  WordCount on a per-record basis Compu<ng  Term  Frequency   HBase Read Product Description Count Words in Product Description Write Word Counts Back
    • •  Written as a Gatherer o  Executed on the Product table as a MapReduce job o  Groups by words Compu<ng  Document  Frequency   HBase Read Term Frequencies Map Emit (Word, 1) Write Document Frequencies HDFS Reduce Group By Word
    • •  Written as a Producer o  Executed on the Product table as a Map-only job o  Pulls in Document Frequencies as a KVStore Compu<ng  TF-­‐IDF   HBase Read Term Frequencies Divide TF by DF Write TF-IDFs Back HDFS Read Document Frequencies via KVStore
    • •  Batch training process •  Associations stored in a model table Associa<ng  Words  with  Products   gourmet knife "gourmet" Products "knife" Products tfidfgourmet tfidfknife
    • Determine  a  User's  Preferred  Words   •  Stored in a user table Natty gourmet knife wgourmet wknife
    • •  Producers incorporate models using KeyValueStores Combining  User  Ra<ngs  and  Models   Natty gourmet knife "gourmet" Products "knife" Products wgourmet wknife tfidfgourmet tfidfknife
    • Genera<ng  a  Recommenda<on   •  Pick the best products for your user
    • KijiShopping   The  model  was  built  with  KijiMR-­‐  an   extension  of  Hadoop  MapReduce.      
    • KijiShopping   The  model  was  built  with  KijiMR-­‐  an   extension  of  Hadoop  MapReduce.      
    • KijiExpress  Modeling  Lifecycle  
    • Want  to  know  more?   •  The Kiji Project o  kiji.org o  github.com/kijiproject •  KijiShopping o  github.com/wibidata/kiji-shopping Questions about this presentation? o  juliet@wibidata.com o  natty@wibidata.com
    • Want  to  know  more?   •  Come see us at the WibiData booth •  Join us at KijiCon tomorrow