Making Sense of Data        Lily goes shopping –real-time recommendations with HBase                         HBaseCon, May...
Lily Core 2’ recap•  HBase-backed data repository,   with batteries included•  Data model:    •  high-level data model on ...
Why HBase?•  BigTable model•  sparseness•  atomic row updates aka concistency•  auto-partitioning•  Apache license•  A gre...
Portfolio Overview                                               Real-time AI                                             ...
Lily (=HBase) In UseSome of the larger Lily deployments•  media    •  aggregation, database publishing and online archives...
Collaborative Filtering?  Recommend items similar to a user’s highly-preferred items                                      ...
Collaborative Filtering is … Matrixes   Sean likes “Scarface” a lot             (123,654,5.0)!   Robin likes “Scarface” so...
Contextualized recommendations                                  Personalized                                     offers   ...
Fitting Recommendations into the LilyArchitecture            LILY CRUD API                                                ...
Preferencing aka Feeding the Matrix•  Transaction-based preferencing     •  Pluggable preference strategies, using Lily-ba...
Making recommendations•  Recommender    •  Pluggable recommender strategies, using Lily-based data       (HBase&Solr) for ...
Other upcoming Lily Features•  Secondary indexes (= Lily Core!)    •  indexes are defined through configuration    •  sing...
Making Sense of DataQuestions? Thank you!               WWW.NGDATA.COM
Upcoming SlideShare
Loading in …5
×

HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata

1,664
-1

Published on

HBase brings interactivity to Hadoop, and allows users to collect, manage and process data in real-time. Lily wraps HBase and Solr in a comprehensive Big Data platform, with HBase-native secondary indexing complementing ad-hoc structured search. Through spare write-cycles during read operations, Lily transforms HBase in an scalable data management engine providing interactive analytics, profile harvesting and real-time recommendations. This talk highlights the architecture of Lily, how it completes HBase, and explains some of its implementation use cases.

Published in: Technology, Business
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,664
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
67
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily & HBase - ngdata

  1. 1. Making Sense of Data Lily goes shopping –real-time recommendations with HBase HBaseCon, May 2012 Steven Noels – VP Product – @stevenn WWW.NGDATA.COM
  2. 2. Lily Core 2’ recap•  HBase-backed data repository, with batteries included•  Data model: •  high-level data model on top of HBase’s client app byte[]’s •  schema •  versioning (schema and data) Lily •  links, variants RowLog•  Java & REST APIs•  Indexing: HBase Solr et al. •  through configuration, not implementation •  incremental and batch index maintenance•  RowLog: distributed, durable queue for sec. actions•  Open Source: www.lilyproject.org (Apache License) WWW.NGDATA.COM
  3. 3. Why HBase?•  BigTable model•  sparseness•  atomic row updates aka concistency•  auto-partitioning•  Apache license•  A great community led by a Saint J WWW.NGDATA.COM
  4. 4. Portfolio Overview Real-time AI Recommendations Industry algorithms and rules commercial availability   Trend Analytics Pattern Detection Profile Development Context and Activity Tracking open source   Social Stream Ingestion Schema and Data Management Total Data Aggregation Real-time Index and Retrieval Security and Enterprise Connectors WWW.NGDATA.COM
  5. 5. Lily (=HBase) In UseSome of the larger Lily deployments•  media •  aggregation, database publishing and online archives•  finance •  real-time identity fraud detection•  retail banking •  contextualized (time+loc+person) mobile coupons•  retail •  e-commerce platform: product catalog, consumer data store, real-time indexing WWW.NGDATA.COM
  6. 6. Collaborative Filtering? Recommend items similar to a user’s highly-preferred items WWW.NGDATA.COM
  7. 7. Collaborative Filtering is … Matrixes Sean likes “Scarface” a lot (123,654,5.0)! Robin likes “Scarface” somewhat (789,654,3.0)! Grant likes “The Notebook” not at all (345,876,1.0)! … …! (Magic) Grant may like “Scarface” quite a bit (345,654,4.5)! … …! WWW.NGDATA.COM
  8. 8. Contextualized recommendations Personalized offers shops & merchants Profile Acitvity Item product families offers/couponscreditcardstatements WWW.NGDATA.COM
  9. 9. Fitting Recommendations into the LilyArchitecture LILY CRUD API Lily/HBase Secondary Indexes read/write demultiplexer co-occurence lookup matrix rowlog activity store Steven Noels stevenn@ngdata.com www.ngdata.com telephone: +32 9 33 engine LILY recommender 88 220 data profile data, activity, profile scoring indexes store store Gent (Belgium) propensity custom ... k-means ALS Makers of Lily Core Repository algorithm support WWW.NGDATA.COM
  10. 10. Preferencing aka Feeding the Matrix•  Transaction-based preferencing •  Pluggable preference strategies, using Lily-based data (HBase&Solr) for decision making •  e.g. credit card statement = transactions between users and product families •  Preference weighting •  Ingest: REST API, bulk support •  Real-time updating of the recommendation model•  Profile Store •  Profile activities can be preferenced •  Support for Profile behavior analysis WWW.NGDATA.COM
  11. 11. Making recommendations•  Recommender •  Pluggable recommender strategies, using Lily-based data (HBase&Solr) for decision making •  Multi-model support: user-item & item-user recommendations •  Estimation of both preferenced and non-preferenced items •  Geolocation-based recommendations •  Re-scoring •  REST API•  (Planned) •  Support for Classifications (scenario - Recommend me all (possible) coffee drinkers) •  Matrix / recommendation indexing WWW.NGDATA.COM
  12. 12. Other upcoming Lily Features•  Secondary indexes (= Lily Core!) •  indexes are defined through configuration •  single or multi-field indexes •  range queries and prefix queries •  asc or desc sorted results •  can read huge, sorted lists •  synchronously updated: index updates are applied by rowlog secondary actions •  online building of new indexes (no table locks) •  MapReduce integration•  SolrCloud integration •  Index shards and configuration managed through ZooKeeper WWW.NGDATA.COM
  13. 13. Making Sense of DataQuestions? Thank you! WWW.NGDATA.COM
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×