At RichRelevance, we service 10 of the top 20 Internet retailer chains and deliver more than $5.5 billions in attributable sales. Every 21 milliseconds a shopper clicks on a recommendation that we have delivered, and we serve over 850 million product recommendations daily. Our Hadoop infrastructure has a capacity to handle upwards of 1.5+ PB. Behavioral Targeting, specifically user segmentation and building personas, is critical for us in generating triggers when a user is added to a segment or switches from a segment. In this presentation, we intend to demonstrate not only how the events are captured, but also how they are stored in HBase in real-time. It is critical to design the system so it can handle thousands of writes per second and, at the same time, be able to query any combination of behavioral attributes in HBase through real-time APIs. This session will walk attendees through the entire design & architecture starting from data Ingestion, schema design, and access patterns, as well as some major problems like sharing & hot spotting. Furthermore, performance metrics will be presented, including the number of read/write per second and details around cluster configuration.
Clipping is a handy way to collect important slides you want to go back to later.