Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using Couchbase and XDCR for Real-Time Advertising Applications: Couchbase Connect 2014


Published on

Mirror Image provides a fully managed, globally distributed and load balanced real-time platform specifically designed for online and mobile advertising applications. Our customers’ advertising applications must execute with very fast response times, typically under 200 ms., regardless of where the request originates geographically.

Some of these applications require very large data sets, with hundreds of millions of rows and hundreds or thousands of values per row. Until recently, since geographically distributed database consistency was so challenging to achieve, our applications managed customers’ data sets using flat files which were limited in size, impacting the use cases that we were able to implement. Learn how Couchbase Server with XDCR is enabling Mirror Image to work with much larger data sets to successfully implement and execute a wider range of advertising technology use cases for our customers.

Published in: Data & Analytics
  • Be the first to comment

Using Couchbase and XDCR for Real-Time Advertising Applications: Couchbase Connect 2014

  1. 1. Using Couchbase and XDCR for Real- Time Advertising Applications Joe Lichtenberg VP Advertising and Analytics, Mirror Image Couchbase Connect October 21, 2014
  2. 2. Mirror Image 1.0 (c. 1997): Content Delivery Network Media Delivery Content Logic
  3. 3. Mirror Image 2.0: Edge Computing + Media Delivery Edge Computing Media Delivery Content Logic Content Logic
  4. 4. Dynamic Delivery: Edge Computing + Media Delivery Edge Computing Media Delivery Content Logic Content Logic
  5. 5. Synchronized Operations Worldwide
  6. 6. Obligatory Marketing Slide Mirror Image’s Dynamic Delivery Network powers hundreds of billions of real-time requests for mission critical applications throughout the Advertising and Advertising Technology ecosystem Extensive functional capabilities Personalized expert customization and support High capacity, worldwide, real time infrastructure Edge Computing Geo-Distributed Database* Live and On-Demand Video Object Delivery SSL Delivery Token-Based Access Control Reporting and Analytics Knowledgeable, dedicated, in-house support team Fully staffed, 24 x 7 Network Operations Center (NOC) Expert professional services resources High capacity, centralized server model Exceptional performance, scalability, and availability with SLA guarantees Elastic scaling within and across geographies Worldwide coverage 15+ years of experience
  7. 7. Why Focus on Advertising and Ad-Tech? -Lots of prospects -Requires fast response times to each visitor’s browser / device
  8. 8. Ad Tech Ecosystem Source: Luma Partners
  9. 9. Ad TechEcosystem – Another View Source:
  10. 10. Request/Response/Feedback Cycle • Request Logic – Customized behavior based on explicit and derived request attributes such as IP address, user-agent, query parameters, cookie values, geo-location, … • Response Logic – Personalized javascript, XML, HTML; Transparent pixel GIFs; Cookie modifications; Transformations, token substitutions… • Edge Data Sets – Shared data sets such as IAB Spiders & Bots, WURFL (Device DB), IP-GeoLocation – Customer’s Key-Value Data • Customized Data Collection – Delivery of log files to customers as part of feedback cycle
  11. 11. Edge Computing Flow • IAB Spiders & Bots Data • Mobile Device Data Sets • IP – Geolocation Data Sets • Custom Key-Value Data In Memory*
  12. 12. What Problems Does The CS Implementation Solve? • Customers and prospects have requirements that we work with their large and growing data sets in real-time at the edge of the internet • Our “flat file” implementation required moving the entire contents of the files into memory on each server at startup, and… • … customers were bumping up against data size limitations
  13. 13. Edge Computing Flow • IAB Spiders & Bots Data • Mobile Device Data Sets • IP – Geolocation Data Sets • Custom Key-Value Data In Memory • Couchbase Key-Value Distributed Database
  14. 14. Geo-Distributed Database: Use Cases • Real-time GUID database • Real-time cookie matching • Contextual targeting • Ad fraud / brand safety • Cross-device user matching / audience de-duplication • In session execution of batch-developed analytics • Any web or mobile database-backed application with geographically dispersed users requiring fast end-to-end response times
  15. 15. Geo-Distributed Database: Requirements • High performance lookup and write capabilities at the edge • Ability to manage large custom data sets for each customer application • Low latency replication (aka XDCR) • Ability to replicate to different regions per application / different data per region • Key-value lookups only – no need for complex queries or SQL • Reliability
  16. 16. Implementation Details • Master hub sites have a 3 node Couchbase cluster for reliability • Edge sites start with a single instance for real-time RW access, and scale out via clustering based on demand • Edges can failover to nearby edges for availability • Limited number of buckets – Multi-tenant – Geographic regions – Customer-specific bucket(s) are optional
  17. 17. Implementation Details • Phase 1 (in production): Read-only edges; master hubs; data import / ETL servers • Phase 2: Extends read-only behavior to virtual locations • Phase 3: Adds write support on edges, bi-directional XDCR back to each master hub
  18. 18. GDD “Marchitecture” - Read Only
  19. 19. GDD “Marchitecture” - Read/Write
  20. 20. Master Hub Configuration Customizable ETL Servers Couchbase Edge Instance Storage FTP Upload ETL Master Couchbase 3-Node Cluster ECF Server Farm
  21. 21. Distributed Star Network Architecture
  22. 22. GDD Key Benefits (or “Marketing Slide #2”) • Globally distributed, consistent data -> High performance globally • Schema-less data model -> Flexibility • Scalable -> Handles traffic and data growth and spikes • Fully managed service -> Including setup, maintenance, replication, synchronization, backup, security, and 24x7 monitoring • Subscription-based billing model -> No CAPEX, pay as you go • Uses Couchbase Server -> Proven; active developer community • Data privacy -> No conflicts of interest with our customers
  23. 23. Questions?? Joe Lichtenberg VP Advertising and Analytics, Mirror Image