QuestPoint_Couchbase_SF_2013
 

QuestPoint_Couchbase_SF_2013

on

  • 1,076 views

 

Statistics

Views

Total Views
1,076
Views on SlideShare
395
Embed Views
681

Actions

Likes
0
Downloads
21
Comments
0

1 Embed 681

http://www.couchbase.com 681

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Questpoint has created one of the largest panel of consumer browsing behavior from around the globe, including over 140 countries.Questpoint purchases panelist and clickstream information from multiplesources. Each source provids the consumer some innate benefit to using their particular piece of software. In turn, consumers have opted-in to having their clickstream anonymously tracked.Best-of-breed: Behavior prediction, Demographic correlation, Retargeting,Creatives, etc…
  • Possibly move the problem slides up to earlier in the presentation and describe things that we need to solve. Put in a slide to show big data ecosystem.
  • equals ~ 127,000 writes per secondAd Server using an in-house designed sharding

QuestPoint_Couchbase_SF_2013 QuestPoint_Couchbase_SF_2013 Presentation Transcript

  • QUESTPOINT COUCHBASE, NOSQL AND BIG DATA SEPT 2013 1 Confidential – not to be distributed without written consentwww.QuestPoint.com
  • table of contents Confidential – not to be distributed without written consentwww.QuestPoint.com • about me • about us • some basic information • legacy system architecture and problems • redesigned system architecture and improvements • comparison and closing 2
  • about me Confidential – not to be distributed without written consentwww.QuestPoint.com 3 • currently the Chief Information Officer of QuestPoint. • worked for Microsoft on the Xbox Live Services backend platform • 13 years of data architecting and coding experience • graduated with a BS in Technology Management with a minor in Math and emphasis in Computer Science • what does all of that actually mean?
  • about me Confidential – not to be distributed without written consentwww.QuestPoint.com 4 •nerd
  • Confidential – not to be distributed without written consentwww.QuestPoint.com 5 1 2 • at QuestPoint, we love data. • we transform proprietary data into actionable insights for both internal and external customers. We use data to better understand and predict online consumer behavior; adding visibility into who does what, when, and why. • these data driven insights provide powerful competitive advantages to our business lines: • DATA SALES • COMPETITIVE INTEL • AUDIENCE INSIGHTS • LEAD GEN • AFFILIATE MARKETING • TRAFFIC ARBITRAGE about us
  • Confidential – not to be distributed without written consentwww.QuestPoint.com 6 Traffic Arbitrage (CPC | CPM) Affiliate Marketing (CPA) Lead Generation (CPL) QuestPoint Marketing • performance marketing services. • high quality traffic and leads for online marketers. QuestPoint Marketing uses data to crack the code on who does what online, when, and why. QuestPoint Marketing
  • QuestPoint Decision Confidential – not to be distributed without written consentwww.QuestPoint.com 7 QuestPoint Decision • global leader in providing useful data-driven marketing insights to brands and marketers. • competitive and audience intel. • understand customers, potential customers, and competition better. • our large panel gives us superior breadth of countries as well as depth of statistical significance within each country. Competitive Intelligence Audience Insights Multi-Platform (Mobile, Tablet, PC) Ad Effectiveness & Media Planning
  • QuestPoint Decision Confidential – not to be distributed without written consentwww.QuestPoint.com 8 Your Logo
  • • the basics Confidential – not to be distributed without written consentwww.QuestPoint.com 9 • the data • extremely high volume of incoming data • structured and unstructured data sets • data used for driving application and user experience along with reporting • data needs to be available for both real-time and post-processed aggregated analysis • data needs to be accessible by multiple disparate systems (customers’ third party software portals, internal software, BI analysts, etc…)
  • • the basics Confidential – not to be distributed without written consentwww.QuestPoint.com 10 • some basic numbers • average ~ 62TB of network traffic a day • average ~ 11 billion database writes a day • average ~ 6.1 million concurrent user connections
  • • legacy system Confidential – not to be distributed without written consentwww.QuestPoint.com 11 • the in-house Ad Server farm • 8 Dell R810 running Microsoft SQL Server 2008 R2 • 64GB of RAM (per server) • 1 x 1.2TB Fusion IO cards (per server) • ~ 75,000 IOPS • the Business Intelligence (BI) server • Dell R910 running Microsoft SQL Server 2008 R2 • 1TB of RAM • 7 x 1.2TB Fusion IO cards • ~ 420,000 IOPS
  • • legacy system Confidential – not to be distributed without written consentwww.QuestPoint.com 12 • though the Ad Server farm was horizontally scalable, it was cost prohibited • scaling was a time intensive and complex process • despite all the BI horsepower, some aggregation queries were taking up 12 hours to run • it is extremely hard to make real-time decisions when your data is 12 hours behind
  • redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 13 • think horizontally scalable…cheaply • commodity hardware • open source • cheap redundancy • chosen technologies • Couchbase • Hadoop (hdfs, mapReduce) • hive, pig, oozie
  • redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 14 • the in-house Ad Server farm • 14 Dell R610 running Couchbase 1.8 • 48GB of RAM • 1 x 256GB OCZ SSD • ~ 55,000 IOPS • the BI Server Farm • 40 node hadoop cluster • 620TB of storage (total) • ~ 22,000 IOPS
  • redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 15 • some benefits • the Ad Server farm capable of performing 1.5 million operations per second • cheap and easy to scale out • if you hit a bottleneck, just add a node • built in redundancy • 12 hour query dropped to 5 minutes • with abstraction layers like hive, the learning curve for BI Analyst was almost nothing • multiple BI Analyst are able to run simultaneous queries, while hourly and daily scheduled jobs run, all without resource contention
  • redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 16
  • comparison and closing Confidential – not to be distributed without written consentwww.QuestPoint.com 17 $0.00 $50,000.00 $100,000.00 $150,000.00 $200,000.00 $250,000.00 Ad BI Legacy Design Cost vs Redesign Cost Legacy Redesign
  • comparison and closing Confidential – not to be distributed without written consentwww.QuestPoint.com 18 • with Couchbase and Hadoop • dramatic reduction in cost • increased performance • increased scalability • increased redundancy • decrease in headaches