QuestPoint_Couchbase_SF_2013

2,015 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
2,015
On SlideShare
0
From Embeds
0
Number of Embeds
1,489
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Questpoint has created one of the largest panel of consumer browsing behavior from around the globe, including over 140 countries.Questpoint purchases panelist and clickstream information from multiplesources. Each source provids the consumer some innate benefit to using their particular piece of software. In turn, consumers have opted-in to having their clickstream anonymously tracked.Best-of-breed: Behavior prediction, Demographic correlation, Retargeting,Creatives, etc…
  • Possibly move the problem slides up to earlier in the presentation and describe things that we need to solve. Put in a slide to show big data ecosystem.
  • equals ~ 127,000 writes per secondAd Server using an in-house designed sharding
  • QuestPoint_Couchbase_SF_2013

    1. 1. QUESTPOINT COUCHBASE, NOSQL AND BIG DATA SEPT 2013 1 Confidential – not to be distributed without written consentwww.QuestPoint.com
    2. 2. table of contents Confidential – not to be distributed without written consentwww.QuestPoint.com • about me • about us • some basic information • legacy system architecture and problems • redesigned system architecture and improvements • comparison and closing 2
    3. 3. about me Confidential – not to be distributed without written consentwww.QuestPoint.com 3 • currently the Chief Information Officer of QuestPoint. • worked for Microsoft on the Xbox Live Services backend platform • 13 years of data architecting and coding experience • graduated with a BS in Technology Management with a minor in Math and emphasis in Computer Science • what does all of that actually mean?
    4. 4. about me Confidential – not to be distributed without written consentwww.QuestPoint.com 4 •nerd
    5. 5. Confidential – not to be distributed without written consentwww.QuestPoint.com 5 1 2 • at QuestPoint, we love data. • we transform proprietary data into actionable insights for both internal and external customers. We use data to better understand and predict online consumer behavior; adding visibility into who does what, when, and why. • these data driven insights provide powerful competitive advantages to our business lines: • DATA SALES • COMPETITIVE INTEL • AUDIENCE INSIGHTS • LEAD GEN • AFFILIATE MARKETING • TRAFFIC ARBITRAGE about us
    6. 6. Confidential – not to be distributed without written consentwww.QuestPoint.com 6 Traffic Arbitrage (CPC | CPM) Affiliate Marketing (CPA) Lead Generation (CPL) QuestPoint Marketing • performance marketing services. • high quality traffic and leads for online marketers. QuestPoint Marketing uses data to crack the code on who does what online, when, and why. QuestPoint Marketing
    7. 7. QuestPoint Decision Confidential – not to be distributed without written consentwww.QuestPoint.com 7 QuestPoint Decision • global leader in providing useful data-driven marketing insights to brands and marketers. • competitive and audience intel. • understand customers, potential customers, and competition better. • our large panel gives us superior breadth of countries as well as depth of statistical significance within each country. Competitive Intelligence Audience Insights Multi-Platform (Mobile, Tablet, PC) Ad Effectiveness & Media Planning
    8. 8. QuestPoint Decision Confidential – not to be distributed without written consentwww.QuestPoint.com 8 Your Logo
    9. 9. • the basics Confidential – not to be distributed without written consentwww.QuestPoint.com 9 • the data • extremely high volume of incoming data • structured and unstructured data sets • data used for driving application and user experience along with reporting • data needs to be available for both real-time and post-processed aggregated analysis • data needs to be accessible by multiple disparate systems (customers’ third party software portals, internal software, BI analysts, etc…)
    10. 10. • the basics Confidential – not to be distributed without written consentwww.QuestPoint.com 10 • some basic numbers • average ~ 62TB of network traffic a day • average ~ 11 billion database writes a day • average ~ 6.1 million concurrent user connections
    11. 11. • legacy system Confidential – not to be distributed without written consentwww.QuestPoint.com 11 • the in-house Ad Server farm • 8 Dell R810 running Microsoft SQL Server 2008 R2 • 64GB of RAM (per server) • 1 x 1.2TB Fusion IO cards (per server) • ~ 75,000 IOPS • the Business Intelligence (BI) server • Dell R910 running Microsoft SQL Server 2008 R2 • 1TB of RAM • 7 x 1.2TB Fusion IO cards • ~ 420,000 IOPS
    12. 12. • legacy system Confidential – not to be distributed without written consentwww.QuestPoint.com 12 • though the Ad Server farm was horizontally scalable, it was cost prohibited • scaling was a time intensive and complex process • despite all the BI horsepower, some aggregation queries were taking up 12 hours to run • it is extremely hard to make real-time decisions when your data is 12 hours behind
    13. 13. redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 13 • think horizontally scalable…cheaply • commodity hardware • open source • cheap redundancy • chosen technologies • Couchbase • Hadoop (hdfs, mapReduce) • hive, pig, oozie
    14. 14. redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 14 • the in-house Ad Server farm • 14 Dell R610 running Couchbase 1.8 • 48GB of RAM • 1 x 256GB OCZ SSD • ~ 55,000 IOPS • the BI Server Farm • 40 node hadoop cluster • 620TB of storage (total) • ~ 22,000 IOPS
    15. 15. redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 15 • some benefits • the Ad Server farm capable of performing 1.5 million operations per second • cheap and easy to scale out • if you hit a bottleneck, just add a node • built in redundancy • 12 hour query dropped to 5 minutes • with abstraction layers like hive, the learning curve for BI Analyst was almost nothing • multiple BI Analyst are able to run simultaneous queries, while hourly and daily scheduled jobs run, all without resource contention
    16. 16. redesigned system Confidential – not to be distributed without written consentwww.QuestPoint.com 16
    17. 17. comparison and closing Confidential – not to be distributed without written consentwww.QuestPoint.com 17 $0.00 $50,000.00 $100,000.00 $150,000.00 $200,000.00 $250,000.00 Ad BI Legacy Design Cost vs Redesign Cost Legacy Redesign
    18. 18. comparison and closing Confidential – not to be distributed without written consentwww.QuestPoint.com 18 • with Couchbase and Hadoop • dramatic reduction in cost • increased performance • increased scalability • increased redundancy • decrease in headaches

    ×