Qo comparision


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Qo comparision

  1. 1. eXtremely large database (XLDB)<br />analytics<br />
  2. 2. XLDB Analytics in on-line advertising<br />OVERVIEW<br />A comparison of several analytical methodologies was conducted by performing a standard on-line advertising campaign and audience analysis consisting of 67 queries against a 2 billion+ impression log file with 350 million unique “user-ids”.<br />Methodologies included in the study:<br /><ul><li>Software approach
  3. 3. Mathematical/algorithmic database: iQO
  4. 4. Relational database, row oriented: MySQL, Oracle
  5. 5. Relational database, column oriented: Vertica
  6. 6. Hardware approach (brute force computing)
  7. 7. Map reduce: hadoop
  8. 8. Distributed relational database: Aster Data
  9. 9. Extrapolated performance and benchmark data from whitepapers published on the Aster Data website used in this study</li></li></ul><li>XLDB Analytics in on-line advertising<br />iQO delivers a quantum leap in performance and direct energy savings<br />normalized values indexed against iQO<br />Carbon Units per Query (CUpQ) Calculation<br />Number of nodes + number of CPU’s + disk (in TB) multiplied by number of users and sum of elapsed query time seconds of the 67 queries. For 25 users, hadoop would yield 35,925 carbon units to iQO’s 25.<br />Query response time observations<br />iQO provided the fastest and most uniform performance with most queries returned in 5 seconds or less. Hadoop and Aster median response times were several hours but still were 3 to 4 times faster than relational approaches.<br />
  10. 10. XLDB Analytics in on-line advertising<br />SCALABILITY<br />iQO answered 100% of the queries<br /><ul><li>iQO and “brute force computing” are readily scalable to daily terabyte volumes
  11. 11. iQO can readily scale to petabyte volume applications whereas hadoop and Aster would reach unwieldy hardware proportions making them impractical to implement with extremely slow response times
  12. 12. MySQL and Oracle are not scalable for daily terabyte volumes
  13. 13. Vertica scaled better than row oriented rdbms’s but introduced limitations in breadth of query capabilities </li></li></ul><li>Estimate of resources needed for alternate solutions<br /> Note: Oracle and MySQL not able to scale with data or return query results in sufficient time to be measured<br />
  14. 14. iQO Solutions<br />Scalable & portable<br />Uniquely suited for high volume and high cardinality<br />Most queries returned in seconds<br />Allows for Distinct Counts and Exclusive Counts not available in other solutions<br />Lowest carbon footprint per query<br />
  15. 15. HADOOP<br /><ul><li>Requires extremely large server array to run effectively
  16. 16. Custom coding from skilled programmers required for each query
  17. 17. Inefficient data storage and management
  18. 18. Distinct counts difficult to achieve
  19. 19. Requires a team to manage effectively
  20. 20. No connectivity with BI and data mining tools
  21. 21. Business Objects
  22. 22. Tableau</li></li></ul><li>Vertica<br />Extensive and costly customization required to improve query performance due to overhead of “training queries”<br />Not suited for very high volume and high cardinality<br />Unconstrained queries typical for on-line advertising analytics are extremely slow<br />Distinct Counts difficult to achieve and Exclusive Counts not possible.<br />Not efficient for ad-hoc queries or exploration<br />
  23. 23. Aster Data<br /><ul><li>Requires customized high-powered servers
  24. 24. Distinct Counts difficult to achieve and Exclusive Counts not possible.
  25. 25. Not efficient for ad-hoc queries or exploration</li></ul>MySQL/Oracle<br /><ul><li>Suitable for storing data but access and indexing times make this untenable for any real analysis
  26. 26. Space and server requirements to difficult to estimate
  27. 27. Query times would be impossibly long</li>