Query O

405 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Query O

  1. 2. <ul><li>OVERVIEW </li></ul><ul><li>A comparison of several analytical methodologies was conducted by performing a standard on-line advertising campaign and audience analysis consisting of 67 queries against a 2 billion+ impression log file with 350 million unique “user-ids”. </li></ul><ul><li>Methodologies included in the study: </li></ul><ul><li>Software approach </li></ul><ul><ul><li>Mathematical/algorithmic database: iQO </li></ul></ul><ul><ul><li>Relational database, row oriented: MySQL, Oracle </li></ul></ul><ul><ul><li>Relational database, column oriented: Vertica </li></ul></ul><ul><li>Hardware approach (brute force computing) </li></ul><ul><ul><li>Map reduce: hadoop </li></ul></ul><ul><ul><li>Distributed relational database: Aster Data </li></ul></ul><ul><ul><ul><li>Extrapolated performance and benchmark data from whitepapers published on the Aster Data website used in this study </li></ul></ul></ul>XLDB Analytics in on-line advertising
  2. 3. iQO delivers a quantum leap in performance and direct energy savings Carbon Units per Query (CUpQ) Calculation Number of nodes + number of CPU’s + disk (in TB) multiplied by number of users and sum of elapsed query time seconds of the 67 queries. For 25 users, hadoop would yield 35,925 carbon units to iQO’s 25. Query response time observations iQO provided the fastest and most uniform performance with most queries returned in 5 seconds or less. Hadoop and Aster median response times were several hours but still were 3 to 4 times faster than relational approaches. normalized values indexed against iQO XLDB Analytics in on-line advertising
  3. 4. <ul><li>SCALABILITY </li></ul>XLDB Analytics in on-line advertising <ul><li>iQO and “brute force computing” are readily scalable to daily terabyte volumes </li></ul><ul><li>iQO can readily scale to petabyte volume applications whereas hadoop and Aster would reach unwieldy hardware proportions making them impractical to implement with extremely slow response times </li></ul><ul><li>MySQL and Oracle are not scalable for daily terabyte volumes </li></ul><ul><li>Vertica scaled better than row oriented rdbms’s but introduced limitations in breadth of query capabilities </li></ul>iQO answered 100% of the queries
  4. 5. Estimate of resources needed for alternate solutions Note: Oracle and MySQL not able to scale with data or return query results in sufficient time to be measured iQO HADOOP VERTICA ASTER DATA CPUs 4 50 12 50 Nodes 1 25+ 3 25+ Memory (GB) 32 800 96 1600 Disk Space (TB) 2 24 12 24 Personnel .5 Team? minimum of 2 Minimum of 2 Query Time Seconds to minutes Minutes to days Minutes to days Minutes to days
  5. 6. <ul><li>iQO Solutions </li></ul><ul><li>Scalable & portable </li></ul><ul><li>Uniquely suited for high volume and high cardinality </li></ul><ul><li>Most queries returned in seconds </li></ul><ul><li>Allows for Distinct Counts and Exclusive Counts not available in other solutions </li></ul><ul><li>Lowest carbon footprint per query </li></ul>
  6. 7. <ul><li>HADOOP </li></ul><ul><li>Requires extremely large server array to run effectively </li></ul><ul><li>Custom coding from skilled programmers required for each query </li></ul><ul><li>Inefficient data storage and management </li></ul><ul><li>Distinct counts difficult to achieve </li></ul><ul><li>Requires a team to manage effectively </li></ul><ul><li>No connectivity with BI and data mining tools </li></ul><ul><ul><li>Business Objects </li></ul></ul><ul><ul><li>Tableau </li></ul></ul>
  7. 8. <ul><li>Vertica </li></ul><ul><li>Extensive and costly customization required to improve query performance due to overhead of “training queries” </li></ul><ul><li>Not suited for very high volume and high cardinality </li></ul><ul><li>Unconstrained queries typical for on-line advertising analytics are extremely slow </li></ul><ul><li>Distinct Counts difficult to achieve and Exclusive Counts not possible. </li></ul><ul><li>Not efficient for ad-hoc queries or exploration </li></ul>
  8. 9. <ul><li>Aster Data </li></ul><ul><li>Requires customized high-powered servers </li></ul><ul><li>Distinct Counts difficult to achieve and Exclusive Counts not possible. </li></ul><ul><li>Not efficient for ad-hoc queries or exploration </li></ul><ul><li>MySQL/Oracle </li></ul><ul><li>Suitable for storing data but access and indexing times make this untenable for any real analysis </li></ul><ul><li>Space and server requirements to difficult to estimate </li></ul><ul><li>Query times would be impossibly long </li></ul>

×