Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS Customer Presentation - JovianDATA


Published on

Satya and Anupam, discuss how JovianData uses AWS and share some lessons learned, AWS Startup Tour - SV - 2010

Published in: Technology
  • Be the first to comment

AWS Customer Presentation - JovianDATA

  1. 1. Analytics at the Speed of Thought<br />2460 North First Street, Suite 170, San Jose, CA 95131  408-433-9383 <br />
  2. 2. JovianDATA Mission<br />Technology platform to optimize your conversion funnel at the lowest cost<br />
  3. 3. Why move to the cloud?<br />Considering AWS actively <br />but not sure about<br /><ul><li> Cap Ex benefits
  4. 4. Current stack’s cloud readiness
  5. 5. Application provisioning challenges</li></li></ul><li>Introducing JovianDATA<br />Ad Server Data, Search Engine Data<br />+<br />Sales/Conversion Data<br />Actionable Insights<br />Billions of Impressions, Clicks & Conversions (100’s of TB)<br />+<br />Site/Web Analytics Data<br />No sampling<br />Multi-dimensional analytics<br />+<br />In-Flight<br />Customer/3rd Party Data <br />Extremely low TCO<br />Fast Time-to-Value SaaS<br />+<br />Other Data Sources<br />
  6. 6. Transforming Data to Actionable Insights<br />Campaign Heat Map<br />Time<br />Fully Materialized Data Cube<br /><ul><li>Incremental updates
  7. 7. Multi-dimensional indexes
  8. 8. Multi-dimensional partitions</li></ul>Engagement<br />High<br />Geography<br />Medium<br />Low<br />Publishers<br />
  9. 9. Agenda<br />JovianDATA Company Overview<br />JovianInsights – The Power of Analytics<br />JovianDATA Cube Storage<br />Innovations in Advanced Analytics using commodity clusters<br />Analytics Lifecycle Management<br />Innovations in Cloud Infrastructure Management<br />
  10. 10. Avoiding Expensive Data Processing<br />Usage based <br />Automatic View Materialization<br />Avoid Network I/O <br />Multi-Dimensional Partitioning<br />Reduce Disk I/O<br />By Materializing Expensive Groups<br />
  11. 11. Why move to the cloud?<br />
  12. 12. Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
  13. 13. Managing CapEx with Role Based Clusters<br />SINGLE<br />CLUSTER FOR <br />DATA CLEANSING, LOAD AND QUERY<br />15TB<br />100 NODES<br />Monthly Cost = $28,800<br />
  14. 14. Managing Cap-Ex with Role Based Clusters<br />UI<br />Ad Server Data, Search Engine Data<br />2 hours daily for load on 10 nodes<br />Query on 5 nodes<br />Monthly Cost = $2,052<br />DATA CLEANSING<br />QUERY<br />LOAD MODEL<br />HIBERNATE MODEL<br />
  15. 15. Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
  16. 16. Selective Replication for on demand perf<br /><ul><li>Power analyst needs to perform complex, heavy number-crunching query that typically take 8 - 10 hours
  17. 17. Solution
  18. 18. FlexRestoreTM
  19. 19. Adds two new temporary nodes (Temp1, Temp2)
  20. 20. Creates new replicas for hot partitions and redistributes across nodes</li></ul>With Replication Factor = 1<br />Site Section Analytics = 10 minutes<br />Node1<br />Node2<br />Node3<br />Node4<br />P34<br />P12<br />P1<br />P1<br />P22<br />P3<br />P3<br />P12<br />With Replication Factor = 10<br />Site Section Analytics = 30 seconds<br />P12<br />P22<br />P34<br />P22<br />Nodeset1<br />P3<br />P34<br />P1<br />P3<br />Temp1<br />Temp2<br />P1<br />P34<br />P22<br />P12<br />
  21. 21. Reduce replication to maintain cost <br /><ul><li>When the analysis is done and the extra performance is not needed, the SLA Controller brings down the two temporary nodes (and the extra replicas)
  22. 22. Benefits
  23. 23. High performance computing power when you need it
  24. 24. But only when you need it to hold down operating costs</li></ul>Node1<br />Node2<br />Node3<br />Node4<br />P34<br />P1<br />P12<br />P1<br />P22<br />P3<br />P3<br />P12<br />P12<br />P22<br />P34<br />P22<br /><br />Nodeset1<br />P3<br />P34<br />P1<br />P3<br /><br />Temp1<br />Temp2<br /><br />P34<br />P1<br />P22<br />P12<br />
  25. 25. Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
  26. 26. Provision Tera Scale Applications in Minutes<br />Without Application Isolation<br />Data for all advertisers is kept<br />‘live’ on 50 nodes <br />Campaign Manager needs to run<br />heavy duty reports for a <br />Big Advertiser<br />50 live nodes per month<br />=<br />$14, 400<br />FUNNEL ANALYSIS FOR CLIENT<br />
  27. 27. Provision Tera Scale Applications in Minutes<br />Application is provisioned in parallel from S3/EBS into EC2<br />Campaign Manager requests<br />Application Provisioning for a <br />Specific Advertiser<br />50 nodes for fortnightly analysis<br />=<br />$320<br />FUNNEL ANALYSIS FOR CLIENT<br />HIBERNATED MODEL<br />
  28. 28. Summary<br />Reducing CapEx with Role based Temporary Clusters on EC2<br />10x Cost Savings with EC2 usage<br />Dynamic Provisioning with Selective Replication on EC2<br />10x Performance on EC2 replication<br />Application Isolation with Application Hibernation on S3/EBS<br />100x Cost Savings with EC2-S3<br />
  29. 29. Thank You<br />