Analytics at the Speed of Thought<br />Satya Ramachandran<br />Vice President of Engineering<br />Anupam Singh<br />Chief ...
JovianDATA Mission<br />Technology platform to optimize your conversion funnel at the lowest cost<br />
Why move to the cloud?<br />Considering AWS actively <br />but not sure about<br /><ul><li> Cap Ex benefits
 Current stack’s cloud readiness
Application provisioning challenges</li></li></ul><li>Introducing JovianDATA<br />Ad Server Data, Search Engine Data<br />...
Transforming Data to Actionable Insights<br />Campaign Heat Map<br />Time<br />Fully Materialized Data Cube<br /><ul><li>I...
Multi-dimensional indexes
Multi-dimensional partitions</li></ul>Engagement<br />High<br />Geography<br />Medium<br />Low<br />Publishers<br />
Agenda<br />JovianDATA Company Overview<br />JovianInsights – The Power of Analytics<br />JovianDATA Cube Storage<br />Inn...
Avoiding Expensive Data Processing<br />Usage based  <br />Automatic View Materialization<br />Avoid Network I/O <br />Mul...
Why move to the cloud?<br />
Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
Managing CapEx with Role Based Clusters<br />SINGLE<br />CLUSTER FOR <br />DATA CLEANSING, LOAD AND QUERY<br />15TB<br />1...
Managing Cap-Ex with Role Based Clusters<br />UI<br />Ad Server Data, Search Engine Data<br />2 hours daily for load on 10...
Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
Selective Replication for on demand perf<br /><ul><li>Power analyst needs to perform complex, heavy number-crunching query...
Solution
FlexRestoreTM
Adds two new temporary nodes (Temp1, Temp2)
Upcoming SlideShare
Loading in …5
×

Jovian Data Amazon Final Version

1,399 views
1,308 views

Published on

JovianDATA presentation at the AWS startup event

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,399
On SlideShare
0
From Embeds
0
Number of Embeds
82
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Jovian Data Amazon Final Version

  1. 1. Analytics at the Speed of Thought<br />Satya Ramachandran<br />Vice President of Engineering<br />Anupam Singh<br />Chief Technology Officer<br />April 14, 2010<br />2460 North First Street, Suite 170, San Jose, CA 95131  408-433-9383  www.joviandata.com<br />
  2. 2. JovianDATA Mission<br />Technology platform to optimize your conversion funnel at the lowest cost<br />
  3. 3. Why move to the cloud?<br />Considering AWS actively <br />but not sure about<br /><ul><li> Cap Ex benefits
  4. 4. Current stack’s cloud readiness
  5. 5. Application provisioning challenges</li></li></ul><li>Introducing JovianDATA<br />Ad Server Data, Search Engine Data<br />+<br />Sales/Conversion Data<br />Actionable Insights<br />Billions of Impressions, Clicks & Conversions (100’s of TB)<br />+<br />Site/Web Analytics Data<br />No sampling<br />Multi-dimensional analytics<br />+<br />In-Flight<br />Customer/3rd Party Data <br />Extremely low TCO<br />Fast Time-to-Value SaaS<br />+<br />Other Data Sources<br />
  6. 6. Transforming Data to Actionable Insights<br />Campaign Heat Map<br />Time<br />Fully Materialized Data Cube<br /><ul><li>Incremental updates
  7. 7. Multi-dimensional indexes
  8. 8. Multi-dimensional partitions</li></ul>Engagement<br />High<br />Geography<br />Medium<br />Low<br />Publishers<br />
  9. 9. Agenda<br />JovianDATA Company Overview<br />JovianInsights – The Power of Analytics<br />JovianDATA Cube Storage<br />Innovations in Advanced Analytics using commodity clusters<br />Analytics Lifecycle Management<br />Innovations in Cloud Infrastructure Management<br />
  10. 10. Avoiding Expensive Data Processing<br />Usage based <br />Automatic View Materialization<br />Avoid Network I/O <br />Multi-Dimensional Partitioning<br />Reduce Disk I/O<br />By Materializing Expensive Groups<br />
  11. 11. Why move to the cloud?<br />
  12. 12. Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
  13. 13. Managing CapEx with Role Based Clusters<br />SINGLE<br />CLUSTER FOR <br />DATA CLEANSING, LOAD AND QUERY<br />15TB<br />100 NODES<br />Monthly Cost = $28,800<br />
  14. 14. Managing Cap-Ex with Role Based Clusters<br />UI<br />Ad Server Data, Search Engine Data<br />2 hours daily for load on 10 nodes<br />8 hours daily for query on 5 nodes<br />Monthly Cost = $2,052<br />DATA CLEANSING<br />QUERY<br />LOAD MODEL<br />HIBERNATE MODEL<br />
  15. 15. Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
  16. 16. Selective Replication for on demand perf<br /><ul><li>Power analyst needs to perform complex, heavy number-crunching query that typically take 8 - 10 hours
  17. 17. Solution
  18. 18. FlexRestoreTM
  19. 19. Adds two new temporary nodes (Temp1, Temp2)
  20. 20. Creates new replicas for hot partitions and redistributes across nodes</li></ul>With Replication Factor = 1<br />Site Section Analytics = 10 minutes<br />Node1<br />Node2<br />Node3<br />Node4<br />P34<br />P12<br />P1<br />P1<br />P22<br />P3<br />P3<br />P12<br />With Replication Factor = 10<br />Site Section Analytics = 30 seconds<br />P12<br />P22<br />P34<br />P22<br />Nodeset1<br />P3<br />P34<br />P1<br />P3<br />Temp1<br />Temp2<br />P1<br />P34<br />P22<br />P12<br />
  21. 21. Reduce replication to maintain cost <br /><ul><li>When the analysis is done and the extra performance is not needed, the SLA Controller brings down the two temporary nodes (and the extra replicas)
  22. 22. Benefits
  23. 23. High performance computing power when you need it
  24. 24. But only when you need it to hold down operating costs</li></ul>Node1<br />Node2<br />Node3<br />Node4<br />P34<br />P1<br />P12<br />P1<br />P22<br />P3<br />P3<br />P12<br />P12<br />P22<br />P34<br />P22<br /><br />Nodeset1<br />P3<br />P34<br />P1<br />P3<br /><br />Temp1<br />Temp2<br /><br />P34<br />P1<br />P22<br />P12<br />
  25. 25. Agenda<br />Reducing Capex<br />Dynamic Provisioning<br />Application Isolation<br />
  26. 26. Provision Tera Scale Applications in Minutes<br />Without Application Isolation<br />Data for all advertisers is kept<br />‘live’ on 50 nodes <br />Campaign Manager needs to run<br />heavy duty reports for a <br />Big Advertiser<br />50 live nodes per month<br />=<br />$14, 400<br />FUNNEL ANALYSIS FOR CLIENT<br />
  27. 27. Provision Tera Scale Applications in Minutes<br />Application is provisioned in parallel from S3/EBS into EC2<br />Campaign Manager requests<br />Application Provisioning for a <br />Specific Advertiser<br />50 nodes for fortnightly analysis<br />=<br />$320<br />FUNNEL ANALYSIS FOR CLIENT<br />HIBERNATED MODEL<br />
  28. 28. Summary<br />Reducing CapEx with Role based Temporary Clusters on EC2<br />10x Cost Savings with EC2 usage<br />Dynamic Provisioning with Selective Replication on EC2<br />10x Performance on EC2 replication<br />Application Isolation with Application Hibernation on S3/EBS<br />100x Cost Savings with EC2-S3<br />
  29. 29. Thank You<br />

×