Jovian Data Amazon Final Version
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Jovian Data Amazon Final Version

  • 1,538 views
Uploaded on

JovianDATA presentation at the AWS startup event

JovianDATA presentation at the AWS startup event

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,538
On Slideshare
1,499
From Embeds
39
Number of Embeds
3

Actions

Shares
Downloads
9
Comments
0
Likes
0

Embeds 39

http://www.linkedin.com 26
https://www.linkedin.com 11
http://www.lmodules.com 2

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Analytics at the Speed of Thought
    Satya Ramachandran
    Vice President of Engineering
    Anupam Singh
    Chief Technology Officer
    April 14, 2010
    2460 North First Street, Suite 170, San Jose, CA 95131  408-433-9383  www.joviandata.com
  • 2. JovianDATA Mission
    Technology platform to optimize your conversion funnel at the lowest cost
  • 3. Why move to the cloud?
    Considering AWS actively
    but not sure about
    • Cap Ex benefits
    • 4. Current stack’s cloud readiness
    • 5. Application provisioning challenges
  • Introducing JovianDATA
    Ad Server Data, Search Engine Data
    +
    Sales/Conversion Data
    Actionable Insights
    Billions of Impressions, Clicks & Conversions (100’s of TB)
    +
    Site/Web Analytics Data
    No sampling
    Multi-dimensional analytics
    +
    In-Flight
    Customer/3rd Party Data
    Extremely low TCO
    Fast Time-to-Value SaaS
    +
    Other Data Sources
  • 6. Transforming Data to Actionable Insights
    Campaign Heat Map
    Time
    Fully Materialized Data Cube
    • Incremental updates
    • 7. Multi-dimensional indexes
    • 8. Multi-dimensional partitions
    Engagement
    High
    Geography
    Medium
    Low
    Publishers
  • 9. Agenda
    JovianDATA Company Overview
    JovianInsights – The Power of Analytics
    JovianDATA Cube Storage
    Innovations in Advanced Analytics using commodity clusters
    Analytics Lifecycle Management
    Innovations in Cloud Infrastructure Management
  • 10. Avoiding Expensive Data Processing
    Usage based
    Automatic View Materialization
    Avoid Network I/O
    Multi-Dimensional Partitioning
    Reduce Disk I/O
    By Materializing Expensive Groups
  • 11. Why move to the cloud?
  • 12. Agenda
    Reducing Capex
    Dynamic Provisioning
    Application Isolation
  • 13. Managing CapEx with Role Based Clusters
    SINGLE
    CLUSTER FOR
    DATA CLEANSING, LOAD AND QUERY
    15TB
    100 NODES
    Monthly Cost = $28,800
  • 14. Managing Cap-Ex with Role Based Clusters
    UI
    Ad Server Data, Search Engine Data
    2 hours daily for load on 10 nodes
    8 hours daily for query on 5 nodes
    Monthly Cost = $2,052
    DATA CLEANSING
    QUERY
    LOAD MODEL
    HIBERNATE MODEL
  • 15. Agenda
    Reducing Capex
    Dynamic Provisioning
    Application Isolation
  • 16. Selective Replication for on demand perf
    • Power analyst needs to perform complex, heavy number-crunching query that typically take 8 - 10 hours
    • 17. Solution
    • 18. FlexRestoreTM
    • 19. Adds two new temporary nodes (Temp1, Temp2)
    • 20. Creates new replicas for hot partitions and redistributes across nodes
    With Replication Factor = 1
    Site Section Analytics = 10 minutes
    Node1
    Node2
    Node3
    Node4
    P34
    P12
    P1
    P1
    P22
    P3
    P3
    P12
    With Replication Factor = 10
    Site Section Analytics = 30 seconds
    P12
    P22
    P34
    P22
    Nodeset1
    P3
    P34
    P1
    P3
    Temp1
    Temp2
    P1
    P34
    P22
    P12
  • 21. Reduce replication to maintain cost
    • When the analysis is done and the extra performance is not needed, the SLA Controller brings down the two temporary nodes (and the extra replicas)
    • 22. Benefits
    • 23. High performance computing power when you need it
    • 24. But only when you need it to hold down operating costs
    Node1
    Node2
    Node3
    Node4
    P34
    P1
    P12
    P1
    P22
    P3
    P3
    P12
    P12
    P22
    P34
    P22

    Nodeset1
    P3
    P34
    P1
    P3

    Temp1
    Temp2

    P34
    P1
    P22
    P12
  • 25. Agenda
    Reducing Capex
    Dynamic Provisioning
    Application Isolation
  • 26. Provision Tera Scale Applications in Minutes
    Without Application Isolation
    Data for all advertisers is kept
    ‘live’ on 50 nodes
    Campaign Manager needs to run
    heavy duty reports for a
    Big Advertiser
    50 live nodes per month
    =
    $14, 400
    FUNNEL ANALYSIS FOR CLIENT
  • 27. Provision Tera Scale Applications in Minutes
    Application is provisioned in parallel from S3/EBS into EC2
    Campaign Manager requests
    Application Provisioning for a
    Specific Advertiser
    50 nodes for fortnightly analysis
    =
    $320
    FUNNEL ANALYSIS FOR CLIENT
    HIBERNATED MODEL
  • 28. Summary
    Reducing CapEx with Role based Temporary Clusters on EC2
    10x Cost Savings with EC2 usage
    Dynamic Provisioning with Selective Replication on EC2
    10x Performance on EC2 replication
    Application Isolation with Application Hibernation on S3/EBS
    100x Cost Savings with EC2-S3
  • 29. Thank You