Razorfish - Amazon EMR usecase

1,480 views

Published on

How Razorfish enabled a large Enterprse company to use Amazon EMR which increased their Return on Advertising spend by 500%

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,480
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
0
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • Return on advertising spend (ROAS)
  • Razorfish - Amazon EMR usecase

    1. 1. Edge in the cloud<br />Salim Hemdani <br />VP, Experiences and Platforms<br />@shemdani<br />
    2. 2. 500,000,000,000<br />
    3. 3. 1,000<br />
    4. 4. 100<br />
    5. 5. 25<br />
    6. 6. 13<br />
    7. 7. What are these numbers?<br />
    8. 8. Numbers<br />500,000,000,000 records<br />1,000 clients<br />100 markets<br />25 data sources<br />13 terabytes per day<br />
    9. 9. Agenda<br />
    10. 10. Time for a change<br />
    11. 11. Transition Service Agreement<br />Move from Atlas<br /><ul><li>Traditional hosting environment
    12. 12. Heavy on CAPEX
    13. 13. Managed by Atlas/MSFT networking teams
    14. 14. To be completed by October 2010; no interruption in SLA</li></ul>Move away from PVM<br />
    15. 15. Ad Serving Event Log<br />Request<br />hash(key) mod R<br />FS01<br />FS03<br />FS02<br />98101<br />98104<br />98115<br />98201<br />98203<br />98004<br />98007<br />98065<br />
    16. 16. MapReduce (divide and concur)<br />HDFS<br /><ul><li>Distributed data storage
    17. 17. Distributed processing
    18. 18. Language agnostic </li></ul>Any Language<br />Job tracker<br />Task tracker<br />
    19. 19. AWS <br />
    20. 20. Aggregate Ad Serving data <br />Log Files<br />File Export<br />APIs<br />Internet<br />Client Provided Data<br />Data Sources<br />Presentation Layer<br />Talend Data Flow Manager<br />Direct Analytics Processing via EMR<br />Web Application Layer<br />ODBC<br />Edge Provisioning DB<br />OLAP<br />Cache<br />Cloud Storage S3<br /> HBase/SDB<br />15<br />Elastic MapReduce<br />
    21. 21. Name Brand Retailer Case Study<br />Business challenge<br /><ul><li>Changing competitive landscape
    22. 22. Decreasing web marketing effectiveness
    23. 23. Monetization of their web assets</li></li></ul><li>Bring it all together<br />Product interest<br />Affinity<br />Generation<br />+<br />+<br />In market Gamer<br />Sport Enthusiast <br />Purchaser Home Theater<br />( 1 of 36 “Personalization” segments ) <br />
    24. 24. Drive a personalized message<br />User recently purchased a home theater system and is now looking for sports games<br />Target Ad<br />( 1.7 million per day ) <br />
    25. 25. We import Atlas transaction level data<br />24 servers<br />S3 file storage<br />Compress and upload 200 + GB of data per day<br />( 180 days = ½ Trillion ICA records )<br />
    26. 26. We use EMR to process and segment<br />EMR<br />S3<br />100 Machinecluster created on demand<br />( 3.5 Billion records, 71 million unique cookies a day)<br />
    27. 27. Process and Cost<br />This all happens in about 8 hours every day and is fully automated (previously 2+ days)<br />And increased ROAS by 500% (to $74)<br />
    28. 28. Why AWS<br />Efficient<br />Elastic infrastructure from AWS allows capacity to be provisioned as needed based on load, reducing cost and the risk of processing delays<br />Ease of integration<br />Amazon Elastic MapReduce with Cascading allows data processing in the cloud without any changes to the underlying algorithms<br />Flexible<br />Hadoop with Cascading is flexible enough to allow “agile” implementation and unit testing of sophisticated algorithms.<br />Adaptable<br />Cascading simplifies the integration of Hadoop with external ad system<br />Scalable<br />AWS infrastructure helps reliably store and process huge (Petabytes) data setss<br />
    29. 29. Learning<br />
    30. 30. Thank you.<br />

    ×