BIG-DATA  When your data sets become so large that you have to startinnovating how to collect, store,      analyze and sha...
Volume3Vs   Velocity      Variety
BIG-DATA   The collection andanalysis of large amounts     of data creates competitive advantage
BIGGER IS BETTER
Online Population    Mobile Phone    Machine Data
1 Trillion Objects!
COLLECT | STORE | ANALYZE | SHARE
COLLECT | STORE | ANALYZE | SHARE
•   Stream data to Amazon using Apache Flume    • Amazon S3    • Amazon Elastic MapReduce
COLLECT | STORE | ANALYZE | SHARE
Structure               High                                      Low       Large                                         ...
ANALYZEORGINIZE | CLEAN | ENRICH | CONDENSE
DynamoDB Table:           On-Premise DB Table:Daily-Orders              Customer-DemographicsNoSQL Table               SQL...
DynamoDB Table:                  On-Premise DB Table:Daily-Orders                     Customer-DemographicsNoSQL Table    ...
S3 file:s3://weekly-trend-data/CSV ReportS3 file:s3://monthly-trend-data/CSV Report
AMAZON ELASTIC MAPREDUCEReduces complexity/cost of Hadoop ManagementIntegrates seamlessly with AWS ServicesLeverages unmat...
Hadoop on Elastic MapReducelowers the cost of developing and  operating a distributed system.
Amazon EMR and Amazon S3                           S3
Recommendation Ad-hoc      Engine    Analysis   Personalization                              Prod Cluster           S3    ...
Prod Cluster         (EMR)S3        EMR     Query Cluster        (EMR)        EMR         EMR               EMR           ...
DynamoDB   S3
EMR   DynamoDBS3
DynamoDB
ANALYZE SHAREVISUALIZE | EXPLORE | DECIDE
Big Data Use Cases
Digital Advertising                      Web Analytics                                      Log Processing                ...
SocialMedia/Advertising   Oil & Gas       Retail        Life Sciences   Financial Services      Security                  ...
Who is VivaKi?           ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
Big Data Challenge for VivaKiEnablement       Activation                                             Attribution          ...
The Product Solution – Fluent from RazorfishA digital marketing technology platform that provides marketers and agencies w...
VivaKi Technology Solution           ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
Example: Atlas Cookie Level Data                    Click Stream                                                          ...
Example: Atlas Cookie Level Data Operational Specifics                   Traditional Data Center Solution                ...
Better?
Search Ads Restyled
Etsy onOprah                           Search Ads Restyled                                      Hurricane                 ...
5%95%
Thank you!aws.amazon.com/big-data
We are sincerely eager tohear your FEEDBACK on thispresentation and on re:Invent. Please fill out an evaluation   form whe...
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
Upcoming SlideShare
Loading in...5
×

BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012

1,510

Published on

Big data technologies let you work with any velocity, volume, or variety of data in a highly productive environment. This session seeks to answer questions such as "what is big data," "how can I use unstructured data," and "how can I integrate data collections from different sources" using Hadoop with Amazon Elastic MapReduce. Join general manager of EMR, Peter Sirota, on a journey through real-world use cases of data-driven discovery.

0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,510
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
7
Embeds 0
No embeds

No notes for slide

Transcript of "BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012"

  1. 1. BIG-DATA When your data sets become so large that you have to startinnovating how to collect, store, analyze and share it
  2. 2. Volume3Vs Velocity Variety
  3. 3. BIG-DATA The collection andanalysis of large amounts of data creates competitive advantage
  4. 4. BIGGER IS BETTER
  5. 5. Online Population Mobile Phone Machine Data
  6. 6. 1 Trillion Objects!
  7. 7. COLLECT | STORE | ANALYZE | SHARE
  8. 8. COLLECT | STORE | ANALYZE | SHARE
  9. 9. • Stream data to Amazon using Apache Flume • Amazon S3 • Amazon Elastic MapReduce
  10. 10. COLLECT | STORE | ANALYZE | SHARE
  11. 11. Structure High Low Large S3 EMR HDFS HbaseSize Dynamo DB RDS Small Logs on App servers
  12. 12. ANALYZEORGINIZE | CLEAN | ENRICH | CONDENSE
  13. 13. DynamoDB Table: On-Premise DB Table:Daily-Orders Customer-DemographicsNoSQL Table SQL Table RDS Table: Targeting-Information
  14. 14. DynamoDB Table: On-Premise DB Table:Daily-Orders Customer-DemographicsNoSQL Table SQL TableS3://clickstream-data/ 3rd Party Data: Apache Logs Social Networking Information Accessed via web API RDS Table: Targeting-Information
  15. 15. S3 file:s3://weekly-trend-data/CSV ReportS3 file:s3://monthly-trend-data/CSV Report
  16. 16. AMAZON ELASTIC MAPREDUCEReduces complexity/cost of Hadoop ManagementIntegrates seamlessly with AWS ServicesLeverages unmatched operational experience
  17. 17. Hadoop on Elastic MapReducelowers the cost of developing and operating a distributed system.
  18. 18. Amazon EMR and Amazon S3 S3
  19. 19. Recommendation Ad-hoc Engine Analysis Personalization Prod Cluster S3 (EMR) EMRData consumed in multiple ways
  20. 20. Prod Cluster (EMR)S3 EMR Query Cluster (EMR) EMR EMR EMR EMR
  21. 21. DynamoDB S3
  22. 22. EMR DynamoDBS3
  23. 23. DynamoDB
  24. 24. ANALYZE SHAREVISUALIZE | EXPLORE | DECIDE
  25. 25. Big Data Use Cases
  26. 26. Digital Advertising Web Analytics Log Processing Data Warehousing
  27. 27. SocialMedia/Advertising Oil & Gas Retail Life Sciences Financial Services Security Network/Gaming User Anti-virus Demographics Targeted Recommendations Monte Carlo Advertising Simulations Seismic Genome Fraud Detection Usage analysis Analysis Analysis Image and Transactions Video Analysis Risk Analysis Processing Image Recognition In-game metrics
  28. 28. Who is VivaKi? ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  29. 29. Big Data Challenge for VivaKiEnablement Activation Attribution ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  30. 30. The Product Solution – Fluent from RazorfishA digital marketing technology platform that provides marketers and agencies with a single,integrated software application to target, distribute, and manage multi-channel digital campaigns andexperiences. Marketing Central (Marketing Planning and Management, Team Collaboration and Workflow) Experience Publishing (CMS / DMS, Multi-Channel and Multi-Device Distribution, Social Monitoring) Targeting Insights (Multi-Channel Aware Segmentation and Targeting) (Analytics and Reporting, including Attribution) Data Warehouse (Data Sources - 1st and 3rd Party, Data Normalization + Transformation, Data Management) Amazon Cloud Infrastructure ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  31. 31. VivaKi Technology Solution ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  32. 32. Example: Atlas Cookie Level Data Click Stream Historical Click Stream Fe Data eUser Browsing d Ad Server Logs Session Data Mining Apply Customization Segmentation & Categorization Algorithm Customer Loyalty Data Ad Serving System Cross Selling System ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  33. 33. Example: Atlas Cookie Level Data Operational Specifics Traditional Data Center Solution Amazon Cloud Solution 30 Processing Servers (HP Proliant DL-360) 3 SQL Servers (HP Proliant DL-580) EMR Cluster of up to 1000 EC2 Instances Configuration 10TB SAN Storage 200GB additional S3 storage per month Processing 2 to 30 hours reliably 9 hours Data Retention 90 days 18 months System Cost $5000/month $10000/month Personnel Cost $15000/month $5500/month Business Impact  no upfront investment in hardware  no hardware procurement delay  no additional operations staff was hired  We completed development and testing of our first client project in six weeks. Our process is completely automated.  our first client campaign experienced a 500% increase in their return on ad spend ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  34. 34. Better?
  35. 35. Search Ads Restyled
  36. 36. Etsy onOprah Search Ads Restyled Hurricane Strikes Justin Beiber New Cat Meme Sneezes
  37. 37. 5%95%
  38. 38. Thank you!aws.amazon.com/big-data
  39. 39. We are sincerely eager tohear your FEEDBACK on thispresentation and on re:Invent. Please fill out an evaluation form when you have a chance.

×