• Save
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
Upcoming SlideShare
Loading in...5
×
 

BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012

on

  • 2,035 views

Big data technologies let you work with any velocity, volume, or variety of data in a highly productive environment. This session seeks to answer questions such as "what is big data," "how can I use ...

Big data technologies let you work with any velocity, volume, or variety of data in a highly productive environment. This session seeks to answer questions such as "what is big data," "how can I use unstructured data," and "how can I integrate data collections from different sources" using Hadoop with Amazon Elastic MapReduce. Join general manager of EMR, Peter Sirota, on a journey through real-world use cases of data-driven discovery.

Statistics

Views

Total Views
2,035
Views on SlideShare
2,034
Embed Views
1

Actions

Likes
7
Downloads
0
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012 BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012 Presentation Transcript

  • BIG-DATA When your data sets become so large that you have to startinnovating how to collect, store, analyze and share it
  • Volume3Vs Velocity Variety
  • BIG-DATA The collection andanalysis of large amounts of data creates competitive advantage
  • BIGGER IS BETTER
  • Online Population Mobile Phone Machine Data
  • 1 Trillion Objects!
  • COLLECT | STORE | ANALYZE | SHARE
  • COLLECT | STORE | ANALYZE | SHARE
  • • Stream data to Amazon using Apache Flume • Amazon S3 • Amazon Elastic MapReduce
  • COLLECT | STORE | ANALYZE | SHARE
  • Structure High Low Large S3 EMR HDFS HbaseSize Dynamo DB RDS Small Logs on App servers
  • ANALYZEORGINIZE | CLEAN | ENRICH | CONDENSE
  • DynamoDB Table: On-Premise DB Table:Daily-Orders Customer-DemographicsNoSQL Table SQL Table RDS Table: Targeting-Information
  • DynamoDB Table: On-Premise DB Table:Daily-Orders Customer-DemographicsNoSQL Table SQL TableS3://clickstream-data/ 3rd Party Data: Apache Logs Social Networking Information Accessed via web API RDS Table: Targeting-Information
  • S3 file:s3://weekly-trend-data/CSV ReportS3 file:s3://monthly-trend-data/CSV Report
  • AMAZON ELASTIC MAPREDUCEReduces complexity/cost of Hadoop ManagementIntegrates seamlessly with AWS ServicesLeverages unmatched operational experience
  • Hadoop on Elastic MapReducelowers the cost of developing and operating a distributed system.
  • Amazon EMR and Amazon S3 S3
  • Recommendation Ad-hoc Engine Analysis Personalization Prod Cluster S3 (EMR) EMRData consumed in multiple ways
  • Prod Cluster (EMR)S3 EMR Query Cluster (EMR) EMR EMR EMR EMR
  • DynamoDB S3
  • EMR DynamoDBS3
  • DynamoDB
  • ANALYZE SHAREVISUALIZE | EXPLORE | DECIDE
  • Big Data Use Cases
  • Digital Advertising Web Analytics Log Processing Data Warehousing
  • SocialMedia/Advertising Oil & Gas Retail Life Sciences Financial Services Security Network/Gaming User Anti-virus Demographics Targeted Recommendations Monte Carlo Advertising Simulations Seismic Genome Fraud Detection Usage analysis Analysis Analysis Image and Transactions Video Analysis Risk Analysis Processing Image Recognition In-game metrics
  • Who is VivaKi? ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  • Big Data Challenge for VivaKiEnablement Activation Attribution ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  • The Product Solution – Fluent from RazorfishA digital marketing technology platform that provides marketers and agencies with a single,integrated software application to target, distribute, and manage multi-channel digital campaigns andexperiences. Marketing Central (Marketing Planning and Management, Team Collaboration and Workflow) Experience Publishing (CMS / DMS, Multi-Channel and Multi-Device Distribution, Social Monitoring) Targeting Insights (Multi-Channel Aware Segmentation and Targeting) (Analytics and Reporting, including Attribution) Data Warehouse (Data Sources - 1st and 3rd Party, Data Normalization + Transformation, Data Management) Amazon Cloud Infrastructure ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  • VivaKi Technology Solution ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  • Example: Atlas Cookie Level Data Click Stream Historical Click Stream Fe Data eUser Browsing d Ad Server Logs Session Data Mining Apply Customization Segmentation & Categorization Algorithm Customer Loyalty Data Ad Serving System Cross Selling System ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  • Example: Atlas Cookie Level Data Operational Specifics Traditional Data Center Solution Amazon Cloud Solution 30 Processing Servers (HP Proliant DL-360) 3 SQL Servers (HP Proliant DL-580) EMR Cluster of up to 1000 EC2 Instances Configuration 10TB SAN Storage 200GB additional S3 storage per month Processing 2 to 30 hours reliably 9 hours Data Retention 90 days 18 months System Cost $5000/month $10000/month Personnel Cost $15000/month $5500/month Business Impact  no upfront investment in hardware  no hardware procurement delay  no additional operations staff was hired  We completed development and testing of our first client project in six weeks. Our process is completely automated.  our first client campaign experienced a 500% increase in their return on ad spend ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
  • Better?
  • Search Ads Restyled
  • Etsy onOprah Search Ads Restyled Hurricane Strikes Justin Beiber New Cat Meme Sneezes
  • 5%95%
  • Thank you!aws.amazon.com/big-data
  • We are sincerely eager tohear your FEEDBACK on thispresentation and on re:Invent. Please fill out an evaluation form when you have a chance.