• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012
 

BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012

on

  • 1,848 views

Big data technologies let you work with any velocity, volume, or variety of data in a highly productive environment. This session seeks to answer questions such as "what is big data," "how can I use ...

Big data technologies let you work with any velocity, volume, or variety of data in a highly productive environment. This session seeks to answer questions such as "what is big data," "how can I use unstructured data," and "how can I integrate data collections from different sources" using Hadoop with Amazon Elastic MapReduce. Join general manager of EMR, Peter Sirota, on a journey through real-world use cases of data-driven discovery.

Statistics

Views

Total Views
1,848
Views on SlideShare
1,847
Embed Views
1

Actions

Likes
7
Downloads
0
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012 BDT101 Big Data with Amazon Elastic MapReduce - AWS re: Invent 2012 Presentation Transcript

    • BIG-DATA When your data sets become so large that you have to startinnovating how to collect, store, analyze and share it
    • Volume3Vs Velocity Variety
    • BIG-DATA The collection andanalysis of large amounts of data creates competitive advantage
    • BIGGER IS BETTER
    • Online Population Mobile Phone Machine Data
    • 1 Trillion Objects!
    • COLLECT | STORE | ANALYZE | SHARE
    • COLLECT | STORE | ANALYZE | SHARE
    • • Stream data to Amazon using Apache Flume • Amazon S3 • Amazon Elastic MapReduce
    • COLLECT | STORE | ANALYZE | SHARE
    • Structure High Low Large S3 EMR HDFS HbaseSize Dynamo DB RDS Small Logs on App servers
    • ANALYZEORGINIZE | CLEAN | ENRICH | CONDENSE
    • DynamoDB Table: On-Premise DB Table:Daily-Orders Customer-DemographicsNoSQL Table SQL Table RDS Table: Targeting-Information
    • DynamoDB Table: On-Premise DB Table:Daily-Orders Customer-DemographicsNoSQL Table SQL TableS3://clickstream-data/ 3rd Party Data: Apache Logs Social Networking Information Accessed via web API RDS Table: Targeting-Information
    • S3 file:s3://weekly-trend-data/CSV ReportS3 file:s3://monthly-trend-data/CSV Report
    • AMAZON ELASTIC MAPREDUCEReduces complexity/cost of Hadoop ManagementIntegrates seamlessly with AWS ServicesLeverages unmatched operational experience
    • Hadoop on Elastic MapReducelowers the cost of developing and operating a distributed system.
    • Amazon EMR and Amazon S3 S3
    • Recommendation Ad-hoc Engine Analysis Personalization Prod Cluster S3 (EMR) EMRData consumed in multiple ways
    • Prod Cluster (EMR)S3 EMR Query Cluster (EMR) EMR EMR EMR EMR
    • DynamoDB S3
    • EMR DynamoDBS3
    • DynamoDB
    • ANALYZE SHAREVISUALIZE | EXPLORE | DECIDE
    • Big Data Use Cases
    • Digital Advertising Web Analytics Log Processing Data Warehousing
    • SocialMedia/Advertising Oil & Gas Retail Life Sciences Financial Services Security Network/Gaming User Anti-virus Demographics Targeted Recommendations Monte Carlo Advertising Simulations Seismic Genome Fraud Detection Usage analysis Analysis Analysis Image and Transactions Video Analysis Risk Analysis Processing Image Recognition In-game metrics
    • Who is VivaKi? ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
    • Big Data Challenge for VivaKiEnablement Activation Attribution ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
    • The Product Solution – Fluent from RazorfishA digital marketing technology platform that provides marketers and agencies with a single,integrated software application to target, distribute, and manage multi-channel digital campaigns andexperiences. Marketing Central (Marketing Planning and Management, Team Collaboration and Workflow) Experience Publishing (CMS / DMS, Multi-Channel and Multi-Device Distribution, Social Monitoring) Targeting Insights (Multi-Channel Aware Segmentation and Targeting) (Analytics and Reporting, including Attribution) Data Warehouse (Data Sources - 1st and 3rd Party, Data Normalization + Transformation, Data Management) Amazon Cloud Infrastructure ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
    • VivaKi Technology Solution ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
    • Example: Atlas Cookie Level Data Click Stream Historical Click Stream Fe Data eUser Browsing d Ad Server Logs Session Data Mining Apply Customization Segmentation & Categorization Algorithm Customer Loyalty Data Ad Serving System Cross Selling System ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
    • Example: Atlas Cookie Level Data Operational Specifics Traditional Data Center Solution Amazon Cloud Solution 30 Processing Servers (HP Proliant DL-360) 3 SQL Servers (HP Proliant DL-580) EMR Cluster of up to 1000 EC2 Instances Configuration 10TB SAN Storage 200GB additional S3 storage per month Processing 2 to 30 hours reliably 9 hours Data Retention 90 days 18 months System Cost $5000/month $10000/month Personnel Cost $15000/month $5500/month Business Impact  no upfront investment in hardware  no hardware procurement delay  no additional operations staff was hired  We completed development and testing of our first client project in six weeks. Our process is completely automated.  our first client campaign experienced a 500% increase in their return on ad spend ©2011. All rights reserved. VivaKi. Proprietary and Confidential.
    • Better?
    • Search Ads Restyled
    • Etsy onOprah Search Ads Restyled Hurricane Strikes Justin Beiber New Cat Meme Sneezes
    • 5%95%
    • Thank you!aws.amazon.com/big-data
    • We are sincerely eager tohear your FEEDBACK on thispresentation and on re:Invent. Please fill out an evaluation form when you have a chance.