[Gaming on AWS] Big Data Analysis in the Cloud

  • 649 views
Uploaded on

Big Data Analysis in the Cloud - AWS Korea (정윤진, Solutions Architect)

Big Data Analysis in the Cloud - AWS Korea (정윤진, Solutions Architect)

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
649
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
59
Comments
0
Likes
6

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Cloud
  • 2. Thank you
  • 3. In the next 30 minutes13What is big dataBig data on AWSHow customers using AWS2
  • 4. Where is this data coming from ?
  • 5. Human generatedMachine generatedTweetSurf the internetBuy and sell productsUpload images and videosPlay gamesCheck in at restaurantsSearch for cafesFind dealsWatch content onlineLook for directionsUse social media
  • 6. Human generatedMachine generatedNetworks and securitydevicesMobile phonesCell phone towersSmart gridsSmart metersTelematics from carsSensors on machinesVideos from traffic andsecurity cameras
  • 7. What is it used for ?
  • 8. Data for competitiveadvantage
  • 9. Data for competitiveadvantageCustomer SegmentationFinancial modeling,System analysis,Line-of-sight,Replacing Human decisionsBusiness intelligence..
  • 10. Data for competitiveadvantageCustomer SegmentationFinancial modeling,System analysis,Line-of-sight,Replacing Human decisionsBusiness intelligence..Innovating new business andrevenue models
  • 11. GenerationCollectStoreCollaboration & sharingAnalysis and Computation
  • 12. GenerationCollectStoreCollaboration & sharingAnalysis and Computationlower cost,increasedthroughput
  • 13. GenerationCollectStoreCollaboration & sharingAnalysis and Computationlower cost,increasedthroughputconstraint
  • 14. Very high barrier toturning data intoinformation…
  • 15. Very high barrier toturning data intoinformation.Infrastructure capacityTechnical SkillsQuestions to askCheap experimentation
  • 16. Amazon Web Services Cloud
  • 17. Elastic and highly scalableNo upfront capital expenseOnly pay for what you use++Available on-demand+=Removeconstraints
  • 18. Remove constraints = More experimentationMore experimentation = More innovationMore Innovation = Competitive edge
  • 19. Amazon Web ServicesRemoves constraintsFocus on your dataLeave undifferentiated heavy lifting to us
  • 20. HOW
  • 21. GenerationCollectStoreCollaboration & sharingAnalysis and Computation
  • 22. 26
  • 23. AWS CloudCorporate Data centerVirtual Private CloudVPNInternetDirect ConnectStorage GatewayAWS Import/ExportS3 EMR RedShiftHow to move your data into AWS
  • 24. AWSImport/ExportCorporatedata centerAmazonElasticMapReduceAmazonSimpleStorageService (S3)BI UsersClickstream datafrom 500+websites and VoDplatform
  • 25. GenerationCollectStoreCollaboration & sharingAnalysis and Computation
  • 26. More than 25 Million Streaming Members50 Billion Events Per Day30 Million plays every day2 billion hours of video in 3months4 million ratings per day3 million searchesDevice location , time ,day, week etc.Social data
  • 27. 10 TB of streaming data per day
  • 28. What is S3?Highly scalable data storageAccess via APIsFast(850K requestsper sec)Highly available & durable(99.999999999% DurabilityEconomical($0.095 per GB)*Web store
  • 29. Velocity of dataAmazon Dynamodb
  • 30. GenerationCollectStoreCollaboration & sharingAnalysis and Computation
  • 31. “Who buys video games?”
  • 32. 3.5 billion records13 TB of click stream logs71 million unique cookiesPer day:
  • 33. 500% return on ad spend17,000% reduction inprocurement timeResults:
  • 34. What is EMR?Map-Reduce engine Integrated with toolsHadoop-as-a-serviceMassively parallelCost effective AWS wrapperIntegrated to AWS services
  • 35. +Source: http://nerds.airbnb.com/redshift-performance-costTable Size Query type Hive Redshift3 billionrowsSimple rangequery1680seconds (28min)360 seconds(6 min)1 millionrows2 complexjoins182 seconds 8 seconds$13.60/hour on Redshift versus $57/hour onHIVE
  • 36. Every day is crucial and costly
  • 37. Challenge: To run a virtual screen with a higheraccuracy algorithm & 21 million compounds
  • 38. Metric CountCompute Hours ofWork109,927 hoursCompute Days ofWork4,580 daysCompute Years ofWork12.55 yearsLigand Count ~21 million ligandsUsing Cycle Computing and AmazonWeb Services
  • 39. 3 Hoursfor $4828.85/hr
  • 40. Instead of $20+Million inInfrastructure
  • 41. GenerationCollectStoreCollaboration & sharingAnalysis and Computation
  • 42. Open web index.3.4 billion records.Available to all.1000 Genomesproject
  • 43. GenerationCollectStoreCollaboration & sharingAnalysis and Computation
  • 44. Game instancesDB instances Proxy farmsAmazon EMRAmazonGlacierAmazonRedShiftAmazonDynamoDBGame traffic AnalysisUsersSample architecture
  • 45. Thank you! aws.amazon.com/big-datayounjin@amazon.comMay 21st, COEX Intercontinental, SeoulOne day Free trainingWalk through of serviceshttp://aws.amazon.com/apac/awsday/seoul/