Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ソーシャルアプリでの Amazon Elastic MapReduce 活用事例

10,831 views

Published on

TokyoWebmining #7
buhii

  • Be the first to comment

ソーシャルアプリでの Amazon Elastic MapReduce 活用事例

  1. 1. Amazon Elastic MapReduce Takahiro Kamatani gumi, Inc. 2010/09/26 Sunday, September 26, 2010
  2. 2. • • • Amazon Elastic MapReduce • Sunday, September 26, 2010
  3. 3. • Twitter: @buhii • • gumi @ http://www.kansei.tsukuba.ac.jp/~uchiyamalab/beacon • • beacon • gumi @ynil Sunday, September 26, 2010
  4. 4. Sunday, September 26, 2010
  5. 5. gumi • mixi, , GREE • python Django • Amazon Web Services (EC2 + RDS) • • DB Sunday, September 26, 2010
  6. 6. • PV , UU • DAU Daily Active Users • • • ÷ DAU • ARPU Average Revenue Per User • Sunday, September 26, 2010
  7. 7. Amazon Web Service AWS Sunday, September 26, 2010
  8. 8. Sunday, September 26, 2010
  9. 9. Sunday, September 26, 2010
  10. 10. Sunday, September 26, 2010
  11. 11. Sunday, September 26, 2010
  12. 12. Sunday, September 26, 2010
  13. 13. Amazon Elastic MapReduce Sunday, September 26, 2010
  14. 14. MapReduce Mapper Key, Value Mapper key Sort / Shuffle Reducer Reducer key, value Mapper, Reducer Sunday, September 26, 2010
  15. 15. Amazon Elastic MapReduce • Hadoop • Hadoop Streaming Mapper Reducer Ruby, Perl, Python, PHP, R, Bash, C++ • EC2 job • Sunday, September 26, 2010
  16. 16. Example Task • • • Sunday, September 26, 2010
  17. 17. • Mapper • Apache Log • ID key value Reducer Sunday, September 26, 2010
  18. 18. • Reducer • sort/shuffle ID Reducer • ID Sunday, September 26, 2010
  19. 19. Reducer 31758623 2010-08-20 42346572 2010-09-05,2010-09-06 31977736 2010-08-11,2010-08-12,2010-08-13,2010-08-14 14007991 2010-08-16 35995849 2010-08-12,2010-08-13,2010-08-14 34246688 2010-08-21,2010-08-22,2010-08-23,2010-08-27 ... PC Sunday, September 26, 2010
  20. 20. Amazon Elastic Mapreduce • AWS • S3 Mapper, Reducer → s3cmd, S3Fox Organizer, Cyberduck • Job OK • Sunday, September 26, 2010
  21. 21. Sunday, September 26, 2010
  22. 22. Sunday, September 26, 2010
  23. 23. Streaming Sunday, September 26, 2010
  24. 24. {Input, Output} Location, Mapper, Reducer S3 gzip Hadoop Extra Args -jobconf stream.recordreader.compression=gzip input Location Extra Args -input s3n://(bucket )/( )/access_log.* Sunday, September 26, 2010
  25. 25. Sunday, September 26, 2010
  26. 26. Sunday, September 26, 2010
  27. 27. Sunday, September 26, 2010
  28. 28. Debug Sunday, September 26, 2010
  29. 29. Sunday, September 26, 2010
  30. 30. • Hadoop • MapReduce • • Sunday, September 26, 2010
  31. 31. Hadoop • • S3 gzip • • hadoop EC2 • ( 20 ...) Sunday, September 26, 2010
  32. 32. @ynil MapReduce http://nlpyutori.g.hatena.ne.jp/yaruki_nil/20100911/1284089305 Sunday, September 26, 2010
  33. 33. Sunday, September 26, 2010
  34. 34. Sunday, September 26, 2010
  35. 35. Sunday, September 26, 2010
  36. 36. MapReduce MapReduce Google Map Reduce Map Reduce MapReduce C++ Java Python Wikipedia “MapReduce” http://ja.wikipedia.org/wiki/MapReduce Sunday, September 26, 2010
  37. 37. cron • PV, UU NFS CSV • DB → DB • PV, UU Sunday, September 26, 2010

×