マーケティングのためのHadoop利用

  • 7,920 views
Uploaded on

6/4に行われた「クックパッド・PFI共同勉強会」で使った資料です。togetterはこちら。 http://togetter.com/li/26756

6/4に行われた「クックパッド・PFI共同勉強会」で使った資料です。togetterはこちら。 http://togetter.com/li/26756

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
7,920
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
197
Comments
0
Likes
20

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide















































































Transcript

  • 1. Hadoop
  • 2. • sasata299 ( ) • Hadoop • NoSQL • • http://blog.livedoor.jp/sasata299/
  • 3. Hadoop • Google MapReduce OSS • • • •
  • 4. • Hadoop • Hadoop • • & !?
  • 5. • Hadoop • Hadoop • • & !?
  • 6. 896 30 3 1
  • 7. 896 30 3 1
  • 8. 896 30 3 1 ” ”
  • 9. GROUP BY !! MySQL ( 3.5 )
  • 10. GROUP BY !! MySQL ( 3.5 ) 7000 ≒292 ……orz
  • 11. MySQL
  • 12. … ※
  • 13. !?
  • 14. Hadoop
  • 15.
  • 16. 7000 30
  • 17. • Hadoop • Hadoop • • & !?
  • 18. Hadoop • Hadoop Streaming • Ruby • Cloudera CDH1 (0.18.3) • EC2 Hadoop • 10-50 • Hadoop S3
  • 19. Hadoop • Hadoop Streaming • Ruby • Cloudera CDH1 (0.18.3) • EC2 Hadoop • 10-50 • Hadoop S3
  • 20. ○○ ×× ○○ ××
  • 21. Hadoop
  • 22. Hadoop AWS
  • 23. Hadoop
  • 24. Hadoop Hadoop (EC2)
  • 25. hadoop-ec2 push [cluster] mapper.rb hadoop-ec2 push [cluster] reducer.rb Hadoop (EC2)
  • 26. hadoop-ec2 push [cluster] mapper.rb hadoop-ec2 push [cluster] reducer.rb Hadoop (EC2)
  • 27. Hadoop hadoop-ec2 exec [cluster] [command] Hadoop S3 (EC2)
  • 28. Hadoop hadoop-ec2 exec [cluster] [command] Hadoop S3 (EC2)
  • 29. Hadoop hadoop-ec2 exec [cluster] [command] Hadoop S3 (EC2)
  • 30. Hadoop hadoop-ec2 exec [cluster] [command] Hadoop S3 (EC2)
  • 31. S3
  • 32. S3
  • 33. Hadoop Hadoop (EC2)
  • 34. Hadoop
  • 35. !!
  • 36. Hadoop
  • 37. 1) 2) Hadoop 3)
  • 38. • Hadoop • Hadoop • • & !?
  • 39. target_ids # [21310,12902,15321,..] ARGF.each do |log| log.chomp! id, foo, bar, ... = log.split(/,/) next if target_ids.include?(id) end target_ids 5 …
  • 40. :-)
  • 41. # 1000 hash = Hash.new {|h,k| h[k] = []} target_ids.each do |_id| hash[_id.to_s[0,3]] << _id end ARGF.each do |log| log.chomp! id, foo, bar, ... = log.split(/,/) # next if hash[id[0,3]].include?(id) end
  • 42. Mapper Reducer - Mapper - Reducer
  • 43. • Hadoop • Hadoop • • & !?
  • 44. EC2 -> AZ -> JobTracker -> 50030 -> hadoop job -list
  • 45. 10h 8h JobTracker
  • 46. !?
  • 47. Amazon Elastic MapReduce
  • 48. Elastic MapReduce