Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

HadoopとMongoDBを活用したソーシャルアプリのログ解析

24,737 views

Published on

Published in: Technology

HadoopとMongoDBを活用したソーシャルアプリのログ解析

  1. 1. •          •       
  2. 2. •    •         
  3. 3.              
  4. 4.
  5. 5.      
  6. 6.      
  7. 7. !    
  8. 8.            
  9. 9.                
  10. 10.
  11. 11. � � �
  12. 12. •  ‣  ‣  ‣  •  ‣  ‣ 
  13. 13. •  ‣  ‣  ‣  ‣  ‣ 
  14. 14. •  ‣  ‣  ‣  ‣  ‣ 
  15. 15. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  16. 16. •  def mapper(key, value):! for word in value.split(): yield word,1! def reducer(key, values):! yield key,sum(values)! if __name__ == "__main__":! import dumbo! dumbo.run(mapper, reducer) dumbo start wordcount.py ! -hadoop /path/to/hadoop ! -input wc_input.txt -output wc_output
  17. 17. •  ‣  ‣  ‣  ‣  python wordcount.py map < wc_input.txt | sort | ! python wordcount.py red > wc_output.txt
  18. 18. •  ‣  ‣  ‣  ‣  ‣ 
  19. 19. • 
  20. 20. •  -----Change------! ActionLogger a{ChangeP} (Point,1371,1383) ! ActionLogger a{ChangeP} (Point,2373,2423)! ActionLogger a{ChangeMedal} (lucky_star,9,10) ! ActionLogger a{ChangeMedal} (lucky_sea_bream,0,1)! ActionLogger a{ChangeG} ! ActionLogger a{ChangeSubG} (SubGold,13,16) ! ActionLogger a{ChangeWakuwakuP} (buy,0,30)! ActionLogger a{ChangeWakuwakuP} (by gacha,30,0) ! ------Get------! ActionLogger a{GetMaterial} (syouhinnomoto,0,-1) ! ActionLogger a{GetMaterial} usesyouhinnomoto ! ActionLogger a{GetMaterial} (omotyanomotoPRO,1,6)! ActionLogger a{GetMaterial} (sui-tunomoto,5,4)! ActionLogger a{GetInterior} (bakery_counter,0,1)! ActionLogger a{GetAvatarPart} (190167,0,1) ! ActionLogger a{GetAvatarPart} (old_girl_09,0,1) ! -----Trade-----! ActionLogger a{Trade} buy 3 itigoke-kis from gree.jp:xxxxx !
  21. 21. •    ‣  ‣  2010-07-26 00:00:02,446 INFO catalina-exec-483 ActionLogger – userId a{Make} make item onsenmanjyuu! 2010-07-26 00:00:02,478 INFO catalina-exec-411 ActionLogger – userId a{LifeCycle} Login userId 2010-07-26 00:00:02,446 a{Make} {onsenmanjyuu,1}! userId 2010-07-26 00:00:02,478 a{LifeCycle} {Login,1}! userId 2010-07-26 00:00:02,478 a{GetMaterial} {omotyanomotoPRO,5}!
  22. 22. •    ‣  ‣  ‣  ‣ 
  23. 23. •  •  •  •  •  •  •  •  •  • 
  24. 24. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  25. 25. •  { "_id" : "2010-08-31+group+a{PutOn}", ! "date" : "2010-08-31", ! "lastUpdate" : "2010-09-21",! "actionType" : "a{PutOn}",! "actionDetail" : { "a{PutOn}" : 52050 } ! }! {...! "actionType" : "a{Make}",! "actionDetail" : { ! ”syurijyou” : 11,! ”aisukuri-mu” : 378,! ”kinnokarakuridokei” : 103,! ”puramoderu” : 22,! ”guremurinno_n” : 164,! ”kyodaipenginno_n” : 76,! ”patinko” : 67,! “wakizasi” : 250,! “dendendaiko” : 13651,! ... (over 100 items)! }! }!
  26. 26. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  27. 27. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  28. 28. •  MySQL: select * from things where x=3 and y="foo"! MongoDB: db.things.find( { x : 3, y : "foo" } );! MySQL: select z from things where x=3! MongoDB: db.things.find( { x : 3 }, { z : 1 } ); db.collection.find({ "field" : { $gt: value } } ); ! // : field > value ! db.collection.find({ "field" : { $lt: value } } ); ! // : field < value ! db.collection.find({"field”: {$gt: value1, $lt: value2}});! // value1 <= field <= value2
  29. 29. MySQL: select * from things where x in (b,a,c)! MongoDB: db.collection.find( { "field" : { $in : array } } ); ! db.things.find({j:{$in: [2,4,6]}});! db.customers.find( { name : /acme.*corp/i } ); ! db.myCollection.find().sort( { ts : -1 } ); // ts ! > m = function() { emit(this.user_id, 1); } ! > r = function(k,vals) { return 1; } ! > res = db.events.mapreduce(m, r, { query : {type:'sale'} }); ! > db[res.result].find().limit(2) ! { "_id" : 8321073716060 , "value" : 1 } ! { "_id" : 7921232311289 , "value" : 1 } !
  30. 30. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  31. 31. •  ‣  •  ‣  ‣  ‣  ‣ 
  32. 32. •    •  •  •  •  •  •  •  •  • 
  33. 33. •  {! "_id" : "2010-06-28+xxxx+Charge",! "lastUpdate" : "2010-09-20",! "userId" : ”xxxx",! "date" : "2010-06-28",! "actionType" : "Charge",! "totalCharge" : 1210,! "boughtItem" : { " EX 5 " : 1,! " 5 " : 1,! " 5 " : 1,! " " : 1,! " " : 2 }! }!
  34. 34. •  ‣  ‣  •  ‣  ‣  ‣ 
  35. 35. • 
  36. 36. •   
  37. 37. •   
  38. 38. •  {! "_id" : "2010-07-11+xxxxx+Registration",! "lastUpdate" : "2010-09-25",! "actionType" : "Registration",! "userId" : ”xxxxx",! "date" : "2010-07-11",! "firstCharge" : "2010-07-12",! "lastCharge" : "2010-09-02",! "lastLogin" : "2010-09-02",! "firstChargeTerm" : 1,! "playTerm" : 50,! "totalMonthCharge" : 1000,! "totalMonthChargeDetail" : {! "1th" : 74.3! "2th" : 17.1,! "3th" : 8.6,! i.e. "4th" : 0,! },! "totalCumlativeCharge" : 10000,! "totalCumlativeChargeDetail" : {! "1th" : 2,! "2th" : 0.5,! "3th" : 0.2,! "4th" : 0,! "5th" : 0.1,! "6th" : 27.5,! "7th" : 1.2,! "8th" : 49! "9th" : 19.5,! 2.7% }! }!
  39. 39. •  topMonthCharge = function(n){! return db.user_registration.find({},{! totalMonthCharge:true,! totalMonthChargeDetail:true,! userId:true! }).sort({totalMonthCharge:-1}).limit(n);! }! > topMonthCharge(20) ! { ! "_id" : "2010-07-10+9999+Registration",! Top20 "userId" : ”9999”,! "totalMonthCharge" : 10000,! "totalMonthChargeDetail" : { "5th" : 13.7, "4th" : 27.6, "3th" : 21, "2th" : 16.2, "1th" : 21.5 }! }! …!
  40. 40. findUser = function(x){ ! return db.user_charge.find({userId:x},{! userId:true,! totalCharge:true,! boughtItem:true}).sort({date:-1})! }! > findUserCharge("9999")! {! "_id" : "2010-09-08+9223458+Charge",! "totalCharge" : 2000,! "userId" : ”9999",! "boughtItem" : {! Top " 110 " : 2! }! }! {! "_id" : "2010-09-07+9223458+Charge",! "totalCharge" : 5000,! "userId" : ”9999",! "boughtItem" : {! " 350 " : 1,! " 110 " : 2! }! }! …!
  41. 41. •  •  •  •  •  • 
  42. 42. db.user_error! db.user_access! ( )! db.user_trace! (from )! (from )! db.user_attr! ( )! db.user_status! db.user_charge! (from Cassandra)! (from MySQL)!
  43. 43.
  44. 44. •  ‣  ‣  ‣  ‣ 
  45. 45. •  ‣  ‣  ‣  ‣ 

×