HadoopとMongoDBを活用したソーシャルアプリのログ解析

24,204 views

Published on

Published in: Technology

HadoopとMongoDBを活用したソーシャルアプリのログ解析

  1. 1. •          •       
  2. 2. •    •         
  3. 3.              
  4. 4.
  5. 5.      
  6. 6.      
  7. 7. !    
  8. 8.            
  9. 9.                
  10. 10.
  11. 11. � � �
  12. 12. •  ‣  ‣  ‣  •  ‣  ‣ 
  13. 13. •  ‣  ‣  ‣  ‣  ‣ 
  14. 14. •  ‣  ‣  ‣  ‣  ‣ 
  15. 15. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  16. 16. •  def mapper(key, value):! for word in value.split(): yield word,1! def reducer(key, values):! yield key,sum(values)! if __name__ == "__main__":! import dumbo! dumbo.run(mapper, reducer) dumbo start wordcount.py ! -hadoop /path/to/hadoop ! -input wc_input.txt -output wc_output
  17. 17. •  ‣  ‣  ‣  ‣  python wordcount.py map < wc_input.txt | sort | ! python wordcount.py red > wc_output.txt
  18. 18. •  ‣  ‣  ‣  ‣  ‣ 
  19. 19. • 
  20. 20. •  -----Change------! ActionLogger a{ChangeP} (Point,1371,1383) ! ActionLogger a{ChangeP} (Point,2373,2423)! ActionLogger a{ChangeMedal} (lucky_star,9,10) ! ActionLogger a{ChangeMedal} (lucky_sea_bream,0,1)! ActionLogger a{ChangeG} ! ActionLogger a{ChangeSubG} (SubGold,13,16) ! ActionLogger a{ChangeWakuwakuP} (buy,0,30)! ActionLogger a{ChangeWakuwakuP} (by gacha,30,0) ! ------Get------! ActionLogger a{GetMaterial} (syouhinnomoto,0,-1) ! ActionLogger a{GetMaterial} usesyouhinnomoto ! ActionLogger a{GetMaterial} (omotyanomotoPRO,1,6)! ActionLogger a{GetMaterial} (sui-tunomoto,5,4)! ActionLogger a{GetInterior} (bakery_counter,0,1)! ActionLogger a{GetAvatarPart} (190167,0,1) ! ActionLogger a{GetAvatarPart} (old_girl_09,0,1) ! -----Trade-----! ActionLogger a{Trade} buy 3 itigoke-kis from gree.jp:xxxxx !
  21. 21. •    ‣  ‣  2010-07-26 00:00:02,446 INFO catalina-exec-483 ActionLogger – userId a{Make} make item onsenmanjyuu! 2010-07-26 00:00:02,478 INFO catalina-exec-411 ActionLogger – userId a{LifeCycle} Login userId 2010-07-26 00:00:02,446 a{Make} {onsenmanjyuu,1}! userId 2010-07-26 00:00:02,478 a{LifeCycle} {Login,1}! userId 2010-07-26 00:00:02,478 a{GetMaterial} {omotyanomotoPRO,5}!
  22. 22. •    ‣  ‣  ‣  ‣ 
  23. 23. •  •  •  •  •  •  •  •  •  • 
  24. 24. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  25. 25. •  { "_id" : "2010-08-31+group+a{PutOn}", ! "date" : "2010-08-31", ! "lastUpdate" : "2010-09-21",! "actionType" : "a{PutOn}",! "actionDetail" : { "a{PutOn}" : 52050 } ! }! {...! "actionType" : "a{Make}",! "actionDetail" : { ! ”syurijyou” : 11,! ”aisukuri-mu” : 378,! ”kinnokarakuridokei” : 103,! ”puramoderu” : 22,! ”guremurinno_n” : 164,! ”kyodaipenginno_n” : 76,! ”patinko” : 67,! “wakizasi” : 250,! “dendendaiko” : 13651,! ... (over 100 items)! }! }!
  26. 26. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  27. 27. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  28. 28. •  MySQL: select * from things where x=3 and y="foo"! MongoDB: db.things.find( { x : 3, y : "foo" } );! MySQL: select z from things where x=3! MongoDB: db.things.find( { x : 3 }, { z : 1 } ); db.collection.find({ "field" : { $gt: value } } ); ! // : field > value ! db.collection.find({ "field" : { $lt: value } } ); ! // : field < value ! db.collection.find({"field”: {$gt: value1, $lt: value2}});! // value1 <= field <= value2
  29. 29. MySQL: select * from things where x in (b,a,c)! MongoDB: db.collection.find( { "field" : { $in : array } } ); ! db.things.find({j:{$in: [2,4,6]}});! db.customers.find( { name : /acme.*corp/i } ); ! db.myCollection.find().sort( { ts : -1 } ); // ts ! > m = function() { emit(this.user_id, 1); } ! > r = function(k,vals) { return 1; } ! > res = db.events.mapreduce(m, r, { query : {type:'sale'} }); ! > db[res.result].find().limit(2) ! { "_id" : 8321073716060 , "value" : 1 } ! { "_id" : 7921232311289 , "value" : 1 } !
  30. 30. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  31. 31. •  ‣  •  ‣  ‣  ‣  ‣ 
  32. 32. •    •  •  •  •  •  •  •  •  • 
  33. 33. •  {! "_id" : "2010-06-28+xxxx+Charge",! "lastUpdate" : "2010-09-20",! "userId" : ”xxxx",! "date" : "2010-06-28",! "actionType" : "Charge",! "totalCharge" : 1210,! "boughtItem" : { " EX 5 " : 1,! " 5 " : 1,! " 5 " : 1,! " " : 1,! " " : 2 }! }!
  34. 34. •  ‣  ‣  •  ‣  ‣  ‣ 
  35. 35. • 
  36. 36. •   
  37. 37. •   
  38. 38. •  {! "_id" : "2010-07-11+xxxxx+Registration",! "lastUpdate" : "2010-09-25",! "actionType" : "Registration",! "userId" : ”xxxxx",! "date" : "2010-07-11",! "firstCharge" : "2010-07-12",! "lastCharge" : "2010-09-02",! "lastLogin" : "2010-09-02",! "firstChargeTerm" : 1,! "playTerm" : 50,! "totalMonthCharge" : 1000,! "totalMonthChargeDetail" : {! "1th" : 74.3! "2th" : 17.1,! "3th" : 8.6,! i.e. "4th" : 0,! },! "totalCumlativeCharge" : 10000,! "totalCumlativeChargeDetail" : {! "1th" : 2,! "2th" : 0.5,! "3th" : 0.2,! "4th" : 0,! "5th" : 0.1,! "6th" : 27.5,! "7th" : 1.2,! "8th" : 49! "9th" : 19.5,! 2.7% }! }!
  39. 39. •  topMonthCharge = function(n){! return db.user_registration.find({},{! totalMonthCharge:true,! totalMonthChargeDetail:true,! userId:true! }).sort({totalMonthCharge:-1}).limit(n);! }! > topMonthCharge(20) ! { ! "_id" : "2010-07-10+9999+Registration",! Top20 "userId" : ”9999”,! "totalMonthCharge" : 10000,! "totalMonthChargeDetail" : { "5th" : 13.7, "4th" : 27.6, "3th" : 21, "2th" : 16.2, "1th" : 21.5 }! }! …!
  40. 40. findUser = function(x){ ! return db.user_charge.find({userId:x},{! userId:true,! totalCharge:true,! boughtItem:true}).sort({date:-1})! }! > findUserCharge("9999")! {! "_id" : "2010-09-08+9223458+Charge",! "totalCharge" : 2000,! "userId" : ”9999",! "boughtItem" : {! Top " 110 " : 2! }! }! {! "_id" : "2010-09-07+9223458+Charge",! "totalCharge" : 5000,! "userId" : ”9999",! "boughtItem" : {! " 350 " : 1,! " 110 " : 2! }! }! …!
  41. 41. •  •  •  •  •  • 
  42. 42. db.user_error! db.user_access! ( )! db.user_trace! (from )! (from )! db.user_attr! ( )! db.user_status! db.user_charge! (from Cassandra)! (from MySQL)!
  43. 43.
  44. 44. •  ‣  ‣  ‣  ‣ 
  45. 45. •  ‣  ‣  ‣  ‣ 

×