More Related Content Similar to Back end analytics_platform_2013_v1.0 (20) Back end analytics_platform_2013_v1.02. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
3. Targets
Analytic systems
KPI
Monitoring (access log , error log)
Real-time analytics / batch processing analytic
al tasks for Client
Depends GNT infrastructure , Scalability
4. Approaches
Refer log platforms' achitectures of Facebook ,
Twitter , ...
Community reviews of each component
Adapt needs
5. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
6. Analytic platform(1/10)
Tracker
Action log, error log (nginx)
Web log (Play framework)
Game user activities log (event-driven logs)
Database log (Cassandra, Redis, commit log)
Page taging / logfile analytics
Collector
ETL
Analyzer
Reporter
7. Analytic platform(2/10)
Facebook
Facebook
Web -> Scribe -> Ptail -> Puma -> HBase
http://www.slideshare.net/slrash/2011-
0630hadoopsummit-v5-8469751
=> Collection layer (Flume/Scribe) → Filter
layer (Flume) → Batching layer (Coprocessor)
16. Agenda
Targets
Approaches
Analytic Platform
Map/Reduce
GNT Game analytic system(current/testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
19. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current &
testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
21. GNT Game Analytic System(2/2)
current
Limitation :
– Javscript implementation : limit 1 JS execution
/1 server at time
– Scalability : not scale except in case of
sharding
Improving : integration Mongo + Hadoop
http://www.slideshare.net/iammutex/the-
elephant-in-the-room-mongo-db-hadoop
23. GNT Game Analytic System(2/4)
testing FlumeNG
flume.conf : 192.168.30.183
t-game-web183.sources = tail-nginx tail-play
t-game-web183.sinks = avro-sink-nginx183 avro-sink-play183
t-game-web183.channels = mem-channel-nginx183 mem-channel-
play183
t-game-web183.sources.tail-nginx.type = exec
t-game-web183.sources.tail-nginx.command = tail -F
/var/log/nginx/access.log
t-game-web183.sources.tail-nginx.channels = mem-channel-nginx183
t-game-web183.channels.mem-channel-nginx183.type = memory
t-game-web183.sinks.avro-sink-nginx183.type = avro
t-game-web183.sinks.avro-sink-nginx183.hostname = 192.168.30.185
t-game-web183.sinks.avro-sink-nginx183.port = 10183
t-game-web183.sinks.avro-sink-nginx183.channel = mem-channel-
nginx183
24. GNT Game Analytic System(3/4)
testing FlumeNG
flume.conf : 192.168.30.185
t-game-cass185.sinks.hdfs-sink-
nginx183.type = hdfs
t-game-cass185.sinks.hdfs-sink-
nginx183.hdfs.path =
hdfs://namenode/flume/webdata/nginx183
26. Mixed Solutions
Case 1 : old system : Mongo + Hadoop
Case 2 : FlumeNG + Hadoop + HBase
Case 3 : Batch processing : Hadoop HDFS
(not use FlumeNG)
28. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
48. Agenda
Targets
Methodologies
Log platforms
Map/Reduce
GNT analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
51. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
56. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
68. Agenda
Targets
Approaches
Analytic platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
72. Agenda
Targets
Approaches
Log platforms
Map/Reduce
GNT Game analytic system (current & testing)
Appendix
Hadoop/Hadoop components
Hbase
Collectors (FlumeNG/Chukwa/Scribe)
74. Future issues
Manage analytic jobs
Message queue : Kafka , ZeroMQ
Monitoring memory , flume agent , hadoop
cluster , ...
Scalability