Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

品友互动 Hadoop的etl任务—flume使用及其 优化

3,279 views

Published on

#LAMP人#新一代互联网行为定向广告技术的挑战与优化- 品友互动专场

Published in: Technology
  • Be the first to comment

品友互动 Hadoop的etl任务—flume使用及其 优化

  1. 1. 12 - - www.LAMPER.cn QQ 83304912 http://weibo.com/lampercnCopyright@2012 iPinyou All Rights Reserved.
  2. 2. Hadoop ETL —FlumeCopyright@2012 iPinyou All Rights Reserved.
  3. 3. Copyright@2012 iPinyou All Rights Reserved.
  4. 4. Web Web LogServer ServerWeb WebServer Server Copyright@2012 iPinyou All Rights Reserved.
  5. 5. Copyright@2012 iPinyou All Rights Reserved.
  6. 6. Copyright@2012 iPinyou All Rights Reserved.
  7. 7. ——ScribeCopyright@2012 iPinyou All Rights Reserved.
  8. 8. ——ChukwaCopyright@2012 iPinyou All Rights Reserved.
  9. 9. ——FlumeCopyright@2012 iPinyou All Rights Reserved.
  10. 10. ——Flume• Flume-Nodes-Nodes Source Sink-Nodes-- Sources Sinks- Nodes Copyright@2012 iPinyou All Rights Reserved.
  11. 11. ——FlumeCopyright@2012 iPinyou All Rights Reserved.
  12. 12. ——Flume• Flume1.Flume - Source Sink APIs - - Sources Sinks Decorators Copyright@2012 iPinyou All Rights Reserved.
  13. 13. ——Flume• Flume2.Connector -Sources Console Exec Syslog Scribe IRC Twitter -Sinks Console Local files HDFS S3 -Decorators Sinks Wire batching compression sampling throughput throttling Copyright@2012 iPinyou All Rights Reserved.
  14. 14. ——Flume• Flume 3. Copyright@2012 iPinyou All Rights Reserved.
  15. 15. ——Flume• Flume1. agentBESink Copyright@2012 iPinyou All Rights Reserved.
  16. 16. ——Flume• Flume2. agentDFOSink Copyright@2012 iPinyou All Rights Reserved.
  17. 17. ——Flume• Flume3. agentE2ESink Copyright@2012 iPinyou All Rights Reserved.
  18. 18. ——Flume• Flume1. -Web Page -Flume Shell Copyright@2012 iPinyou All Rights Reserved.
  19. 19. ——Flume• Flume2. Configuring FlumeNode: tail(“file”) | filter [ console, roll(1000) { dfs(“hdfs://namenode/user/flume”) } ] ; Copyright@2012 iPinyou All Rights Reserved.
  20. 20. ——FlumeCopyright@2012 iPinyou All Rights Reserved.
  21. 21. – Why Flume? Scribe Server Scribe Server Scribe ServerScribe Store Scribe server Agent CollectorChukwa Agent Collector Collector StoreFlume 3 Copyright@2012 iPinyou All Rights Reserved.
  22. 22. – Why Flume? Agent Thrift ClientScribe Collector Thrift serverChukwa Agents Collectors Agents Collector DecoratorsFlume Copyright@2012 iPinyou All Rights Reserved.
  23. 23. – Why Flume?ScribeChukwaFlume Web Flume Shell Copyright@2012 iPinyou All Rights Reserved.
  24. 24. – Why Flume?Copyright@2012 iPinyou All Rights Reserved.
  25. 25. Flume 1.Master 2.Collector 3.HDFS Small Files 4. 5.CPU 6. Copyright@2012 iPinyou All Rights Reserved.
  26. 26. Flume• Master Copyright@2012 iPinyou All Rights Reserved.
  27. 27. Flume• Collector 1.Agents Collectors Copyright@2012 iPinyou All Rights Reserved.
  28. 28. Flume• Collector 2. failover Collector Collector Copyright@2012 iPinyou All Rights Reserved.
  29. 29. Flume• HDFS small files HDFS - NameNode - Map Copyright@2012 iPinyou All Rights Reserved.
  30. 30. Flume• HDFS small files• CollectorSink rollmillis• flume-site.xml flume.collector.roll.millis Copyright@2012 iPinyou All Rights Reserved.
  31. 31. Flume• 1. Batch - batch(n,maxlatency) Decorator Event batch customdfs - Agent batch Collector unbatch Event Event Copyright@2012 iPinyou All Rights Reserved.
  32. 32. Flume•2.Compression - gzip Decorator Event - batch batch gzip customdfs - Agent gzip Collector gunzip Event gzip 80% Copyright@2012 iPinyou All Rights Reserved.
  33. 33. Flume•3. checksum - Collector Checksum Checksum - Checksum 33% Copyright@2012 iPinyou All Rights Reserved.
  34. 34. Flume• CPU TailDirSource CPU - 200ms - CPU - - - Log CPU 2/3 Copyright@2012 iPinyou All Rights Reserved.
  35. 35. Flume• TailDirSource -TailDirSource Direct Buffer MaxDirectMemorySize Direct Memory 1/2 Copyright@2012 iPinyou All Rights Reserved.
  36. 36. Flume – —— Copyright@2012 iPinyou All Rights Reserved.
  37. 37. Hadoop Redis Hbase Hive Pig Oozie Ganglia Flume Lucene LIBSVM Mahout Zookeeper………… Zookeeper………… Optimus Folo8 !!!!http://www.ipinyou.com.cnhttp://weibo.com/pinyouhudonghttp://weibo.com/pinyouhudonghr !!!! hr@ipinyou.com Copyright@2012 iPinyou All Rights Reserved.
  38. 38. Copyright@2012 iPinyou All Rights Reserved.

×