12                                              - -                  www.LAMPER.cn                 QQ      83304912       ...
Hadoop ETL                               —FlumeCopyright@2012 iPinyou All Rights Reserved.
Copyright@2012 iPinyou All Rights Reserved.
Web                                                    Web                                 LogServer                      ...
Copyright@2012 iPinyou All Rights Reserved.
Copyright@2012 iPinyou All Rights Reserved.
——ScribeCopyright@2012 iPinyou All Rights Reserved.
——ChukwaCopyright@2012 iPinyou All Rights Reserved.
——FlumeCopyright@2012 iPinyou All Rights Reserved.
——Flume• Flume-Nodes-Nodes           Source            Sink-Nodes--   Sources        Sinks-   Nodes          Copyright@201...
——FlumeCopyright@2012 iPinyou All Rights Reserved.
——Flume• Flume1.Flume -      Source Sink APIs - -                                                      Sources   Sinks Dec...
——Flume• Flume2.Connector  -Sources   Console Exec Syslog Scribe IRC Twitter  -Sinks   Console Local files HDFS S3  -Decor...
——Flume• Flume 3.          Copyright@2012 iPinyou All Rights Reserved.
——Flume• Flume1. agentBESink          Copyright@2012 iPinyou All Rights Reserved.
——Flume• Flume2. agentDFOSink          Copyright@2012 iPinyou All Rights Reserved.
——Flume• Flume3. agentE2ESink          Copyright@2012 iPinyou All Rights Reserved.
——Flume• Flume1. -Web Page -Flume Shell          Copyright@2012 iPinyou All Rights Reserved.
——Flume• Flume2. Configuring FlumeNode: tail(“file”) | filter [ console, roll(1000) { dfs(“hdfs://namenode/user/flume”) } ...
——FlumeCopyright@2012 iPinyou All Rights Reserved.
– Why Flume?             Scribe Server               Scribe Server           Scribe ServerScribe         Store            ...
– Why Flume?             Agent             Thrift ClientScribe       Collector                 Thrift serverChukwa        ...
– Why Flume?ScribeChukwaFlume              Web          Flume Shell         Copyright@2012 iPinyou All Rights Reserved.
– Why Flume?Copyright@2012 iPinyou All Rights Reserved.
Flume   1.Master   2.Collector   3.HDFS Small Files   4.   5.CPU   6.           Copyright@2012 iPinyou All Rights Reserved.
Flume• Master           Copyright@2012 iPinyou All Rights Reserved.
Flume• Collector  1.Agents                                                 Collectors             Copyright@2012 iPinyou A...
Flume• Collector  2.                           failover        Collector              Collector            Copyright@2012 ...
Flume• HDFS small files              HDFS   - NameNode   -               Map          Copyright@2012 iPinyou All Rights Re...
Flume• HDFS small files•          CollectorSink                      rollmillis•       flume-site.xml                     ...
Flume• 1. Batch   -    batch(n,maxlatency) Decorator                       Event                             batch        ...
Flume•2.Compression   -    gzip Decorator                   Event     -         batch                   batch             ...
Flume•3. checksum    - Collector                                           Checksum                                Checksu...
Flume• CPU        TailDirSource                                      CPU   -                        200ms    -            ...
Flume•        TailDirSource    -TailDirSource                                Direct Buffer        MaxDirectMemorySize     ...
Flume                             –               ——        Copyright@2012 iPinyou All Rights Reserved.
Hadoop   Redis Hbase    Hive   Pig                                         Oozie      Ganglia    Flume Lucene LIBSVM      ...
Copyright@2012 iPinyou All Rights Reserved.
Upcoming SlideShare
Loading in …5
×

品友互动 Hadoop的etl任务—flume使用及其 优化

2,958
-1

Published on

#LAMP人#新一代互联网行为定向广告技术的挑战与优化- 品友互动专场

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,958
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
76
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

品友互动 Hadoop的etl任务—flume使用及其 优化

  1. 1. 12 - - www.LAMPER.cn QQ 83304912 http://weibo.com/lampercnCopyright@2012 iPinyou All Rights Reserved.
  2. 2. Hadoop ETL —FlumeCopyright@2012 iPinyou All Rights Reserved.
  3. 3. Copyright@2012 iPinyou All Rights Reserved.
  4. 4. Web Web LogServer ServerWeb WebServer Server Copyright@2012 iPinyou All Rights Reserved.
  5. 5. Copyright@2012 iPinyou All Rights Reserved.
  6. 6. Copyright@2012 iPinyou All Rights Reserved.
  7. 7. ——ScribeCopyright@2012 iPinyou All Rights Reserved.
  8. 8. ——ChukwaCopyright@2012 iPinyou All Rights Reserved.
  9. 9. ——FlumeCopyright@2012 iPinyou All Rights Reserved.
  10. 10. ——Flume• Flume-Nodes-Nodes Source Sink-Nodes-- Sources Sinks- Nodes Copyright@2012 iPinyou All Rights Reserved.
  11. 11. ——FlumeCopyright@2012 iPinyou All Rights Reserved.
  12. 12. ——Flume• Flume1.Flume - Source Sink APIs - - Sources Sinks Decorators Copyright@2012 iPinyou All Rights Reserved.
  13. 13. ——Flume• Flume2.Connector -Sources Console Exec Syslog Scribe IRC Twitter -Sinks Console Local files HDFS S3 -Decorators Sinks Wire batching compression sampling throughput throttling Copyright@2012 iPinyou All Rights Reserved.
  14. 14. ——Flume• Flume 3. Copyright@2012 iPinyou All Rights Reserved.
  15. 15. ——Flume• Flume1. agentBESink Copyright@2012 iPinyou All Rights Reserved.
  16. 16. ——Flume• Flume2. agentDFOSink Copyright@2012 iPinyou All Rights Reserved.
  17. 17. ——Flume• Flume3. agentE2ESink Copyright@2012 iPinyou All Rights Reserved.
  18. 18. ——Flume• Flume1. -Web Page -Flume Shell Copyright@2012 iPinyou All Rights Reserved.
  19. 19. ——Flume• Flume2. Configuring FlumeNode: tail(“file”) | filter [ console, roll(1000) { dfs(“hdfs://namenode/user/flume”) } ] ; Copyright@2012 iPinyou All Rights Reserved.
  20. 20. ——FlumeCopyright@2012 iPinyou All Rights Reserved.
  21. 21. – Why Flume? Scribe Server Scribe Server Scribe ServerScribe Store Scribe server Agent CollectorChukwa Agent Collector Collector StoreFlume 3 Copyright@2012 iPinyou All Rights Reserved.
  22. 22. – Why Flume? Agent Thrift ClientScribe Collector Thrift serverChukwa Agents Collectors Agents Collector DecoratorsFlume Copyright@2012 iPinyou All Rights Reserved.
  23. 23. – Why Flume?ScribeChukwaFlume Web Flume Shell Copyright@2012 iPinyou All Rights Reserved.
  24. 24. – Why Flume?Copyright@2012 iPinyou All Rights Reserved.
  25. 25. Flume 1.Master 2.Collector 3.HDFS Small Files 4. 5.CPU 6. Copyright@2012 iPinyou All Rights Reserved.
  26. 26. Flume• Master Copyright@2012 iPinyou All Rights Reserved.
  27. 27. Flume• Collector 1.Agents Collectors Copyright@2012 iPinyou All Rights Reserved.
  28. 28. Flume• Collector 2. failover Collector Collector Copyright@2012 iPinyou All Rights Reserved.
  29. 29. Flume• HDFS small files HDFS - NameNode - Map Copyright@2012 iPinyou All Rights Reserved.
  30. 30. Flume• HDFS small files• CollectorSink rollmillis• flume-site.xml flume.collector.roll.millis Copyright@2012 iPinyou All Rights Reserved.
  31. 31. Flume• 1. Batch - batch(n,maxlatency) Decorator Event batch customdfs - Agent batch Collector unbatch Event Event Copyright@2012 iPinyou All Rights Reserved.
  32. 32. Flume•2.Compression - gzip Decorator Event - batch batch gzip customdfs - Agent gzip Collector gunzip Event gzip 80% Copyright@2012 iPinyou All Rights Reserved.
  33. 33. Flume•3. checksum - Collector Checksum Checksum - Checksum 33% Copyright@2012 iPinyou All Rights Reserved.
  34. 34. Flume• CPU TailDirSource CPU - 200ms - CPU - - - Log CPU 2/3 Copyright@2012 iPinyou All Rights Reserved.
  35. 35. Flume• TailDirSource -TailDirSource Direct Buffer MaxDirectMemorySize Direct Memory 1/2 Copyright@2012 iPinyou All Rights Reserved.
  36. 36. Flume – —— Copyright@2012 iPinyou All Rights Reserved.
  37. 37. Hadoop Redis Hbase Hive Pig Oozie Ganglia Flume Lucene LIBSVM Mahout Zookeeper………… Zookeeper………… Optimus Folo8 !!!!http://www.ipinyou.com.cnhttp://weibo.com/pinyouhudonghttp://weibo.com/pinyouhudonghr !!!! hr@ipinyou.com Copyright@2012 iPinyou All Rights Reserved.
  38. 38. Copyright@2012 iPinyou All Rights Reserved.

×