Apache Flume NG

3,381 views
3,150 views

Published on

Talk given by Kai Voigt, Cloudera Inc, at the Hadoop User Group UK meetup on 10 Oct 2012 in London

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,381
On SlideShare
0
From Embeds
0
Number of Embeds
125
Actions
Shares
0
Downloads
161
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Apache Flume NG

  1. 1. APACHE FLUME NG Kai Voigt, Cloudera Inc London, Hadoop User Group, 10 Oct 2012Donnerstag, 11. Oktober 12
  2. 2. “ FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA ”Donnerstag, 11. Oktober 12
  3. 3. “ FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA ”Donnerstag, 11. Oktober 12
  4. 4. httpd /var/log/htaccess Flume HDFSDonnerstag, 11. Oktober 12
  5. 5. 5Donnerstag, 11. Oktober 12
  6. 6. mysource mysink mychannel myagent.sources = mysource myagent.sinks = mysink myagent.channels = mychannel 6Donnerstag, 11. Oktober 12
  7. 7. mysource mysink mychannel myagent.sources.mysource.type = exec myagent.sources.mysource.command = tail -F /var/log/htaccess myagent.sources.mysource.channels = mychannel 7Donnerstag, 11. Oktober 12
  8. 8. mysource mysink mychannel myagent.sinks.mysink.type = hdfs myagent.sinks.mysink.hdfs.path = /user/cloudera/htaccess myagent.sinks.mysink.hdfs.fileType = DataStream myagent.sinks.mysink.channel = mychannel 8Donnerstag, 11. Oktober 12
  9. 9. mysource mysink mychannel myagent.channels.mychannel.type = memory myagent.channels.mychannel.capacity = 1000 myagent.channels.mychannel.transactionCapactiy = 100 9Donnerstag, 11. Oktober 12
  10. 10. $ flume-ng agent --conf-file simple.conf --name myagent $ hadoop fs -ls htaccess -rw-r--r-- 1 cloudera cloudera 1001 2012-09-30 05:58 htaccess/FlumeData.1348999108529 -rw-r--r-- 1 cloudera cloudera 993 2012-09-30 05:58 htaccess/FlumeData.1348999108530 -rw-r--r-- 1 cloudera cloudera 997 2012-09-30 05:59 htaccess/FlumeData.1348999108531 -rw-r--r-- 1 cloudera cloudera 1009 2012-09-30 05:59 htaccess/FlumeData.1348999108532 ... 10Donnerstag, 11. Oktober 12
  11. 11. “ FLUME IS A DISTRIBUTED, RELIABLE, AND AVAILABLE SERVICE FOR EFFICIENTLY COLLECTING, AGGREGATING, AND MOVING LARGE AMOUNTS OF LOG DATA ”Donnerstag, 11. Oktober 12
  12. 12. MULTI HOP 12Donnerstag, 11. Oktober 12
  13. 13. myagent1.sinks = mysink myagent1.sinks.mysink.type = avro myagent1.sinks.mysink.bind = 10.10.10.20 myagent1.sinks.mysink.port = 4141 myagent2.sources = mysource myagent2.sources.mysource.type = avro myagent2.sources.mysource.bind = 10.10.10.10 myagent2.sources.mysource.port = 4141 13Donnerstag, 11. Oktober 12
  14. 14. CONSOLIDATION 14Donnerstag, 11. Oktober 12
  15. 15. MULTIPLEXING 15Donnerstag, 11. Oktober 12
  16. 16. Sources Sinks Channels Avro Avro Memory Exec Logger JDBC NetCat IRC File Sequence Generator File Syslog HBase Scribe 16Donnerstag, 11. Oktober 12
  17. 17. DEMO DEMO DEMO DEMO DEMODonnerstag, 11. Oktober 12
  18. 18. Thank you! kai@cloudera.com http://www.cloudera.com/Donnerstag, 11. Oktober 12

×