Flume NG Basics 1   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Clas...
Oracle’s Big Data Approach    4 Steps to Greater Value    • Acquire and organize all data    • Enable greater access to wi...
How do I get data to my Hadoop Cluster?                                 Using Flume NG to collect distributed data3   Copy...
My log data is not near my Hadoop cluster                                                                                 ...
Moving Data with Flume NG               Application               Servers                                                 ...
Building a Basic Flume Agent    One configuration file    • Flume is flexible          – Durable Transactions          – I...
Flume Configurationflume-ng agent –f this_file –n hdfs-agentollectehannelllect.type = netcatllect.bind = 127.0.0.1llect.po...
Sending Data to the Agent    • Connect netcat to the host    • Pipe input to it    • Records are transmitted on newline   ...
Alternatives to Flume    And Their Trade-Offs    • Scribe          – Thrift-based          – Lightweight, but no support  ...
10   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from...
11   Copyright © 2012, Oracle and/or its affiliates. All rights   Insert Information Protection Policy Classification from...
Upcoming SlideShare
Loading in...5
×

Flume in 10minutes

7,042

Published on

Slides for the video walkthrough at https://www.youtube.com/watch?v=112opbzgBiw

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
7,042
On Slideshare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
76
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Flume in 10minutes

  1. 1. Flume NG Basics 1 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  2. 2. Oracle’s Big Data Approach 4 Steps to Greater Value • Acquire and organize all data • Enable greater access to wide data • Analyze and refine important data • Decide and publish insights2 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  3. 3. How do I get data to my Hadoop Cluster? Using Flume NG to collect distributed data3 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  4. 4. My log data is not near my Hadoop cluster OracleApplication Big Data ApplianceServers Customer Logs ?4 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  5. 5. Moving Data with Flume NG Application Servers Oracle Big Data Appliance Flume NG Flume NG Logs Agent HDFS Write Avro Agent Flume NG Flume NG Logs Avro HDFS Write Agent Agent Flume NG Flume NG Logs Avro HDFS Write Agent Agent5 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  6. 6. Building a Basic Flume Agent One configuration file • Flume is flexible – Durable Transactions – In-Flight Data Modification – Compresses Data • Flume simpler than it used to be – No Zookeeper requirement – No Master-Slave architecture • 3 basic pieces – Source, Channel, Sink6 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  7. 7. Flume Configurationflume-ng agent –f this_file –n hdfs-agentollectehannelllect.type = netcatllect.bind = 127.0.0.1llect.port = 11111type = hdfshdfs.path = hdfs://localhost:8020/user/oracle/sabre_examplerollInterval = 30hdfs.writeFormat=Texthdfs.fileType=DataStreamannel.type = memoryannel.capacity=10000llect.channels=memoryChannelchannel=memoryChannel 7 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  8. 8. Sending Data to the Agent • Connect netcat to the host • Pipe input to it • Records are transmitted on newline • head example.xml | nc localhost 111118 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  9. 9. Alternatives to Flume And Their Trade-Offs • Scribe – Thrift-based – Lightweight, but no support – Not designed around Hadoop • Kafka – Designed to resemble a publish-subscribe system – Explicitly distributed – Apache Incubator Project9 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  10. 10. 10 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  11. 11. 11 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×