Flume in 10minutes

  • 6,394 views
Uploaded on

Slides for the video walkthrough at https://www.youtube.com/watch?v=112opbzgBiw

Slides for the video walkthrough at https://www.youtube.com/watch?v=112opbzgBiw

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
6,394
On Slideshare
0
From Embeds
0
Number of Embeds
7

Actions

Shares
Downloads
55
Comments
0
Likes
4

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Flume NG Basics 1 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 2. Oracle’s Big Data Approach 4 Steps to Greater Value • Acquire and organize all data • Enable greater access to wide data • Analyze and refine important data • Decide and publish insights2 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 3. How do I get data to my Hadoop Cluster? Using Flume NG to collect distributed data3 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 4. My log data is not near my Hadoop cluster OracleApplication Big Data ApplianceServers Customer Logs ?4 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 5. Moving Data with Flume NG Application Servers Oracle Big Data Appliance Flume NG Flume NG Logs Agent HDFS Write Avro Agent Flume NG Flume NG Logs Avro HDFS Write Agent Agent Flume NG Flume NG Logs Avro HDFS Write Agent Agent5 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 6. Building a Basic Flume Agent One configuration file • Flume is flexible – Durable Transactions – In-Flight Data Modification – Compresses Data • Flume simpler than it used to be – No Zookeeper requirement – No Master-Slave architecture • 3 basic pieces – Source, Channel, Sink6 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 7. Flume Configurationflume-ng agent –f this_file –n hdfs-agentollectehannelllect.type = netcatllect.bind = 127.0.0.1llect.port = 11111type = hdfshdfs.path = hdfs://localhost:8020/user/oracle/sabre_examplerollInterval = 30hdfs.writeFormat=Texthdfs.fileType=DataStreamannel.type = memoryannel.capacity=10000llect.channels=memoryChannelchannel=memoryChannel 7 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 8. Sending Data to the Agent • Connect netcat to the host • Pipe input to it • Records are transmitted on newline • head example.xml | nc localhost 111118 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 9. Alternatives to Flume And Their Trade-Offs • Scribe – Thrift-based – Lightweight, but no support – Not designed around Hadoop • Kafka – Designed to resemble a publish-subscribe system – Explicitly distributed – Apache Incubator Project9 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 10. 10 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.
  • 11. 11 Copyright © 2012, Oracle and/or its affiliates. All rights Insert Information Protection Policy Classification from Slide 8 reserved.