Your SlideShare is downloading. ×
0
Apache Flume
● What is it ?
● How does it work ?
● Architecture
● Reliability
www.semtech-solutions.co.nz info@semtech-sol...
Flume – What is it ?
● A data collection service for Hadoop
● For distributed systems
● Open source
● Scaleable
● Reliable...
Flume – How does it work ?
● Flumes uses agents which have
– A source
● Listen for events
● Write events to channel
– A ch...
Flume – Architecture
● A single agent showing its parts
● Generally one agent for a given data type
www.semtech-solutions....
Flume – Architecture
● Agents can be chained into flows
● Avro can be used for data serialization
www.semtech-solutions.co...
Flume – Architecture
In complicated flows it may be necessary to think about
● Event Data Reliability
● Should we have
– C...
Flume – Architecture
● Complex flows may have many links
www.semtech-solutions.co.nz info@semtech-solutions.co.nz
Contact Us
● Feel free to contact us at
– www.semtech-solutions.co.nz
– info@semtech-solutions.co.nz
● We offer IT project...
Upcoming SlideShare
Loading in...5
×

An Introduction to Apache Flume

2,129

Published on

An Introduction to Apache Flume, what is it used for and
how does it work ? How does it fit into the Hadoop tool
set ?

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,129
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
109
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "An Introduction to Apache Flume"

  1. 1. Apache Flume ● What is it ? ● How does it work ? ● Architecture ● Reliability www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  2. 2. Flume – What is it ? ● A data collection service for Hadoop ● For distributed systems ● Open source ● Scaleable ● Reliable ● Manageable ● Fault tolerant www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  3. 3. Flume – How does it work ? ● Flumes uses agents which have – A source ● Listen for events ● Write events to channel – A channel ● Queue event data as transactions – A sink ● Write event data to target i.e. HDFS ● Remove event from queue www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  4. 4. Flume – Architecture ● A single agent showing its parts ● Generally one agent for a given data type www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  5. 5. Flume – Architecture ● Agents can be chained into flows ● Avro can be used for data serialization www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  6. 6. Flume – Architecture In complicated flows it may be necessary to think about ● Event Data Reliability ● Should we have – Complete end to end reliability – Send and forget – Or something in between ? www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  7. 7. Flume – Architecture ● Complex flows may have many links www.semtech-solutions.co.nz info@semtech-solutions.co.nz
  8. 8. Contact Us ● Feel free to contact us at – www.semtech-solutions.co.nz – info@semtech-solutions.co.nz ● We offer IT project consultancy ● We are happy to hear about your problems ● You can just pay for those hours that you need ● To solve your problems
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×