● What is it ?
● How does it work ?
● What can we collect ?
Chukwa – What is it ?
● For log collection and analysis
● Designed for big data
● Designed for Hadoop
● Uses HDFS and MapReduce
● Provides a tool kit to analyse logs
Chukwa – How does it work ?
● Chukwa agents on source nodes
● Transfer data to collectors which save data to HDFS
● Data sinks contain raw unsorted data
● Data sinks clean data
● Demux adds structure to create Chukwa records
● Chukwa records go to database
● Are ready to be analysed
Chukwa – What can we collect ?
● System logs
– Defined format
– Undefined format
● Low latency
– Access to log data
Chukwa – Architecture ?
● Chukwa agents
– Reside on the Hadoop machines
– Collect raw data
– Use adaptors for data sources
– Use http to transmit data
– Operate on data chunks
– Can fail over between collectors
● Feel free to contact us at
● We offer IT project consultancy
● We are happy to hear about your problems
● You can just pay for those hours that you need
● To solve your problems