Your SlideShare is downloading. ×
Apache Kafka Demo
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Apache Kafka Demo


Published on

Information and demonstration of Apache Kafka in action. Originally presented at m6d Lunch and Learn series.

Information and demonstration of Apache Kafka in action. Originally presented at m6d Lunch and Learn series.

Published in: Technology, Self Improvement

  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Apache Kafka Obligatory bold claim: Kafka Rocks! After thispresentation you will love it Edward Capriolo @
  • 2. Facts and historyWritten in Java/scalaDeveloped by Linked-inRecently accepted as Apache project
  • 3. Overall designDistributed messages queuing systemDesigned for throughput not complicated acknowledgment stuff like JMS. Again... Kafka != JMSSimple producer consumer APIAlways persists messages to disk Linear writes and reading Dont worry still fast (sendfile support)Messages are retained for specified time period (then pruned)Clients state stores position in log files Easy to replay messages Easy to deliver messages to multiple consumer groups
  • 4. Key ConceptsMessage : Datum to sendTopic : named place to send messagesBroker: Kafka ServerPartition: A logical division of a topicProducer: An API to send messagesConsumer: An API to receive messagesConsumer Group: a group of clients that will receive only one copy of a message from a topic (more later)
  • 5. Design
  • 6. Demo 1 Producer Consumer Shell
  • 7. Startup the kafka/Broker componentscd /home/edward/kafka/kafka/bin/ config/ &bin/ config/server.propertiesbin/ --server kafka://localhost:9092 --- topic topic1bin/ --partitions 1 --props config/ --topic topic1
  • 8. Demo 2: Produce with no consumer then reconnect
  • 9. Stop all consumers produce messagesStop consumer1Produce messagesStart consumer1Producing and consuming are not coupledRemember all messages are always persisted
  • 10. Demo 3 multiple consumer groups
  • 11. Start two consumers each with one consumer groupbin/ --partitions 1 --props config/ --topic topic1bin/ --partitions 1 --props config/ --topic topic1#consumer group idgroupid=test-consumer-group2Each message sent to topic gets delivered once to each group
  • 12. Demo 4: Multiple clients on a consumer groupbin/ --partitions 1 --props config/ --topic topic1 (in a new shell)Each consumer group gets 1 copy of messagemessages are balanced across consumers in the group Aka if consumer group has 3 consumers each get roughly 1/3rd of the messages (with random partitioning )
  • 13. Processing kafka in a distributed fashionEach topic has N partitionsWe need 1-N consumers to process these messagesCould write a class that implements Kafka Consumer interface and run this on N NodesEnter IronCount (something I wrote)
  • 14. Write handle messagepackage com.jointhegrid.ironcount;import kafka.message.Message;public interface MessageHandler { public void setWorkload(Workload w); public void handleMessage(Message m) ; public void stop(); public void setWorkerThread(WorkerThread wt) ;}
  • 15. Real Time Count of city and statepublic void handleMessage(Message m) { String [] parts = getMessage(m).split("t"); long time_in_ms=Long.parseLong( parts[1]); String city = parts[9]; String state = parts[10]; GregorianCalendar gc = new GregorianCalendar(); gc.setTimeInMillis(time_in_ms); Mutator<String> mut = HFactory.createMutator(keyspace, StringSerializer.get()); HCounterColumn<String> hc = HFactory.createCounter- Column(state+","+city, 1L); mut.addCounter( bucketByMinute.format(new Date()),""), hc); mut.execute(); }
  • 16. Aggregate To Cassandra
  • 17. Wrap upDistributed designDesigned for throughput not in memory per message seman- ticsSurvives fail overs of consumers and brokersKafka partition is similar to a map/reduce partitionPartial aggregations possible because the key of a message routes it to same consumer