Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

From Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and #stream-processing

1,380 views

Published on

Presentation on "Big Data and Kafka, Kafka-Connect and the modern days of stream processing" For @Argos - @Accenture Development Technology Conference - London Science Museum (IMAX)

Published in: Software
  • Login to see the comments

From Big to Fast Data. How #kafka and #kafka-connect can redefine you ETL and #stream-processing

  1. 1. 2010
  2. 2. 2014 - Error handling first class citizen 
  3. 3. schema registry Your App Producer Serializer Check is format is acceptable 
 Retrieve schema ID Topic Incompatible 
 data error Schema ID + Data Kafka producerProps.put(“key.serializer", "io.confluent.kafka.serializers.KafkaAvroSerializer");
 producerProps.put("value.serializer","io.confluent.kafka.serializers.KafkaAvroSerializer");
  4. 4. shipments topic sales topic low inventory topic spark streaming generate
 data let’s see some code
  5. 5. Define the data contract / schema in Avro format
  6. 6. generate data 1,9 M msg / sec
 using 1 thread
  7. 7. https://schema-registry-ui.landoop.com Schemas registered for us :-)
  8. 8. Defining the typed data format
  9. 9. Initiate the streaming from 2 topics
  10. 10. The business logic
  11. 11. shipments topic sales topic low inventory topic spark streaming elastic-search re-ordering
  12. 12. Simple is beautiful
  13. 13. landoop.com/blog github.com/landoop

×