Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Kafka connect-london-meetup-2016

1,325 views

Published on

Kafka is the basis for Modern Data Integration infrastructure and KafkaConnect makes it much better

Published in: Software
  • Be the first to comment

Kafka connect-london-meetup-2016

  1. 1. Stream All Things Real-time Data Integration at Scale with Apache Kafka By Gwen Shapira
  2. 2. Hadoop Cluster II Storage Processing SolR Hadoop Cluster I ClientClient Flume Agents Hbase / Memory Spark Streaming HDFS Hive/Im pala Map/Re duce Spark Search Automated & Manual Analytical Adjustments and Pattern detection Fetching & Updating Profiles Adjusting NRT Stats HDFSEventSink SolR Sink Batch Time Adjustments Automated & Manual Review of NRT Changes and Counters Local Cache Kafka Clients: (Swipe here!) Web App
  3. 3. Data Integration getting data to all the right places
  4. 4. Introducing Kafka Connect Large-scale streaming data import/export for Kafka
  5. 5. Offsets automatically committed and restored On restart: task checks offsets & rewinds At least once delivery – flush data, then commit Exactly once for connectors that support it (e.g. HDFS) Delivery Guarantees
  6. 6. Abstract serialization: 1 connector, many serialization formats Convert between Kafka Connect Data API (Connectors) and serialized bytes (Kafka) JSON and Avro are currently well supported Converters
  7. 7. Confluent Open Source – HDFS, JDBC Connector Hub: connectors.confluent.io Examples: MySQL, MongoDB, Twitter, Solr, S3, MQTT, Bloomberg, Apache Ignite, Attunity, Couchbase, Vertica, Cassandra, Hbase, Kudu, Mixpanel, Systlog, Twitter and more Connectors Today
  8. 8. Jenkins connector – Aravind Yarram (Equifax) Twitter semantic analysis and visualization – Ashish Singh (Cloudera) Brain monitoring device connector – Silicon Valley Data Science DynamoDB, Cassandra, Slack, Splunk, and many more Connectors from the Hackathon
  9. 9. Improved connector control via REST API, standardized configs, metrics Single record transformations Data pipelines in an app - embedded mode & Kafka Streams integration Many more connectors Coming soon…
  10. 10. THANK YOU! Gwen Shapira | gwen@confluent.io | @gwenshap Visit us in the Confluent Booth (#217) Kafka: The Definitive Guide = Book Giveaway and Signing Making Sense of Stream Processing = Book Giveaway Kafka Training with Confluent University Kafka Developer and Operations Courses Visit www.confluent.io/training

×