Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What's new in Confluent 3.2 and Apache Kafka 0.10.2

880 views

Published on

With the introduction of connect and streams API in 2016, Apache Kafka is becoming the defacto solution for anyone looking to build a streaming platform. The community continues to add additional capabilities to make it the complete solution for streaming data.

Join us as we review the latest additions in Apache Kafka 0.10.2. In addition, we’ll cover what’s new in Confluent Enterprise 3.2 that makes it possible for running Kafka at scale.

Published in: Software
  • Be the first to comment

What's new in Confluent 3.2 and Apache Kafka 0.10.2

  1. 1. 1 What’s new in Confluent 3.2? Clarke Patterson Sr. Director, Product Marketing
  2. 2. 2 Attend the whole series! Simplify Governance for Streaming Data in Apache Kafka Date: Thursday, April 6, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Gwen Shapira, Product Manager, Confluent Using Apache Kafka to Analyze Session Windows Date: Thursday, March 30, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Michael Noll, Product Manager, Confluent Monitoring and Alerting Apache Kafka with Confluent Control Center Date: Thursday, March 16, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Nick Dearden, Director, Engineering and Product Data Pipelines Made Simple with Apache Kafka Date: Thursday, March 23, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Ewen Cheslack-Postava, Engineer, Confluent https://www.confluent.io/online-talk/online-talk-series-five-steps-to-production-with-apache-kafka/ What’s New in Apache Kafka 0.10.2 and Confluent 3.2 Date: Thursday, March 9, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Clarke Patterson, Senior Director, Product Marketing
  3. 3. 3 Key themes for 3.2 Less Effort Confluent Control Center brings visibility into the health of a cluster so it’s easy to surface only those trouble spots that count. Confluent makes operating Kafka a snap. Monitoring and Alerting in Confluent Control Center More Apps Confluent offers the most robust set of clients and connectors, making it easy to onboard more apps in a streaming platform .NET client Bridge to Cloud S3 Connector Build real-time streaming pipelines directly to Amazon with new S3 connector.
  4. 4. 4 Apache KafkaTM Connect API – Streaming Data Capture JDBC Mongo MySQL Elastic Cassandra HDFS Kafka Connect API Kafka Pipeline Connector Connector Connector Connector Connector Connector Sources Sinks Fault tolerant Manage hundreds of data sources and sinks Preserves data schema Part of Apache Kafka project Integrated within Confluent Platform’s Control Center
  5. 5. 5 Single Message Transforms for Kafka Connect Modify events before storing in Kafka: • Mask sensitive information • Add identifiers • Tag events • Store lineage • Remove unnecessary columns Modify events going out of Kafka: • Route high priority events to faster data stores • Direct events to different ElasticSearch indexes • Cast data types to match destination • Remove unnecessary columns
  6. 6. 6 Single Message Transforms Use Cases • Data masking: Mask sensitive information while sending it to Kafka. • Eg: Capture data from a relational database to Kafka, but the data includes PCI / PII information and your Kafka cluster is not certified yet. SMT allows • Event routing: Modify an event destination based on the contents of the event. (applies to events that need to get written to different database tables) • Eg: write events from Kafka to Elasticsearch, but each event needs to go to a different index - based on information in the event itself. • Event enhancement: Add additional fields to events while replicating. • Eg: Capture events from multiple data sources to Kafka, and want to include information about the source of the data in the event. • Partitioning: Set the key for the event based on event information before it gets written to Kafka. • Eg: reading records from a database table, partition the records in Kafka based on customer ID) • Timestamp conversion: Time-based data conversion standardization when integrating different systems • Eg: There are many different ways to represent time. Often, Kafka events are read from logs, which use something like "[2017-01-31 05:21:00,298]" but the key-value store events are being written into prefer dates as "milliseconds since 1970"
  7. 7. 7 Architecture of Kafka Streams API, a Part of Apache Kafka Kafka Streams API Producer Kafka Cluster Topic TopicTopic Consumer Consumer Key benefits • No additional cluster • Easy to run as a service • Supports large aggregations and joins • Security and permissions fully integrated from Kafka Example Use Cases • Microservices • Continuous queries • Continuous transformations • Event-triggered processes
  8. 8. 8 Windowing. How do find patterns in the noise? event-time Alice Bob Dave … … … … … …
  9. 9. 9 Tumbling windows answer a different type of question event-time Alice Bob Dave … … … … … … 5 mins. Eg: How many downloads did we have per user in the last 5 minutes?”
  10. 10. 10 Session windows allow us to group events based on periods of inactivity event-time Alice Bob Dave … … … … … …
  11. 11. 11 Session windows allow us to group events based on periods of inactivity event-time Alice Bob Dave … … … … … … Eg: How many shows does Alice watch on average per session?” Inactivity period
  12. 12. 12 Session windows allow us to group events based on periods of inactivity event-time Alice Bob Dave … … … … … … Eg: How many shows does Alice watch on average per session?”
  13. 13. 13 What about late arriving data? event-time Alice Bob Dave … … … … … …
  14. 14. 14 Sessions potentially merge as new events arrive Session Window
  15. 15. 15 What about late arriving data? event-time Alice Bob Dave … … … … … …
  16. 16. 16 Session windows handles late arriving data event-time Alice Bob Dave … … … … … …
  17. 17. 17 Kafka Clients Apache Kafka Native Clients Confluent Native Clients Community Supported Clients Proxy http/REST stdin/stdout
  18. 18. 18 Confluent 3.2 – C# Client High performance Full support of Kafka protocol and features Supported fully-featured native C# client Integrates with Confluent’s Schema Registry Works with any version of Apache Kafka High reliability – honors Kafka ack settings and retries
  19. 19. 19 Confluent 3.2 – JMS Client Supported Kafka client, implementing the JMS interface Secure clients with authentication, authorization and encryption Integrates with Confluent’s Schema Registry High reliability – Supports Kafka and JMS acknowledgments Support for all JMS Message Types, Headers and Properties
  20. 20. 20 Confluent 3.2 – Client Security End-to-end encryption for REST Proxy ActiveDirectory integration for C# client
  21. 21. 21 Kafka Connect API Library of Connectors * Denotes Connectors developed at Confluent and distributed by Confluent. Extensive validation and testing has been performed. Databases * Datastore/File Store * Analytics * Applications / Other *
  22. 22. 22 CP 3.2 – New Certified & Supported Connectors S3 Connector • Write Avro and JSON files • Date and time based partitions • Exactly-once delivery
  23. 23. 23 Confluent 3.2 – Cluster Health & Administration Cluster health dashboard • Monitor the health of your Kafka clusters and get alerts if any problems occur • Measure system load, performance, and operations • View aggregate statistics or drill down by broker or topic Cluster administration • Monitor topic configurations
  24. 24. 24 Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise Single message transformations Modify single events before storing in Kafka or as they leave Kafka Session windows Group events in a stream based on session windows C# client Simple library that enables streaming application development within the Kafka framework Client security Active directory integration for C# and end-to-end encryption for REST proxy S3 connector Easily write Avro and Parquet files to Amazon S3 JMS client Central registry for the format of Kafka data – guarantees all data is always consumable Cluster health monitoring Monitor the health of Kafka clusters and get alerts when problems occur Cluster administration Simplify the process of administering a Kafka cluster What’s new in Confluent 3.2?
  25. 25. 25 Feature Benefit Apache Kafka Confluent Open Source Confluent Enterprise Apache Kafka High throughput, low latency, high availability, secure distributed streaming platform Kafka Connect API Advanced API for connecting external sources/destinations into Kafka Kafka Streams API Simple library that enables streaming application development within the Kafka framework Additional Clients Supports non-Java clients; C, C++, Python, .NET and several others REST Proxy Provides universal access to Kafka from any network connected device via HTTP Schema Registry Central registry for the format of Kafka data – guarantees all data is always consumable Pre-Built Connectors HDFS, JDBC, Elasticsearch, Amazon S3 and other connectors fully certified and supported by Confluent Confluent Control Center Enables easy connector management, monitoring and alerting for a Kafka cluster Auto Data Balancer Rebalancing data across cluster to remove bottlenecks Replicator Multi-datacenter replication simplifies and automates MDC Kafka clusters Support Enterprise class support to keep your Kafka environment running at top performance Community Community 24x7x365 Confluent Completes Kafka
  26. 26. 26 Attend the whole series! Simplify Governance for Streaming Data in Apache Kafka Date: Thursday, April 6, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Gwen Shapira, Product Manager, Confluent Using Apache Kafka to Analyze Session Windows Date: Thursday, March 30, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Michael Noll, Product Manager, Confluent Monitoring and Alerting Apache Kafka with Confluent Control Center Date: Thursday, March 16, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Nick Dearden, Director, Engineering and Product Data Pipelines Made Simple with Apache Kafka Date: Thursday, March 23, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Ewen Cheslack-Postava, Engineer, Confluent https://www.confluent.io/online-talk/online-talk-series-five-steps-to-production-with-apache-kafka/ What’s New in Apache Kafka 0.10.2 and Confluent 3.2 Date: Thursday, March 9, 2017 Time: 9:30 am - 10:00 am PT | 12:30 pm - 1:00 pm ET Speaker: Clarke Patterson, Senior Director, Product Marketing
  27. 27. 27 Why Confluent? More than just enterprise software Confluent Platform The only enterprise open source streaming platform based entirely on Apache Kafka Professional Services Best practice consultation for future Kafka deployments and optimize for performance and scalability of existing ones Enterprise Support 24x7 support for the entire Apache Kafka project, not just a portion of it Complete support across the entire adoption lifecycle Kafka Training Comprehensive hands-on courses for developers and operators from the Apache Kafka experts
  28. 28. 28 Get Started with Apache Kafka Today! https://www.confluent.io/downloads/ THE place to start with Apache Kafka! Thoroughly tested and quality assured More extensible developer experience Easy upgrade path to Confluent Enterprise
  29. 29. 29 Discount code: kafcom17 Use the Apache Kafka community discount code to get $50 off www.kafka-summit.org Kafka Summit New York: May 8 Kafka Summit San Francisco: August 28 Presented by

×