SlideShare a Scribd company logo
Creating Connector to bridge the worlds of
Kafka and gRPC at Wework
-- Anoop Dixith
Agenda
❏ Role of Kafka at Wework
❏ - Why Kafka? Why gRPC?
❏ Life without Connect
❏ Why Connectors?
❏ gRPC sink Connector at WeWork
❏ - Configuration, implementation, monitoring
Architecture (publisher side)
Architecture (continued)
❏ Importance of the architecture
❏ Kafka-gRPC combination
❏ Java/Scala-Go
❏ Why gRPC?
❏ How the combination worked perfectly?
Architecture (continued)
Machine readable API contract, so
platform neutral
Throughput
Protobuf, so polyglot and faster,
smaller and simpler, safer payload
Distributed, HA
HTTP/2 - multiplexing, binary, safer,
future
Scalability
Streaming vs request/response Complementary components -
KStream, KTable, KSQL, Connectors
Community Community
gRPC Kafka
Life Without Connect
In the very beginning, there was no Connect
Kafka sources were connected to a variety of sinks at Wework- gRPC,
Elasticsearch etc. And hey, it worked well.
Yes, but what about
❏ Scalability?
❏ Security?
❏ Configurability?
❏ Error handling?
❏ Extendability?
Do not reinvent the wheel!
Architecture without Connect
Then there was Connect...
Very simply, Kafka Connect is a framework to stream data into and out of Kafka
Properties:
❏ Broad copying, scalability, streaming and batch apps, parallelism.
❏ Does one and one thing very well - copying data
❏ Extensible through Connectors
Models:
❏ Connector
❏ Worker
❏ Data
Connect components
❏ Connectors – abstraction that handles data streaming by managing “tasks”
❏ Tasks – the implementation of how data is copied to or from Kafka
❏ Workers – the running processes that execute connectors and tasks
❏ Converters – translates data between Connect and the source/sink
❏ Transforms – alters messages produced by or sent to a connector
❏ Dead Letter Queue – error handling
Types of Connectors
❏ Source Connector - Kinesis, Zendesk, Jira, Twitter, email
❏ Sink Connector - S3, MongoDB, HDFS, YouNameIt
❏ Connect Hub - https://www.confluent.io/hub/
❏ Also available are different Transform and Converter
❏ Availability in Confluent Cloud - fully managed
❏ Licenses, levels of verification
Connect flow
Writing your own Connectors. Yes, we can!
❏ Why?
❏ Oh, we don’t have that Connector!
❏ We have a Connector, but we need to customize it to our needs
❏ Complete control on how we want to move the data
❏ Give it back to the community
❏ How?
❏ Kafka Connect API to the rescue!
❏ Implement/extend your Connector, Task, Config interfaces/abstract classes
Custom Connectors
gRPC Sink Connector
❏ What is it intended to do?
❏ Why not directly sink to underlying databases?
Implementation
❏ SinkRecord
❏ SinkTask
GrpcService Interface
GrpcClient Interface
Crux of bulkSend() AKA how is gRPC Connector different from MySql Connector?
gRPC glossary
❏ stub: generated when protoc is run if a service declaration is in the proto file
❏ service class
❏ rpc name and args
❏ channel: provides a connection to a gRPC server on a specified host and port
❏ along with grpc server url and port
bulkSend()
bulkSend() forms the crux of the gRPC Sink Connector’s data copying
❏ Handles channel readiness (connectivity state)
❏ Manages security and all logic related to error-handling
❏ Controls the rate of data copying
❏ Potentially retry logic
Limitations of bulkSend()
Uses reflection to get stub classes and methods
What you can’t do with bulkSend() (yet)?
Configuration
Configs passed to GrpcSinkConnector object
Testing, Deployment, and Monitoring
❏ Testing Connector by mocking is hard and tricky as it involves two systems -
Kafka and the external source/sink
❏ Independent unit-testing is recommended for all Task and Connector classes
❏ End-to-end testing using gRPC servers created on the fly in Docker
containers of CircleCI testing plan
❏ Extremely difficult to test gRPC channel connectivity states and other error
scenarios
Testing, Deployment, and Monitoring (contd…)
❏ Packaging - for easy installing into into Kafka Connect installations
❏ By creating an Archive
❏ create a tarball or ZIP archive
❏ contains a single directory with unique name (name and version likely)
❏ all JAR files and other resource files needed by the connector are in tld
❏ doesn’t include Kafka Connect API or runtime libraries
❏ By creating an Uber JAR
❏ create an uber JAR that contains all JAR files and other resource files
❏ Installation -
❏ User simply unpacks the archive or places the uber JAR in a directory listed in Kafka
Connect’s plugin path
Monitoring Connectors
❏ Monitored via Connect’s extensive REST interface
❏ current status of a connector and its tasks
❏ worker ids to whom tasks are assigned
❏ pause/resume APIs
❏ active connectors, connector tasks, restart a connector, restart a task, update config, delete
connector
❏ Logging
❏ Connect comes with default Java-based logging utility Apache Log4j to collect runtime data
and record component events
Connect metrics and metrics using Prometheus
❏ Reports a variety of metrics through Java Management Extensions (JMX)
❏ task and worker metrics - status, running-ratio,
offset-commit-success-percentage, offset-commit-avg-time-ms, task-count,
connector-count, rebalancing metrics
❏ A variety of client metrics like connection-count, connection-close-rate,
network-io-rate, outgoing-byte-rate, request-rate etc
❏ gRPC Sink Connector metrics:
❏ sink-record-read-rate
❏ sink-record-active-count
❏ sink-record-read-total
❏ sink-record-send-rate
Connect metrics and metrics using Prometheus (cntd..)
❏ The monitoring tool Prometheus ingests metrics, makes them graphable, and
helps build alerts on top of metrics
❏ pulls metrics from HTTP endpoints added to the Prometheus configuration file
❏ provides JMX Exporter, a collector that can configurably scrape and expose
mBeans of a JMX target
Challenges and Lessons Learned
❏ Configuring more than one rpcs in a service
❏ Configuring rpcs with multiple arguments
❏ Testing the two components of the system
❏ Connectors have the capability to be extremely flexible, and can also hide
intricate logic when used off the shelf
Thank you!
Lets Connect over questions

More Related Content

What's hot

Hacking Jenkins
Hacking JenkinsHacking Jenkins
Hacking Jenkins
Miro Cupak
 
[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion
Akihiro Suda
 
End-to-end Streaming Between gRPC Services Via Kafka with John Fallows
End-to-end Streaming Between gRPC Services Via Kafka with John FallowsEnd-to-end Streaming Between gRPC Services Via Kafka with John Fallows
End-to-end Streaming Between gRPC Services Via Kafka with John Fallows
HostedbyConfluent
 
Jenkins
JenkinsJenkins
Simplifying Your IT Workflow with Katello and Foreman
Simplifying Your IT Workflow with Katello and ForemanSimplifying Your IT Workflow with Katello and Foreman
Simplifying Your IT Workflow with Katello and Foreman
Nikhil Kathole
 
Jenkins Overview
Jenkins OverviewJenkins Overview
Jenkins Overview
Ahmed M. Gomaa
 
Automation with ansible
Automation with ansibleAutomation with ansible
Automation with ansible
Khizer Naeem
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
Hao H. Zhang
 
Getting Started with Kubernetes
Getting Started with Kubernetes Getting Started with Kubernetes
Getting Started with Kubernetes
VMware Tanzu
 
Chef for DevOps - an Introduction
Chef for DevOps - an IntroductionChef for DevOps - an Introduction
Chef for DevOps - an Introduction
Sanjeev Sharma
 
Docker Networking: Control plane and Data plane
Docker Networking: Control plane and Data planeDocker Networking: Control plane and Data plane
Docker Networking: Control plane and Data plane
Docker, Inc.
 
Introduction to Version Control and Configuration Management
Introduction to Version Control and Configuration ManagementIntroduction to Version Control and Configuration Management
Introduction to Version Control and Configuration ManagementPhilip Johnson
 
Red Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform OverviewRed Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform Overview
James Falkner
 
REST API Design & Development
REST API Design & DevelopmentREST API Design & Development
REST API Design & Development
Ashok Pundit
 
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Eva Mave Ng
 
Secure Spring Boot Microservices with Keycloak
Secure Spring Boot Microservices with KeycloakSecure Spring Boot Microservices with Keycloak
Secure Spring Boot Microservices with Keycloak
Red Hat Developers
 
Introduction à l’intégration continue avec Jenkins
Introduction à l’intégration continue avec JenkinsIntroduction à l’intégration continue avec Jenkins
Introduction à l’intégration continue avec Jenkins
Eric Hogue
 
Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署
Bo-Yi Wu
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
Grafana Labs
 
Comparing Native Java REST API Frameworks - Seattle JUG 2022
Comparing Native Java REST API Frameworks - Seattle JUG 2022Comparing Native Java REST API Frameworks - Seattle JUG 2022
Comparing Native Java REST API Frameworks - Seattle JUG 2022
Matt Raible
 

What's hot (20)

Hacking Jenkins
Hacking JenkinsHacking Jenkins
Hacking Jenkins
 
[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion[KubeConEU2023] Lima pavilion
[KubeConEU2023] Lima pavilion
 
End-to-end Streaming Between gRPC Services Via Kafka with John Fallows
End-to-end Streaming Between gRPC Services Via Kafka with John FallowsEnd-to-end Streaming Between gRPC Services Via Kafka with John Fallows
End-to-end Streaming Between gRPC Services Via Kafka with John Fallows
 
Jenkins
JenkinsJenkins
Jenkins
 
Simplifying Your IT Workflow with Katello and Foreman
Simplifying Your IT Workflow with Katello and ForemanSimplifying Your IT Workflow with Katello and Foreman
Simplifying Your IT Workflow with Katello and Foreman
 
Jenkins Overview
Jenkins OverviewJenkins Overview
Jenkins Overview
 
Automation with ansible
Automation with ansibleAutomation with ansible
Automation with ansible
 
Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1Kubernetes Architecture - beyond a black box - Part 1
Kubernetes Architecture - beyond a black box - Part 1
 
Getting Started with Kubernetes
Getting Started with Kubernetes Getting Started with Kubernetes
Getting Started with Kubernetes
 
Chef for DevOps - an Introduction
Chef for DevOps - an IntroductionChef for DevOps - an Introduction
Chef for DevOps - an Introduction
 
Docker Networking: Control plane and Data plane
Docker Networking: Control plane and Data planeDocker Networking: Control plane and Data plane
Docker Networking: Control plane and Data plane
 
Introduction to Version Control and Configuration Management
Introduction to Version Control and Configuration ManagementIntroduction to Version Control and Configuration Management
Introduction to Version Control and Configuration Management
 
Red Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform OverviewRed Hat OpenShift Container Platform Overview
Red Hat OpenShift Container Platform Overview
 
REST API Design & Development
REST API Design & DevelopmentREST API Design & Development
REST API Design & Development
 
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
Designing Apps for Runtime Fabric: Logging, Monitoring & Object Store Persist...
 
Secure Spring Boot Microservices with Keycloak
Secure Spring Boot Microservices with KeycloakSecure Spring Boot Microservices with Keycloak
Secure Spring Boot Microservices with Keycloak
 
Introduction à l’intégration continue avec Jenkins
Introduction à l’intégration continue avec JenkinsIntroduction à l’intégration continue avec Jenkins
Introduction à l’intégration continue avec Jenkins
 
Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署Drone CI/CD 自動化測試及部署
Drone CI/CD 自動化測試及部署
 
Monitoring Kubernetes with Prometheus
Monitoring Kubernetes with PrometheusMonitoring Kubernetes with Prometheus
Monitoring Kubernetes with Prometheus
 
Comparing Native Java REST API Frameworks - Seattle JUG 2022
Comparing Native Java REST API Frameworks - Seattle JUG 2022Comparing Native Java REST API Frameworks - Seattle JUG 2022
Comparing Native Java REST API Frameworks - Seattle JUG 2022
 

Similar to Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Dixith, Wework) Kafka Summit 2020

Container orchestration from theory to practice
Container orchestration from theory to practiceContainer orchestration from theory to practice
Container orchestration from theory to practice
Docker, Inc.
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Athens Big Data
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lightbend
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
StreamNative
 
KrakenD API Gateway
KrakenD API GatewayKrakenD API Gateway
KrakenD API Gateway
Albert Lombarte
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
Cedric Vidal
 
REST in Peace. Long live gRPC!
REST in Peace. Long live gRPC!REST in Peace. Long live gRPC!
REST in Peace. Long live gRPC!
QAware GmbH
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and Californium
Julien Vermillard
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp API
shareddatamsft
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
Anton Nazaruk
 
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembBuilding a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
StreamNative
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connect
confluent
 
Chti jug - 2018-06-26
Chti jug - 2018-06-26Chti jug - 2018-06-26
Chti jug - 2018-06-26
Florent Ramiere
 
Rust kafka-5-2019-unskip
Rust kafka-5-2019-unskipRust kafka-5-2019-unskip
Rust kafka-5-2019-unskip
Gerard Klijs
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
Florent Ramiere
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
Sadique Puthen
 
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Imesha Sudasingha
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
Data Con LA
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Yaroslav Tkachenko
 

Similar to Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Dixith, Wework) Kafka Summit 2020 (20)

Container orchestration from theory to practice
Container orchestration from theory to practiceContainer orchestration from theory to practice
Container orchestration from theory to practice
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
Lessons Learned From PayPal: Implementing Back-Pressure With Akka Streams And...
 
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pu...
 
KrakenD API Gateway
KrakenD API GatewayKrakenD API Gateway
KrakenD API Gateway
 
BBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.comBBL KAPPA Lesfurets.com
BBL KAPPA Lesfurets.com
 
REST in Peace. Long live gRPC!
REST in Peace. Long live gRPC!REST in Peace. Long live gRPC!
REST in Peace. Long live gRPC!
 
Hands on with CoAP and Californium
Hands on with CoAP and CaliforniumHands on with CoAP and Californium
Hands on with CoAP and Californium
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp API
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembBuilding a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
 
Diving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka ConnectDiving into the Deep End - Kafka Connect
Diving into the Deep End - Kafka Connect
 
Chti jug - 2018-06-26
Chti jug - 2018-06-26Chti jug - 2018-06-26
Chti jug - 2018-06-26
 
Rust kafka-5-2019-unskip
Rust kafka-5-2019-unskipRust kafka-5-2019-unskip
Rust kafka-5-2019-unskip
 
Jug - ecosystem
Jug -  ecosystemJug -  ecosystem
Jug - ecosystem
 
Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28Introduction openstack-meetup-nov-28
Introduction openstack-meetup-nov-28
 
Comparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systemsComparison between zookeeper, etcd 3 and other distributed coordination systems
Comparison between zookeeper, etcd 3 and other distributed coordination systems
 
Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7Jack Gudenkauf sparkug_20151207_7
Jack Gudenkauf sparkug_20151207_7
 
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­ticaA noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
A noETL Parallel Streaming Transformation Loader using Spark, Kafka­ & Ver­tica
 
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to StreamingBravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
Bravo Six, Going Realtime. Transitioning Activision Data Pipeline to Streaming
 

More from confluent

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
confluent
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
confluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
confluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
confluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
confluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
confluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
confluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
confluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
confluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
confluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
confluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
confluent
 

More from confluent (20)

Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in MinutesSpeed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
 
Evolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI EraEvolving Data Governance for the Real-time Streaming and AI Era
Evolving Data Governance for the Real-time Streaming and AI Era
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 

Recently uploaded

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Dixith, Wework) Kafka Summit 2020

  • 1. Creating Connector to bridge the worlds of Kafka and gRPC at Wework -- Anoop Dixith
  • 2. Agenda ❏ Role of Kafka at Wework ❏ - Why Kafka? Why gRPC? ❏ Life without Connect ❏ Why Connectors? ❏ gRPC sink Connector at WeWork ❏ - Configuration, implementation, monitoring
  • 3.
  • 5. Architecture (continued) ❏ Importance of the architecture ❏ Kafka-gRPC combination ❏ Java/Scala-Go ❏ Why gRPC? ❏ How the combination worked perfectly?
  • 6. Architecture (continued) Machine readable API contract, so platform neutral Throughput Protobuf, so polyglot and faster, smaller and simpler, safer payload Distributed, HA HTTP/2 - multiplexing, binary, safer, future Scalability Streaming vs request/response Complementary components - KStream, KTable, KSQL, Connectors Community Community gRPC Kafka
  • 7. Life Without Connect In the very beginning, there was no Connect Kafka sources were connected to a variety of sinks at Wework- gRPC, Elasticsearch etc. And hey, it worked well. Yes, but what about ❏ Scalability? ❏ Security? ❏ Configurability? ❏ Error handling? ❏ Extendability? Do not reinvent the wheel!
  • 9. Then there was Connect... Very simply, Kafka Connect is a framework to stream data into and out of Kafka Properties: ❏ Broad copying, scalability, streaming and batch apps, parallelism. ❏ Does one and one thing very well - copying data ❏ Extensible through Connectors Models: ❏ Connector ❏ Worker ❏ Data
  • 10. Connect components ❏ Connectors – abstraction that handles data streaming by managing “tasks” ❏ Tasks – the implementation of how data is copied to or from Kafka ❏ Workers – the running processes that execute connectors and tasks ❏ Converters – translates data between Connect and the source/sink ❏ Transforms – alters messages produced by or sent to a connector ❏ Dead Letter Queue – error handling
  • 11. Types of Connectors ❏ Source Connector - Kinesis, Zendesk, Jira, Twitter, email ❏ Sink Connector - S3, MongoDB, HDFS, YouNameIt ❏ Connect Hub - https://www.confluent.io/hub/ ❏ Also available are different Transform and Converter ❏ Availability in Confluent Cloud - fully managed ❏ Licenses, levels of verification
  • 13. Writing your own Connectors. Yes, we can! ❏ Why? ❏ Oh, we don’t have that Connector! ❏ We have a Connector, but we need to customize it to our needs ❏ Complete control on how we want to move the data ❏ Give it back to the community ❏ How? ❏ Kafka Connect API to the rescue! ❏ Implement/extend your Connector, Task, Config interfaces/abstract classes
  • 15. gRPC Sink Connector ❏ What is it intended to do? ❏ Why not directly sink to underlying databases?
  • 18. GrpcClient Interface Crux of bulkSend() AKA how is gRPC Connector different from MySql Connector? gRPC glossary ❏ stub: generated when protoc is run if a service declaration is in the proto file ❏ service class ❏ rpc name and args ❏ channel: provides a connection to a gRPC server on a specified host and port ❏ along with grpc server url and port
  • 19. bulkSend() bulkSend() forms the crux of the gRPC Sink Connector’s data copying ❏ Handles channel readiness (connectivity state) ❏ Manages security and all logic related to error-handling ❏ Controls the rate of data copying ❏ Potentially retry logic
  • 20. Limitations of bulkSend() Uses reflection to get stub classes and methods What you can’t do with bulkSend() (yet)?
  • 21. Configuration Configs passed to GrpcSinkConnector object
  • 22. Testing, Deployment, and Monitoring ❏ Testing Connector by mocking is hard and tricky as it involves two systems - Kafka and the external source/sink ❏ Independent unit-testing is recommended for all Task and Connector classes ❏ End-to-end testing using gRPC servers created on the fly in Docker containers of CircleCI testing plan ❏ Extremely difficult to test gRPC channel connectivity states and other error scenarios
  • 23. Testing, Deployment, and Monitoring (contd…) ❏ Packaging - for easy installing into into Kafka Connect installations ❏ By creating an Archive ❏ create a tarball or ZIP archive ❏ contains a single directory with unique name (name and version likely) ❏ all JAR files and other resource files needed by the connector are in tld ❏ doesn’t include Kafka Connect API or runtime libraries ❏ By creating an Uber JAR ❏ create an uber JAR that contains all JAR files and other resource files ❏ Installation - ❏ User simply unpacks the archive or places the uber JAR in a directory listed in Kafka Connect’s plugin path
  • 24. Monitoring Connectors ❏ Monitored via Connect’s extensive REST interface ❏ current status of a connector and its tasks ❏ worker ids to whom tasks are assigned ❏ pause/resume APIs ❏ active connectors, connector tasks, restart a connector, restart a task, update config, delete connector ❏ Logging ❏ Connect comes with default Java-based logging utility Apache Log4j to collect runtime data and record component events
  • 25. Connect metrics and metrics using Prometheus ❏ Reports a variety of metrics through Java Management Extensions (JMX) ❏ task and worker metrics - status, running-ratio, offset-commit-success-percentage, offset-commit-avg-time-ms, task-count, connector-count, rebalancing metrics ❏ A variety of client metrics like connection-count, connection-close-rate, network-io-rate, outgoing-byte-rate, request-rate etc ❏ gRPC Sink Connector metrics: ❏ sink-record-read-rate ❏ sink-record-active-count ❏ sink-record-read-total ❏ sink-record-send-rate
  • 26. Connect metrics and metrics using Prometheus (cntd..) ❏ The monitoring tool Prometheus ingests metrics, makes them graphable, and helps build alerts on top of metrics ❏ pulls metrics from HTTP endpoints added to the Prometheus configuration file ❏ provides JMX Exporter, a collector that can configurably scrape and expose mBeans of a JMX target
  • 27. Challenges and Lessons Learned ❏ Configuring more than one rpcs in a service ❏ Configuring rpcs with multiple arguments ❏ Testing the two components of the system ❏ Connectors have the capability to be extremely flexible, and can also hide intricate logic when used off the shelf
  • 28. Thank you! Lets Connect over questions