Schema Registry 101
Bill Bejeck
@bbejeck
@bbejeck
Nice to meet you!
• Member of the DevX team
• Prior to DevX ~3 years as engineer on Kafka Streams team
• Apache Kafka® Committer and PMC member
• Author of “Kafka Streams in Action” - 2nd edition underway!
2
Why Schema Registry?
A quick Apache Kafka® review
@bbejeck
Kafka – an append only distributed log
4
@bbejeck
Kafka brokers work with bytes only
5
@bbejeck
Clients produce and consume
6
Event objects form a contract
@bbejeck
An implicit contract
8
@bbejeck
Expectation of object structure
9
@bbejeck
Potentially make unexpected changes
10
@bbejeck
Use Schema Registry FTW!
11
@bbejeck
Use Schema Registry FTW!
12
Working with Schema Registry
@bbejeck
Working with Schema Registry
1. Write/download a schema
2. Test and upload the schema
3. Generate the objects
4. Configure clients (Producer, Consumer, Kafka Streams)
5. Write your application!
14
@bbejeck
Working with Schema Registry
• Confluent Cloud UI
• Schema Registry REST API / Confluent CLI
• Producer and Consumer client
• Schema Registry console producer and consumer
• Tools
• Gradle and Maven plugins
15
Build a Schema
@bbejeck
Build a Schema
• Avro
• Protocol Buffers
• JSON Schema
17
@bbejeck
Build a Schema
{
"type":"record",
}
18
@bbejeck
Build a Schema
{
"type":"record",
"namespace": "io.confluent.developer.avro",
}
19
@bbejeck
Build a Schema
{
"type":"record",
"namespace": "io.confluent.developer.avro",
"name":"Purchase",
}
20
@bbejeck
Build a Schema
{
"type":"record",
"namespace": "io.confluent.developer.avro",
"name":"Purchase",
"fields": [
{"name": "item", "type":"string"},
{"name": "amount", "type": "double”, ”default”:0.0},
{"name": "customer_id", "type": "string"}
]
}
21
@bbejeck
Build a Schema (Protobuf)
syntax = "proto3";
package io.confluent.developer.proto;
option java_outer_classname = "PurchaseProto";
message Purchase {
string item = 1;
double amount = 2;
string customer_id = 3;
}
22
@bbejeck
Build a Schema
Avro
{"name": ”list", "type":{ "type": "array", "items" : "string",
"default": [] }}
{"name": ”numbers", "type": {"type": "map", "values": ”long”,
“default” : {}}}
Protobuf
repeated string strings = 1;
map<string, string> projects = 2;
map<string, Message> myOtherMap = 3;
23
Working with schemas
@bbejeck
Register
25
@bbejeck
Register
26
@bbejeck
Register
27
Producer clients can “auto-register”
producerConfigs.put(“auto.register.schemas”, true)
Not for production!!!
@bbejeck
View/Retrieve
28
Schema lifecycle
@bbejeck
Lifecycle
30
Integrating Schema Registry
@bbejeck
Clients
32
• basic.auth.credentials.source=USER_INFO
• schema.registry.url=
https://<CLUSTER>.us-east-2.aws.confluent.cloud
• basic.auth.user.info=API_KEY:API_SECRET
@bbejeck
Clients - Producer
33
producerConfigs.put("value.serializer", ? );
• KafkaAvroSerializer.class
• KafkaProtobufSerializer.class
• KafkaJsonSchemaSerializer.class
@bbejeck
Clients – Consumer
34
consumerConfigs.put("value.deserializer", ? );
• KafkaAvroDeserializer.class
• KafkaProtobufDeserializer.class
• KafkaJsonSchemaDeserializer.class
@bbejeck
Clients – Consumer
35
• specific.avro.reader = true|false
• specific.protobuf.value.type = proto class name
• json.value.type = class name
@bbejeck
Clients – Consumer Returned Types
Avro
• SpecificRecord – myObj.getFoo()
• GenericRecord - generic.get(“foo”)
Protobuf
• Specific
• Dynamic
36
@bbejeck
Clients – Kafka Streams
37
• SpecificAvroSerde
• GenericAvroSerde
• KafkaProtobufSerde
• KafkaJsonSchemaSerde
@bbejeck
Clients – Kafka Streams
38
Serde<Generated> serde = new SpecificAvroSerde<>();
Map<String, String> configs =
Map.of(“schema.registry.url”, “https..”, …)
serde.configure(configs)
@bbejeck
Clients – Command Line Produce
39
confluent kafka topic produce purchases 
--value-format protobuf 
--schema src/main/proto/purchase.proto 
--sr-endpoint https://.... 
--sr-api-key xxxyyyy
--sr-api-secret abc123 
--cluster lkc-45687
> {"item":"pizza", "amount":17.99, "customer_id":”lombardi"}
@bbejeck
Clients – Command Line Produce
40
./bin/kafka-protobuf-console-producer 
--topic purchases 
--bootstrap-server localhost:9092 
--property schema.registry.url = http://... 
--property value.schema ="$(<src/main/proto/purchase.proto)"
> {"item":"pizza", "amount":17.99, "customer_id":”lombardi"}
@bbejeck
Clients – Command Line Consume
41
confluent kafka topic consume purchases 
--from-beginning 
--value-format protobuf 
--schema src/main/proto/purchase.proto 
--sr-endpoint https://.... 
--sr-api-key xxxyyyy
--sr-api-secret abc123 
--cluster lkc-45687
@bbejeck
Clients – Command Line Consume
42
./bin/kafka-protobuf-console-consumer 
--from-beginning 
--topic purchases 
--bootstrap-server localhost:9092 
--property schema.registry.url = http://...
@bbejeck
Clients –Testing
43
For testing you can use a mock schema registry
• schema.registry.url= “mock://scope-name”
What is a Subject?
@bbejeck
What is a Subject ?
• Defines a namespace for a schema
• Compatibility checks are per subject
• Different approaches – subject name strategies
45
@bbejeck
Subjects - TopicNameStrategy
46
@bbejeck
Subjects - RecordNameStrategy
47
@bbejeck
Subjects - TopicRecordNameStrategy
48
Schema Compatibility
@bbejeck
Schema Compatibility
• Schema Registry provides a mechanism for safe changes
• Evolving a schema
• Compatibility checks are per subject
• When a schema evolves, the subject remains, but the schema
gets a new ID and version
50
@bbejeck
Schema Compatibility
• Backward - default
• Forward
• Full
Latest version will work with previous version
• But not necessarily with versions beyond that
51
@bbejeck
Schema Compatibility
• Backward transitive
• Forward transitive
• Full transitive
Latest version will work with all pervious versions
52
@bbejeck
Schema Compatibility
Backward Forward Full
Delete fields
Add fields with default values
Delete fields with default values
Add fields
Delete or add fields either must have
default values.
Update consumer clients first Update producer clients first Order of update doesn’t matter
53
@bbejeck
Testing a Schema for compatiblity
54
@bbejeck
Summary
• Kafka broker works with bytes only
• Event objects form an implicit contract
• Using Schema Registry allows for an explicit contract
• Schema Registry provides for keeping domain objects in sync
55
@bbejeck
Resources
• Documentation - https://docs.confluent.io/platform/current/schema-
registry/index.html#sr-overview
• Multiple Event Types Presentation - https://www.confluent.io/en-
gb/events/kafka-summit-europe-2021/managing-multiple-event-types-in-
a-single-topic-with-schema-registry/
• Multiple Events tutorial - https://developer.confluent.io/tutorials/multiple-
event-type-topics/confluent.html
• Multiple Event Type code - https://github.com/bbejeck/multiple-events-
kafka-summit-europe-2021
56
@bbejeck
Resources
• Martin Kleppmann - https://www.confluent.io/blog/put-several-event-
types-kafka-topic/
• Robert Yokota - https://www.confluent.io/blog/multiple-event-types-in-
the-same-kafka-topic/
• Avro - https://avro.apache.org/docs/current/
• Protocol Buffers - https://developers.google.com/protocol-buffers
• JSON Schema - https://json-schema.org/
57
@bbejeck
Build tools
58
• Schema Registry Maven
• https://docs.confluent.io/platform/current/schema-
registry/develop/maven-plugin.html#sr-maven-plugin
• Schema Registry Gradle
• https://github.com/ImFlog/schema-registry-plugin
• Protobuf Gradle
• https://github.com/google/protobuf-gradle-plugin
• Avro Gradle
• https://github.com/davidmc24/gradle-avro-plugin
@bbejeck
Build tools
59
• JSON Schema
• https://github.com/jsonschema2dataclass/js2d-gradle
• https://github.com/joelittlejohn/jsonschema2pojo
Thank you!
@bbejeck
bill@confluent.io
cnfl.io/meetups cnfl.io/slack
cnfl.io/blog
Your Apache Kafka®
journey begins here
developer.confluent.io
61

Schema Registry 101 with Bill Bejeck | Kafka Summit London 2022