Kafka Schema Registry
Managing Record Schema in Kafka
Presented By:
Manisha Dubaria
Agenda
• What is Schema Registry
• Flow of Schema Registry
• Data Serialization
• Configuration Options
• Format of data written in topic
• Restcall to Schema Registry
• Use of Schema Registry
What is Schema Registry
• A shared repository of schema that allows application to
flexibly interact with each other.
• It deals with Evolution of Schema in Message Record over
Time
• Schema Registry is used by Writer/Reader
– Senders/Producers use this schema while sending the
payloads according to the given schema
– Reader/Consumer uses this schema to project the received
payload written with a writers schema
Flow of Schema Registry
Producer ConsumerKafka Cluster
Schema
Registry
Registers Schema Asks for Schema
Data Serialization
• Data consumers should understand Data Producers
• Kafka handles schema evolution problem using avro serializer
and deserializer
Configuration Options
• Most important Configuration given to Producer and
Consumer is “schema.registry.url”
• Producer can set “auto.register.schemas” to true to
automatically register the schema to the registry
• There are two ways to provide the schema to Producer:
I. Include the Path of schema fie in pom of project and build the
project every time the schema is changed
II. Pass the schema File explicitly and build the project only once
independent of the change in schema file
Configuration Options
Configuration Options
Format of Data
• Either the message key or message value or both can be
serialized as Avro
• It has subject which defines a scope in which schemas can
evolve
• Schema Registry does compatibility checks only within the
schema subject
• The schema file is an avsc file which contains namespace,
type, name, fields
• Data in topic is stored as [MagicByte][Schema ID][Data]
Rest call to Schema Registry
• Registering a new version of Schema under subject
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json"  --data
'{"schema": "{"type": "string"}"}'  http://localhost:8081/subjects/Kafka-value/versions
• List all Subjects
curl -X GET http://localhost:8081/subjects
• Fetching a schema by globally unique Id
curl -X GET http://localhost:8081/schemas/ids
• Fetch Version 1 of the Schema Registered Under Subject
curl -X GET http://localhost:8081/subjects/Kafka-value/versions/1
• Deleting Version 1 of the Schema Registered Under Subject
curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/1
• Check the compatibility
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json"  --data
'{"schema": "{"type": "string"}"}'  http://localhost:8081/compatibility/subjects/Kafka-
value/versions/latest
• Checking top Level Config
curl -X GET http://localhost:8081/config
Use of Schema Registry
• Provides Reusable Schema
• Define relationships between schema
• To avoid attaching schema to every piece of data
• Producers and Consumers can evolve at different Rate
Schema registry

Schema registry

  • 1.
    Kafka Schema Registry ManagingRecord Schema in Kafka Presented By: Manisha Dubaria
  • 2.
    Agenda • What isSchema Registry • Flow of Schema Registry • Data Serialization • Configuration Options • Format of data written in topic • Restcall to Schema Registry • Use of Schema Registry
  • 3.
    What is SchemaRegistry • A shared repository of schema that allows application to flexibly interact with each other. • It deals with Evolution of Schema in Message Record over Time • Schema Registry is used by Writer/Reader – Senders/Producers use this schema while sending the payloads according to the given schema – Reader/Consumer uses this schema to project the received payload written with a writers schema
  • 4.
    Flow of SchemaRegistry Producer ConsumerKafka Cluster Schema Registry Registers Schema Asks for Schema
  • 5.
    Data Serialization • Dataconsumers should understand Data Producers • Kafka handles schema evolution problem using avro serializer and deserializer
  • 6.
    Configuration Options • Mostimportant Configuration given to Producer and Consumer is “schema.registry.url” • Producer can set “auto.register.schemas” to true to automatically register the schema to the registry • There are two ways to provide the schema to Producer: I. Include the Path of schema fie in pom of project and build the project every time the schema is changed II. Pass the schema File explicitly and build the project only once independent of the change in schema file
  • 7.
  • 8.
  • 9.
    Format of Data •Either the message key or message value or both can be serialized as Avro • It has subject which defines a scope in which schemas can evolve • Schema Registry does compatibility checks only within the schema subject • The schema file is an avsc file which contains namespace, type, name, fields • Data in topic is stored as [MagicByte][Schema ID][Data]
  • 10.
    Rest call toSchema Registry • Registering a new version of Schema under subject curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{"type": "string"}"}' http://localhost:8081/subjects/Kafka-value/versions • List all Subjects curl -X GET http://localhost:8081/subjects • Fetching a schema by globally unique Id curl -X GET http://localhost:8081/schemas/ids • Fetch Version 1 of the Schema Registered Under Subject curl -X GET http://localhost:8081/subjects/Kafka-value/versions/1 • Deleting Version 1 of the Schema Registered Under Subject curl -X DELETE http://localhost:8081/subjects/Kafka-value/versions/1 • Check the compatibility curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" --data '{"schema": "{"type": "string"}"}' http://localhost:8081/compatibility/subjects/Kafka- value/versions/latest • Checking top Level Config curl -X GET http://localhost:8081/config
  • 11.
    Use of SchemaRegistry • Provides Reusable Schema • Define relationships between schema • To avoid attaching schema to every piece of data • Producers and Consumers can evolve at different Rate