SlideShare a Scribd company logo
1 of 11
Download to read offline
Hackolade Tutorial
Visual design of Avro schema
Copyright © 2016-2023 Hackolade 1
About Apache Avro
• Open-source project
• Provides
• Data serialisation framework
• Data exchange services
• Often used with
• Kafka pub-sub pipelines
• Data lake storage
• Row-oriented object container
• as opposed to Parquet which is column-oriented
• Language independent, compact and efficient
Copyright © 2016-2023 Hackolade 2
Components of an Avro Schema Model
• An Avro schema can be viewed as a language-
agnostic contract for systems to interoperate.
• 4 attributes:
• Type: specifies the data type of the JSON record,
whether its complex type or primitive value. At the top
level of an Avro schema, it is mandatory to have a
“record” type.
• Name: the name of the Avro schema being defined
• Namespace: a high-level logical indicator of the Avro
schema
• Fields: the individual data elements of the
JSON object. Fields can be of primitive as
well as complex type, which can be
further made of simple and complex
data types.
Copyright © 2016-2023 Hackolade 3
About Apache Avro
• Uses JSON to define schema and data types
• Warning: Avro schema is quite different from JSON Schema, even if schema is
defined using JSON format
• Container file has header containing the schema, plus 1 or more
storage blocks
Copyright © 2016-2023 Hackolade 4
Avro schema evolution
• Avro allows for powerful schema evolution
• Can achieve backward- and forward-compatibility if done well
• Super interesting for “integration” services where many producers/consumers
may evolve applications using different versions of the schema
• Schemas and their evolving versions can be published in a schema
registry
• Goal is to maximise compatibility when decoupling the lifecycle of
publishers and consumers
• Specific guidelines and best practices
Copyright © 2016-2023 Hackolade 5
Benefits of Avro schema design in Hackolade Studio
• Visual design of Avro schema
• Anticipate future schema evolution
• Facilitate message validation
• Optimize message payload
• Enable compatibility and interoperability
• Integrate with Confluent Schema Registry and others
• Promote schema reuse (despite limited official specification and
documentation) via
• Avro namespace references
• Confluent Schema Registry’s
• Schema references
• Union schemas
Copyright © 2016-2023 Hackolade 6
Hackolade Studio support for Avro schema
• Graphical Avro schema design tool
• Can also import existing Avro schemas
• Integrates with the schema registries
(Confluent Schema Registry,
Azure EventHubs Schema Registry,
Pulsar Schema Registry,
other CSR API-compatible registries)
• For forward- and reverse-engineering of the Avro schema with these registries
• Hackolade also integrates with Object Storage providers (S3, ADLS, …)
for data lakes
• Hackolade maintains (recursive) references between elements of the
Avro schema
Copyright © 2016-2023 Hackolade 7
Hackolade Studio support for Avro schema
• Outputs of Avro Schema modeling
in Hackolade Studio:
• Entity-Relationship Diagrams of
multi-record/event environments
• Hierarchical view of nested objects
• Syntactically correct schema
generation for
• users without technical knowledge
• developers wanting to improve
quality and productivity
• Documentation of schema in
different formats
• Conversion to/from other targets
(OpenAPI, RDBMS, NoSQL) via
Polyglot Data Modeling
Copyright © 2016-2023 Hackolade 8
Hackolade Studio and Confluent Schema Registry
• Central repository with RESTful interface
• Developers publish schemas for
• Versioning
• Safe schema evolution
• Enhanced integrity
• Data discoverability
• Self-hosted or SaaS
• Hackolade Studio native integration
• can forward/reverse engineer Avro schemas with the registry
• including support for the different Subject Name Strategies
• supports namespace references, schema references, and union schemas
Copyright © 2016-2023 Hackolade 9
Reading material
• See Hackolade Studio online documentation
• The Hackolade Blog
• These excellent new books:
• MongoDB Data Modeling & Schema Design
• Many of the principles in the book are related to query-driven
modeling based on access patterns
• Neo4j Data Modeling
• Hackolade’s on social media: LinkedIn page, Twitter page
• Download Hackolade Studio for free
Copyright © 2016-2023 Hackolade 10
Questions?
Answers!
Copyright © 2016-2023 Hackolade 11

More Related Content

Similar to Visual Design of Apache Avro Schemas with Hackolade Studio

DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended CutWes McKinney
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...HostedbyConfluent
 
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaWhat are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaEdureka!
 
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)My own sweet home!
 
Hackolade Tutorial - part 12 - Create a REST API model
Hackolade Tutorial - part  12 - Create a REST API modelHackolade Tutorial - part  12 - Create a REST API model
Hackolade Tutorial - part 12 - Create a REST API modelPascalDesmarets1
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasDataWorks Summit
 
Tutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaborationTutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaborationPascalDesmarets1
 
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and SparkODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and SparkCarolyn Duby
 
Building Software Backend (Web API)
Building Software Backend (Web API)Building Software Backend (Web API)
Building Software Backend (Web API)Alexander Goida
 
Schema registry
Schema registrySchema registry
Schema registryWhiteklay
 
Oracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDSOracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDSDoug Gault
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...Redis Labs
 
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceCOMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceAntonio García-Domínguez
 
IOT15_Unit6.pptx
IOT15_Unit6.pptxIOT15_Unit6.pptx
IOT15_Unit6.pptxsuptel
 
APIdays 2016 - The State of Web API Languages
APIdays 2016  - The State of Web API LanguagesAPIdays 2016  - The State of Web API Languages
APIdays 2016 - The State of Web API LanguagesRestlet
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Wen-Tien Chang
 
MySQL Connector/Node.js and the X DevAPI
MySQL Connector/Node.js and the X DevAPIMySQL Connector/Node.js and the X DevAPI
MySQL Connector/Node.js and the X DevAPIRui Quelhas
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVROairisData
 
NYC Identity Summit Tech Day: ForgeRock DevOps/Cloud Strategy
NYC Identity Summit Tech Day: ForgeRock DevOps/Cloud StrategyNYC Identity Summit Tech Day: ForgeRock DevOps/Cloud Strategy
NYC Identity Summit Tech Day: ForgeRock DevOps/Cloud StrategyForgeRock
 
HTML5 Programming
HTML5 ProgrammingHTML5 Programming
HTML5 Programminghotrannam
 

Similar to Visual Design of Apache Avro Schemas with Hackolade Studio (20)

DataFrames: The Extended Cut
DataFrames: The Extended CutDataFrames: The Extended Cut
DataFrames: The Extended Cut
 
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
Kafka for Microservices – You absolutely need Avro Schemas! | Gerardo Gutierr...
 
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | EdurekaWhat are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
What are Hadoop Components? Hadoop Ecosystem and Architecture | Edureka
 
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)開放原始碼 Ch1.2   intro - oss - apahce foundry (ver 2.0)
開放原始碼 Ch1.2 intro - oss - apahce foundry (ver 2.0)
 
Hackolade Tutorial - part 12 - Create a REST API model
Hackolade Tutorial - part  12 - Create a REST API modelHackolade Tutorial - part  12 - Create a REST API model
Hackolade Tutorial - part 12 - Create a REST API model
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
 
Tutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaborationTutorial Workgroup - Model versioning and collaboration
Tutorial Workgroup - Model versioning and collaboration
 
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and SparkODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
ODSC East 2017 - Reproducible Research at Scale with Apache Zeppelin and Spark
 
Building Software Backend (Web API)
Building Software Backend (Web API)Building Software Backend (Web API)
Building Software Backend (Web API)
 
Schema registry
Schema registrySchema registry
Schema registry
 
Oracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDSOracle Office Hours - Exposing REST services with APEX and ORDS
Oracle Office Hours - Exposing REST services with APEX and ORDS
 
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
The Happy Marriage of Redis and Protobuf by Scott Haines of Twilio - Redis Da...
 
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceCOMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
 
IOT15_Unit6.pptx
IOT15_Unit6.pptxIOT15_Unit6.pptx
IOT15_Unit6.pptx
 
APIdays 2016 - The State of Web API Languages
APIdays 2016  - The State of Web API LanguagesAPIdays 2016  - The State of Web API Languages
APIdays 2016 - The State of Web API Languages
 
Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3Service-Oriented Design and Implement with Rails3
Service-Oriented Design and Implement with Rails3
 
MySQL Connector/Node.js and the X DevAPI
MySQL Connector/Node.js and the X DevAPIMySQL Connector/Node.js and the X DevAPI
MySQL Connector/Node.js and the X DevAPI
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
 
NYC Identity Summit Tech Day: ForgeRock DevOps/Cloud Strategy
NYC Identity Summit Tech Day: ForgeRock DevOps/Cloud StrategyNYC Identity Summit Tech Day: ForgeRock DevOps/Cloud Strategy
NYC Identity Summit Tech Day: ForgeRock DevOps/Cloud Strategy
 
HTML5 Programming
HTML5 ProgrammingHTML5 Programming
HTML5 Programming
 

More from PascalDesmarets1

Tutorial Workgroup - Working with Forks
Tutorial Workgroup - Working with ForksTutorial Workgroup - Working with Forks
Tutorial Workgroup - Working with ForksPascalDesmarets1
 
Tutorial Advanced How-To - Oracle 23c Duality views
Tutorial Advanced How-To - Oracle 23c Duality viewsTutorial Advanced How-To - Oracle 23c Duality views
Tutorial Advanced How-To - Oracle 23c Duality viewsPascalDesmarets1
 
Tutorial Expert How-To - Docker-based automation
Tutorial Expert How-To - Docker-based automationTutorial Expert How-To - Docker-based automation
Tutorial Expert How-To - Docker-based automationPascalDesmarets1
 
Tutorial Getting Started part 4 - Domain-Driven Data Modeling
Tutorial Getting Started part 4 - Domain-Driven Data ModelingTutorial Getting Started part 4 - Domain-Driven Data Modeling
Tutorial Getting Started part 4 - Domain-Driven Data ModelingPascalDesmarets1
 
Tutorial Getting Started part 3 - Metadata-as-Code
Tutorial Getting Started part 3 - Metadata-as-CodeTutorial Getting Started part 3 - Metadata-as-Code
Tutorial Getting Started part 3 - Metadata-as-CodePascalDesmarets1
 
Tutorial Getting Started part 2 - Polyglot Data Modeling
Tutorial Getting Started part 2 - Polyglot Data ModelingTutorial Getting Started part 2 - Polyglot Data Modeling
Tutorial Getting Started part 2 - Polyglot Data ModelingPascalDesmarets1
 
Tutorial Getting Started part 1 - Overview
Tutorial Getting Started part 1 - OverviewTutorial Getting Started part 1 - Overview
Tutorial Getting Started part 1 - OverviewPascalDesmarets1
 
Tutorial Expert How-To - Verify Data Model
Tutorial Expert How-To - Verify Data ModelTutorial Expert How-To - Verify Data Model
Tutorial Expert How-To - Verify Data ModelPascalDesmarets1
 
Tutorial Expert How-To - Naming Conventions
Tutorial Expert How-To - Naming ConventionsTutorial Expert How-To - Naming Conventions
Tutorial Expert How-To - Naming ConventionsPascalDesmarets1
 
Tutorial Expert How-To - Export-Import with Excel template
Tutorial Expert How-To - Export-Import with Excel templateTutorial Expert How-To - Export-Import with Excel template
Tutorial Expert How-To - Export-Import with Excel templatePascalDesmarets1
 
Tutorial Expert How-To - Compare and Merge
Tutorial Expert How-To - Compare and MergeTutorial Expert How-To - Compare and Merge
Tutorial Expert How-To - Compare and MergePascalDesmarets1
 
Tutorial Expert How-To - Custom properties
Tutorial Expert How-To - Custom propertiesTutorial Expert How-To - Custom properties
Tutorial Expert How-To - Custom propertiesPascalDesmarets1
 
Tutorial Expert How-To - Command Line Interface (CLI)
Tutorial Expert How-To - Command Line Interface (CLI)Tutorial Expert How-To - Command Line Interface (CLI)
Tutorial Expert How-To - Command Line Interface (CLI)PascalDesmarets1
 
Tutorial Expert How-To - Add reusable Definitions
Tutorial Expert How-To - Add reusable DefinitionsTutorial Expert How-To - Add reusable Definitions
Tutorial Expert How-To - Add reusable DefinitionsPascalDesmarets1
 
Hackolade Tutorial - part 13 - Leverage a Polyglot data model
Hackolade Tutorial - part 13 - Leverage a Polyglot data modelHackolade Tutorial - part 13 - Leverage a Polyglot data model
Hackolade Tutorial - part 13 - Leverage a Polyglot data modelPascalDesmarets1
 
Hackolade Tutorial - part 8 - Import or reverse-engineer.pdf
Hackolade Tutorial - part 8 - Import or reverse-engineer.pdfHackolade Tutorial - part 8 - Import or reverse-engineer.pdf
Hackolade Tutorial - part 8 - Import or reverse-engineer.pdfPascalDesmarets1
 
Hackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdf
Hackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdfHackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdf
Hackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdfPascalDesmarets1
 
Hackolade Tutorial - part 4 - Create your first data model
Hackolade Tutorial - part 4 - Create your first data modelHackolade Tutorial - part 4 - Create your first data model
Hackolade Tutorial - part 4 - Create your first data modelPascalDesmarets1
 
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...PascalDesmarets1
 
Hackolade Tutorial - part 2 - Overview of JSON and JSON schema
Hackolade Tutorial - part 2 - Overview of JSON and JSON schemaHackolade Tutorial - part 2 - Overview of JSON and JSON schema
Hackolade Tutorial - part 2 - Overview of JSON and JSON schemaPascalDesmarets1
 

More from PascalDesmarets1 (20)

Tutorial Workgroup - Working with Forks
Tutorial Workgroup - Working with ForksTutorial Workgroup - Working with Forks
Tutorial Workgroup - Working with Forks
 
Tutorial Advanced How-To - Oracle 23c Duality views
Tutorial Advanced How-To - Oracle 23c Duality viewsTutorial Advanced How-To - Oracle 23c Duality views
Tutorial Advanced How-To - Oracle 23c Duality views
 
Tutorial Expert How-To - Docker-based automation
Tutorial Expert How-To - Docker-based automationTutorial Expert How-To - Docker-based automation
Tutorial Expert How-To - Docker-based automation
 
Tutorial Getting Started part 4 - Domain-Driven Data Modeling
Tutorial Getting Started part 4 - Domain-Driven Data ModelingTutorial Getting Started part 4 - Domain-Driven Data Modeling
Tutorial Getting Started part 4 - Domain-Driven Data Modeling
 
Tutorial Getting Started part 3 - Metadata-as-Code
Tutorial Getting Started part 3 - Metadata-as-CodeTutorial Getting Started part 3 - Metadata-as-Code
Tutorial Getting Started part 3 - Metadata-as-Code
 
Tutorial Getting Started part 2 - Polyglot Data Modeling
Tutorial Getting Started part 2 - Polyglot Data ModelingTutorial Getting Started part 2 - Polyglot Data Modeling
Tutorial Getting Started part 2 - Polyglot Data Modeling
 
Tutorial Getting Started part 1 - Overview
Tutorial Getting Started part 1 - OverviewTutorial Getting Started part 1 - Overview
Tutorial Getting Started part 1 - Overview
 
Tutorial Expert How-To - Verify Data Model
Tutorial Expert How-To - Verify Data ModelTutorial Expert How-To - Verify Data Model
Tutorial Expert How-To - Verify Data Model
 
Tutorial Expert How-To - Naming Conventions
Tutorial Expert How-To - Naming ConventionsTutorial Expert How-To - Naming Conventions
Tutorial Expert How-To - Naming Conventions
 
Tutorial Expert How-To - Export-Import with Excel template
Tutorial Expert How-To - Export-Import with Excel templateTutorial Expert How-To - Export-Import with Excel template
Tutorial Expert How-To - Export-Import with Excel template
 
Tutorial Expert How-To - Compare and Merge
Tutorial Expert How-To - Compare and MergeTutorial Expert How-To - Compare and Merge
Tutorial Expert How-To - Compare and Merge
 
Tutorial Expert How-To - Custom properties
Tutorial Expert How-To - Custom propertiesTutorial Expert How-To - Custom properties
Tutorial Expert How-To - Custom properties
 
Tutorial Expert How-To - Command Line Interface (CLI)
Tutorial Expert How-To - Command Line Interface (CLI)Tutorial Expert How-To - Command Line Interface (CLI)
Tutorial Expert How-To - Command Line Interface (CLI)
 
Tutorial Expert How-To - Add reusable Definitions
Tutorial Expert How-To - Add reusable DefinitionsTutorial Expert How-To - Add reusable Definitions
Tutorial Expert How-To - Add reusable Definitions
 
Hackolade Tutorial - part 13 - Leverage a Polyglot data model
Hackolade Tutorial - part 13 - Leverage a Polyglot data modelHackolade Tutorial - part 13 - Leverage a Polyglot data model
Hackolade Tutorial - part 13 - Leverage a Polyglot data model
 
Hackolade Tutorial - part 8 - Import or reverse-engineer.pdf
Hackolade Tutorial - part 8 - Import or reverse-engineer.pdfHackolade Tutorial - part 8 - Import or reverse-engineer.pdf
Hackolade Tutorial - part 8 - Import or reverse-engineer.pdf
 
Hackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdf
Hackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdfHackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdf
Hackolade Tutorial - part 6 - Add choice, conditional, pattern fields.pdf
 
Hackolade Tutorial - part 4 - Create your first data model
Hackolade Tutorial - part 4 - Create your first data modelHackolade Tutorial - part 4 - Create your first data model
Hackolade Tutorial - part 4 - Create your first data model
 
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
Hackolade Tutorial - part 3 - Query-driven data modeling based on access patt...
 
Hackolade Tutorial - part 2 - Overview of JSON and JSON schema
Hackolade Tutorial - part 2 - Overview of JSON and JSON schemaHackolade Tutorial - part 2 - Overview of JSON and JSON schema
Hackolade Tutorial - part 2 - Overview of JSON and JSON schema
 

Recently uploaded

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Recently uploaded (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Visual Design of Apache Avro Schemas with Hackolade Studio

  • 1. Hackolade Tutorial Visual design of Avro schema Copyright © 2016-2023 Hackolade 1
  • 2. About Apache Avro • Open-source project • Provides • Data serialisation framework • Data exchange services • Often used with • Kafka pub-sub pipelines • Data lake storage • Row-oriented object container • as opposed to Parquet which is column-oriented • Language independent, compact and efficient Copyright © 2016-2023 Hackolade 2
  • 3. Components of an Avro Schema Model • An Avro schema can be viewed as a language- agnostic contract for systems to interoperate. • 4 attributes: • Type: specifies the data type of the JSON record, whether its complex type or primitive value. At the top level of an Avro schema, it is mandatory to have a “record” type. • Name: the name of the Avro schema being defined • Namespace: a high-level logical indicator of the Avro schema • Fields: the individual data elements of the JSON object. Fields can be of primitive as well as complex type, which can be further made of simple and complex data types. Copyright © 2016-2023 Hackolade 3
  • 4. About Apache Avro • Uses JSON to define schema and data types • Warning: Avro schema is quite different from JSON Schema, even if schema is defined using JSON format • Container file has header containing the schema, plus 1 or more storage blocks Copyright © 2016-2023 Hackolade 4
  • 5. Avro schema evolution • Avro allows for powerful schema evolution • Can achieve backward- and forward-compatibility if done well • Super interesting for “integration” services where many producers/consumers may evolve applications using different versions of the schema • Schemas and their evolving versions can be published in a schema registry • Goal is to maximise compatibility when decoupling the lifecycle of publishers and consumers • Specific guidelines and best practices Copyright © 2016-2023 Hackolade 5
  • 6. Benefits of Avro schema design in Hackolade Studio • Visual design of Avro schema • Anticipate future schema evolution • Facilitate message validation • Optimize message payload • Enable compatibility and interoperability • Integrate with Confluent Schema Registry and others • Promote schema reuse (despite limited official specification and documentation) via • Avro namespace references • Confluent Schema Registry’s • Schema references • Union schemas Copyright © 2016-2023 Hackolade 6
  • 7. Hackolade Studio support for Avro schema • Graphical Avro schema design tool • Can also import existing Avro schemas • Integrates with the schema registries (Confluent Schema Registry, Azure EventHubs Schema Registry, Pulsar Schema Registry, other CSR API-compatible registries) • For forward- and reverse-engineering of the Avro schema with these registries • Hackolade also integrates with Object Storage providers (S3, ADLS, …) for data lakes • Hackolade maintains (recursive) references between elements of the Avro schema Copyright © 2016-2023 Hackolade 7
  • 8. Hackolade Studio support for Avro schema • Outputs of Avro Schema modeling in Hackolade Studio: • Entity-Relationship Diagrams of multi-record/event environments • Hierarchical view of nested objects • Syntactically correct schema generation for • users without technical knowledge • developers wanting to improve quality and productivity • Documentation of schema in different formats • Conversion to/from other targets (OpenAPI, RDBMS, NoSQL) via Polyglot Data Modeling Copyright © 2016-2023 Hackolade 8
  • 9. Hackolade Studio and Confluent Schema Registry • Central repository with RESTful interface • Developers publish schemas for • Versioning • Safe schema evolution • Enhanced integrity • Data discoverability • Self-hosted or SaaS • Hackolade Studio native integration • can forward/reverse engineer Avro schemas with the registry • including support for the different Subject Name Strategies • supports namespace references, schema references, and union schemas Copyright © 2016-2023 Hackolade 9
  • 10. Reading material • See Hackolade Studio online documentation • The Hackolade Blog • These excellent new books: • MongoDB Data Modeling & Schema Design • Many of the principles in the book are related to query-driven modeling based on access patterns • Neo4j Data Modeling • Hackolade’s on social media: LinkedIn page, Twitter page • Download Hackolade Studio for free Copyright © 2016-2023 Hackolade 10