Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Avro Tutorial - Records with Schema for Kafka and Hadoop

39,029 views

Published on

Covers how to use Avro to save records to disk. This can be used later to use Avro with Kafka Schema Registry. This provides background on Avro which gets used with Hadoop and Kafka.

Published in: Technology
  • Be the first to comment

Avro Tutorial - Records with Schema for Kafka and Hadoop

  1. 1. ™ Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting Avro Avro Apache Avro Data Serialization
  2. 2. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Apache Avro ❖ Data serialization system ❖ Data structures ❖ Binary data format ❖ Container file format to store persistent data ❖ RPC capabilities ❖ Does not require code generation to use
  3. 3. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Avro Schemas ❖ Supports schemas for defining data structure ❖ Serializing and deserializing data, uses schema ❖ File schema ❖ Avro files store data with its schema ❖ RPC Schema ❖ RPC protocol exchanges schemas as part of the handshake ❖ Schemas written in JSON
  4. 4. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Avro compared to… ❖ Similar to Thrift, Protocol Buffers, JSON, etc. ❖ Does not require code generation ❖ Avro needs less encoding as part of the data since it stores names and types in the schema ❖ It supports evolution of schemas.
  5. 5. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Avro Schema Avro schema stored in src/main/avro by default.
  6. 6. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Code Generation
  7. 7. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Employee Code Generation
  8. 8. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Using Generated Avro class
  9. 9. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Writing employees to an Avro File
  10. 10. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Reading employees From a File
  11. 11. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Using GenericRecord
  12. 12. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Writing Generic Records
  13. 13. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Reading using Generic Records
  14. 14. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Avro Schema Validation
  15. 15. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Avro supported types ❖ Records ❖ Arrays ❖ Enums ❖ Unions ❖ Maps ❖ Strings, Int, Boolean, Decimal, Timestamp, Date
  16. 16. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Fuller example Avro Schema
  17. 17. Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka Consulting ™ Avro ❖ Fast data serialization ❖ Supports data structures ❖ Supports Records, Maps, Array, and basic types ❖ You can use it direct or use Code Generation ❖ Read more ❖ Kafka Training ❖ Kafka Consulting

×