SlideShare a Scribd company logo
1 of 19
Protocol
Buffer
www.tothenew.com
Serialization - Basic Concepts
➢Serialization is the encoding of objects, and the objects reachable in them, into a stream of bytes.
➢Concept is by no means unique to Java, but PPT is related to Java’s Serialization.
➢It is basis for all PERSISTENCE in java.
➢Handles versioning with the use of serialVersionUID.
➢Adding Marker interface - Serializable makes the class serializable.
➢Transient and static fields are not serialized.
www.tothenew.com
Serialization - Advantages
➢Provide way to hook into Serialization process
○ by providing implementation of readObject() and writeObject()
○ by providing implementation for readExternal() and writeExternal()
➢When you want to serialize just part of the class
○ provide implementations for readResolve() and writeReplace() methods describing what you
want to serialize.
➢Object Validation
○ provide implementation for validateObject() of ObjectInputValidation interface, which shall be
called automatically when de-serializing the object
www.tothenew.com
Serialization - Problem
➢Slow Processing
○ Serialization discovers which fields to write/read through reflection and Type Introspection, which is usually slow.
○ Serialization writes extra data to stream.
○ You can offset the cost of Serialization, to some extent, by having application objects implement java.io.Externalizable, but
there still will be significant overhead in marshalling the class descriptor. To avoid the cost, have these objects implement
Externalizable, and call readExternal and writeExternal on them directly. For example, call obj.writeExternal(stream) rather
than stream.writeObject(obj). See this Link
➢No proper Handling of fields
○ readObject() and writeObject() may not handle proper serialization of transient and static fields.
○ when default handling is inefficient, use the Externalizable interface instead of Serializable.
○ this way you need to write readExternal() and writeExternal(), a lot more work for simple serialization.
➢Not Secure
○ Because the format is fully documented, it's possible to read the contents of the serialized stream without the class being available
➢No proper version handling, even using serialVersionUID won’t help much. Not using it makes the Serializable class not version changes in class,
and using it will result in API break when version changed.
www.tothenew.com
Protocol Buffer - Basic Concepts
➢Library for Serializing Messages
➢Protocol Buffer is a Serialization format with an interface description language developed by Google
➢Write a .proto file with structure of data(message format) and run it through protocol compiler,
generate classes in java
➢Each class has accessor for fields defined
➢Methods for parsing and serializing the data in a compact and very fast data
➢Protocol buffers are Strongly typed
➢Handles Versioning Automatically
➢Generates Classes into C++/ Java/ Python
○ More languages supported into external repos(C#, Erlang etc)
➢Each generated class represents a Single Message
➢protoc generates code that depends on private APIs in libprotobuf.jar which often change between
versions. So, use same version in maven as the compiler installed on system.
www.tothenew.com
.proto File
➢Defines a message format/class
➢Simple syntax for defining message
➢Fields in a message class must be identified via a numeric index
➢Field have a name, type and descriptor such as it’s a required field or not
➢Messages can import or subclass other messages
www.tothenew.com
Sample .proto File
package java;
option java_package="com.shashi.protoc.generated";
option java_outer_classname="AddressBookProtos";
message Person {
required string name = 1;
required int32 id = 2;
optional string email = 3;
enum PhoneType {
MOBILE = 0;
HOME = 1;
WORK = 2;
}
message PhoneNumber {
required string number = 1;
optional PhoneType type = 2 [default = HOME];
}
repeated PhoneNumber number = 4;
}
message AddressBook {
repeated Person person = 1;
}
www.tothenew.com
import Command
➢Simply import another .proto file
➢Allows for separating different message classes into different files
➢Imported file should be into same directory
○ can be into another directory, in case have to specify additional argument to protoc compiler
www.tothenew.com
package Command
➢In message file, generate namespaces
➢package abc.def would mean
namespace abc {
namespace def {
. . .
}
}
➢package here has same significance as in java Language.
www.tothenew.com
message Command
➢Encloses a message class
➢Follows the term “message” with the name of the message, which will become it’s Java Class name
➢Message classes are encapsulated
www.tothenew.com
enum Command
➢Enum followed by the name of enumeration
➢Zero based enumeration
➢will produce actual Java Enumeration
➢Simple defines an enumeration, will not create a field in the message for that enumeration
www.tothenew.com
Fields
➢Fields are members of the message class
➢Convention is [descriptor] type name = index
➢index is 1-based
➢index 1-16 are better performing than 17+, so save 1-16 for the most frequently accessed fields
www.tothenew.com
Descriptor
➢Describes the field
➢Required means that the message requires this field to be non-null before writing
➢Optional means that the field is not required to be set before writing
➢Repeated means that the field is a collection(Dynamic array) of another type
○ For historical reasons, repeated fields of scalar numeric types aren't encoded as efficiently as they could be. New code should use the special
option [packed=true] to get a more efficient encoding
message AddressBook {
repeated Person person = 1 [packed = true];
}
www.tothenew.com
Types
➢The Expected type of the field
➢There are range of integer types and String types
➢Can be name of an enumeration
➢Can be a name of another Message class
www.tothenew.com
Class Generation
➢Use the Protoc Compiler
➢protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto
➢Use your classes via aggregation
○ DO NOT inherit from your message class
www.tothenew.com
Advantage / Disadvantages
➢Advantages:
○ If you add new fields in the structure, and there are any old programs that dont know about
those structures then these old programs will ignore these new fields.
○ If you remove a field, old program will just assume default value for this deleted field.
➢Disadvantages
○ Can not remove required fields once added. Have to plan schema in advance.
■ suggested to add only optional fields. make only id etc required.
○ Just a way to encode data, not an RPC
■ it’s designed to be implemented with any RPC implementation
○ Not for Unstructured text
○ Not great if your first priority is human readability(Not Good for debugging and stuff)
www.tothenew.com
Alternatives
➢Apache Avro :
○ Essentially ProtoBuf with RPC facility, it is a Data Serialization and RPC framework used in
APache Hadoop
○ Dynamic Typing - no code generation required, only schema in json format
■ Can optionally use Avro IDL.
○ No Static Data Types - facilitates generic data-processing systems
➢Apache Thrift:
○ a code generation engine. Has a IDL and Binary communication protocol developed by FB
○ Facilitates calling between different language platforms
○ Instead of writing a load of boilerplate code to serialize and transport your objects and invoke
remote methods, you can get right down to business.
www.tothenew.com
REFERENCES
Serialization
1. http://thecodersbreakfast.net/index.php?post/2011/05/12/Serialization-and-magic-
methods
2. http://www.ibm.com/developerworks/library/j-5things1/
Protocol Buffers
1. https://developers.google.com/protocol-buffers/docs/javatutorial?hl=en
www.tothenew.com
Thank you!

More Related Content

What's hot

(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
Guido Schmutz
 

What's hot (20)

gRPC Overview
gRPC OverviewgRPC Overview
gRPC Overview
 
GRPC.pptx
GRPC.pptxGRPC.pptx
GRPC.pptx
 
gRPC and Microservices
gRPC and MicroservicesgRPC and Microservices
gRPC and Microservices
 
Go lang
Go langGo lang
Go lang
 
(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf(Big) Data Serialization with Avro and Protobuf
(Big) Data Serialization with Avro and Protobuf
 
An Introduction To REST API
An Introduction To REST APIAn Introduction To REST API
An Introduction To REST API
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 
Reactive Card Magic: Understanding Spring WebFlux and Project Reactor
Reactive Card Magic: Understanding Spring WebFlux and Project ReactorReactive Card Magic: Understanding Spring WebFlux and Project Reactor
Reactive Card Magic: Understanding Spring WebFlux and Project Reactor
 
Network Protocol Testing Using Robot Framework
Network Protocol Testing Using Robot FrameworkNetwork Protocol Testing Using Robot Framework
Network Protocol Testing Using Robot Framework
 
REST vs gRPC: Battle of API's
REST vs gRPC: Battle of API'sREST vs gRPC: Battle of API's
REST vs gRPC: Battle of API's
 
Xml http request
Xml http requestXml http request
Xml http request
 
Introduction to gRPC
Introduction to gRPCIntroduction to gRPC
Introduction to gRPC
 
RabbitMQ.ppt
RabbitMQ.pptRabbitMQ.ppt
RabbitMQ.ppt
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
HTTP2 and gRPC
HTTP2 and gRPCHTTP2 and gRPC
HTTP2 and gRPC
 
Память в Java. Garbage Collector
Память в Java. Garbage CollectorПамять в Java. Garbage Collector
Память в Java. Garbage Collector
 
Golang - Overview of Go (golang) Language
Golang - Overview of Go (golang) LanguageGolang - Overview of Go (golang) Language
Golang - Overview of Go (golang) Language
 
Web Services - WSDL
Web Services - WSDLWeb Services - WSDL
Web Services - WSDL
 
JSON: The Basics
JSON: The BasicsJSON: The Basics
JSON: The Basics
 
gRPC
gRPC gRPC
gRPC
 

Viewers also liked

ZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 LabsZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 Labs
James Dennis
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
Alex Tumanoff
 

Viewers also liked (19)

Data Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol BuffersData Serialization Using Google Protocol Buffers
Data Serialization Using Google Protocol Buffers
 
Google Protocol Buffers
Google Protocol BuffersGoogle Protocol Buffers
Google Protocol Buffers
 
Introduction to protocol buffer
Introduction to protocol bufferIntroduction to protocol buffer
Introduction to protocol buffer
 
3 apache-avro
3 apache-avro3 apache-avro
3 apache-avro
 
Event Driven with LibUV and ZeroMQ
Event Driven with LibUV and ZeroMQEvent Driven with LibUV and ZeroMQ
Event Driven with LibUV and ZeroMQ
 
Experience protocol buffer on android
Experience protocol buffer on androidExperience protocol buffer on android
Experience protocol buffer on android
 
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
An Empirical Evaluation of VoIP Playout Buffer Dimensioning in Skype, Google ...
 
Illustration of TextSecure's Protocol Buffer usage
Illustration of TextSecure's Protocol Buffer usageIllustration of TextSecure's Protocol Buffer usage
Illustration of TextSecure's Protocol Buffer usage
 
Introducing HTTP/2
Introducing HTTP/2Introducing HTTP/2
Introducing HTTP/2
 
ZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 LabsZeroMQ: Super Sockets - by J2 Labs
ZeroMQ: Super Sockets - by J2 Labs
 
Serialization and performance by Sergey Morenets
Serialization and performance by Sergey MorenetsSerialization and performance by Sergey Morenets
Serialization and performance by Sergey Morenets
 
Axolotl Protocol: An Illustrated Primer
Axolotl Protocol: An Illustrated PrimerAxolotl Protocol: An Illustrated Primer
Axolotl Protocol: An Illustrated Primer
 
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
(BDT309) Data Science & Best Practices for Apache Spark on Amazon EMR
 
Best Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWSBest Practices for Using Apache Spark on AWS
Best Practices for Using Apache Spark on AWS
 
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and DatabricksFour Things to Know About Reliable Spark Streaming with Typesafe and Databricks
Four Things to Know About Reliable Spark Streaming with Typesafe and Databricks
 
Reversing Google Protobuf protocol
Reversing Google Protobuf protocolReversing Google Protobuf protocol
Reversing Google Protobuf protocol
 
Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)Hadoop, Pig, and Twitter (NoSQL East 2009)
Hadoop, Pig, and Twitter (NoSQL East 2009)
 
Scaling Deep Learning with MXNet
Scaling Deep Learning with MXNetScaling Deep Learning with MXNet
Scaling Deep Learning with MXNet
 
Protocol Buffers and Hadoop at Twitter
Protocol Buffers and Hadoop at TwitterProtocol Buffers and Hadoop at Twitter
Protocol Buffers and Hadoop at Twitter
 

Similar to Protocol Buffer.ppt

Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development EnvironmentPython Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Devloper
 
Objective-c for Java Developers
Objective-c for Java DevelopersObjective-c for Java Developers
Objective-c for Java Developers
Muhammad Abdullah
 
Java (1).ppt seminar topics engineering
Java (1).ppt  seminar topics engineeringJava (1).ppt  seminar topics engineering
Java (1).ppt seminar topics engineering
4MU21CS023
 

Similar to Protocol Buffer.ppt (20)

Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development EnvironmentPython Mastery: A Comprehensive Guide to Setting Up Your Development Environment
Python Mastery: A Comprehensive Guide to Setting Up Your Development Environment
 
Objective-c for Java Developers
Objective-c for Java DevelopersObjective-c for Java Developers
Objective-c for Java Developers
 
Why Drupal is Rockstar?
Why Drupal is Rockstar?Why Drupal is Rockstar?
Why Drupal is Rockstar?
 
Balisage - EXPath Packaging
Balisage - EXPath PackagingBalisage - EXPath Packaging
Balisage - EXPath Packaging
 
Comp102 lec 11
Comp102   lec 11Comp102   lec 11
Comp102 lec 11
 
Basics java programing
Basics java programingBasics java programing
Basics java programing
 
Core Java
Core JavaCore Java
Core Java
 
Structure of java program diff c- cpp and java
Structure of java program  diff c- cpp and javaStructure of java program  diff c- cpp and java
Structure of java program diff c- cpp and java
 
Comp102 lec 3
Comp102   lec 3Comp102   lec 3
Comp102 lec 3
 
OOP-Chap2.docx
OOP-Chap2.docxOOP-Chap2.docx
OOP-Chap2.docx
 
Java Presentation For Syntax
Java Presentation For SyntaxJava Presentation For Syntax
Java Presentation For Syntax
 
Java (1).ppt seminar topics engineering
Java (1).ppt  seminar topics engineeringJava (1).ppt  seminar topics engineering
Java (1).ppt seminar topics engineering
 
Java lab-manual
Java lab-manualJava lab-manual
Java lab-manual
 
Annotations
AnnotationsAnnotations
Annotations
 
Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...Building scalable and language-independent Java services using Apache Thrift ...
Building scalable and language-independent Java services using Apache Thrift ...
 
Development and deployment with composer and kite
Development and deployment with composer and kiteDevelopment and deployment with composer and kite
Development and deployment with composer and kite
 
Python for katana
Python for katanaPython for katana
Python for katana
 
05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf05 Lecture - PARALLEL Programming in C ++.pdf
05 Lecture - PARALLEL Programming in C ++.pdf
 
Automating API Documentation
Automating API DocumentationAutomating API Documentation
Automating API Documentation
 
Java Enterprise Edition
Java Enterprise EditionJava Enterprise Edition
Java Enterprise Edition
 

Protocol Buffer.ppt

  • 2. www.tothenew.com Serialization - Basic Concepts ➢Serialization is the encoding of objects, and the objects reachable in them, into a stream of bytes. ➢Concept is by no means unique to Java, but PPT is related to Java’s Serialization. ➢It is basis for all PERSISTENCE in java. ➢Handles versioning with the use of serialVersionUID. ➢Adding Marker interface - Serializable makes the class serializable. ➢Transient and static fields are not serialized.
  • 3. www.tothenew.com Serialization - Advantages ➢Provide way to hook into Serialization process ○ by providing implementation of readObject() and writeObject() ○ by providing implementation for readExternal() and writeExternal() ➢When you want to serialize just part of the class ○ provide implementations for readResolve() and writeReplace() methods describing what you want to serialize. ➢Object Validation ○ provide implementation for validateObject() of ObjectInputValidation interface, which shall be called automatically when de-serializing the object
  • 4. www.tothenew.com Serialization - Problem ➢Slow Processing ○ Serialization discovers which fields to write/read through reflection and Type Introspection, which is usually slow. ○ Serialization writes extra data to stream. ○ You can offset the cost of Serialization, to some extent, by having application objects implement java.io.Externalizable, but there still will be significant overhead in marshalling the class descriptor. To avoid the cost, have these objects implement Externalizable, and call readExternal and writeExternal on them directly. For example, call obj.writeExternal(stream) rather than stream.writeObject(obj). See this Link ➢No proper Handling of fields ○ readObject() and writeObject() may not handle proper serialization of transient and static fields. ○ when default handling is inefficient, use the Externalizable interface instead of Serializable. ○ this way you need to write readExternal() and writeExternal(), a lot more work for simple serialization. ➢Not Secure ○ Because the format is fully documented, it's possible to read the contents of the serialized stream without the class being available ➢No proper version handling, even using serialVersionUID won’t help much. Not using it makes the Serializable class not version changes in class, and using it will result in API break when version changed.
  • 5. www.tothenew.com Protocol Buffer - Basic Concepts ➢Library for Serializing Messages ➢Protocol Buffer is a Serialization format with an interface description language developed by Google ➢Write a .proto file with structure of data(message format) and run it through protocol compiler, generate classes in java ➢Each class has accessor for fields defined ➢Methods for parsing and serializing the data in a compact and very fast data ➢Protocol buffers are Strongly typed ➢Handles Versioning Automatically ➢Generates Classes into C++/ Java/ Python ○ More languages supported into external repos(C#, Erlang etc) ➢Each generated class represents a Single Message ➢protoc generates code that depends on private APIs in libprotobuf.jar which often change between versions. So, use same version in maven as the compiler installed on system.
  • 6. www.tothenew.com .proto File ➢Defines a message format/class ➢Simple syntax for defining message ➢Fields in a message class must be identified via a numeric index ➢Field have a name, type and descriptor such as it’s a required field or not ➢Messages can import or subclass other messages
  • 7. www.tothenew.com Sample .proto File package java; option java_package="com.shashi.protoc.generated"; option java_outer_classname="AddressBookProtos"; message Person { required string name = 1; required int32 id = 2; optional string email = 3; enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; } message PhoneNumber { required string number = 1; optional PhoneType type = 2 [default = HOME]; } repeated PhoneNumber number = 4; } message AddressBook { repeated Person person = 1; }
  • 8. www.tothenew.com import Command ➢Simply import another .proto file ➢Allows for separating different message classes into different files ➢Imported file should be into same directory ○ can be into another directory, in case have to specify additional argument to protoc compiler
  • 9. www.tothenew.com package Command ➢In message file, generate namespaces ➢package abc.def would mean namespace abc { namespace def { . . . } } ➢package here has same significance as in java Language.
  • 10. www.tothenew.com message Command ➢Encloses a message class ➢Follows the term “message” with the name of the message, which will become it’s Java Class name ➢Message classes are encapsulated
  • 11. www.tothenew.com enum Command ➢Enum followed by the name of enumeration ➢Zero based enumeration ➢will produce actual Java Enumeration ➢Simple defines an enumeration, will not create a field in the message for that enumeration
  • 12. www.tothenew.com Fields ➢Fields are members of the message class ➢Convention is [descriptor] type name = index ➢index is 1-based ➢index 1-16 are better performing than 17+, so save 1-16 for the most frequently accessed fields
  • 13. www.tothenew.com Descriptor ➢Describes the field ➢Required means that the message requires this field to be non-null before writing ➢Optional means that the field is not required to be set before writing ➢Repeated means that the field is a collection(Dynamic array) of another type ○ For historical reasons, repeated fields of scalar numeric types aren't encoded as efficiently as they could be. New code should use the special option [packed=true] to get a more efficient encoding message AddressBook { repeated Person person = 1 [packed = true]; }
  • 14. www.tothenew.com Types ➢The Expected type of the field ➢There are range of integer types and String types ➢Can be name of an enumeration ➢Can be a name of another Message class
  • 15. www.tothenew.com Class Generation ➢Use the Protoc Compiler ➢protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/addressbook.proto ➢Use your classes via aggregation ○ DO NOT inherit from your message class
  • 16. www.tothenew.com Advantage / Disadvantages ➢Advantages: ○ If you add new fields in the structure, and there are any old programs that dont know about those structures then these old programs will ignore these new fields. ○ If you remove a field, old program will just assume default value for this deleted field. ➢Disadvantages ○ Can not remove required fields once added. Have to plan schema in advance. ■ suggested to add only optional fields. make only id etc required. ○ Just a way to encode data, not an RPC ■ it’s designed to be implemented with any RPC implementation ○ Not for Unstructured text ○ Not great if your first priority is human readability(Not Good for debugging and stuff)
  • 17. www.tothenew.com Alternatives ➢Apache Avro : ○ Essentially ProtoBuf with RPC facility, it is a Data Serialization and RPC framework used in APache Hadoop ○ Dynamic Typing - no code generation required, only schema in json format ■ Can optionally use Avro IDL. ○ No Static Data Types - facilitates generic data-processing systems ➢Apache Thrift: ○ a code generation engine. Has a IDL and Binary communication protocol developed by FB ○ Facilitates calling between different language platforms ○ Instead of writing a load of boilerplate code to serialize and transport your objects and invoke remote methods, you can get right down to business.