CourierJoe Betz @ Coursera
Courier is a code generator
{
"name": "Fortune",
"namespace": "org.example",
"type": "record",
"fields": [
{
"name": "message",
"type": "string"
}
]
}
{ "message": "Today is your lucky day!" }
case class Fortune(message: String)
JSON Data
ScalaPegasus Schema generate
serialize / deserialize
● Extension of Apache Avro’s schema language built at Linkedin.
● Designed for natural looking JSON.
● Rich type system maps well between JSON and type-safe
languages like Scala.
● Schema language is machine readable and easy to extend.
● Tooling and language support.
Pegasus Schema Language
Pegasus Schemas
Avro Schemas
+optional record fields, +typerefs
Core schema language:
records, maps, arrays, unions, enums,
primitives
Courier
Why Schemas?
Increase transparency
into the structure of our
data.
With a common
understanding of
structure type-safety can
span multiple languages
and platforms.
Why Pegasus Schemas?
Pegasus schemas have a
rich type system.
Pegasus schemas are
machine readable.
Pegasus schemas are
easy to extend.
Pegasus schemas work
with multiple data
formats.
Pegasus Schema Types
Pegasus Type Scala Type Example JSON
int, long, float, double, boolean,
string
Int, Long, Float, Double, Boolean, String 1, 10000000, 3.14, 2.718281, true, “Coursera”
record case class R(f1: T1, f1: T2, ...) { “f1”: 1, “f2”: “Coursera” }
array A extends IndexedSeq[T] [1, 2, 3]
map M extends Map[String, T] { “key1”: 1, “key2”: 2 }
union sealed abstract class U
case class M1(T1) extends U
case class M2(T2) extends U
{
“org.example.M1”: <T1 Value>
}
enum object E extend Enumeration “SYMBOL”
* unions and typerefs will be covered in more detail later.
.pdsc
(pegasus data schema)
Records
org/example/Note.pdsc
{
"name": "Note",
"namespace": "org.example",
"doc": "A simple note.",
"type": "record",
"fields": [
{ "name": "title", "type": "string" },
{ "name": "body", "type": "Body", "optional": true }
]
}
Scala
case class Note(title: String, body: Option[Body])
Examples
Note("reminder", Some(Body(…))) => { "title": "reminder", "body": { … } }
Note("reminder", None) => { "title": "reminder" }
Records
Schema
{ "name": "WithOptionals", "type": "record", …
"fields": [
{ "name": "o1", "type": "string", "optional": true },
{ "name": "o2", "type": "string", "optional": true, "default": "b" }
{ "name": "o3", "type": "string", "optional": true, "defaultNone": true }
]
Generated Scala
case class WithOptionals(
o1: Option[String],
o2: Option[String] = Some("b"),
o3: Option[String] = None)
Examples
WithOptionals(o1 = None) => { "o2": "b" }
WithOptionals(o1 = Some("a")) => { "o1": "a", "o2": "b" }
WithOptionals(o1 = None, o2 = None) => {}
WithOptionals(o1 = None, o3 = Some("c")) => { "o3": "c" }
Optional Fields and Defaults
Collection Types
org/example/NotePad.psdc
{
"name": "NotePad", "type": "record", …
"fields": [
{
"name": "notes",
"type": { "type": "array", "items": "Note" }
}
]
}
Generated Scala
case class NotePad(notes: NoteArray)
class NoteArray extends IndexedSeq[Note]
Example
NotePad(notes = NoteArray(Note(…), …)) =>
{ "notes": [ { "title": "…" }, … ] }
Arrays
org/example/TermWeights.pdsc
{
"name": "TermWeights", "type": "record", "namespace": "org.example",
"fields": [
{
"name": "byTerm",
"type": { "type": "map", "keys": "string", "values": "float" }
}
]
}
Generated Scala
case class TermWeights(byTerm: FloatMap)
class FloatMap extends Map[String, Float]
Example
TermWeights(byTerm = FloatMap("ninja" -> 0.25f, "pirate" -> 0.75f)) =>
{ "byTerm": { "ninja": 0.25, "pirate": 0.75 } }
Maps
Unions
Unions are the closest
thing Pegasus has to a
polymorphic type.
Unlike true subtype
polymorphism, unions
have a “sealed” set of
member types.
org/example/Question.pdsc
{
"name": "Question", "type": "record", …
"fields": [
{ "name": "answerFormat", "type": [ "MultipleChoice", "TextEntry" ] }
]
}
Scala
case class Question(answerFormat: Question.AnswerFormat)
object Question {
sealed abstract class AnswerFormat()
case class MultipleChoiceMember(value: MultipleChoice) extends
AnswerFormat
case class TextEntryMember(value: TextEntry) extends AnswerFormat
}
Example
Question(answerFormat = MultipleChoiceMember(MultipleChoice(…)) =>
Unions
Enums
org/example/Fruits.pdsc
{
"type" : "enum",
"name" : "Fruits",
"namespace" : "org.example",
"symbols" : ["APPLE", "BANANA", "ORANGE"]
}
Generated Scala
object Fruits extend Enumeration {
val APPLE
val BANANA
val ORANCE
}
Example
Fruits.APPLE => "APPLE"
Enums
Typerefs
org/example/Timestamp.pdsc
{
"type" : "enum",
"name" : "Timestamp",
"namespace" : "org.example",
"ref" : "long"
}
Generated Scala
Use Long directly.
Example
1434916256
Basic Typeref
Typerefs are Pegasus’s
multi-tool.
org/example/AnswerFormats.pdsc
{
"name": "AnswerFormats", "namespace": "org.example", "type": "typeref",
"ref": [ "MultipleChoice", "TextEntry" ]
}
Scala
sealed abstract class AnswerFormats()
object AnswerFormats {
case class MultipleChoiceMember(v: MultipleChoice) extends AnswerFormats
case class TextEntryMember(v: TextEntry) extends AnswerFormats
}
Example
MultipleChoiceMember(MultipleChoice(…)) =>
{ "org.example.MultipleChoice": { … }
Naming a Union with a Typeref
org/example/DateTime.pdsc
{
"name": "DateTime", "namespace": "org.example", "type": "typeref",
"ref": "string",
"scala": {
"class": "org.joda.time.DateTime",
"coercerClass": "org.coursera.models.common.DateTimeCoercer"
}
}
Scala
Use org.joda.time.DateTime directly.
Example
Record(createdAt = new org.joda.time.DateTime(…)) =>
{ "createdAt": "2015-06-21T18:24:18Z" }
Custom Bindings with a Typeref
Pegasus System
Schema system: Schema based validation + custom validators.
Data system: JSON Object and Array equivalent types, support for binding to native types.
Code generators: Java, Scala (via Courier), Swift (in progress), Android Java (planned)
Codecs:
● via Pegasus:
o JSON - Jackson streaming
o PSON - non-standard JSON equivalent binary protocol
o Avro binary - compact binary protocol
● via Courier:
o StringKeyCodec - compatible with our legacy StringKeyFormats
o InlineStringCodec - a new “URL Friendly” JSON compatible format
Hardened and performance optimized at Linkedin. In large scale production use for over 3 years.
Development with Courier
Courier SBT Plugin
Code generation
integrated into SBT build.
Caches .pdsc file state,
only runs generator when
they change.
Courier SBT Plugin
http://coursera.github.io/courier/

Introduction to Courier

  • 1.
  • 2.
    Courier is acode generator { "name": "Fortune", "namespace": "org.example", "type": "record", "fields": [ { "name": "message", "type": "string" } ] } { "message": "Today is your lucky day!" } case class Fortune(message: String) JSON Data ScalaPegasus Schema generate serialize / deserialize
  • 4.
    ● Extension ofApache Avro’s schema language built at Linkedin. ● Designed for natural looking JSON. ● Rich type system maps well between JSON and type-safe languages like Scala. ● Schema language is machine readable and easy to extend. ● Tooling and language support. Pegasus Schema Language Pegasus Schemas Avro Schemas +optional record fields, +typerefs Core schema language: records, maps, arrays, unions, enums, primitives Courier
  • 5.
  • 6.
    Increase transparency into thestructure of our data.
  • 8.
    With a common understandingof structure type-safety can span multiple languages and platforms.
  • 9.
  • 10.
    Pegasus schemas havea rich type system.
  • 11.
  • 12.
  • 13.
    Pegasus schemas work withmultiple data formats.
  • 14.
    Pegasus Schema Types PegasusType Scala Type Example JSON int, long, float, double, boolean, string Int, Long, Float, Double, Boolean, String 1, 10000000, 3.14, 2.718281, true, “Coursera” record case class R(f1: T1, f1: T2, ...) { “f1”: 1, “f2”: “Coursera” } array A extends IndexedSeq[T] [1, 2, 3] map M extends Map[String, T] { “key1”: 1, “key2”: 2 } union sealed abstract class U case class M1(T1) extends U case class M2(T2) extends U { “org.example.M1”: <T1 Value> } enum object E extend Enumeration “SYMBOL” * unions and typerefs will be covered in more detail later.
  • 15.
  • 16.
  • 17.
    org/example/Note.pdsc { "name": "Note", "namespace": "org.example", "doc":"A simple note.", "type": "record", "fields": [ { "name": "title", "type": "string" }, { "name": "body", "type": "Body", "optional": true } ] } Scala case class Note(title: String, body: Option[Body]) Examples Note("reminder", Some(Body(…))) => { "title": "reminder", "body": { … } } Note("reminder", None) => { "title": "reminder" } Records
  • 18.
    Schema { "name": "WithOptionals","type": "record", … "fields": [ { "name": "o1", "type": "string", "optional": true }, { "name": "o2", "type": "string", "optional": true, "default": "b" } { "name": "o3", "type": "string", "optional": true, "defaultNone": true } ] Generated Scala case class WithOptionals( o1: Option[String], o2: Option[String] = Some("b"), o3: Option[String] = None) Examples WithOptionals(o1 = None) => { "o2": "b" } WithOptionals(o1 = Some("a")) => { "o1": "a", "o2": "b" } WithOptionals(o1 = None, o2 = None) => {} WithOptionals(o1 = None, o3 = Some("c")) => { "o3": "c" } Optional Fields and Defaults
  • 19.
  • 20.
    org/example/NotePad.psdc { "name": "NotePad", "type":"record", … "fields": [ { "name": "notes", "type": { "type": "array", "items": "Note" } } ] } Generated Scala case class NotePad(notes: NoteArray) class NoteArray extends IndexedSeq[Note] Example NotePad(notes = NoteArray(Note(…), …)) => { "notes": [ { "title": "…" }, … ] } Arrays
  • 21.
    org/example/TermWeights.pdsc { "name": "TermWeights", "type":"record", "namespace": "org.example", "fields": [ { "name": "byTerm", "type": { "type": "map", "keys": "string", "values": "float" } } ] } Generated Scala case class TermWeights(byTerm: FloatMap) class FloatMap extends Map[String, Float] Example TermWeights(byTerm = FloatMap("ninja" -> 0.25f, "pirate" -> 0.75f)) => { "byTerm": { "ninja": 0.25, "pirate": 0.75 } } Maps
  • 22.
  • 23.
    Unions are theclosest thing Pegasus has to a polymorphic type.
  • 24.
    Unlike true subtype polymorphism,unions have a “sealed” set of member types.
  • 25.
    org/example/Question.pdsc { "name": "Question", "type":"record", … "fields": [ { "name": "answerFormat", "type": [ "MultipleChoice", "TextEntry" ] } ] } Scala case class Question(answerFormat: Question.AnswerFormat) object Question { sealed abstract class AnswerFormat() case class MultipleChoiceMember(value: MultipleChoice) extends AnswerFormat case class TextEntryMember(value: TextEntry) extends AnswerFormat } Example Question(answerFormat = MultipleChoiceMember(MultipleChoice(…)) => Unions
  • 26.
  • 27.
    org/example/Fruits.pdsc { "type" : "enum", "name": "Fruits", "namespace" : "org.example", "symbols" : ["APPLE", "BANANA", "ORANGE"] } Generated Scala object Fruits extend Enumeration { val APPLE val BANANA val ORANCE } Example Fruits.APPLE => "APPLE" Enums
  • 28.
  • 29.
    org/example/Timestamp.pdsc { "type" : "enum", "name": "Timestamp", "namespace" : "org.example", "ref" : "long" } Generated Scala Use Long directly. Example 1434916256 Basic Typeref
  • 30.
  • 31.
    org/example/AnswerFormats.pdsc { "name": "AnswerFormats", "namespace":"org.example", "type": "typeref", "ref": [ "MultipleChoice", "TextEntry" ] } Scala sealed abstract class AnswerFormats() object AnswerFormats { case class MultipleChoiceMember(v: MultipleChoice) extends AnswerFormats case class TextEntryMember(v: TextEntry) extends AnswerFormats } Example MultipleChoiceMember(MultipleChoice(…)) => { "org.example.MultipleChoice": { … } Naming a Union with a Typeref
  • 32.
    org/example/DateTime.pdsc { "name": "DateTime", "namespace":"org.example", "type": "typeref", "ref": "string", "scala": { "class": "org.joda.time.DateTime", "coercerClass": "org.coursera.models.common.DateTimeCoercer" } } Scala Use org.joda.time.DateTime directly. Example Record(createdAt = new org.joda.time.DateTime(…)) => { "createdAt": "2015-06-21T18:24:18Z" } Custom Bindings with a Typeref
  • 33.
    Pegasus System Schema system:Schema based validation + custom validators. Data system: JSON Object and Array equivalent types, support for binding to native types. Code generators: Java, Scala (via Courier), Swift (in progress), Android Java (planned) Codecs: ● via Pegasus: o JSON - Jackson streaming o PSON - non-standard JSON equivalent binary protocol o Avro binary - compact binary protocol ● via Courier: o StringKeyCodec - compatible with our legacy StringKeyFormats o InlineStringCodec - a new “URL Friendly” JSON compatible format Hardened and performance optimized at Linkedin. In large scale production use for over 3 years.
  • 34.
  • 35.
    Courier SBT Plugin Codegeneration integrated into SBT build. Caches .pdsc file state, only runs generator when they change.
  • 36.
  • 37.

Editor's Notes

  • #2 Hi, my name is “Joe”. I work on the infrastructure team and I’m here to talk to you about a new project called “Courier”.
  • #3 Courier is a code generator for Scala. This slide shows what it essentially does. If you only remember one slide from this presentation, remember this one. Courier takes pegasus schema files as inputs. It generates Scala classes. These classes serialize/deserialize to JSON. Kinda like we do today with our Scala case classes and Play! JSON’s Formats. Two things I should note about this slide: For Play JSON’s Formats, We have this utility called AutoSchema that is able to generate schema from Scala classes. So basically it will generate a schema from a Scala class, the opposite direction that Courier does generation. However, AutoSchema deeply flawed. It could only do this correctly for some of our case classes and it would require significant change both to AutoSchema and to our JsonFormats to get it to work properly. Still, we could have chosen to take the approach where we and generate schemas from scala classes instead of the other way and I’ll talk a bit more later about why we prefer generating code from schemas. While JSON is shown here, Courier supports multiple message formats, not just JSON, and we’ll be look at those in more detail shortly. Okay, at this point, some of you might be wondering, ...
  • #4 What is the world are these pegasus schemas? I’ve never heard of this pegasus thing!
  • #5 Pegasus is a schema language and data system used by the LinkedIn rest.li opensource project. Unless you’ve used Rest.li, odds are you’ve never heard of Pegasus. The main thing to know is that pegasus’s schema language is just an extension of Avro’s schema language. In fact, pegasus schemas are almost identical to Avro schemas with a few specific improvements, the most important one being direct support for optional fields. And if you’ve heard of Avro, it’s most likely you know if it as a binary protocol, usually associated with Hadoop. We won’t be looking at the binary protocol much as we’re more focused on JSON, but we will be looking at the schema language Avro uses. We’ll talk more about the specific advantages of the Pegasus schema language shortly.
  • #6 Before we dive into Pegasus in more detail. Let’s look at why we care about schemas.
  • #7 If you’re a Web or Mobile developer you should not need to study our Scala code to try to figure the JSON structure of our REST resources you should not need to guess the structure from sample responses you should not need to ask the developer who wrote the resource to explain to you the structure data it returns Odds are, if you’re working with our current REST resources, you’ve got not choice but to do one of these things. WIth schemas, you can just open a file, and it tells you the exact structure of the data. And there is no possibility that some custom OFormat has been somewhere defined that manipulates the JSON structure in an unexpected way. Also, from schemas, we should be able to generate documentation like this ...
  • #8 This is actually a screenshot of our API Explorer. Unfortunately, the current API Explorer uses the broken AutoSchema utility, which often generates these documentation pages incorrectly. This is one it happened to get right. I think. It’s hard to be sure, AutoSchema is not particularly trustworthy. But with a schema language we will be able to generate documentation like this correctly. Documents like this are great when they’re correct. If you want to use a new REST API, you just pull up the API Explorer documentation and you immediately know what fields exist, if they are required or optional, and what their type is. Huge time saver.
  • #9 Currently, our backend Scala developers benefit from strong type safety and we’ve seen substantial productivity gains, but that type safety only spans service boundaries if both the client and server are written Scala, even though mobile developers are using statically typed languages. But if we instead generate bindings from schemas, we generating bindings for Java and Swift as well and we can expand that type-safety and all the productivity gains it offers.
  • #10 Okay, so schemas are great, but why Pegasus Schemas? Why not something else like JSON Schema?
  • #11 We want a rich type system, richer than what JSON Schema offers. For numerics JSON Schema offers a “integer” type and “number” type, which is sort of weird, JSON only has a “number” type, so I’m not sure how the inventors of JSON Schema landed on “integer” and “number” as their numeric types. Pegasus has “int”, “long”, “float” and “double”, which matches up better with how most statically typed languages, not just Scala!, define numeric types. JSON Schema is also missing direct support for type variance (no tagged union type like thrift or avro, no subtype polymorphism), so there is no natural and type-safe way to model polymorphic types. Pegasus, on the other hand, has a well thought out type system that is based on ADTs (Abstract Data Types) and maps well to the level of type-safety we expect in our Scala code. We’ll talk about this in a bit more detail when we review the Pegasus type system.
  • #12 Because the schemas themselves are written in JSON, we can read the schemas in just about any programming language because all you need is a JSON parser. This makes it easy to build tools like code generators schema based data validators documentation tools like our API explorer
  • #13 Being JSON, we can add additional fields to pegasus schemas for things we need and schema readers that are unaware of the fields simply ignore them. We’ve already done with a couple times with Courier, I’ll point those out as we go. If we had used a grammer based schema language like thrift of protobuf have, we would be unable to do this.
  • #14 With Play JSON, if we need to support an additional data format, we must add more implicit converters. With Pegasus, we define our schemas once, and then we can use the generate classes with any codecs that exist. If in the future we decide we need a new codec, we can add it and use it immediately with all Courier generated classes.
  • #15 This a high level overview of the pegasus types. I’ll look at each one individually, so you don’t need worry about understanding this all now. The left column shows the major types. The primitive types map directly to Scala types. All four numeric types maps to JSON’s “number” type.
  • #16 Pegaus schemas have the .pdsc file extension.
  • #17 Let’s look at some examples, starting with Record types.
  • #18 In each example, the first section is the pegasus schema. The second section shows basically what the generated class looks like. The actual generated classes are a bit more sophisticated but they will be consistent with the declaration shown here. The third section shows a few examples of Scala code and the equivalent JSON. This particular example shows a simple record with two fields. The second field is optional.
  • #19 Here’s another record. This shows how optional and default fields work. Note how the defaults in the schema correspond to the constructor defaults in the generated Scala class. The “defaultNone” schema property, highlighted in green, is a custom property added by Courier, it is not part of the pegasus schema language. Being able to add properties like this is an example of why we prefer having a schema language that is written in JSON.
  • #20 Okay, moving on to arrays and maps.
  • #21 Here’s an array. Here, the type definition highlighted in green is an array. The rest is a record type that contains the array. In Pegasus, arrays are usually defined “inline” like this. It is possible to define an array as a top level type, and I’ll show that later. But for now we’ll look an this inline array. Courier generates a class for the array, highlighted in green, extends IndexedSeq, and has all the convenience methods one expects of a Scala seq type. Map, filter, scan, fold, and so on.. Note that we do not use generic collection types for all arrays, instead we generate a custom array class for each contained type. We do this so that we can attach some schema related information to the array type which is needed for validation and some other Courier internals. This doesn’t end up being a problem in practice. It’s trivial to convert from Seq, List, etc. to the generated classes.
  • #22 Here’s a map. Currently all maps are string keyed, but we plan to add a feature to Courier called “Typed Map Keys” which allow key types to be defined as well similar to how we define the type of the values here. I’ll show how we will encode the map keys later on. Just like with arrays, we generate a map class for each contained type.
  • #23 Pegasus unions are tagged unions. Aka disjoint types, for those of you familiar with ADTs (Algebraic Data Types), unions are your sum type.
  • #24 … There is no notion of sub-types. But type hierarchies can be mapped to unions quite easily.
  • #25 … this is a bit of a subtle point. The notion here is that, with a union, all the member types are declared directly in the union declaration, so any code written to process the union is guaranteed to know what all the possible types are. This allows, for example, for a compiler to check if the cases of a pattern matching expression against a union are exhaustive or not. Those of you coding in Scala regularly should feel right at home with this concept.
  • #26 Okay, here’s an example union. Unions are one of the most complex types in Courier, so we’ll spend a bit of extra time on this slide. Here the union type declaration is highlighted in green. Again, it’s inside a record. Note the structure. Unions are declared using a JSON array. Each item in the array is a union member type. If you look at the generated scala, you’ll see a sealed AnswerFormat class inside the record classes companion object, and under that a case class for each of the union members. In Scala, we have to “box” the member types. For example “MultipleChoiceMember” boxes “MultipleChoice”. We do this because scala does not yet directly support unions. Support for unions for Scala has been mentioned by Odersky in a recent talk (I believe he refered to them as “disjoint types”), so if they are added to Scala, we won’t need to box the member types like we do here. How unions are represented in JSON is worth looking at more closely. In the example, highlighted in red is a Pegasus type name. This is the union tag. It indicates that this JSON data contains a “MultipleChoice” answer format. If instead the answerFormat had been a TextEntry, this would be “org.example.TextEntry”. This is a union, so it’s either/or, Since this union has two member types, it can either be MultipleChoice or TextEntry. The is the one feature of Pegasus where the JSON looks “un-natural”. But it is also flexible and clear.. Unions can be contain all possible member types, including primitives, and the tag used for each type makes it very clear what the member data is. We’ve spoken with a few javascript experts here at Coursera and written some sample code to make sure that we’ll be able to bind to this union format from javascript in a reasonable way.
  • #28 I like to think of enums as degenerate unions. In pegasus they are a completely separate type, which is probably a good thing. They map to JSON strings. They’re pretty simple.
  • #29 We’ve covered all the base types, but there is one language feature left to go over: Typerefs Typerefs do not exist in Avro, they are added by Pegasus.
  • #30 Here’s a basic typeref. It just defines an alias to the long type called “Timestamp”. This is just the most lightweight possible alias to an existing type. It really doesn’t do anything other than introduce a new name for the referenced type. In Scala and JSON, a “Timestamp” is handled exactly like a long is. We will rarely if ever use basic typerefs just to provide lightweight aliases.
  • #31 While basic typerefs are not all that useful. Pegasus use of typerefs to activate other, more powerful features, that are missing from Avro.
  • #32 Remember how we defined all unions, maps and arrays as inline types inside some other type? This is because Avro does not support defining union, maps or arrays as top level type. Typerefs allow us to ge around this. We can define a new typeref, and set the “ref” type to as a union. Courier will then generate a top level class for the union. We can do the same thing for maps or arrays if we want.
  • #33 This is the most powerful feature that typrefs enable. With a typeref we can bind a pegasus type to any Scala class we want. When we do this, Courier will not generate a class, but instead will use the existing class the we specify. In order for this to work, we must define a separate “coercer” class that converts the pegasus type (a string here) into the scala class. We expect custom type bindings to be used heavily for core types like as CourseId and UUID. For arity-1 case classes we plan add a feature to Courier to automatically handle the coercion, so a coercerClass will not need to be defined for simple “AnyVal” type case classes.
  • #34 In addition to the schema language, pegasus provides a well engineered Java implementation that we can take advantage of. One feature that we get “out-of-the-gate” is schema based data validation as well as the ability to write custom validators. We also get Java and Scala code generators. Any Android developers might be interested to know that the Java generator is available via a gradle plugin. Linkedin is working on code generators for Swift and planning to write a specialized code generator specifically for Android. I checked with LinkedIn a few weeks ago and at the time they were not willing to give me a firm date when these will be available. I was told that the Swift generator was basically “code complete” they were still working with the mobile team to improve the generated code to address some specific concerns I’m happy to let them flesh this stuff and get something polished when it’s ready! I will continue to check in with the rest.li team to make sure this is progressing. As I mentioned earlier, pegasus supports a multiple message formats. It calls these codecs. PSON is a binary format that is a bit more compact than JSON and is much cheaper, computationally to serialize/deserialize. We don’t have any immediate plans to use PSON, but have considered using it instead of JSON at the storage layer, and could turn it on between high traffic services to reduce CPU is it becomes as bottleneck.. Compatibility with Avro may have a number of advantages as it is a particularly compact binary protocol. Again, we don’t have any immediate plans to use Avro, but could potentially use it at the storage layer. Courier adds two custom codecs for string keys that I’ll discuss in more detail later. At LinkedIn, pegasus is the primary system for inter-service communication. It’s battle tested and performance tested. A huge amount of engineering effort has gone into productionalizing this code and performance optimizing it.
  • #35 Okay, so how does Courier integrate with SBT?
  • #36 Courier has been a part of “models” project in infra-services for over a month. The code generator runs before the compile task. You don’t need to change your development workflow. Courier will automatically generate classes as needed. If you make a change to a .pdsc, just run `compile`. If you want, you can run SBT `~compile`, which will watch for changes to .pdsc files and run courier automatically. Courier emits contextual error messages like the one shown here if there are any errors in schemas.
  • #37 The Courier plugin is integrated with Play! Error messages, like the missing closing quote on line 7 here, will appear in the browser when using Play!