Introduction to Courier

Courier is a code generator
{
"name": "Fortune",
"namespace": "org.example",
"type": "record",
"fields": [
{
"name": "message",
"type": "string"
}
]
}
{ "message": "Today is your lucky day!" }
case class Fortune(message: String)
JSON Data
ScalaPegasus Schema generate
serialize / deserialize

● Extension of Apache Avro’s schema language built at Linkedin.
● Designed for natural looking JSON.
● Rich type system maps well between JSON and type-safe
languages like Scala.
● Schema language is machine readable and easy to extend.
● Tooling and language support.
Pegasus Schema Language
Pegasus Schemas
Avro Schemas
+optional record fields, +typerefs
Core schema language:
records, maps, arrays, unions, enums,
primitives
Courier

Increase transparency
into the structure of our
data.

With a common
understanding of
structure type-safety can
span multiple languages
and platforms.

Pegasus schemas have a
rich type system.

Pegasus schemas are
machine readable.

Pegasus schemas are
easy to extend.

Pegasus schemas work
with multiple data
formats.

Pegasus Schema Types
Pegasus Type Scala Type Example JSON
int, long, float, double, boolean,
string
Int, Long, Float, Double, Boolean, String 1, 10000000, 3.14, 2.718281, true, “Coursera”
record case class R(f1: T1, f1: T2, ...) { “f1”: 1, “f2”: “Coursera” }
array A extends IndexedSeq[T] [1, 2, 3]
map M extends Map[String, T] { “key1”: 1, “key2”: 2 }
union sealed abstract class U
case class M1(T1) extends U
case class M2(T2) extends U
{
“org.example.M1”: <T1 Value>
}
enum object E extend Enumeration “SYMBOL”
* unions and typerefs will be covered in more detail later.

org/example/Note.pdsc
{
"name": "Note",
"namespace": "org.example",
"doc": "A simple note.",
"type": "record",
"fields": [
{ "name": "title", "type": "string" },
{ "name": "body", "type": "Body", "optional": true }
]
}
Scala
case class Note(title: String, body: Option[Body])
Examples
Note("reminder", Some(Body(…))) => { "title": "reminder", "body": { … } }
Note("reminder", None) => { "title": "reminder" }
Records

Schema
{ "name": "WithOptionals", "type": "record", …
"fields": [
{ "name": "o1", "type": "string", "optional": true },
{ "name": "o2", "type": "string", "optional": true, "default": "b" }
{ "name": "o3", "type": "string", "optional": true, "defaultNone": true }
]
Generated Scala
case class WithOptionals(
o1: Option[String],
o2: Option[String] = Some("b"),
o3: Option[String] = None)
Examples
WithOptionals(o1 = None) => { "o2": "b" }
WithOptionals(o1 = Some("a")) => { "o1": "a", "o2": "b" }
WithOptionals(o1 = None, o2 = None) => {}
WithOptionals(o1 = None, o3 = Some("c")) => { "o3": "c" }
Optional Fields and Defaults

org/example/NotePad.psdc
{
"name": "NotePad", "type": "record", …
"fields": [
{
"name": "notes",
"type": { "type": "array", "items": "Note" }
}
]
}
Generated Scala
case class NotePad(notes: NoteArray)
class NoteArray extends IndexedSeq[Note]
Example
NotePad(notes = NoteArray(Note(…), …)) =>
{ "notes": [ { "title": "…" }, … ] }
Arrays

org/example/TermWeights.pdsc
{
"name": "TermWeights", "type": "record", "namespace": "org.example",
"fields": [
{
"name": "byTerm",
"type": { "type": "map", "keys": "string", "values": "float" }
}
]
}
Generated Scala
case class TermWeights(byTerm: FloatMap)
class FloatMap extends Map[String, Float]
Example
TermWeights(byTerm = FloatMap("ninja" -> 0.25f, "pirate" -> 0.75f)) =>
{ "byTerm": { "ninja": 0.25, "pirate": 0.75 } }
Maps

Unions are the closest
thing Pegasus has to a
polymorphic type.

Unlike true subtype
polymorphism, unions
have a “sealed” set of
member types.

org/example/Question.pdsc
{
"name": "Question", "type": "record", …
"fields": [
{ "name": "answerFormat", "type": [ "MultipleChoice", "TextEntry" ] }
]
}
Scala
case class Question(answerFormat: Question.AnswerFormat)
object Question {
sealed abstract class AnswerFormat()
case class MultipleChoiceMember(value: MultipleChoice) extends
AnswerFormat
case class TextEntryMember(value: TextEntry) extends AnswerFormat
}
Example
Question(answerFormat = MultipleChoiceMember(MultipleChoice(…)) =>
Unions

org/example/Fruits.pdsc
{
"type" : "enum",
"name" : "Fruits",
"namespace" : "org.example",
"symbols" : ["APPLE", "BANANA", "ORANGE"]
}
Generated Scala
object Fruits extend Enumeration {
val APPLE
val BANANA
val ORANCE
}
Example
Fruits.APPLE => "APPLE"
Enums

org/example/Timestamp.pdsc
{
"type" : "enum",
"name" : "Timestamp",
"namespace" : "org.example",
"ref" : "long"
}
Generated Scala
Use Long directly.
Example
1434916256
Basic Typeref

Typerefs are Pegasus’s
multi-tool.

org/example/AnswerFormats.pdsc
{
"name": "AnswerFormats", "namespace": "org.example", "type": "typeref",
"ref": [ "MultipleChoice", "TextEntry" ]
}
Scala
sealed abstract class AnswerFormats()
object AnswerFormats {
case class MultipleChoiceMember(v: MultipleChoice) extends AnswerFormats
case class TextEntryMember(v: TextEntry) extends AnswerFormats
}
Example
MultipleChoiceMember(MultipleChoice(…)) =>
{ "org.example.MultipleChoice": { … }
Naming a Union with a Typeref

org/example/DateTime.pdsc
{
"name": "DateTime", "namespace": "org.example", "type": "typeref",
"ref": "string",
"scala": {
"class": "org.joda.time.DateTime",
"coercerClass": "org.coursera.models.common.DateTimeCoercer"
}
}
Scala
Use org.joda.time.DateTime directly.
Example
Record(createdAt = new org.joda.time.DateTime(…)) =>
{ "createdAt": "2015-06-21T18:24:18Z" }
Custom Bindings with a Typeref

Pegasus System
Schema system: Schema based validation + custom validators.
Data system: JSON Object and Array equivalent types, support for binding to native types.
Code generators: Java, Scala (via Courier), Swift (in progress), Android Java (planned)
Codecs:
● via Pegasus:
o JSON - Jackson streaming
o PSON - non-standard JSON equivalent binary protocol
o Avro binary - compact binary protocol
● via Courier:
o StringKeyCodec - compatible with our legacy StringKeyFormats
o InlineStringCodec - a new “URL Friendly” JSON compatible format
Hardened and performance optimized at Linkedin. In large scale production use for over 3 years.

Courier SBT Plugin
Code generation
integrated into SBT build.
Caches .pdsc file state,
only runs generator when
they change.

http://coursera.github.io/courier/

Introduction to Courier

Recommended

Recommended

More Related Content

What's hot

What's hot (17)

Viewers also liked

Viewers also liked (14)

Similar to Introduction to Courier

Similar to Introduction to Courier (20)

Recently uploaded

Recently uploaded (20)

Introduction to Courier

Editor's Notes