SlideShare a Scribd company logo
1 of 37
CourierJoe Betz @ Coursera
Courier is a code generator
{
"name": "Fortune",
"namespace": "org.example",
"type": "record",
"fields": [
{
"name": "message",
"type": "string"
}
]
}
{ "message": "Today is your lucky day!" }
case class Fortune(message: String)
JSON Data
ScalaPegasus Schema generate
serialize / deserialize
● Extension of Apache Avro’s schema language built at Linkedin.
● Designed for natural looking JSON.
● Rich type system maps well between JSON and type-safe
languages like Scala.
● Schema language is machine readable and easy to extend.
● Tooling and language support.
Pegasus Schema Language
Pegasus Schemas
Avro Schemas
+optional record fields, +typerefs
Core schema language:
records, maps, arrays, unions, enums,
primitives
Courier
Why Schemas?
Increase transparency
into the structure of our
data.
With a common
understanding of
structure type-safety can
span multiple languages
and platforms.
Why Pegasus Schemas?
Pegasus schemas have a
rich type system.
Pegasus schemas are
machine readable.
Pegasus schemas are
easy to extend.
Pegasus schemas work
with multiple data
formats.
Pegasus Schema Types
Pegasus Type Scala Type Example JSON
int, long, float, double, boolean,
string
Int, Long, Float, Double, Boolean, String 1, 10000000, 3.14, 2.718281, true, “Coursera”
record case class R(f1: T1, f1: T2, ...) { “f1”: 1, “f2”: “Coursera” }
array A extends IndexedSeq[T] [1, 2, 3]
map M extends Map[String, T] { “key1”: 1, “key2”: 2 }
union sealed abstract class U
case class M1(T1) extends U
case class M2(T2) extends U
{
“org.example.M1”: <T1 Value>
}
enum object E extend Enumeration “SYMBOL”
* unions and typerefs will be covered in more detail later.
.pdsc
(pegasus data schema)
Records
org/example/Note.pdsc
{
"name": "Note",
"namespace": "org.example",
"doc": "A simple note.",
"type": "record",
"fields": [
{ "name": "title", "type": "string" },
{ "name": "body", "type": "Body", "optional": true }
]
}
Scala
case class Note(title: String, body: Option[Body])
Examples
Note("reminder", Some(Body(…))) => { "title": "reminder", "body": { … } }
Note("reminder", None) => { "title": "reminder" }
Records
Schema
{ "name": "WithOptionals", "type": "record", …
"fields": [
{ "name": "o1", "type": "string", "optional": true },
{ "name": "o2", "type": "string", "optional": true, "default": "b" }
{ "name": "o3", "type": "string", "optional": true, "defaultNone": true }
]
Generated Scala
case class WithOptionals(
o1: Option[String],
o2: Option[String] = Some("b"),
o3: Option[String] = None)
Examples
WithOptionals(o1 = None) => { "o2": "b" }
WithOptionals(o1 = Some("a")) => { "o1": "a", "o2": "b" }
WithOptionals(o1 = None, o2 = None) => {}
WithOptionals(o1 = None, o3 = Some("c")) => { "o3": "c" }
Optional Fields and Defaults
Collection Types
org/example/NotePad.psdc
{
"name": "NotePad", "type": "record", …
"fields": [
{
"name": "notes",
"type": { "type": "array", "items": "Note" }
}
]
}
Generated Scala
case class NotePad(notes: NoteArray)
class NoteArray extends IndexedSeq[Note]
Example
NotePad(notes = NoteArray(Note(…), …)) =>
{ "notes": [ { "title": "…" }, … ] }
Arrays
org/example/TermWeights.pdsc
{
"name": "TermWeights", "type": "record", "namespace": "org.example",
"fields": [
{
"name": "byTerm",
"type": { "type": "map", "keys": "string", "values": "float" }
}
]
}
Generated Scala
case class TermWeights(byTerm: FloatMap)
class FloatMap extends Map[String, Float]
Example
TermWeights(byTerm = FloatMap("ninja" -> 0.25f, "pirate" -> 0.75f)) =>
{ "byTerm": { "ninja": 0.25, "pirate": 0.75 } }
Maps
Unions
Unions are the closest
thing Pegasus has to a
polymorphic type.
Unlike true subtype
polymorphism, unions
have a “sealed” set of
member types.
org/example/Question.pdsc
{
"name": "Question", "type": "record", …
"fields": [
{ "name": "answerFormat", "type": [ "MultipleChoice", "TextEntry" ] }
]
}
Scala
case class Question(answerFormat: Question.AnswerFormat)
object Question {
sealed abstract class AnswerFormat()
case class MultipleChoiceMember(value: MultipleChoice) extends
AnswerFormat
case class TextEntryMember(value: TextEntry) extends AnswerFormat
}
Example
Question(answerFormat = MultipleChoiceMember(MultipleChoice(…)) =>
Unions
Enums
org/example/Fruits.pdsc
{
"type" : "enum",
"name" : "Fruits",
"namespace" : "org.example",
"symbols" : ["APPLE", "BANANA", "ORANGE"]
}
Generated Scala
object Fruits extend Enumeration {
val APPLE
val BANANA
val ORANCE
}
Example
Fruits.APPLE => "APPLE"
Enums
Typerefs
org/example/Timestamp.pdsc
{
"type" : "enum",
"name" : "Timestamp",
"namespace" : "org.example",
"ref" : "long"
}
Generated Scala
Use Long directly.
Example
1434916256
Basic Typeref
Typerefs are Pegasus’s
multi-tool.
org/example/AnswerFormats.pdsc
{
"name": "AnswerFormats", "namespace": "org.example", "type": "typeref",
"ref": [ "MultipleChoice", "TextEntry" ]
}
Scala
sealed abstract class AnswerFormats()
object AnswerFormats {
case class MultipleChoiceMember(v: MultipleChoice) extends AnswerFormats
case class TextEntryMember(v: TextEntry) extends AnswerFormats
}
Example
MultipleChoiceMember(MultipleChoice(…)) =>
{ "org.example.MultipleChoice": { … }
Naming a Union with a Typeref
org/example/DateTime.pdsc
{
"name": "DateTime", "namespace": "org.example", "type": "typeref",
"ref": "string",
"scala": {
"class": "org.joda.time.DateTime",
"coercerClass": "org.coursera.models.common.DateTimeCoercer"
}
}
Scala
Use org.joda.time.DateTime directly.
Example
Record(createdAt = new org.joda.time.DateTime(…)) =>
{ "createdAt": "2015-06-21T18:24:18Z" }
Custom Bindings with a Typeref
Pegasus System
Schema system: Schema based validation + custom validators.
Data system: JSON Object and Array equivalent types, support for binding to native types.
Code generators: Java, Scala (via Courier), Swift (in progress), Android Java (planned)
Codecs:
● via Pegasus:
o JSON - Jackson streaming
o PSON - non-standard JSON equivalent binary protocol
o Avro binary - compact binary protocol
● via Courier:
o StringKeyCodec - compatible with our legacy StringKeyFormats
o InlineStringCodec - a new “URL Friendly” JSON compatible format
Hardened and performance optimized at Linkedin. In large scale production use for over 3 years.
Development with Courier
Courier SBT Plugin
Code generation
integrated into SBT build.
Caches .pdsc file state,
only runs generator when
they change.
Courier SBT Plugin
http://coursera.github.io/courier/

More Related Content

What's hot

New SPL Features in PHP 5.3
New SPL Features in PHP 5.3New SPL Features in PHP 5.3
New SPL Features in PHP 5.3Matthew Turland
 
XML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBXML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBMarco Gralike
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1Marco Gralike
 
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...Marco Gralike
 
Scalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/CascadingScalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/Cascadingjohnynek
 
Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Sumant Tambe
 
Wordcamp Fayetteville Pods Presentation (PDF)
Wordcamp Fayetteville Pods Presentation (PDF)Wordcamp Fayetteville Pods Presentation (PDF)
Wordcamp Fayetteville Pods Presentation (PDF)mpvanwinkle
 
Migrating to Puppet 4.0
Migrating to Puppet 4.0Migrating to Puppet 4.0
Migrating to Puppet 4.0Puppet
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLMarco Gralike
 
Scala for the doubters. Максим Клыга
Scala for the doubters. Максим КлыгаScala for the doubters. Максим Клыга
Scala for the doubters. Максим КлыгаAlina Dolgikh
 

What's hot (17)

Java and XML
Java and XMLJava and XML
Java and XML
 
Parsing XML Data
Parsing XML DataParsing XML Data
Parsing XML Data
 
Sax Dom Tutorial
Sax Dom TutorialSax Dom Tutorial
Sax Dom Tutorial
 
New SPL Features in PHP 5.3
New SPL Features in PHP 5.3New SPL Features in PHP 5.3
New SPL Features in PHP 5.3
 
XML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDBXML In The Real World - Use Cases For Oracle XMLDB
XML In The Real World - Use Cases For Oracle XMLDB
 
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
OPP2010 (Brussels) - Programming with XML in PL/SQL - Part 1
 
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
Oracle Developer Day, 20 October 2009, Oracle De Meern, Holland: Oracle Datab...
 
What is new in Java 8
What is new in Java 8What is new in Java 8
What is new in Java 8
 
Scalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/CascadingScalding: Twitter's Scala DSL for Hadoop/Cascading
Scalding: Twitter's Scala DSL for Hadoop/Cascading
 
Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)
 
Wordcamp Fayetteville Pods Presentation (PDF)
Wordcamp Fayetteville Pods Presentation (PDF)Wordcamp Fayetteville Pods Presentation (PDF)
Wordcamp Fayetteville Pods Presentation (PDF)
 
Migrating to Puppet 4.0
Migrating to Puppet 4.0Migrating to Puppet 4.0
Migrating to Puppet 4.0
 
ODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XMLODTUG Webcast - Thinking Clearly about XML
ODTUG Webcast - Thinking Clearly about XML
 
Scala for the doubters. Максим Клыга
Scala for the doubters. Максим КлыгаScala for the doubters. Максим Клыга
Scala for the doubters. Максим Клыга
 
Java XML Parsing
Java XML ParsingJava XML Parsing
Java XML Parsing
 
Xml parsing
Xml parsingXml parsing
Xml parsing
 
Spl Not A Bridge Too Far phpNW09
Spl Not A Bridge Too Far phpNW09Spl Not A Bridge Too Far phpNW09
Spl Not A Bridge Too Far phpNW09
 

Viewers also liked

The Ultimate Guide to Mobile Consumer Research
The Ultimate Guide to Mobile Consumer ResearchThe Ultimate Guide to Mobile Consumer Research
The Ultimate Guide to Mobile Consumer ResearchRay Beharry
 
Introduction to rest.li
Introduction to rest.liIntroduction to rest.li
Introduction to rest.liJoe Betz
 
Courier service van solutions
Courier service van solutionsCourier service van solutions
Courier service van solutionsHandHEuroTech
 
Market Research Report : Courier market in india 2014 - Sample
Market Research Report : Courier market in india 2014 - SampleMarket Research Report : Courier market in india 2014 - Sample
Market Research Report : Courier market in india 2014 - SampleNetscribes, Inc.
 
Parcel management system - presentation
Parcel management system - presentationParcel management system - presentation
Parcel management system - presentationTahmina Khatoon
 
Courier project abstract
Courier project abstractCourier project abstract
Courier project abstractRahul Chanda
 
Online Bus Reservatiom System
Online Bus Reservatiom SystemOnline Bus Reservatiom System
Online Bus Reservatiom SystemNikhil Vyas
 
Project report on mobile shop management
Project report on mobile shop managementProject report on mobile shop management
Project report on mobile shop managementDinesh Jogdand
 
Tcs presentation
Tcs presentationTcs presentation
Tcs presentationsahmedali11
 
Course registration system dfd
Course registration system dfdCourse registration system dfd
Course registration system dfdUtsav mistry
 
DFD for E-Commerce Website
DFD for E-Commerce WebsiteDFD for E-Commerce Website
DFD for E-Commerce WebsiteRabart Kurrey
 
Library management system
Library management systemLibrary management system
Library management systemashu6
 

Viewers also liked (14)

Itvv project ppt
Itvv project pptItvv project ppt
Itvv project ppt
 
The Ultimate Guide to Mobile Consumer Research
The Ultimate Guide to Mobile Consumer ResearchThe Ultimate Guide to Mobile Consumer Research
The Ultimate Guide to Mobile Consumer Research
 
Introduction to rest.li
Introduction to rest.liIntroduction to rest.li
Introduction to rest.li
 
Sad project fed_ex_express
Sad project fed_ex_expressSad project fed_ex_express
Sad project fed_ex_express
 
Courier service van solutions
Courier service van solutionsCourier service van solutions
Courier service van solutions
 
Market Research Report : Courier market in india 2014 - Sample
Market Research Report : Courier market in india 2014 - SampleMarket Research Report : Courier market in india 2014 - Sample
Market Research Report : Courier market in india 2014 - Sample
 
Parcel management system - presentation
Parcel management system - presentationParcel management system - presentation
Parcel management system - presentation
 
Courier project abstract
Courier project abstractCourier project abstract
Courier project abstract
 
Online Bus Reservatiom System
Online Bus Reservatiom SystemOnline Bus Reservatiom System
Online Bus Reservatiom System
 
Project report on mobile shop management
Project report on mobile shop managementProject report on mobile shop management
Project report on mobile shop management
 
Tcs presentation
Tcs presentationTcs presentation
Tcs presentation
 
Course registration system dfd
Course registration system dfdCourse registration system dfd
Course registration system dfd
 
DFD for E-Commerce Website
DFD for E-Commerce WebsiteDFD for E-Commerce Website
DFD for E-Commerce Website
 
Library management system
Library management systemLibrary management system
Library management system
 

Similar to Introduction to Courier

Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?Jesper Kamstrup Linnet
 
Processing OWL2 Ontologies using Thea: An application of Logic Programming
Processing OWL2 Ontologies using Thea: An application of Logic ProgrammingProcessing OWL2 Ontologies using Thea: An application of Logic Programming
Processing OWL2 Ontologies using Thea: An application of Logic ProgrammingVangelis Vassiliadis
 
Thea: Processing OWL Ontologies - An application of logic programming
Thea: Processing OWL Ontologies - An application of logic programmingThea: Processing OWL Ontologies - An application of logic programming
Thea: Processing OWL Ontologies - An application of logic programmingguest57f623bf
 
Stepping Up : A Brief Intro to Scala
Stepping Up : A Brief Intro to ScalaStepping Up : A Brief Intro to Scala
Stepping Up : A Brief Intro to ScalaDerek Chen-Becker
 
Scala for Java Programmers
Scala for Java ProgrammersScala for Java Programmers
Scala for Java ProgrammersEric Pederson
 
Basics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examplesBasics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examplesSanjeev Kumar Jaiswal
 
Scala - just good for Java shops?
Scala - just good for Java shops?Scala - just good for Java shops?
Scala - just good for Java shops?Sarah Mount
 
Genomic Analysis in Scala
Genomic Analysis in ScalaGenomic Analysis in Scala
Genomic Analysis in ScalaRyan Williams
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibJen Aman
 
Scala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning TalkScala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning TalkLior Schejter
 
Introduction To Scala
Introduction To ScalaIntroduction To Scala
Introduction To ScalaPeter Maas
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Databricks
 
Scala presentation by Aleksandar Prokopec
Scala presentation by Aleksandar ProkopecScala presentation by Aleksandar Prokopec
Scala presentation by Aleksandar ProkopecLoïc Descotte
 
Java Intro
Java IntroJava Intro
Java Introbackdoor
 
Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008Yardena Meymann
 
sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)
sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)
sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)Eugene Yokota
 

Similar to Introduction to Courier (20)

Scala - en bedre Java?
Scala - en bedre Java?Scala - en bedre Java?
Scala - en bedre Java?
 
Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?Scala - en bedre og mere effektiv Java?
Scala - en bedre og mere effektiv Java?
 
Scala in Places API
Scala in Places APIScala in Places API
Scala in Places API
 
Processing OWL2 Ontologies using Thea: An application of Logic Programming
Processing OWL2 Ontologies using Thea: An application of Logic ProgrammingProcessing OWL2 Ontologies using Thea: An application of Logic Programming
Processing OWL2 Ontologies using Thea: An application of Logic Programming
 
Thea: Processing OWL Ontologies - An application of logic programming
Thea: Processing OWL Ontologies - An application of logic programmingThea: Processing OWL Ontologies - An application of logic programming
Thea: Processing OWL Ontologies - An application of logic programming
 
Stepping Up : A Brief Intro to Scala
Stepping Up : A Brief Intro to ScalaStepping Up : A Brief Intro to Scala
Stepping Up : A Brief Intro to Scala
 
Scala for Java Programmers
Scala for Java ProgrammersScala for Java Programmers
Scala for Java Programmers
 
Basics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examplesBasics of JSON (JavaScript Object Notation) with examples
Basics of JSON (JavaScript Object Notation) with examples
 
Scala - just good for Java shops?
Scala - just good for Java shops?Scala - just good for Java shops?
Scala - just good for Java shops?
 
Genomic Analysis in Scala
Genomic Analysis in ScalaGenomic Analysis in Scala
Genomic Analysis in Scala
 
Elasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlibElasticsearch And Apache Lucene For Apache Spark And MLlib
Elasticsearch And Apache Lucene For Apache Spark And MLlib
 
Scala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning TalkScala Parser Combinators - Scalapeno Lightning Talk
Scala Parser Combinators - Scalapeno Lightning Talk
 
Scala in a nutshell by venkat
Scala in a nutshell by venkatScala in a nutshell by venkat
Scala in a nutshell by venkat
 
Introduction To Scala
Introduction To ScalaIntroduction To Scala
Introduction To Scala
 
Scala introduction
Scala introductionScala introduction
Scala introduction
 
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...Easy, scalable, fault tolerant stream processing with structured streaming - ...
Easy, scalable, fault tolerant stream processing with structured streaming - ...
 
Scala presentation by Aleksandar Prokopec
Scala presentation by Aleksandar ProkopecScala presentation by Aleksandar Prokopec
Scala presentation by Aleksandar Prokopec
 
Java Intro
Java IntroJava Intro
Java Intro
 
Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008Scala at HUJI PL Seminar 2008
Scala at HUJI PL Seminar 2008
 
sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)
sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)
sbt, history of JSON libraries, microservices, and schema evolution (Tokyo ver)
 

Recently uploaded

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 

Recently uploaded (20)

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 

Introduction to Courier

  • 1. CourierJoe Betz @ Coursera
  • 2. Courier is a code generator { "name": "Fortune", "namespace": "org.example", "type": "record", "fields": [ { "name": "message", "type": "string" } ] } { "message": "Today is your lucky day!" } case class Fortune(message: String) JSON Data ScalaPegasus Schema generate serialize / deserialize
  • 3.
  • 4. ● Extension of Apache Avro’s schema language built at Linkedin. ● Designed for natural looking JSON. ● Rich type system maps well between JSON and type-safe languages like Scala. ● Schema language is machine readable and easy to extend. ● Tooling and language support. Pegasus Schema Language Pegasus Schemas Avro Schemas +optional record fields, +typerefs Core schema language: records, maps, arrays, unions, enums, primitives Courier
  • 6. Increase transparency into the structure of our data.
  • 7.
  • 8. With a common understanding of structure type-safety can span multiple languages and platforms.
  • 10. Pegasus schemas have a rich type system.
  • 13. Pegasus schemas work with multiple data formats.
  • 14. Pegasus Schema Types Pegasus Type Scala Type Example JSON int, long, float, double, boolean, string Int, Long, Float, Double, Boolean, String 1, 10000000, 3.14, 2.718281, true, “Coursera” record case class R(f1: T1, f1: T2, ...) { “f1”: 1, “f2”: “Coursera” } array A extends IndexedSeq[T] [1, 2, 3] map M extends Map[String, T] { “key1”: 1, “key2”: 2 } union sealed abstract class U case class M1(T1) extends U case class M2(T2) extends U { “org.example.M1”: <T1 Value> } enum object E extend Enumeration “SYMBOL” * unions and typerefs will be covered in more detail later.
  • 17. org/example/Note.pdsc { "name": "Note", "namespace": "org.example", "doc": "A simple note.", "type": "record", "fields": [ { "name": "title", "type": "string" }, { "name": "body", "type": "Body", "optional": true } ] } Scala case class Note(title: String, body: Option[Body]) Examples Note("reminder", Some(Body(…))) => { "title": "reminder", "body": { … } } Note("reminder", None) => { "title": "reminder" } Records
  • 18. Schema { "name": "WithOptionals", "type": "record", … "fields": [ { "name": "o1", "type": "string", "optional": true }, { "name": "o2", "type": "string", "optional": true, "default": "b" } { "name": "o3", "type": "string", "optional": true, "defaultNone": true } ] Generated Scala case class WithOptionals( o1: Option[String], o2: Option[String] = Some("b"), o3: Option[String] = None) Examples WithOptionals(o1 = None) => { "o2": "b" } WithOptionals(o1 = Some("a")) => { "o1": "a", "o2": "b" } WithOptionals(o1 = None, o2 = None) => {} WithOptionals(o1 = None, o3 = Some("c")) => { "o3": "c" } Optional Fields and Defaults
  • 20. org/example/NotePad.psdc { "name": "NotePad", "type": "record", … "fields": [ { "name": "notes", "type": { "type": "array", "items": "Note" } } ] } Generated Scala case class NotePad(notes: NoteArray) class NoteArray extends IndexedSeq[Note] Example NotePad(notes = NoteArray(Note(…), …)) => { "notes": [ { "title": "…" }, … ] } Arrays
  • 21. org/example/TermWeights.pdsc { "name": "TermWeights", "type": "record", "namespace": "org.example", "fields": [ { "name": "byTerm", "type": { "type": "map", "keys": "string", "values": "float" } } ] } Generated Scala case class TermWeights(byTerm: FloatMap) class FloatMap extends Map[String, Float] Example TermWeights(byTerm = FloatMap("ninja" -> 0.25f, "pirate" -> 0.75f)) => { "byTerm": { "ninja": 0.25, "pirate": 0.75 } } Maps
  • 23. Unions are the closest thing Pegasus has to a polymorphic type.
  • 24. Unlike true subtype polymorphism, unions have a “sealed” set of member types.
  • 25. org/example/Question.pdsc { "name": "Question", "type": "record", … "fields": [ { "name": "answerFormat", "type": [ "MultipleChoice", "TextEntry" ] } ] } Scala case class Question(answerFormat: Question.AnswerFormat) object Question { sealed abstract class AnswerFormat() case class MultipleChoiceMember(value: MultipleChoice) extends AnswerFormat case class TextEntryMember(value: TextEntry) extends AnswerFormat } Example Question(answerFormat = MultipleChoiceMember(MultipleChoice(…)) => Unions
  • 26. Enums
  • 27. org/example/Fruits.pdsc { "type" : "enum", "name" : "Fruits", "namespace" : "org.example", "symbols" : ["APPLE", "BANANA", "ORANGE"] } Generated Scala object Fruits extend Enumeration { val APPLE val BANANA val ORANCE } Example Fruits.APPLE => "APPLE" Enums
  • 29. org/example/Timestamp.pdsc { "type" : "enum", "name" : "Timestamp", "namespace" : "org.example", "ref" : "long" } Generated Scala Use Long directly. Example 1434916256 Basic Typeref
  • 31. org/example/AnswerFormats.pdsc { "name": "AnswerFormats", "namespace": "org.example", "type": "typeref", "ref": [ "MultipleChoice", "TextEntry" ] } Scala sealed abstract class AnswerFormats() object AnswerFormats { case class MultipleChoiceMember(v: MultipleChoice) extends AnswerFormats case class TextEntryMember(v: TextEntry) extends AnswerFormats } Example MultipleChoiceMember(MultipleChoice(…)) => { "org.example.MultipleChoice": { … } Naming a Union with a Typeref
  • 32. org/example/DateTime.pdsc { "name": "DateTime", "namespace": "org.example", "type": "typeref", "ref": "string", "scala": { "class": "org.joda.time.DateTime", "coercerClass": "org.coursera.models.common.DateTimeCoercer" } } Scala Use org.joda.time.DateTime directly. Example Record(createdAt = new org.joda.time.DateTime(…)) => { "createdAt": "2015-06-21T18:24:18Z" } Custom Bindings with a Typeref
  • 33. Pegasus System Schema system: Schema based validation + custom validators. Data system: JSON Object and Array equivalent types, support for binding to native types. Code generators: Java, Scala (via Courier), Swift (in progress), Android Java (planned) Codecs: ● via Pegasus: o JSON - Jackson streaming o PSON - non-standard JSON equivalent binary protocol o Avro binary - compact binary protocol ● via Courier: o StringKeyCodec - compatible with our legacy StringKeyFormats o InlineStringCodec - a new “URL Friendly” JSON compatible format Hardened and performance optimized at Linkedin. In large scale production use for over 3 years.
  • 35. Courier SBT Plugin Code generation integrated into SBT build. Caches .pdsc file state, only runs generator when they change.

Editor's Notes

  1. Hi, my name is “Joe”. I work on the infrastructure team and I’m here to talk to you about a new project called “Courier”.
  2. Courier is a code generator for Scala. This slide shows what it essentially does. If you only remember one slide from this presentation, remember this one. Courier takes pegasus schema files as inputs. It generates Scala classes. These classes serialize/deserialize to JSON. Kinda like we do today with our Scala case classes and Play! JSON’s Formats. Two things I should note about this slide: For Play JSON’s Formats, We have this utility called AutoSchema that is able to generate schema from Scala classes. So basically it will generate a schema from a Scala class, the opposite direction that Courier does generation. However, AutoSchema deeply flawed. It could only do this correctly for some of our case classes and it would require significant change both to AutoSchema and to our JsonFormats to get it to work properly. Still, we could have chosen to take the approach where we and generate schemas from scala classes instead of the other way and I’ll talk a bit more later about why we prefer generating code from schemas. While JSON is shown here, Courier supports multiple message formats, not just JSON, and we’ll be look at those in more detail shortly. Okay, at this point, some of you might be wondering, ...
  3. What is the world are these pegasus schemas? I’ve never heard of this pegasus thing!
  4. Pegasus is a schema language and data system used by the LinkedIn rest.li opensource project. Unless you’ve used Rest.li, odds are you’ve never heard of Pegasus. The main thing to know is that pegasus’s schema language is just an extension of Avro’s schema language. In fact, pegasus schemas are almost identical to Avro schemas with a few specific improvements, the most important one being direct support for optional fields. And if you’ve heard of Avro, it’s most likely you know if it as a binary protocol, usually associated with Hadoop. We won’t be looking at the binary protocol much as we’re more focused on JSON, but we will be looking at the schema language Avro uses. We’ll talk more about the specific advantages of the Pegasus schema language shortly.
  5. Before we dive into Pegasus in more detail. Let’s look at why we care about schemas.
  6. If you’re a Web or Mobile developer you should not need to study our Scala code to try to figure the JSON structure of our REST resources you should not need to guess the structure from sample responses you should not need to ask the developer who wrote the resource to explain to you the structure data it returns Odds are, if you’re working with our current REST resources, you’ve got not choice but to do one of these things. WIth schemas, you can just open a file, and it tells you the exact structure of the data. And there is no possibility that some custom OFormat has been somewhere defined that manipulates the JSON structure in an unexpected way. Also, from schemas, we should be able to generate documentation like this ...
  7. This is actually a screenshot of our API Explorer. Unfortunately, the current API Explorer uses the broken AutoSchema utility, which often generates these documentation pages incorrectly. This is one it happened to get right. I think. It’s hard to be sure, AutoSchema is not particularly trustworthy. But with a schema language we will be able to generate documentation like this correctly. Documents like this are great when they’re correct. If you want to use a new REST API, you just pull up the API Explorer documentation and you immediately know what fields exist, if they are required or optional, and what their type is. Huge time saver.
  8. Currently, our backend Scala developers benefit from strong type safety and we’ve seen substantial productivity gains, but that type safety only spans service boundaries if both the client and server are written Scala, even though mobile developers are using statically typed languages. But if we instead generate bindings from schemas, we generating bindings for Java and Swift as well and we can expand that type-safety and all the productivity gains it offers.
  9. Okay, so schemas are great, but why Pegasus Schemas? Why not something else like JSON Schema?
  10. We want a rich type system, richer than what JSON Schema offers. For numerics JSON Schema offers a “integer” type and “number” type, which is sort of weird, JSON only has a “number” type, so I’m not sure how the inventors of JSON Schema landed on “integer” and “number” as their numeric types. Pegasus has “int”, “long”, “float” and “double”, which matches up better with how most statically typed languages, not just Scala!, define numeric types. JSON Schema is also missing direct support for type variance (no tagged union type like thrift or avro, no subtype polymorphism), so there is no natural and type-safe way to model polymorphic types. Pegasus, on the other hand, has a well thought out type system that is based on ADTs (Abstract Data Types) and maps well to the level of type-safety we expect in our Scala code. We’ll talk about this in a bit more detail when we review the Pegasus type system.
  11. Because the schemas themselves are written in JSON, we can read the schemas in just about any programming language because all you need is a JSON parser. This makes it easy to build tools like code generators schema based data validators documentation tools like our API explorer
  12. Being JSON, we can add additional fields to pegasus schemas for things we need and schema readers that are unaware of the fields simply ignore them. We’ve already done with a couple times with Courier, I’ll point those out as we go. If we had used a grammer based schema language like thrift of protobuf have, we would be unable to do this.
  13. With Play JSON, if we need to support an additional data format, we must add more implicit converters. With Pegasus, we define our schemas once, and then we can use the generate classes with any codecs that exist. If in the future we decide we need a new codec, we can add it and use it immediately with all Courier generated classes.
  14. This a high level overview of the pegasus types. I’ll look at each one individually, so you don’t need worry about understanding this all now. The left column shows the major types. The primitive types map directly to Scala types. All four numeric types maps to JSON’s “number” type.
  15. Pegaus schemas have the .pdsc file extension.
  16. Let’s look at some examples, starting with Record types.
  17. In each example, the first section is the pegasus schema. The second section shows basically what the generated class looks like. The actual generated classes are a bit more sophisticated but they will be consistent with the declaration shown here. The third section shows a few examples of Scala code and the equivalent JSON. This particular example shows a simple record with two fields. The second field is optional.
  18. Here’s another record. This shows how optional and default fields work. Note how the defaults in the schema correspond to the constructor defaults in the generated Scala class. The “defaultNone” schema property, highlighted in green, is a custom property added by Courier, it is not part of the pegasus schema language. Being able to add properties like this is an example of why we prefer having a schema language that is written in JSON.
  19. Okay, moving on to arrays and maps.
  20. Here’s an array. Here, the type definition highlighted in green is an array. The rest is a record type that contains the array. In Pegasus, arrays are usually defined “inline” like this. It is possible to define an array as a top level type, and I’ll show that later. But for now we’ll look an this inline array. Courier generates a class for the array, highlighted in green, extends IndexedSeq, and has all the convenience methods one expects of a Scala seq type. Map, filter, scan, fold, and so on.. Note that we do not use generic collection types for all arrays, instead we generate a custom array class for each contained type. We do this so that we can attach some schema related information to the array type which is needed for validation and some other Courier internals. This doesn’t end up being a problem in practice. It’s trivial to convert from Seq, List, etc. to the generated classes.
  21. Here’s a map. Currently all maps are string keyed, but we plan to add a feature to Courier called “Typed Map Keys” which allow key types to be defined as well similar to how we define the type of the values here. I’ll show how we will encode the map keys later on. Just like with arrays, we generate a map class for each contained type.
  22. Pegasus unions are tagged unions. Aka disjoint types, for those of you familiar with ADTs (Algebraic Data Types), unions are your sum type.
  23. … There is no notion of sub-types. But type hierarchies can be mapped to unions quite easily.
  24. … this is a bit of a subtle point. The notion here is that, with a union, all the member types are declared directly in the union declaration, so any code written to process the union is guaranteed to know what all the possible types are. This allows, for example, for a compiler to check if the cases of a pattern matching expression against a union are exhaustive or not. Those of you coding in Scala regularly should feel right at home with this concept.
  25. Okay, here’s an example union. Unions are one of the most complex types in Courier, so we’ll spend a bit of extra time on this slide. Here the union type declaration is highlighted in green. Again, it’s inside a record. Note the structure. Unions are declared using a JSON array. Each item in the array is a union member type. If you look at the generated scala, you’ll see a sealed AnswerFormat class inside the record classes companion object, and under that a case class for each of the union members. In Scala, we have to “box” the member types. For example “MultipleChoiceMember” boxes “MultipleChoice”. We do this because scala does not yet directly support unions. Support for unions for Scala has been mentioned by Odersky in a recent talk (I believe he refered to them as “disjoint types”), so if they are added to Scala, we won’t need to box the member types like we do here. How unions are represented in JSON is worth looking at more closely. In the example, highlighted in red is a Pegasus type name. This is the union tag. It indicates that this JSON data contains a “MultipleChoice” answer format. If instead the answerFormat had been a TextEntry, this would be “org.example.TextEntry”. This is a union, so it’s either/or, Since this union has two member types, it can either be MultipleChoice or TextEntry. The is the one feature of Pegasus where the JSON looks “un-natural”. But it is also flexible and clear.. Unions can be contain all possible member types, including primitives, and the tag used for each type makes it very clear what the member data is. We’ve spoken with a few javascript experts here at Coursera and written some sample code to make sure that we’ll be able to bind to this union format from javascript in a reasonable way.
  26. I like to think of enums as degenerate unions. In pegasus they are a completely separate type, which is probably a good thing. They map to JSON strings. They’re pretty simple.
  27. We’ve covered all the base types, but there is one language feature left to go over: Typerefs Typerefs do not exist in Avro, they are added by Pegasus.
  28. Here’s a basic typeref. It just defines an alias to the long type called “Timestamp”. This is just the most lightweight possible alias to an existing type. It really doesn’t do anything other than introduce a new name for the referenced type. In Scala and JSON, a “Timestamp” is handled exactly like a long is. We will rarely if ever use basic typerefs just to provide lightweight aliases.
  29. While basic typerefs are not all that useful. Pegasus use of typerefs to activate other, more powerful features, that are missing from Avro.
  30. Remember how we defined all unions, maps and arrays as inline types inside some other type? This is because Avro does not support defining union, maps or arrays as top level type. Typerefs allow us to ge around this. We can define a new typeref, and set the “ref” type to as a union. Courier will then generate a top level class for the union. We can do the same thing for maps or arrays if we want.
  31. This is the most powerful feature that typrefs enable. With a typeref we can bind a pegasus type to any Scala class we want. When we do this, Courier will not generate a class, but instead will use the existing class the we specify. In order for this to work, we must define a separate “coercer” class that converts the pegasus type (a string here) into the scala class. We expect custom type bindings to be used heavily for core types like as CourseId and UUID. For arity-1 case classes we plan add a feature to Courier to automatically handle the coercion, so a coercerClass will not need to be defined for simple “AnyVal” type case classes.
  32. In addition to the schema language, pegasus provides a well engineered Java implementation that we can take advantage of. One feature that we get “out-of-the-gate” is schema based data validation as well as the ability to write custom validators. We also get Java and Scala code generators. Any Android developers might be interested to know that the Java generator is available via a gradle plugin. Linkedin is working on code generators for Swift and planning to write a specialized code generator specifically for Android. I checked with LinkedIn a few weeks ago and at the time they were not willing to give me a firm date when these will be available. I was told that the Swift generator was basically “code complete” they were still working with the mobile team to improve the generated code to address some specific concerns I’m happy to let them flesh this stuff and get something polished when it’s ready! I will continue to check in with the rest.li team to make sure this is progressing. As I mentioned earlier, pegasus supports a multiple message formats. It calls these codecs. PSON is a binary format that is a bit more compact than JSON and is much cheaper, computationally to serialize/deserialize. We don’t have any immediate plans to use PSON, but have considered using it instead of JSON at the storage layer, and could turn it on between high traffic services to reduce CPU is it becomes as bottleneck.. Compatibility with Avro may have a number of advantages as it is a particularly compact binary protocol. Again, we don’t have any immediate plans to use Avro, but could potentially use it at the storage layer. Courier adds two custom codecs for string keys that I’ll discuss in more detail later. At LinkedIn, pegasus is the primary system for inter-service communication. It’s battle tested and performance tested. A huge amount of engineering effort has gone into productionalizing this code and performance optimizing it.
  33. Okay, so how does Courier integrate with SBT?
  34. Courier has been a part of “models” project in infra-services for over a month. The code generator runs before the compile task. You don’t need to change your development workflow. Courier will automatically generate classes as needed. If you make a change to a .pdsc, just run `compile`. If you want, you can run SBT `~compile`, which will watch for changes to .pdsc files and run courier automatically. Courier emits contextual error messages like the one shown here if there are any errors in schemas.
  35. The Courier plugin is integrated with Play! Error messages, like the missing closing quote on line 7 here, will appear in the browser when using Play!