Go at Uber
Prashant Varanasi, Senior Engineer, RPC
April 26, 2016
● Previously at Google — my first production experience with Go
● At Uber, I work on the RPC team
○ Support the Go client library for TChannel, our open source RPC framework
○ Building out the new fleetwide routing and load balancing sidecar
● One of the core reviewers for internal Go libraries
● Maintainer of the internal Go build infrastructure, go-build
About me
Java-free for 2 years
● Traditionally used Python and Node.js
○ Node.js is popular in our marketplace
and dispatch services
○ Python is popular for business logic
as well as data analysis
● Almost all the business logic was
behind a monolithic Python API server
About Uber
From a large monolithic service...
The Uber Monolith
Users
Products
Background
Checks
Trips
Cities
Vehicles
Payments
Documents
Promos
● As Uber grew (in features and
engineers), continuous integration
turned into a liability as deploying the
codebase meant deploying everything
● We followed the lead of other hyper-
growth companies (i.e., Twitter, Netflix)
and broke up the monolith into many
microservices.
● This allowed more flexibility in
languages
About Uber
...to over one thousand microservices
Users
ProductsTrips
Cities
Communication
between services
uses HTTP/JSON or
TChannel/Thrift
● In 2014, Aiden Scandella, one of our early
engineers, experimented with Go
○ Deployed a sandbox marketplace for the Uber
External API
● By early 2015, Go was used across the
company,
○ Data Engineering: A cross-cluster query cache
○ Marketplace: The Geofences service
● Go was selected for the following characteristics:
○ Simple language, high developer productivity
○ Strong concurrency and parallelism built in
Introduction of Go
Cross-language microservices
Building up the ecosystem
● No standardized support for Go (build, lint, coverage) in our internal
infrastructure
○ Custom Makefiles integrated with go-junit-report and gocov
● Dependency management, used godep
○ Builds passed locally but not on build machines due to missing deps
○ Vendored dependencies hard to manage and review
● Lack of libraries for internal infrastructure
○ E.g., Kafka logging, dynamic configuration, background tasks
Making Go a first-class citizen
go-buildOld Way
Build Infrastructure
From copy+pasted Makefiles and shell scripts to a common submodule
● Before the GO15VENDOREXPERIMENT, we started with godep
○ Vendored dependencies in every repo
○ Even internal dependencies were vendored as godep restore was flaky
○ Noisy code reviews, hard to update dependencies,
● The glide migration
○ Users can choose to check in their dependencies in the vendor folder
○ Or check in glide.yaml and glide.lock and have dependencies
installed before build
○ Use an internal Github mirror to avoid outages affecting internal builds
Dependency Management
From godep to glide
● Base libraries used by almost all services: config, metrics, logging, tracing,
RPC
● Utility and extension libraries:
○ Transport helpers, JSON + HTTP, TChannel + Thrift
○ Backoff, LRU cache, flags, worker pools
● Libraries for internal infrastructure built up over time
○ Storage (schemaless -- internal key-value store built on MySQL)
○ Translations, AVRO-encoded logging, dynamic config, experiments
Libraries for internal infrastructure
Monorepo of Go libraries
● Now have over a hundred services written in Go
● Two years of production experience
○ Integrated runtime metrics (#goroutines, GC stats)
○ Easy profiling of services running in production
○ Investigated and fixed file descriptor leaks in open
source libraries
○ Even found a compiler bug that led to stack corruption
● Strong library and infrastructure support
● Lots of resources: documentation, classes, mailing list
● Now the recommended language for new services
Experience with Go
Writing code is fun again
What’s built in Go?
Goal: given a lat/lng pair, find the list of geofences this location falls in
● HTTP/JSON interface
● High throughput, low latency (P99 < 100ms)
● No infrastructure support, so project handled everything:
○ Config: Used the standard “encoding/json” package
○ Logging: wrapped the standard “log” package, and reported errors to
Sentry
○ Metrics: Wrapped open source StatsD client
Blog Post
Geofences
One of the earliest Go services at Uber
Goal: match riders to drivers, sharding the matching across machines
● Ringpop for sharding the matching by location
● TChannel / Thrift for RPC interface
● Uses our internal libraries for:
○ Config (both static, and dynamic configuration)
○ Logging (includes logging to disk, Kafka, and Sentry)
○ Metrics (reported to M3, our internal Metrics infrastructure)
● Much more infrastructure support for Go services
Geobase
One of the more recent Go services at Uber
● Embeddable application level sharding:
○ Fault detection: provides membership list of alive
nodes using a variation of SWIM (similar to Serf)
○ Consistent hashing: keys are hashed to a node,
and failures will not create load imbalances or
redistribute every key
○ Forwarding: provides forward of HTTP or
TChannel requests
● Used for sharding work, sharded cache, serializing
requests to a resource (in-order)
Ringpop
Scalable, fault-tolerant application-layer sharding. Open source!
● Strong forwarding performance, as it was built for Ringpop forwarding
● Built on top of TCP, provides multiplexing and framing
○ Supports out of order responses, no head of line blocking
○ Very simple protocol, easy to implement in multiple languages
● Provides high-level RPC features:
○ Timeouts, consistent retry semantics
○ Connection management and smart peer selection
● Integrates with Thrift as a first-class citizen
TChannel
Intra-datacenter RPC protocol. Open source, available for Go, Java, Node, and Python
● Strong integration with Thrift
○ Uses net/context
○ Integrates with error values
val, err := client.Get(ctx, key)
if err != nil {
switch err := err.(type) {
case *keyvalue.KeyNotFound:
// Handle Thrift exception
default:
// Unknown error
}
}
TChannel + Thrift
Intra-datacenter RPC protocol. Open source, available for Go, Java, Node, and Python
● Often used internally with Python and Node.js
● Uses pprof output to generate the flamegraph
go-torch
Visualization of profiling output. Open source!
● Powers the majority of high QPS services at Uber
● Already the most popular language for new services
● Open source our core libraries and tools
■ TChannel, Ringpop, go-torch, yab (beta), zap (beta)
■ Working with open source community on opentracing
● Check out our engineering blog and Github page for more information
Go at Uber
Thanks

Go at uber

  • 1.
    Go at Uber PrashantVaranasi, Senior Engineer, RPC April 26, 2016
  • 2.
    ● Previously atGoogle — my first production experience with Go ● At Uber, I work on the RPC team ○ Support the Go client library for TChannel, our open source RPC framework ○ Building out the new fleetwide routing and load balancing sidecar ● One of the core reviewers for internal Go libraries ● Maintainer of the internal Go build infrastructure, go-build About me Java-free for 2 years
  • 3.
    ● Traditionally usedPython and Node.js ○ Node.js is popular in our marketplace and dispatch services ○ Python is popular for business logic as well as data analysis ● Almost all the business logic was behind a monolithic Python API server About Uber From a large monolithic service... The Uber Monolith Users Products Background Checks Trips Cities Vehicles Payments Documents Promos
  • 4.
    ● As Ubergrew (in features and engineers), continuous integration turned into a liability as deploying the codebase meant deploying everything ● We followed the lead of other hyper- growth companies (i.e., Twitter, Netflix) and broke up the monolith into many microservices. ● This allowed more flexibility in languages About Uber ...to over one thousand microservices Users ProductsTrips Cities Communication between services uses HTTP/JSON or TChannel/Thrift
  • 5.
    ● In 2014,Aiden Scandella, one of our early engineers, experimented with Go ○ Deployed a sandbox marketplace for the Uber External API ● By early 2015, Go was used across the company, ○ Data Engineering: A cross-cluster query cache ○ Marketplace: The Geofences service ● Go was selected for the following characteristics: ○ Simple language, high developer productivity ○ Strong concurrency and parallelism built in Introduction of Go Cross-language microservices
  • 6.
    Building up theecosystem
  • 7.
    ● No standardizedsupport for Go (build, lint, coverage) in our internal infrastructure ○ Custom Makefiles integrated with go-junit-report and gocov ● Dependency management, used godep ○ Builds passed locally but not on build machines due to missing deps ○ Vendored dependencies hard to manage and review ● Lack of libraries for internal infrastructure ○ E.g., Kafka logging, dynamic configuration, background tasks Making Go a first-class citizen
  • 8.
    go-buildOld Way Build Infrastructure Fromcopy+pasted Makefiles and shell scripts to a common submodule
  • 9.
    ● Before theGO15VENDOREXPERIMENT, we started with godep ○ Vendored dependencies in every repo ○ Even internal dependencies were vendored as godep restore was flaky ○ Noisy code reviews, hard to update dependencies, ● The glide migration ○ Users can choose to check in their dependencies in the vendor folder ○ Or check in glide.yaml and glide.lock and have dependencies installed before build ○ Use an internal Github mirror to avoid outages affecting internal builds Dependency Management From godep to glide
  • 10.
    ● Base librariesused by almost all services: config, metrics, logging, tracing, RPC ● Utility and extension libraries: ○ Transport helpers, JSON + HTTP, TChannel + Thrift ○ Backoff, LRU cache, flags, worker pools ● Libraries for internal infrastructure built up over time ○ Storage (schemaless -- internal key-value store built on MySQL) ○ Translations, AVRO-encoded logging, dynamic config, experiments Libraries for internal infrastructure Monorepo of Go libraries
  • 11.
    ● Now haveover a hundred services written in Go ● Two years of production experience ○ Integrated runtime metrics (#goroutines, GC stats) ○ Easy profiling of services running in production ○ Investigated and fixed file descriptor leaks in open source libraries ○ Even found a compiler bug that led to stack corruption ● Strong library and infrastructure support ● Lots of resources: documentation, classes, mailing list ● Now the recommended language for new services Experience with Go Writing code is fun again
  • 12.
  • 13.
    Goal: given alat/lng pair, find the list of geofences this location falls in ● HTTP/JSON interface ● High throughput, low latency (P99 < 100ms) ● No infrastructure support, so project handled everything: ○ Config: Used the standard “encoding/json” package ○ Logging: wrapped the standard “log” package, and reported errors to Sentry ○ Metrics: Wrapped open source StatsD client Blog Post Geofences One of the earliest Go services at Uber
  • 14.
    Goal: match ridersto drivers, sharding the matching across machines ● Ringpop for sharding the matching by location ● TChannel / Thrift for RPC interface ● Uses our internal libraries for: ○ Config (both static, and dynamic configuration) ○ Logging (includes logging to disk, Kafka, and Sentry) ○ Metrics (reported to M3, our internal Metrics infrastructure) ● Much more infrastructure support for Go services Geobase One of the more recent Go services at Uber
  • 15.
    ● Embeddable applicationlevel sharding: ○ Fault detection: provides membership list of alive nodes using a variation of SWIM (similar to Serf) ○ Consistent hashing: keys are hashed to a node, and failures will not create load imbalances or redistribute every key ○ Forwarding: provides forward of HTTP or TChannel requests ● Used for sharding work, sharded cache, serializing requests to a resource (in-order) Ringpop Scalable, fault-tolerant application-layer sharding. Open source!
  • 16.
    ● Strong forwardingperformance, as it was built for Ringpop forwarding ● Built on top of TCP, provides multiplexing and framing ○ Supports out of order responses, no head of line blocking ○ Very simple protocol, easy to implement in multiple languages ● Provides high-level RPC features: ○ Timeouts, consistent retry semantics ○ Connection management and smart peer selection ● Integrates with Thrift as a first-class citizen TChannel Intra-datacenter RPC protocol. Open source, available for Go, Java, Node, and Python
  • 17.
    ● Strong integrationwith Thrift ○ Uses net/context ○ Integrates with error values val, err := client.Get(ctx, key) if err != nil { switch err := err.(type) { case *keyvalue.KeyNotFound: // Handle Thrift exception default: // Unknown error } } TChannel + Thrift Intra-datacenter RPC protocol. Open source, available for Go, Java, Node, and Python
  • 18.
    ● Often usedinternally with Python and Node.js ● Uses pprof output to generate the flamegraph go-torch Visualization of profiling output. Open source!
  • 19.
    ● Powers themajority of high QPS services at Uber ● Already the most popular language for new services ● Open source our core libraries and tools ■ TChannel, Ringpop, go-torch, yab (beta), zap (beta) ■ Working with open source community on opentracing ● Check out our engineering blog and Github page for more information Go at Uber
  • 20.