Gophers Riding Elephants:
Writing PostgreSQL Tools in Go
by AJ Bahnken,
Senior Engineer @ Procore
Who am I?
Senior Engineer @ Procore
Work on availability, performance, and (now mostly)
security
Been writing Go code actively for 2 years
Twitter:
Email: aj.bahnken@procore.com
@ajvbahnken
Who is this talk for?
Overview of Go
Go was created by Google
It was built for Google as well
Reliable
Good for teams
Good for building the kind of things they need to build
"Less is exponentially more" by Rob Pike
Go is purposefully lacking certain features.
Link: https://commandcenter.blogspot.com/2012/06/less-is-
exponentially-more.html
Statically Typed
Compiled
Garbage Collected
package main 
import "fmt" 
func main() { 
    fmt.Println("Hello, 世界") 
} 
Why Go with Postgres?
1. Performance
It's pretty fast
Garbage collector is pretty solid
Concurrency is a breeze
package main 
import ( 
  "fmt" 
  "time" 
) 
func say(s string) { 
  for i := 0; i < 5; i++ { 
    time.Sleep(100 * time.Millisecond) 
    fmt.Println(s) 
  } 
} 
func main() { 
  go say("world") 
  say("hello") // Allows for the goroutine to run, by blocking. 
}
$ go run main.go 
world 
hello 
hello 
world 
world 
hello 
hello 
world 
world 
hello 
2. Reliability
Statically Typed (yay!)
bool 
string 
int  int8  int16  int32  int64 
uint uint8 uint16 uint32 uint64 uintptr 
byte // alias for uint8 
rune // alias for int32 
     // represents a Unicode code point 
float32 float64 
complex64 complex128 
Simple
type EventProcessor struct { 
    eventQueue chan Event 
} 
func (ep *EventProcessor) Add(event Event) { 
    ep.eventQueue <­ event 
} 
func (ep *EventProcessor) Start() { 
  for { 
    event := <­ep.eventQueue 
    go event.Process() 
  } 
} 
Testing is simple and built in + race detector
$ ls 
processing.go       processing_test.go  utils.go 
$ go test 
PASS 
ok      ~/pgnetdetective/processing        0.165s 
$ go test ­­race 
PASS 
ok      ~/pgnetdetective/processing        2.133s 
Error handling instead of exceptions
func MyFunc() (string, error) { 
  str, err := run() 
  if err != nil { 
    return "", err 
  } 
  return str, nil 
} 
func MustMyFunc() string { 
  str, err := run() 
  if err != nil { 
    panic("run() returned an err: "+err.String()) 
  } 
  return str 
} 
3. Ease of Use
Tooling
(Gofmt, testing, godocs, go build/run, vim-go)
Familiarity
Library support and ease of installation
$ go get github.com/urfave/cli 
Distribute a single binary anywhere
$ go build 
$ file dbduke 
dbduke: Mach­O 64­bit executable x86_64 
$ GOOS=linux go build 
$ file dbduke 
dbduke: ELF 64­bit LSB executable, x86­64, version 1 (SYSV),  
     statically linked, not stripped 
$ GOOS=linux GOARCH=386 go build 
$ file dbduke 
dbduke: ELF 32­bit LSB executable, Intel 80386, version 1  
    (SYSV), statically linked, not stripped 
Performance
Reliability
Ease of Use
Interacting with
Postgres in Go
database/sql
Docs: https://golang.org/pkg/database/sql/
Provides core interface for interacting with SQL databases
Open / Close
Begin / Rollback / Commit
Exec / Query / QueryRow
Ping / Connection Pooling
go get github.com/lib/pq
package main 
import ( 
  "database/sql" 
  "fmt" 
  _ "github.com/lib/pq" 
) 
func main() { 
  dbUrl := "postgres://postgres@localhost:5432/postgres" 
  db, err := sql.Open("postgres", dbUrl) 
  if err != nil { 
    panic(err.String()) 
  } 
  var result int 
  err = db.QueryRow('SELECT 1').Scan(&result) 
  if err != nil { 
    panic(err.String()) 
  } 
  fmt.Printf("1 == %d", result) 
} 
http://go-database-sql.org/
Example #1
pgnetdetective
https://github.com/procore/pgnetdetective
?????
tcpdump ­n ­w ~/pg.cap ­i any port 5432 
~1GB every 10 seconds
We needed something faster,
so I decided to rewrite it in Go
https://github.com/google/gopacket
Provides packet processing
capabilities for Go
// If the destination port is 5432... 
if tcp.DstPort == 5432 { 
  // And the packet payload starts with P or Q... 
  raw = fmt.Sprintf("%s", tcp.Payload) 
  if strings.HasPrefix(raw, "P") || strings.HasPrefix(raw, "Q") { 
    // It is a Parse or Query packet, therefore it contains a Query 
    combinedQueryMetrics.Add( 
      metrics.New( 
        NormalizeQuery(raw), 
        1, 
        ip.SrcIP, 
        tcp.Seq, 
      ), 
    ) 
  } 
} else if tcp.SrcPort == 5432 && tcp.ACK { 
  responses = append(responses, &ResponsePacket{
    DstIP: ip.DstIP, 
    Ack:   tcp.Ack, 
    Size:  uint64(len(tcp.Payload)), 
  }) 
} 
So I got some output like this:
******* Query ******* 
Query: SELECT attr.attname FROM pg_attribute attr 
    INNER JOIN pg_constraint cons ON attr.attrelid = cons.conrelid 
    AND attr.attnum = any(cons.conkey) WHERE cons.contype = p 
    AND cons.conrelid = "drawing_log_imports"::regclass 
TotalNetBytes: 170 MB 
TotalResponsePackets: 64041 
TotalQueryPackets: 63 
ummm, catalog queries?
Introducing: Resque
https://github.com/resque/resque
http://resque.github.io/
~10,000 jobs per hour
1-8 tables being touched per job
average of 20 columns per table.
During spikes, this can get up to 120MB per second.
On to Sidekiq we go...
Where Go won with pgnetdetective:
Performance
Community (Ease of Use)
Example #2 dbduke
(not open source + still under active development)
Context:
1. We restore staging/qa/testing databases frequently
2. It's important that they successfully restore
Problem:
1. When restores fail, productivity dies
2. The process of kicking restores off by hand is faulty
Further Context for a Solution:
1. Restores sometimes fail from easily recoverable errors
A tool for making restores of Postgres
databases manageable and fault tolerant.
A tool for making restores of Postgres
databases manageable and fault tolerant.
Manageable
Run dbduke as a daemon with jobs
$ dbduke jobs 
­­­­­­­­­­­­ 
   DBDuke 
­­­­­­­­­­­­ 
* restore ­ 35e1ca93­936b­4c73­8812­b1a69d708791 
   database: postgres 
   dumpfile: /data/lite­dump.dmp 
   started: 17:19:59 Tue Oct 11, 2016 ­0700 
   flags: ­­no­big­tables ­­maintenance 
A tool for making restores of Postgres
databases manageable and fault tolerant.
Fault Tolerance
Treat restores as a state machine
and recover from failure states
Error handling in practice:
1. Error out
2. Log a warning
3. Retry with timeout (with or without backoff)
Error out
db, err := sql.Open("postgres", dbUrl) 
if err != nil { 
  log.Fatalf("Could not open postgres db @ `%s`", dbUrl) 
} 
Log warning
query := "DROP SCHEMA IF EXISTS _migration CASCADE" 
_, err = db.Exec(query) 
if err != nil { 
  log.Warnf("Query `%s` failed with err: %v", query, err) 
} 
Retry with timeout (without backoff)
func (r *Restorer) BlockTillNotInUse() { 
    if r.State == state.InUse { 
        log.Warn("State is currently InUse. Going into retry loop.") 
        for { 
            time.Sleep(time.Second * 15) 
            r.QuitIfTimeout() 
            currentState, err := state.GetCurrentState() 
            if err != nil { 
                log.Errorf( 
                  "Error getting current state. Err: %v", 
                  err, 
                ) 
                break 
            } 
            if currentState != state.InUse { 
                r.State = currentState 
                break 
            } 
        } 
    } 
} 
Manageability + Fault Tolerance
Go makes it easy! ™
Where Go won with dbduke:
Error Handling (Reliability)
Concurrency (Performance/Ease of Use)
In Conclusion
In the context of tool building
Go = Reliability, Performance, and Ease of
Use
Procore is hiring! (big surprise)
http://procore.com/careers
Thank you! Questions?
aj.bahnken@procore.com / @ajvbahnken
Further Resources
Tour of Go
Effective Go (required reading)
Great intoduction to using SQL within Go
Why we import drivers with '_'
Sources
Performance Graph

Gophers Riding Elephants: Writing PostgreSQL tools in Go