Research oriented presentation @Microxchg Berlin Feb 5th 2016. New code to collect histograms of response time and export them to monte-carlo simulation spreadsheet via getguesstimate.com
2. What does @adrianco do?
@adrianco
Technology Due
Diligence on Deals
Presentations at
Conferences
Presentations at
Companies
Technical
Advice for Portfolio
Companies
Program
Committee for
Conferences
Networking with
Interesting PeopleTinkering with
Technologies
Maintain
Relationship with
Cloud Vendors
7. Some tools can show
the request flow
across a few services
8. Interesting
architectures have a
lot of microservices!
Flow visualization is
a big challenge.
See http://www.slideshare.net/LappleApple/gilt-from-monolith-ruby-app-to-micro-service-scala-service-architecture
9. Simulated Microservices
Model and visualize microservices
Simulate interesting architectures
Generate large scale configurations
Eventually stress test real tools
See github.com/adrianco/spigo
Simulate Protocol Interactions in Go
Visualize with D3
ELB Load Balancer
Zuul API Proxy
Karyon
Business
Logic
Staash
Data
Access
Layer
Priam Cassandra
Datastore
Three
Availability
Zones
10. Spigo Nanoservice Structure
func Start(listener chan gotocol.Message) {
...
for {
select {
case msg := <-listener:
flow.Instrument(msg, name, hist)
switch msg.Imposition {
case gotocol.Hello: // get named by parent
...
case gotocol.NameDrop: // someone new to talk to
...
case gotocol.Put: // upstream request handler
...
outmsg := gotocol.Message{gotocol.Replicate, listener, time.Now(),
msg.Ctx.NewParent(), msg.Intention}
flow.AnnotateSend(outmsg, name)
outmsg.GoSend(replicas)
}
case <-eurekaTicker.C: // poll the service registry
...
}
}
}
Nanoservice simulation total about 200 lines of Go
12. Open Zipkin
A common format for trace annotations
A Java tool for visualizing traces
Standardization effort to fold in other formats
Driven by Adrian Cole (currently at Pivotal)
Extended to load Spigo generated trace files
20. Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
21. Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Load Balancer
Load Balancer
22. Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Load Balancer
Load Balancer
Stream Service
Analytics Service
23. Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Enrich Message Queue
Riak KV
Enricher Services
Load Balancer
Load Balancer
Stream Service
Analytics Service
24. Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Enrich Message Queue
Riak KV
Enricher Services
Ingest Message Queue
Load Balancer
Load Balancer
Stream Service
Analytics Service
25. Single Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
Load Balancer
Normalization Services
Enrich Message Queue
Riak KV
Enricher Services
Ingest Message Queue
Load Balancer
Load Balancer
Stream Service Riak TS
Analytics Service
Ingester Service
26. Two Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
East Region Ingestion
West Region Ingestion
Multi Region TS Analytics
27. Two Region Riak IoT
IoT Ingestion Endpoint
Stream Endpoint
Analytics Endpoint
East Region Ingestion
West Region Ingestion
Multi Region TS Analytics
What’s the response
time of the stream
endpoint?
36. memcached hit %
memcached response mysql response
service cpu time
memcached hit mode
mysql cache hit mode
mysql disk access mode
Hit rates: memcached 40% mysql 70%
38. memcached hit %
memcached response mysql response
service cpu time
memcached hit mode
mysql cache hit mode
mysql disk access mode
Hit rates: memcached 60% mysql 70%
40. memcached hit %
memcached response mysql response
service cpu time
memcached hit mode
mysql cache hit mode
mysql disk access mode
Hit rates: memcached 20% mysql 90%
42. Changes made to codahale/hdrhistogram
Changes made to go-kit/kit/metrics (today!)
Implementation in adrianco/spigo/collect
43. What to measure?
Client Server
GetRequest
GetResponse
Client
Time
Client Send CS
Server Receive SR
Server Send SS
Client Receive CR
Server
Time
44. What to measure?
Client Server
GetRequest
GetResponse
Client
Time
Client Send CS
Server Receive SR
Server Send SS
Client Receive CR
Response
CR-CS
Service
SS-SR
Network
SR-CS
Network
CR-SS
Net Round Trip
(SR-CS) + (CR-SS)
(CR-CS) - (SS-SR)
Server
Time
45. Spigo Histogram Collection
func Start(listener chan gotocol.Message) {
...
for {
select {
case msg := <-listener:
flow.Instrument(msg, name, nethist)
switch msg.Imposition {
...
case gotocol.GetResponse:
// return path from a request, terminate and log response time in histograms
flow.End(msg, resphist, servhist, rthist)
case gotocol.Goodbye:
collect.SaveHist(nethist, name, "_net")
collect.SaveHist(resphist, name, "_resp")
collect.SaveHist(servhist, name, "_serv")
collect.SaveHist(rthist, name, “_rt")
collect.SaveAllGuesses(name)
gotocol.Message{gotocol.Goodbye, nil, time.Now(), gotocol.NilContext, name}.GoSend(parent)
return
}
case <-chatTicker.C:
...
sm = gotocol.Message{gotocol.GetRequest, listener, now, ctx, "Why"}
flow.AnnotateSend(sm, name)
sm.GoSend(microindex[m]) // send to a randomly chosen dependency
}
}
}
46. Go-Kit Histogram Collection
const (
maxHistObservable = 1000000
sampleCount = 500
)
func NewHist(name string) metrics.Histogram {
var h metrics.Histogram
if name != "" && archaius.Conf.Collect {
h = expvar.NewHistogram(name, 1000, maxHistObservable, 1, []int{50, 99}...)
if sampleMap == nil {
sampleMap = make(map[metrics.Histogram][]int64)
}
sampleMap[h] = make([]int64, 0, sampleCount)
return h
}
return nil
}
func Measure(h metrics.Histogram, d time.Duration) {
if h != nil && archaius.Conf.Collect {
if d > maxHistObservable {
h.Observe(int64(maxHistObservable))
} else {
h.Observe(int64(d))
}
s := sampleMap[h]
if s != nil && len(s) < sampleCount {
sampleMap[h] = append(s, int64(d))
}
}
}
Nanoseconds!
Median and 99%ile
Slice for first 500
values as samples for
export to Guesstimate
50. See http://www.getguesstimate.com
Response time distributions
exported directly from Spigo
as 500 samples to
json_metrics/storage.guess
then posted to guesstimate.
Conference driven
development not quite
complete, go-kit PR in
place to provide full
names of histograms
Relationship between
services will also be
exported soon.
54. Terabyte Memory Directions
Engulf dataset in memory for analytics
Balanced config for memory intensive workloads
Replace high end systems at commodity cost point
Explore non-volatile memory implications
55. Terabyte Memory Options
Now: Diablo DDR4 DIMM containing flash 64/128/256GB
Migrates pages to/from companion DRAM DIMM
Shipping now as volatile memory, future non-volatile
Announced but not shipped for 2016
AWS X1 Instance Type - over 2TB RAM
Easy availability should drive innovation
56. Diablo Memory1: Flash DIMM
NO CHANGES to CPU or Server
NO CHANGES to Operating System
NO CHANGES to Applications
✓ UP TO 256GB DDR4 MEMORY PER MODULE
✓ UP TO 4TB MEMORY IN 2 SOCKET SYSTEM
TM
58. Security
Visit http://www.battery.com/our-companies/ for a full list of all portfolio companies in which all Battery Funds have invested.
Palo Alto Networks
Enterprise IT
Operations &
Management
Big DataCompute
Networking
Storage