OpenCensus with Prometheus and Kubernetes
201904
김진웅
About Me
김진웅 @ddiiwoong
Cloud Platform, Data Lake Architect @SK C&C
Interested in Kubernetes and Serverless(FaaS), DevOps, SRE, ML/DL
Who am I and Where am I?
Ops
Data Center Virtual Machine Container Serverless
Weeks Minutes Seconds Milliseconds
Dev
1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed
OS설치/운영,
개발플랫폼 패치,
백업 등 직접관리
서버 기반이나
관리형 서비스로 제공
(설정, Scale 관리)
서버관리 없는 서비스
(No-Ops)
Complexity is inevitable
Microservices Containerization Orchestration Service Mesh
Bare Metal
Kernel
Network Stack
Cloud Stack
Libraries
Frameworks
Your Codes
Monitoring and Troubleshooting with Prometheus
Monitoring and Troubleshooting
● Cluster (APIs, Etcd, Nodes, VMs or BMs)
● Network (Service, Ingress, NetworkPolicy, DNS, TLS)
● Storage (Volumes, PV, PVC, CSI)
● Code
○ Instrumentation
○ Tracing, Metrics
(Cloud Providers, APM Solution, OpenSource)
OpenCensus
A single distribution of libraries that
automatically collect traces and metrics
from your app, display them locally, and
send them to any backend.
A Stats Collection and Distributed Tracing Framework
backed by Google and Microsoft in Jan. 2018
VM or Kube Pod
VM or Kube Pod
OpenCensus Libraries
Auth.
service
Catalog
service
Search
service
FrontEnd
service
oc
lib
oc
lib
oc
lib
oc
lib
oc agent
oc agent
metrics +
tracing
backends
oc
collector
OpenCensus
다양한 Language, 백엔드 Applicatin 지원
Tracing
● 서비스 요청에 대한 애플리케이션 또는 서비스 구조 확인
● 모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
Metrics
응용 프로그램 및 서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터
● database 및 API의 Latency
● Request content length
● open file descriptor 수
● cache hit/miss 수
OpenCensus Agent
Polygot 개발/배포를 위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon
● Agent
● Sidecar
● Kubernetes DaemonSet
OpenCensus Agent Benefit
● 단일 exporter 관리
● 배포의 민주화 (Democratizes)
○ Backend로 보내는 선택은 개발자의 몫
● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능
○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로
● 오버헤드 감소
○ application 재시작 없이 ocagent 만 재시작
● 관측가능한 signal 통합 (pass-through)
○ Routing - Zipkin, Jaeger, Prometheus data
○ polyglot and poly-backend 관리 용이
● 관리 Port 최소화
○ TCP 55678
OpenCensus Collector
Application과 근접한 곳에 위치(예: 동일VPC, AZ등)
OpenCensus Collector Benefit
● 단일 exporter 관리
● 배포의 민주화 (Democratizes)
○ Backend로 보내는 선택은 개발자의 몫
● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능
○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로
● Egress Point 제한
○ Application 내 다수의 API Key, TLS관리 일원화
● Backend까지 data 보장
○ built-in buffering and retry capabilities
● Intelligent (tail-based) Sampling 기능 활용 (percentile, 백분위)
● Annotation
○ span이 수집되는 동안 metadata 추가 가능
● Tagging 가능
○ span에 포함된 tag override, remove 가능
Demo - Hipster Shop
Hipster Shop: Cloud-Native Microservices Demo Application
● 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
Demo - Hipster Shop
Hipster Shop: Cloud-Native Microservices Demo Application
● 모든 통신은 gRPC, 외부 통신만 HTTP
● Polygot : Go, C#, Node.js, Python, Java
● Istio 구성 가능
● Skaffold 로 배포 (https://skaffold.dev/)
● Backend Embedded
○ Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo
○ Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo
● Load Generator(Locust) 가 지속적으로 서비스 호출
● 특정 서비스(CheckoutService/PlaceOrder)에서 Latency 지연 발생
● Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
Demo - Tracing (Frontend, Go)
Exporter Library 추가 및 http handler 초기화
import (
…
"go.opencensus.io/exporter/jaeger"
"go.opencensus.io/exporter/prometheus"
"go.opencensus.io/plugin/ochttp"
"go.opencensus.io/plugin/ochttp/propagation/b3"
...
)
func main() {
…
var handler http.Handler = r
handler = &logHandler{log: log, next: handler}
handler = ensureSessionID(handler)
handler = &ochttp.Handler{
Handler: handler,
Propagation: &b3.HTTPFormat{}}
log.Infof("starting server on " + addr + ":" + srvPort)
log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler))
}
https://godoc.org/go.opencensus.io/plugin/ochttp
Demo - Tracing (Frontend, Go)
Exporter 등록, Sampling (https://opencensus.io/stats/sampling/)
func initJaegerTracing(log logrus. FieldLogger) {
exporter, err := jaeger.NewExporter(jaeger.Options{
Endpoint: "http://jaeger:14268",
Process: jaeger.Process{
ServiceName: "frontend",
},
})
if err != nil {
log.Fatal(err)
}
trace.RegisterExporter(exporter)
}
trace.ApplyConfig(trace.Config{
DefaultSampler: trace.AlwaysSample(),
})
initJaegerTracing(log)
...
}
Supported Sampling Bit
● AlwaysSample
● NeverSample
● Probability
● RateLimiting
https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
Demo - Tracing (AdService, Java)
Exporter 등록
import io.opencensus.exporter.trace.jaeger. JaegerTraceExporter ;
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Register Jaeger Tracing.
JaegerTraceExporter
.createAndRegister("http://jaeger:14268/api/traces",
"adservice");
...
final AdService service = AdService.getInstance();
service.start();
service.blockUntilShutdown();
}
trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
Demo - Metrics (Frontend, Go)
Exporter 등록 및 gRPC views
func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter {
exporter, err := prometheus.NewExporter(prometheus.Options{})
if err != nil {
log.Fatal("error registering prometheus exporter")
return nil
}
view.RegisterExporter(exporter)
return exporter
}
func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) {
addr := ":9090"
log.Infof("starting prometheus server at %s", addr)
http.Handle("/metrics", exporter)
log.Fatal(http.ListenAndServe(addr, nil))
}
func initStats(log logrus.FieldLogger) {
// Start prometheus exporter
exporter := initPrometheusStatsExporter(log)
go startPrometheusExporter(log, exporter)
if err := view.Register(ochttp.DefaultServerViews...); err != nil {
log.Fatal("error registering default http server views")
}
if err := view.Register(ocgrpc.DefaultClientViews...); err != nil {
log.Fatal("error registering default grpc client views")
}
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/
stats/DataAggregation.md#view
Demo - Metrics (AdService, Java)
Exporter 등록
import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector;
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Register Prometheus exporters and export metrics to a Prometheus
HTTPServer.
PrometheusStatsCollector.createAndRegister();
HTTPServer prometheusServer = new HTTPServer(9090, true);
...
final AdService service = AdService.getInstance();
service.start();
service.blockUntilShutdown();
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
Demo - Metrics (AdService, Java)
gRPC views
/** Main launches the server from the command line. */
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Registers all RPC views.
RpcViews.registerAllViews();
...
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
Demo - Trace Monitoring (문제상황)
Demo - Stats Monitoring (문제상황)
Demo - Code Tuning
parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기
재배포 (skaffold)
$ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
Demo - Trace Monitoring
Demo - Stats Monitoring
정리
● OpenCensus Agent, Collector 활용 고민해보자
● SRE - SLI(Service Level Indicator), SLO(Service Level Objective)
● Application Custom Metric 확장
● Istio 확장
○ https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server
● OpenMetric + OpenCensus : ??
○ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
Q&A
@ddiiwoong
@ddiiwoong
ddiiwoong@gmail.com
https://ddii.dev

Opencensus with prometheus and kubernetes

  • 1.
    OpenCensus with Prometheusand Kubernetes 201904 김진웅
  • 2.
    About Me 김진웅 @ddiiwoong CloudPlatform, Data Lake Architect @SK C&C Interested in Kubernetes and Serverless(FaaS), DevOps, SRE, ML/DL
  • 3.
    Who am Iand Where am I? Ops Data Center Virtual Machine Container Serverless Weeks Minutes Seconds Milliseconds Dev 1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed OS설치/운영, 개발플랫폼 패치, 백업 등 직접관리 서버 기반이나 관리형 서비스로 제공 (설정, Scale 관리) 서버관리 없는 서비스 (No-Ops)
  • 4.
    Complexity is inevitable MicroservicesContainerization Orchestration Service Mesh Bare Metal Kernel Network Stack Cloud Stack Libraries Frameworks Your Codes
  • 5.
  • 6.
    Monitoring and Troubleshooting ●Cluster (APIs, Etcd, Nodes, VMs or BMs) ● Network (Service, Ingress, NetworkPolicy, DNS, TLS) ● Storage (Volumes, PV, PVC, CSI) ● Code ○ Instrumentation ○ Tracing, Metrics (Cloud Providers, APM Solution, OpenSource)
  • 7.
    OpenCensus A single distributionof libraries that automatically collect traces and metrics from your app, display them locally, and send them to any backend. A Stats Collection and Distributed Tracing Framework backed by Google and Microsoft in Jan. 2018
  • 8.
    VM or KubePod VM or Kube Pod OpenCensus Libraries Auth. service Catalog service Search service FrontEnd service oc lib oc lib oc lib oc lib oc agent oc agent metrics + tracing backends oc collector
  • 9.
  • 10.
    Tracing ● 서비스 요청에대한 애플리케이션 또는 서비스 구조 확인 ● 모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
  • 11.
    Metrics 응용 프로그램 및서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터 ● database 및 API의 Latency ● Request content length ● open file descriptor 수 ● cache hit/miss 수
  • 12.
    OpenCensus Agent Polygot 개발/배포를위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon ● Agent ● Sidecar ● Kubernetes DaemonSet
  • 13.
    OpenCensus Agent Benefit ●단일 exporter 관리 ● 배포의 민주화 (Democratizes) ○ Backend로 보내는 선택은 개발자의 몫 ● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 ○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 ● 오버헤드 감소 ○ application 재시작 없이 ocagent 만 재시작 ● 관측가능한 signal 통합 (pass-through) ○ Routing - Zipkin, Jaeger, Prometheus data ○ polyglot and poly-backend 관리 용이 ● 관리 Port 최소화 ○ TCP 55678
  • 14.
    OpenCensus Collector Application과 근접한곳에 위치(예: 동일VPC, AZ등)
  • 15.
    OpenCensus Collector Benefit ●단일 exporter 관리 ● 배포의 민주화 (Democratizes) ○ Backend로 보내는 선택은 개발자의 몫 ● 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 ○ Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 ● Egress Point 제한 ○ Application 내 다수의 API Key, TLS관리 일원화 ● Backend까지 data 보장 ○ built-in buffering and retry capabilities ● Intelligent (tail-based) Sampling 기능 활용 (percentile, 백분위) ● Annotation ○ span이 수집되는 동안 metadata 추가 가능 ● Tagging 가능 ○ span에 포함된 tag override, remove 가능
  • 16.
    Demo - HipsterShop Hipster Shop: Cloud-Native Microservices Demo Application ● 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
  • 17.
    Demo - HipsterShop Hipster Shop: Cloud-Native Microservices Demo Application ● 모든 통신은 gRPC, 외부 통신만 HTTP ● Polygot : Go, C#, Node.js, Python, Java ● Istio 구성 가능 ● Skaffold 로 배포 (https://skaffold.dev/) ● Backend Embedded ○ Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo ○ Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo ● Load Generator(Locust) 가 지속적으로 서비스 호출 ● 특정 서비스(CheckoutService/PlaceOrder)에서 Latency 지연 발생 ● Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
  • 18.
    Demo - Tracing(Frontend, Go) Exporter Library 추가 및 http handler 초기화 import ( … "go.opencensus.io/exporter/jaeger" "go.opencensus.io/exporter/prometheus" "go.opencensus.io/plugin/ochttp" "go.opencensus.io/plugin/ochttp/propagation/b3" ... ) func main() { … var handler http.Handler = r handler = &logHandler{log: log, next: handler} handler = ensureSessionID(handler) handler = &ochttp.Handler{ Handler: handler, Propagation: &b3.HTTPFormat{}} log.Infof("starting server on " + addr + ":" + srvPort) log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler)) } https://godoc.org/go.opencensus.io/plugin/ochttp
  • 19.
    Demo - Tracing(Frontend, Go) Exporter 등록, Sampling (https://opencensus.io/stats/sampling/) func initJaegerTracing(log logrus. FieldLogger) { exporter, err := jaeger.NewExporter(jaeger.Options{ Endpoint: "http://jaeger:14268", Process: jaeger.Process{ ServiceName: "frontend", }, }) if err != nil { log.Fatal(err) } trace.RegisterExporter(exporter) } trace.ApplyConfig(trace.Config{ DefaultSampler: trace.AlwaysSample(), }) initJaegerTracing(log) ... } Supported Sampling Bit ● AlwaysSample ● NeverSample ● Probability ● RateLimiting https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
  • 20.
    Demo - Tracing(AdService, Java) Exporter 등록 import io.opencensus.exporter.trace.jaeger. JaegerTraceExporter ; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Jaeger Tracing. JaegerTraceExporter .createAndRegister("http://jaeger:14268/api/traces", "adservice"); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
  • 21.
    Demo - Metrics(Frontend, Go) Exporter 등록 및 gRPC views func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter { exporter, err := prometheus.NewExporter(prometheus.Options{}) if err != nil { log.Fatal("error registering prometheus exporter") return nil } view.RegisterExporter(exporter) return exporter } func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) { addr := ":9090" log.Infof("starting prometheus server at %s", addr) http.Handle("/metrics", exporter) log.Fatal(http.ListenAndServe(addr, nil)) } func initStats(log logrus.FieldLogger) { // Start prometheus exporter exporter := initPrometheusStatsExporter(log) go startPrometheusExporter(log, exporter) if err := view.Register(ochttp.DefaultServerViews...); err != nil { log.Fatal("error registering default http server views") } if err := view.Register(ocgrpc.DefaultClientViews...); err != nil { log.Fatal("error registering default grpc client views") } } https://github.com/census-instrumentation/opencensus-specs/blob/master/ stats/DataAggregation.md#view
  • 22.
    Demo - Metrics(AdService, Java) Exporter 등록 import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Prometheus exporters and export metrics to a Prometheus HTTPServer. PrometheusStatsCollector.createAndRegister(); HTTPServer prometheusServer = new HTTPServer(9090, true); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  • 23.
    Demo - Metrics(AdService, Java) gRPC views /** Main launches the server from the command line. */ public static void main(String[] args) throws IOException, InterruptedException { ... // Registers all RPC views. RpcViews.registerAllViews(); ... } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  • 24.
    Demo - TraceMonitoring (문제상황)
  • 25.
    Demo - StatsMonitoring (문제상황)
  • 26.
    Demo - CodeTuning parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기 재배포 (skaffold) $ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
  • 27.
    Demo - TraceMonitoring
  • 28.
    Demo - StatsMonitoring
  • 29.
    정리 ● OpenCensus Agent,Collector 활용 고민해보자 ● SRE - SLI(Service Level Indicator), SLO(Service Level Objective) ● Application Custom Metric 확장 ● Istio 확장 ○ https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server ● OpenMetric + OpenCensus : ?? ○ https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
  • 30.