OpenCensus with Prometheus and Kubernetes

OpenCensus with Prometheus and Kubernetes
Korea DevOps MeetUp '19
김진웅 (ddiiwoong)
About Me
김진웅 @ddiiwoong
Cloud Native Platform Architect @SK C&C
Interested in Kubernetes and Serverless(FaaS), Dev(Data)Ops, SRE, ML/DL
Who am I and Where am I?
DevOps?
Data Center Virtual Machine Container Serverless
Weeks Minutes Seconds Milliseconds
Evolution
1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed
OS설치/운영,
개발플랫폼 패치,
백업 등 직접관리
서버 기반이나
관리형 서비스로 제공
(설정, Scale 관리)
서버관리 없는 서비스
(No-Ops)
Complexity is inevitable
Microservices Containerization Orchestration Service Mesh
Bare Metal
Kernel
Network Stack
Cloud Stack
Libraries
Frameworks
Your Codes
Monitoring and Troubleshooting with Prometheus
Monitoring and Troubleshooting
• Cluster (APIs, Etcd, Nodes, VMs or BMs)
• Network (Service, Ingress, NetworkPolicy, DNS, TLS)
• Storage (Volumes, PV, PVC, CSI)
• Code Instrumentation
• Metrics, Tracing
(Cloud Providers, APM Solution, OpenSource)
Metrics
응용 프로그램 및 서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터
• Database, API Latency
• Request content length
• Open file descriptor
• Cache hit/miss
Tracing
서비스 요청에 대한 애플리케이션 또는 서비스 구조 확인
모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
OpenCensus
A Stats Collection and Distributed Tracing Framework
backed by Google and Microsoft since Jan. 2018
A single distribution of libraries that
automatically collect traces and metrics
from your app, Display them locally, and
send them to any backend.
(Prometheus, Stackdriver, Zipkin, Jaeger...)
VM or Kube Pod
VM or Kube Pod
OpenCensus Libraries
Auth.
service
Catalog
service
Search
service
FrontEnd
service
oc
lib
oc
lib
oc
lib
oc
lib
oc agent
oc agent
metrics +
tracing
backends
oc
collector
OpenCensus
다양한 Language, 백엔드 Application 지원
OpenCensus Agent
Polygot 개발/배포를 위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon
• Agent
• Sidecar
• Kubernetes DaemonSet
OpenCensus Agent Benefit
• 단일 exporter 관리
• 배포의 민주화 (Democratizes)
Backend로 보내는 선택은 개발자의 몫
• 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능
Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로
• 오버헤드 감소
application 재시작 없이 ocagent 만 재시작
• 관측가능한 signal 통합 (pass-through)
Routing - Zipkin, Jaeger, Prometheus data
polyglot and poly-backend 관리 용이
• 관리 Port 최소화
TCP 55678
OpenCensus Collector
Application과 근접한 곳에 위치 (예: 동일 VPC, Available Zone등)
OpenCensus Collector Benefit
• 단일 exporter 관리
• 배포의 민주화 (Democratizes)
Backend로 보내는 선택은 개발자의 몫
• 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능
Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로
• Egress Point 제한
Application 내 다수의 API Key, TLS관리 일원화
• Backend까지 data 보장
built-in buffering and retry capabilities
• Intelligent Sampling 기능 활용 (percentile, 백분위)
• Annotation
span이 수집되는 동안 metadata 추가 가능
• Tagging 가능
span에 포함된 tag override, remove 가능
OpenCensus Collector Performance
• 1 collector (HA 구성) - 24 Cores, 48 GB
• 17.3 billion spans/day (200K spans/second)
• 57 TB data a day (3.3KB per span)
• 0 dropped spans
• https://cloud.withgoogle.com/next/sf/sessions?session=268946-130602
Demo - Hipster Shop
Hipster Shop: Cloud-Native Microservices Demo Application
• 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
Demo - Hipster Shop
Hipster Shop: Cloud-Native Microservices Demo Application
• 모든 통신은 gRPC, 외부 통신만 HTTP
• Polygot : Go, C#, Node.js, Python, Java
• Istio 구성 가능
• Skaffold 로 배포 (https://skaffold.dev/)
• Backend Embedded
Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo
Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo
• Load Generator(Locust) 가 지속적으로 서비스 호출
• 특정 서비스(productcatalog)에서 GetProduct 함수 Latency 지연 발생
• Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
Demo - Tracing (Frontend, Go)
Exporter Library 추가 및 http handler 초기화
import (
…
"go.opencensus.io/exporter/jaeger"
"go.opencensus.io/exporter/prometheus"
"go.opencensus.io/plugin/ochttp"
"go.opencensus.io/plugin/ochttp/propagation/b3"
...
)
func main() {
…
var handler http.Handler = r
handler = &logHandler{log: log, next: handler}
handler = ensureSessionID(handler)
handler = &ochttp.Handler{
Handler: handler,
Propagation: &b3.HTTPFormat{}}
log.Infof("starting server on " + addr + ":" + srvPort)
log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler))
}
https://godoc.org/go.opencensus.io/plugin/ochttp
Demo - Tracing (Frontend, Go)
Exporter 등록, Sampling (https://opencensus.io/stats/sampling/)
func initJaegerTracing(log logrus.FieldLogger) {
exporter, err := jaeger.NewExporter(jaeger.Options{
Endpoint: "http://jaeger:14268",
Process: jaeger.Process{
ServiceName: "frontend",
},
})
if err != nil {
log.Fatal(err)
}
trace.RegisterExporter(exporter)
}
trace.ApplyConfig(trace.Config{
DefaultSampler: trace.AlwaysSample(),
})
initJaegerTracing(log)
...
}
Supported Sampling Bit
● AlwaysSample
● NeverSample
● Probability
● RateLimiting
https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
Demo - Tracing (AdService, Java)
Exporter 등록
import io.opencensus.exporter.trace.jaeger.JaegerTraceExporter;
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Register Jaeger Tracing.
JaegerTraceExporter
.createAndRegister("http://jaeger:14268/api/traces",
"adservice");
...
final AdService service = AdService.getInstance();
service.start();
service.blockUntilShutdown();
}
trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
Demo - Metrics (Frontend, Go)
Exporter 등록 및 gRPC views
func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter {
exporter, err := prometheus.NewExporter(prometheus.Options{})
if err != nil {
log.Fatal("error registering prometheus exporter")
return nil
}
view.RegisterExporter(exporter)
return exporter
}
func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) {
addr := ":9090"
log.Infof("starting prometheus server at %s", addr)
http.Handle("/metrics", exporter)
log.Fatal(http.ListenAndServe(addr, nil))
}
func initStats(log logrus.FieldLogger) {
// Start prometheus exporter
exporter := initPrometheusStatsExporter(log)
go startPrometheusExporter(log, exporter)
if err := view.Register(ochttp.DefaultServerViews...); err != nil {
log.Fatal("error registering default http server views")
}
if err := view.Register(ocgrpc.DefaultClientViews...); err != nil {
log.Fatal("error registering default grpc client views")
}
}
https://github.com/census-instrumentation/opencensus-
specs/blob/master/stats/DataAggregation.md#view
Demo - Metrics (AdService, Java)
Exporter 등록
import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector;
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Register Prometheus exporters and export metrics to a Prometheus
HTTPServer.
PrometheusStatsCollector.createAndRegister();
HTTPServer prometheusServer = new HTTPServer(9090, true);
...
final AdService service = AdService.getInstance();
service.start();
service.blockUntilShutdown();
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
Demo - Metrics (AdService, Java)
gRPC views
/** Main launches the server from the command line. */
public static void main(String[] args) throws IOException,
InterruptedException {
...
// Registers all RPC views.
RpcViews.registerAllViews();
...
}
https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
Demo - Trace Monitoring (문제상황)
Demo - Stats Monitoring (문제상황)
Demo - Code Tuning
parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기
재배포 (skaffold)
$ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
Demo - Trace Monitoring
Demo - Stats Monitoring
Wrap Up
• OpenCensus Agent, Collector 활용 고민
• App. SRE - SLI(Service Level Indicator), SLO(Service Level Objective)
• Application Custom Metric 확장
• Istio 확장
https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server
• OpenMetric + OpenCensus = OpenTelemetry
https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
OpenTracing
Vendor-neutral APIs and instrumentation for distributed tracing.
Open standard for distributed tracing.
Libraries available in 9 languages
(Go, JavaScript, Java, Python, Ruby, PHP,
Objective-C, C++, C#)
OpenTelemetry
OpenTelemetry : The next major version
of the OpenTracing and OpenCensus
+ =
Q&A
@ddiiwoong
@ddiiwoong
ddiiwoong@gmail.com
https://ddii.dev
1 of 33

More Related Content

Recently uploaded(20)

METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
Prity Khastgir IPR Strategic India Patent Attorney Amplify Innovation24 views
CXL at OCPCXL at OCP
CXL at OCP
CXL Forum203 views
Green Leaf Consulting: Capabilities DeckGreen Leaf Consulting: Capabilities Deck
Green Leaf Consulting: Capabilities Deck
GreenLeafConsulting177 views
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)
CSUC - Consorci de Serveis Universitaris de Catalunya59 views

OpenCensus with Prometheus and Kubernetes

  • 1. OpenCensus with Prometheus and Kubernetes Korea DevOps MeetUp '19 김진웅 (ddiiwoong)
  • 2. About Me 김진웅 @ddiiwoong Cloud Native Platform Architect @SK C&C Interested in Kubernetes and Serverless(FaaS), Dev(Data)Ops, SRE, ML/DL
  • 3. Who am I and Where am I? DevOps? Data Center Virtual Machine Container Serverless Weeks Minutes Seconds Milliseconds Evolution 1단계 : Self-manage 2단계 : Managed 3단계 : Fully-Managed OS설치/운영, 개발플랫폼 패치, 백업 등 직접관리 서버 기반이나 관리형 서비스로 제공 (설정, Scale 관리) 서버관리 없는 서비스 (No-Ops)
  • 4. Complexity is inevitable Microservices Containerization Orchestration Service Mesh Bare Metal Kernel Network Stack Cloud Stack Libraries Frameworks Your Codes
  • 6. Monitoring and Troubleshooting • Cluster (APIs, Etcd, Nodes, VMs or BMs) • Network (Service, Ingress, NetworkPolicy, DNS, TLS) • Storage (Volumes, PV, PVC, CSI) • Code Instrumentation • Metrics, Tracing (Cloud Providers, APM Solution, OpenSource)
  • 7. Metrics 응용 프로그램 및 서비스의 성능과 품질을 측정하는 데 도움이 되는 정량 데이터 • Database, API Latency • Request content length • Open file descriptor • Cache hit/miss
  • 8. Tracing 서비스 요청에 대한 애플리케이션 또는 서비스 구조 확인 모든 서비스들 간 데이터 흐름을 시각화하여 아키텍처상의 병목 현상을 파악
  • 9. OpenCensus A Stats Collection and Distributed Tracing Framework backed by Google and Microsoft since Jan. 2018 A single distribution of libraries that automatically collect traces and metrics from your app, Display them locally, and send them to any backend. (Prometheus, Stackdriver, Zipkin, Jaeger...)
  • 10. VM or Kube Pod VM or Kube Pod OpenCensus Libraries Auth. service Catalog service Search service FrontEnd service oc lib oc lib oc lib oc lib oc agent oc agent metrics + tracing backends oc collector
  • 12. OpenCensus Agent Polygot 개발/배포를 위해 중앙화된 exporter 구현을 할 수있게 해주는 Daemon • Agent • Sidecar • Kubernetes DaemonSet
  • 13. OpenCensus Agent Benefit • 단일 exporter 관리 • 배포의 민주화 (Democratizes) Backend로 보내는 선택은 개발자의 몫 • 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 • 오버헤드 감소 application 재시작 없이 ocagent 만 재시작 • 관측가능한 signal 통합 (pass-through) Routing - Zipkin, Jaeger, Prometheus data polyglot and poly-backend 관리 용이 • 관리 Port 최소화 TCP 55678
  • 14. OpenCensus Collector Application과 근접한 곳에 위치 (예: 동일 VPC, Available Zone등)
  • 15. OpenCensus Collector Benefit • 단일 exporter 관리 • 배포의 민주화 (Democratizes) Backend로 보내는 선택은 개발자의 몫 • 최초 Instrumentation 적용후 언제든지 원하는 Backend로 변경가능 Prometheus 에서 Stackdriver로, Zipkin 에서 Jaeger로 • Egress Point 제한 Application 내 다수의 API Key, TLS관리 일원화 • Backend까지 data 보장 built-in buffering and retry capabilities • Intelligent Sampling 기능 활용 (percentile, 백분위) • Annotation span이 수집되는 동안 metadata 추가 가능 • Tagging 가능 span에 포함된 tag override, remove 가능
  • 16. OpenCensus Collector Performance • 1 collector (HA 구성) - 24 Cores, 48 GB • 17.3 billion spans/day (200K spans/second) • 57 TB data a day (3.3KB per span) • 0 dropped spans • https://cloud.withgoogle.com/next/sf/sessions?session=268946-130602
  • 17. Demo - Hipster Shop Hipster Shop: Cloud-Native Microservices Demo Application • 상품을 검색 및 구매할 수 있는 웹 기반 이커머스 Application
  • 18. Demo - Hipster Shop Hipster Shop: Cloud-Native Microservices Demo Application • 모든 통신은 gRPC, 외부 통신만 HTTP • Polygot : Go, C#, Node.js, Python, Java • Istio 구성 가능 • Skaffold 로 배포 (https://skaffold.dev/) • Backend Embedded Stackdriver - https://github.com/GoogleCloudPlatform/microservices-demo Prometheus - https://github.com/census-ecosystem/opencensus-microservices-demo • Load Generator(Locust) 가 지속적으로 서비스 호출 • 특정 서비스(productcatalog)에서 GetProduct 함수 Latency 지연 발생 • Backend(Prometheus/Jaeger) 도구로 원인 파악 후 코드 수정 및 재배포
  • 19. Demo - Tracing (Frontend, Go) Exporter Library 추가 및 http handler 초기화 import ( … "go.opencensus.io/exporter/jaeger" "go.opencensus.io/exporter/prometheus" "go.opencensus.io/plugin/ochttp" "go.opencensus.io/plugin/ochttp/propagation/b3" ... ) func main() { … var handler http.Handler = r handler = &logHandler{log: log, next: handler} handler = ensureSessionID(handler) handler = &ochttp.Handler{ Handler: handler, Propagation: &b3.HTTPFormat{}} log.Infof("starting server on " + addr + ":" + srvPort) log.Fatal(http.ListenAndServe(addr+":"+srvPort, handler)) } https://godoc.org/go.opencensus.io/plugin/ochttp
  • 20. Demo - Tracing (Frontend, Go) Exporter 등록, Sampling (https://opencensus.io/stats/sampling/) func initJaegerTracing(log logrus.FieldLogger) { exporter, err := jaeger.NewExporter(jaeger.Options{ Endpoint: "http://jaeger:14268", Process: jaeger.Process{ ServiceName: "frontend", }, }) if err != nil { log.Fatal(err) } trace.RegisterExporter(exporter) } trace.ApplyConfig(trace.Config{ DefaultSampler: trace.AlwaysSample(), }) initJaegerTracing(log) ... } Supported Sampling Bit ● AlwaysSample ● NeverSample ● Probability ● RateLimiting https://github.com/census-instrumentation/opencensus-specs/blob/master/trace/Sampling.md#ratelimiting-sampler-implementation-details
  • 21. Demo - Tracing (AdService, Java) Exporter 등록 import io.opencensus.exporter.trace.jaeger.JaegerTraceExporter; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Jaeger Tracing. JaegerTraceExporter .createAndRegister("http://jaeger:14268/api/traces", "adservice"); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } trace.AlwaysSample( ) 없는 이유는 Frontend에서 전이되기 때문임
  • 22. Demo - Metrics (Frontend, Go) Exporter 등록 및 gRPC views func initPrometheusStatsExporter(log logrus.FieldLogger) *prometheus.Exporter { exporter, err := prometheus.NewExporter(prometheus.Options{}) if err != nil { log.Fatal("error registering prometheus exporter") return nil } view.RegisterExporter(exporter) return exporter } func startPrometheusExporter(log logrus.FieldLogger, exporter *prometheus.Exporter) { addr := ":9090" log.Infof("starting prometheus server at %s", addr) http.Handle("/metrics", exporter) log.Fatal(http.ListenAndServe(addr, nil)) } func initStats(log logrus.FieldLogger) { // Start prometheus exporter exporter := initPrometheusStatsExporter(log) go startPrometheusExporter(log, exporter) if err := view.Register(ochttp.DefaultServerViews...); err != nil { log.Fatal("error registering default http server views") } if err := view.Register(ocgrpc.DefaultClientViews...); err != nil { log.Fatal("error registering default grpc client views") } } https://github.com/census-instrumentation/opencensus- specs/blob/master/stats/DataAggregation.md#view
  • 23. Demo - Metrics (AdService, Java) Exporter 등록 import io.opencensus.exporter.stats.prometheus.PrometheusStatsCollector; public static void main(String[] args) throws IOException, InterruptedException { ... // Register Prometheus exporters and export metrics to a Prometheus HTTPServer. PrometheusStatsCollector.createAndRegister(); HTTPServer prometheusServer = new HTTPServer(9090, true); ... final AdService service = AdService.getInstance(); service.start(); service.blockUntilShutdown(); } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  • 24. Demo - Metrics (AdService, Java) gRPC views /** Main launches the server from the command line. */ public static void main(String[] args) throws IOException, InterruptedException { ... // Registers all RPC views. RpcViews.registerAllViews(); ... } https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/DataAggregation.md#view
  • 25. Demo - Trace Monitoring (문제상황)
  • 26. Demo - Stats Monitoring (문제상황)
  • 27. Demo - Code Tuning parceCatalog( )를 products 변수로 처리하여 전체 로직에서 시간 줄이기 재배포 (skaffold) $ skaffold run --default-repo=gcr.io/cloudrun-237814 -n default
  • 28. Demo - Trace Monitoring
  • 29. Demo - Stats Monitoring
  • 30. Wrap Up • OpenCensus Agent, Collector 활용 고민 • App. SRE - SLI(Service Level Indicator), SLO(Service Level Objective) • Application Custom Metric 확장 • Istio 확장 https://github.com/census-instrumentation/opencensus-service/blob/master/DESIGN.md#implementation-details-of-agent-server • OpenMetric + OpenCensus = OpenTelemetry https://medium.com/opentracing/merging-opentracing-and-opencensus-f0fe9c7ca6f0
  • 31. OpenTracing Vendor-neutral APIs and instrumentation for distributed tracing. Open standard for distributed tracing. Libraries available in 9 languages (Go, JavaScript, Java, Python, Ruby, PHP, Objective-C, C++, C#)
  • 32. OpenTelemetry OpenTelemetry : The next major version of the OpenTracing and OpenCensus + =