Tracing Adventures from PR - Production

•

0 likes•85 views

Many things can go awry on the journey from pull request (PR) open to merge to production deployment. Issues can arise from the application code, layers of YAML configuration, underlying infrastructure or pipeline logic itself. How can distributed tracing and trace-derived metrics bring developers and operators together for troubleshooting paradise? I’ll unpack a deploy gone bad from both vantage points, gaining an empathy for the engineer who needs to deploy their changes and an ops engineer who is responsible for keeping the system up and running. With signals from OpenTelemetry I will show how increasing the observability of your deploy system can facilitate better collaboration and quicker troubleshooting.

Technology

@paigerduty
Tracing Adventures
from Pull Request
to Production

@paigerduty
2
Can I Deploy?!
Weds, 4:30 pm

@paigerduty
3
��🦰
UGH why is pipeline failing?
…better ping Ops

@paigerduty
4
#ops-help
#ops-help
:wave: Why does this deploy
keep failing?
<link>
I gotta deploy this before
the end of day
��🦰

@paigerduty
5
#ops-help
#ops-help
Have you tried re-running
the pipeline?
��♀

@paigerduty
6
#ops-help
#ops-help
Yep. Tried re-running it 3x
before posting here.
Every time it keeps hanging
and spits out this
$BIG_ERROR
��🦰

@paigerduty
7
��♀
OF COURSE there’s an urgent
issue to solve with only 30m left
in the day
…I can’t wait to log off

@paigerduty
8
#ops-help
#ops-help
Hmmm…OK let me take a
closer look
��♀

@paigerduty
Dev and Ops have
distinct perspectives
and data on CI/CD
platform its not
always
straightforward to
know who is
responsible for an
issue
11

@paigerduty
16
It’s the
platform!
It’s your change!

@paigerduty
19
Dev
● Application
○ Code
○ Tests
● Deployment Pipeline
● Deploying changes

@paigerduty
20
Dev
Ops
● Infrastructure
○ Nodes / Capacity
○ Network
● CI/CD Platform
○ Availability / Reliability
○ Cost Efﬁciency

@paigerduty
21
Dev
Ops
● Infrastructure
○ Nodes / Capacity
○ Network
● CI/CD Platform
○ Availability / Reliability
○ Cost Efﬁciency
● Application
○ Code
○ Tests
● Deployment Pipeline
● Deploying changes

@paigerduty
2.
The Path to
Production
23

@paigerduty
Changes go on a long
opaque journey from
PR to Production.
Tracing provides a
vantage point both
dev and ops can use
24

@paigerduty
a Path to Production
25
1
PR checks

@paigerduty
a Path to Production
26
1
2
PR checks
PR Review

@paigerduty
a Path to Production
27
1
3
2
PR checks
PR Review
Merge PR

@paigerduty
a Path to Production
28
1
3
5
6
4
2
PR checks
PR Review
Merge PR
Deploy to
Integration
Env
Deploy to
Staging Env
Deploy to
Production
Env

@paigerduty
29
1
3
5
6
4
2
PR checks
PR Review
Merge PR
Deploy to
Integration
Env
Deploy to
Staging Env
Deploy to
Production
Env
��

@paigerduty
Integration
Production
Staging
31

@paigerduty
Production
Integration
Staging
32
Local
PR

@paigerduty
Traces bring people
together, just as
traces span service
boundaries they span
team/organization
boundaries.
35

@paigerduty
But we have metrics and logs...
36

@paigerduty
Build
39
Test Scan Deploy
Pipeline Stages

@paigerduty
Checkout
repo
40
Run
linter
Build
Image
Push
image
Pipeline Steps
BUILD

@paigerduty
build_image
41
unit_test
scan_image
deploy
Pipeline as Trace Waterfall
integration_deploy

@paigerduty
Receive
webhook
43
Schedule
Run step
1
Clean up
Pipeline Steps - Platform POV
Run step
2

@paigerduty
a Path to Production
44
1
5
6
4
PR checks
Deploy to
Integration
Env
Deploy to
Staging Env
Deploy to
Production
Env
��
��
��
��

@paigerduty
Data Flow
50
O11y
Backend/UI
OTel
Collector*
CI/CD
System
Generate & send Process & route Store & visualize
OTel
SDK/Integration

@paigerduty
DIY Instrumentation
● Otel-cli
● Tracepusher
● OTel Registry
ﬁnd your
stack :)
52
OpenTelemetry
Collector

@paigerduty
Integrations
● Ansible Tracing Callback
● Maven Extension
● Pytest
59

@paigerduty
Native Instrumentation
● Jenkins Plugin
● GitHub Actions Step
● Concourse CI (experimental)
● Zuul OTel Proposal
61

@paigerduty
Thanks!
You can ﬁnd me at
- paigerduty@hachyderm.io
- paigerduty@chronosphere.io
Slides from SlidesCarnival, pics from Pixabay
66

@paigerduty
70
Paige’s tale of how a 1-line conﬁg change caused a SEV-1….

Similar to Tracing Adventures from PR - Production

The Changing Role of Release Engineering in a DevOps WorldPerforce

Mobile Product Strategy Keynote Presentation for Mobile App Europe Conference...Marc C. Lange

Continuous Deployment of Clojure AppsSiva Jagadeesan

Man&symbolspresoNick Servino

Agile anti-patterns at CodeMotion MadridSander Hoogendoorn

Release Engineering & Rugged DevOps: An Intersection - J. Paul ReedSeniorStoryteller

DevOps is Scaling Agile tooDerk-Jan de Grood

Building a Testing Playbook by Andrew RichardsonDelphic Digital

Seoul Test Conference - Agile in Europe, The way its done.Derk-Jan de Grood

Be kind to your future admin self, Silvia Denaro & Nathaniel SombuCzechDreamin

Release Engineering and Rugged DevOps: An Intersection?SeniorStoryteller

MeasureFest 2021: Interactive Core Web Vitals In Data StudioLazarinaStoyanova

PCA14: Herding Cat GIFs - Learning Scrum by Doing ScrumTheresa Huth, PMP

Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014Austin Ogilvie

Introduction to Azure Machine LearningRene Modery

Tackling Python: What is it and how can it help with Technical SEO?BristolSEO

The Power of Python :: How It Can Help With Technical SEO | Bristol SEO May 2...Ruth Everett

Lean kanban India 16Marcio Sete

Better delivery with DevOps Driven DevelopmentJirayut Nimsaeng

AllDayDevOps: Crossing the CD ChasmJ. Paul Reed

Similar to Tracing Adventures from PR - Production (20)

The Changing Role of Release Engineering in a DevOps World

Mobile Product Strategy Keynote Presentation for Mobile App Europe Conference...

Continuous Deployment of Clojure Apps

Man&symbolspreso

Agile anti-patterns at CodeMotion Madrid

Release Engineering & Rugged DevOps: An Intersection - J. Paul Reed

DevOps is Scaling Agile too

Building a Testing Playbook by Andrew Richardson

Seoul Test Conference - Agile in Europe, The way its done.

Be kind to your future admin self, Silvia Denaro & Nathaniel Sombu

Release Engineering and Rugged DevOps: An Intersection?

MeasureFest 2021: Interactive Core Web Vitals In Data Studio

PCA14: Herding Cat GIFs - Learning Scrum by Doing Scrum

Applied Data Science: Building a Beer Recommender | Data Science MD - Oct 2014

Introduction to Azure Machine Learning

Tackling Python: What is it and how can it help with Technical SEO?

The Power of Python :: How It Can Help With Technical SEO | Bristol SEO May 2...

Lean kanban India 16

Better delivery with DevOps Driven Development

AllDayDevOps: Crossing the CD Chasm

Recently uploaded

GenCyber Cyber Security Day PresentationMichael W. Hawkins

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Histor y of HAM Radio presentation slidevu2urc

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Evaluating the top large language models.pdfChristopherTHyatt

Artificial Intelligence: Facts and MythsJoaquim Jorge

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

Recently uploaded (20)

GenCyber Cyber Security Day Presentation

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Powerful Google developer tools for immediate impact! (2023-24 C)

08448380779 Call Girls In Civil Lines Women Seeking Men

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Handwritten Text Recognition for manuscripts and early printed texts

08448380779 Call Girls In Friends Colony Women Seeking Men

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

How to Troubleshoot Apps for the Modern Connected Worker

Histor y of HAM Radio presentation slide

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf

Scaling API-first – The story of a global engineering organization

Evaluating the top large language models.pdf

Artificial Intelligence: Facts and Myths

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

Tracing Adventures from PR - Production

1. @paigerduty Tracing Adventures from Pull Request to Production

2. @paigerduty 2 Can I Deploy?! Weds, 4:30 pm

3. @paigerduty 3 ��🦰 UGH why is pipeline failing? …better ping Ops

4. @paigerduty 4 #ops-help #ops-help :wave: Why does this deploy keep failing? <link> I gotta deploy this before the end of day ��🦰

5. @paigerduty 5 #ops-help #ops-help Have you tried re-running the pipeline? ��♀

6. @paigerduty 6 #ops-help #ops-help Yep. Tried re-running it 3x before posting here. Every time it keeps hanging and spits out this $BIG_ERROR ��🦰

7. @paigerduty 7 ��♀ OF COURSE there’s an urgent issue to solve with only 30m left in the day …I can’t wait to log off

8. @paigerduty 8 #ops-help #ops-help Hmmm…OK let me take a closer look ��♀

9. @paigerduty 9

10. @paigerduty 1. Dev 10 Ops

11. @paigerduty Dev and Ops have distinct perspectives and data on CI/CD platform its not always straightforward to know who is responsible for an issue 11

12. @paigerduty 12 Dev

13. @paigerduty 13 Ops

14. @paigerduty 14

15. @paigerduty 15 It’s the platform!

16. @paigerduty 16 It’s the platform! It’s your change!

17. @paigerduty 17

18. @paigerduty 18 Dev Ops

19. @paigerduty 19 Dev ● Application ○ Code ○ Tests ● Deployment Pipeline ● Deploying changes

20. @paigerduty 20 Dev Ops ● Infrastructure ○ Nodes / Capacity ○ Network ● CI/CD Platform ○ Availability / Reliability ○ Cost Efﬁciency

21. @paigerduty 21 Dev Ops ● Infrastructure ○ Nodes / Capacity ○ Network ● CI/CD Platform ○ Availability / Reliability ○ Cost Efﬁciency ● Application ○ Code ○ Tests ● Deployment Pipeline ● Deploying changes

22. @paigerduty 22

23. @paigerduty 2. The Path to Production 23

24. @paigerduty Changes go on a long opaque journey from PR to Production. Tracing provides a vantage point both dev and ops can use 24

25. @paigerduty a Path to Production 25 1 PR checks

26. @paigerduty a Path to Production 26 1 2 PR checks PR Review

27. @paigerduty a Path to Production 27 1 3 2 PR checks PR Review Merge PR

28. @paigerduty a Path to Production 28 1 3 5 6 4 2 PR checks PR Review Merge PR Deploy to Integration Env Deploy to Staging Env Deploy to Production Env

29. @paigerduty 29 1 3 5 6 4 2 PR checks PR Review Merge PR Deploy to Integration Env Deploy to Staging Env Deploy to Production Env ��

30. @paigerduty 30

31. @paigerduty Integration Production Staging 31

32. @paigerduty Production Integration Staging 32 Local PR

33. @paigerduty 33 Dev & Ops

34. @paigerduty 3. Tracing Together 34

35. @paigerduty Traces bring people together, just as traces span service boundaries they span team/organization boundaries. 35

36. @paigerduty But we have metrics and logs... 36

37. @paigerduty Why add tracing? 37

38. @paigerduty What’s a Trace? 38

39. @paigerduty Build 39 Test Scan Deploy Pipeline Stages

40. @paigerduty Checkout repo 40 Run linter Build Image Push image Pipeline Steps BUILD