SlideShare a Scribd company logo
SignalFx
Microservices and Devs in Charge:
Why Monitoring is an Analytics Problem
SignalFx
Microservices and Devs in Charge:
Why Monitoring is an Analytics Problem
Phillip Liu
phillip@signalfx.com
@SignalFx - signalfx.com
Agenda
• My background
• Microservices, a review
• Analytics approach to monitoring
• Code push side effects, an example
• Summary
SignalFx
My Background
Experience
[2013 - ] SignalFx - Founder, CTO, Software Engineer
Microservices; Monitoring using Analytics
[2008 - 2012] Facebook - Software Engineer, Software Architect
Hyperscale SOA; Monitoring using Nagios, Ganglia, and in-house
Analytics
[2004 - 2008] Opsware - Chief Architect, Software Engineer
Monolithic Architecture; Monitoring using Ganglia, Nagios, Splunk
[2000 - 2004] Loudcloud - Software Engineer
LAMP, Application Server; Monitoring using SNMP, Ganglia, NetCool
[1998 - 2000] Marimba - Software Engineer
Client / Server; Monitoring using SNMP, FreshWater Software
[ … ]
SignalFx
Microservices, a Review
A Microservices Definition
Loosely coupled service
oriented architecture with
bounded context.
Adrian Cockcroft
SignalFx’s Microservices
More than 15 internal services.
Spanning hundreds of
instances.
Across 3 AZs.
Have dependencies on
tens of external services.
Monitoring Challenges
• High iteration rate leads to shortened test
cycles
• Integration test combinations are intractable
• Catch problems during rolling deployments
• Identify upstream/downstream side effects
• e.g. backpressure
• Identify brownouts before the customer
• etc.
SignalFx
Analytics Approach to Monitoring
Measure
Store
Analyze
Detect
SignalFx
Examples
Monitoring at SignalFx
•We use SignalFx to monitor SignalFx
•CollectD for OS and Docker metrics on all VMs
•Yammer metrics for all Java app servers
•Custom logger to count exception types
•All metrics are sent to an analytics service
•Each service deploy a their cadence
•Push lab, then canary in prod, then rest of tier
Code Push Side Effects
Code Push Side Effects
Push canary instance and Metadata API
dashboard shows healthy tier.
Code Push Side Effects
However, upstream UI dashboard
showed unusual # of timeouts.
Code Push Side Effects
In search of root cause.
Always safe to start by looking at exception counts.
Can’t derive much from all the noise.
Code Push Side Effects
Sum the # of exceptions to create a single signal.
Code Push Side Effects
Compare sum with time-shifted sum from a day ago.
Code Push Side Effects
Look at an outlier host - an Analytics
service host.
Code Push Side Effects
java.io.InvalidObjectException: enum constant MURMUR128_MITZ_64 does
not exist in class com.google.common.hash.BloomFilterStrategies
at java.io.ObjectInputStream.readEnum(ObjectInputStream.java:1743) ~[na:
1.7.0_79]
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
~[na:1.7.0_79]
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:
1990) ~[na:1.7.0_79]
…
Looking at Analytic’s logs revealed
source of the problem.
Code Push Side Effects
• Analytics across multiple microservices reduced
time to identify problem. From push to resolution
was ~15min
• Service instrumentation helped narrowed down
root cause
• Discovery allowed us to create a detector using
analytics to notify similar problems in the future
Other Examples
• A customer started dropping data because they
reverted to an unsupported API
• Compare tsdb write throughput of two different
write strategies
• Create per-service capacity reports
• Identify memory usage patterns across our
Analytics service
• Create a detector for every previously uncaught
error conditions - postmortem output
SignalFx
Summary
• Measure and Store as much metrics and events as
possible
• Use data analytics techniques to
• Identify problems
• Chase down root cause
• Create analytics based detectors to notify you of
recurrence
SignalFx
Thank You!
Phillip Liu
phillip@signalfx.com
WE’RE HIRING
jobs@signalfx.com
@SignalFx - signalfx.com

More Related Content

What's hot

Modern Web 2019 從零開始加入自動化資安測試
Modern Web 2019 從零開始加入自動化資安測試Modern Web 2019 從零開始加入自動化資安測試
Modern Web 2019 從零開始加入自動化資安測試
Secview
 
José Vila - ¿Otro parche más? No, por favor. [rooted2018]
José Vila - ¿Otro parche más? No, por favor. [rooted2018]José Vila - ¿Otro parche más? No, por favor. [rooted2018]
José Vila - ¿Otro parche más? No, por favor. [rooted2018]
RootedCON
 
Stephen Sadowski - Securely automating infrastructure in the cloud
Stephen Sadowski - Securely automating infrastructure in the cloudStephen Sadowski - Securely automating infrastructure in the cloud
Stephen Sadowski - Securely automating infrastructure in the cloud
DevSecCon
 
Javier Hijas & Ori Kuyumgiski - Security at the speed of DevOps [rooted2018]
Javier Hijas & Ori Kuyumgiski	- Security at the speed of DevOps [rooted2018]Javier Hijas & Ori Kuyumgiski	- Security at the speed of DevOps [rooted2018]
Javier Hijas & Ori Kuyumgiski - Security at the speed of DevOps [rooted2018]
RootedCON
 
Alfredo Reino - Monitoring aws and azure
Alfredo Reino - Monitoring aws and azureAlfredo Reino - Monitoring aws and azure
Alfredo Reino - Monitoring aws and azure
DevSecCon
 
Owasp Code Crawler Presentation
Owasp Code Crawler PresentationOwasp Code Crawler Presentation
Owasp Code Crawler Presentation
alessiomarziali
 
Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...
Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...
Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...
Akond Rahman
 
ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...
ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...
ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...
Agile Testing Alliance
 
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic WebinarMaking the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Sumo Logic
 
[OPD 2019] Governance as a missing part of IT security architecture
[OPD 2019] Governance as a missing part of IT security architecture[OPD 2019] Governance as a missing part of IT security architecture
[OPD 2019] Governance as a missing part of IT security architecture
OWASP
 
Veracode Automation CLI (using Jenkins for SDL integration)
Veracode Automation CLI (using Jenkins for SDL integration)Veracode Automation CLI (using Jenkins for SDL integration)
Veracode Automation CLI (using Jenkins for SDL integration)
Dinis Cruz
 
Customer Presentation - KCP&L
Customer Presentation - KCP&LCustomer Presentation - KCP&L
Customer Presentation - KCP&L
Splunk
 
Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...
Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...
Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...
Ernesto Bethencourt
 
Fences and Gates: Designing Ops for DevOps
Fences and Gates: Designing Ops for DevOpsFences and Gates: Designing Ops for DevOps
Fences and Gates: Designing Ops for DevOps
Dan Illson
 
Integrating DevOps and Security
Integrating DevOps and SecurityIntegrating DevOps and Security
Integrating DevOps and Security
Stijn Muylle
 
Owasp top 10 2017 (en)
Owasp top 10 2017 (en)Owasp top 10 2017 (en)
Owasp top 10 2017 (en)
PrashantDhakol
 
The Dev, Sec and Ops of API Security - API World
The Dev, Sec and Ops of API Security - API WorldThe Dev, Sec and Ops of API Security - API World
The Dev, Sec and Ops of API Security - API World
42Crunch
 
Beyond Continuous Delivery
Beyond Continuous DeliveryBeyond Continuous Delivery
Beyond Continuous Delivery
Antoni Orfin
 

What's hot (18)

Modern Web 2019 從零開始加入自動化資安測試
Modern Web 2019 從零開始加入自動化資安測試Modern Web 2019 從零開始加入自動化資安測試
Modern Web 2019 從零開始加入自動化資安測試
 
José Vila - ¿Otro parche más? No, por favor. [rooted2018]
José Vila - ¿Otro parche más? No, por favor. [rooted2018]José Vila - ¿Otro parche más? No, por favor. [rooted2018]
José Vila - ¿Otro parche más? No, por favor. [rooted2018]
 
Stephen Sadowski - Securely automating infrastructure in the cloud
Stephen Sadowski - Securely automating infrastructure in the cloudStephen Sadowski - Securely automating infrastructure in the cloud
Stephen Sadowski - Securely automating infrastructure in the cloud
 
Javier Hijas & Ori Kuyumgiski - Security at the speed of DevOps [rooted2018]
Javier Hijas & Ori Kuyumgiski	- Security at the speed of DevOps [rooted2018]Javier Hijas & Ori Kuyumgiski	- Security at the speed of DevOps [rooted2018]
Javier Hijas & Ori Kuyumgiski - Security at the speed of DevOps [rooted2018]
 
Alfredo Reino - Monitoring aws and azure
Alfredo Reino - Monitoring aws and azureAlfredo Reino - Monitoring aws and azure
Alfredo Reino - Monitoring aws and azure
 
Owasp Code Crawler Presentation
Owasp Code Crawler PresentationOwasp Code Crawler Presentation
Owasp Code Crawler Presentation
 
Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...
Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...
Software Security in DevOps: Synthesizing Practitioners’ Perceptions and Prac...
 
ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...
ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...
ATAGTR2017 Cost-effective Security Testing Approaches for Web, Mobile & Enter...
 
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic WebinarMaking the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
 
[OPD 2019] Governance as a missing part of IT security architecture
[OPD 2019] Governance as a missing part of IT security architecture[OPD 2019] Governance as a missing part of IT security architecture
[OPD 2019] Governance as a missing part of IT security architecture
 
Veracode Automation CLI (using Jenkins for SDL integration)
Veracode Automation CLI (using Jenkins for SDL integration)Veracode Automation CLI (using Jenkins for SDL integration)
Veracode Automation CLI (using Jenkins for SDL integration)
 
Customer Presentation - KCP&L
Customer Presentation - KCP&LCustomer Presentation - KCP&L
Customer Presentation - KCP&L
 
Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...
Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...
Ernesto Bethencourt & Javier Sanz - OFRECIENDO SEGURIDAD DE AUTOCONSUMO A LOS...
 
Fences and Gates: Designing Ops for DevOps
Fences and Gates: Designing Ops for DevOpsFences and Gates: Designing Ops for DevOps
Fences and Gates: Designing Ops for DevOps
 
Integrating DevOps and Security
Integrating DevOps and SecurityIntegrating DevOps and Security
Integrating DevOps and Security
 
Owasp top 10 2017 (en)
Owasp top 10 2017 (en)Owasp top 10 2017 (en)
Owasp top 10 2017 (en)
 
The Dev, Sec and Ops of API Security - API World
The Dev, Sec and Ops of API Security - API WorldThe Dev, Sec and Ops of API Security - API World
The Dev, Sec and Ops of API Security - API World
 
Beyond Continuous Delivery
Beyond Continuous DeliveryBeyond Continuous Delivery
Beyond Continuous Delivery
 

Viewers also liked

Aging in Place: Housing Washington 2014 Conference SLIDE DECK
Aging in Place: Housing Washington 2014 Conference SLIDE DECKAging in Place: Housing Washington 2014 Conference SLIDE DECK
Aging in Place: Housing Washington 2014 Conference SLIDE DECK
Aaron D. Murphy, Architect / CAPS
 
Celebrities then and now
Celebrities then and nowCelebrities then and now
Celebrities then and nowmiguelvilla95
 
Empowering The Mature Mind - SUMMER 2014 Newsletter
Empowering The Mature Mind - SUMMER 2014 NewsletterEmpowering The Mature Mind - SUMMER 2014 Newsletter
Empowering The Mature Mind - SUMMER 2014 Newsletter
Aaron D. Murphy, Architect / CAPS
 
Reaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. Murphy
Reaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. MurphyReaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. Murphy
Reaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. Murphy
Aaron D. Murphy, Architect / CAPS
 
Ashxarhi test
Ashxarhi testAshxarhi test
Ashxarhi testjora2001
 
Lightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO Meet
Lightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO MeetLightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO Meet
Lightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO Meet
Steve Morgan
 
Omer presentation
Omer presentationOmer presentation
Omer presentation
Omer Hassan
 

Viewers also liked (7)

Aging in Place: Housing Washington 2014 Conference SLIDE DECK
Aging in Place: Housing Washington 2014 Conference SLIDE DECKAging in Place: Housing Washington 2014 Conference SLIDE DECK
Aging in Place: Housing Washington 2014 Conference SLIDE DECK
 
Celebrities then and now
Celebrities then and nowCelebrities then and now
Celebrities then and now
 
Empowering The Mature Mind - SUMMER 2014 Newsletter
Empowering The Mature Mind - SUMMER 2014 NewsletterEmpowering The Mature Mind - SUMMER 2014 Newsletter
Empowering The Mature Mind - SUMMER 2014 Newsletter
 
Reaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. Murphy
Reaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. MurphyReaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. Murphy
Reaching & Connecting with the BOOMER CONSUMER - by EtMM Aaron D. Murphy
 
Ashxarhi test
Ashxarhi testAshxarhi test
Ashxarhi test
 
Lightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO Meet
Lightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO MeetLightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO Meet
Lightning Talk: The History of SEO (in 3 Minutes) | Cardiff SEO Meet
 
Omer presentation
Omer presentationOmer presentation
Omer presentation
 

Similar to Why monitoring is an analytics problem

AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFx
SignalFx
 
Scaling security in a cloud environment v0.5 (Sep 2017)
Scaling security in a cloud environment  v0.5 (Sep 2017)Scaling security in a cloud environment  v0.5 (Sep 2017)
Scaling security in a cloud environment v0.5 (Sep 2017)
Dinis Cruz
 
MuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.io
MuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.ioMuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.io
MuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.io
Jitendra Bafna
 
Making Security Agile
Making Security AgileMaking Security Agile
Making Security Agile
Oleg Gryb
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
aspyker
 
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB
 
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays
 
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...
Mobodexter
 
Dev{sec}ops
Dev{sec}opsDev{sec}ops
Dev{sec}ops
Steven Carlson
 
Pragmatic Pipeline Security
Pragmatic Pipeline SecurityPragmatic Pipeline Security
Pragmatic Pipeline Security
James Wickett
 
AWS live hack: Atlassian + Snyk OSS on AWS
AWS live hack: Atlassian + Snyk OSS on AWSAWS live hack: Atlassian + Snyk OSS on AWS
AWS live hack: Atlassian + Snyk OSS on AWS
Eric Smalling
 
Log Analytics for Distributed Microservices
Log Analytics for Distributed MicroservicesLog Analytics for Distributed Microservices
Log Analytics for Distributed Microservices
Kai Wähner
 
Bangalore OpenMSA DevDay - September 19, 2018
Bangalore OpenMSA DevDay - September 19, 2018Bangalore OpenMSA DevDay - September 19, 2018
Bangalore OpenMSA DevDay - September 19, 2018
UBiqube
 
Cncf checkov and bridgecrew
Cncf checkov and bridgecrewCncf checkov and bridgecrew
Cncf checkov and bridgecrew
LibbySchulze
 
Platform governance, gestire un ecosistema di microservizi a livello enterprise
Platform governance, gestire un ecosistema di microservizi a livello enterprisePlatform governance, gestire un ecosistema di microservizi a livello enterprise
Platform governance, gestire un ecosistema di microservizi a livello enterprise
Giulio Roggero
 
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Pavneet Singh Kochhar
 
Empowering developers and operators through Gitlab and HashiCorp
Empowering developers and operators through Gitlab and HashiCorpEmpowering developers and operators through Gitlab and HashiCorp
Empowering developers and operators through Gitlab and HashiCorp
Mitchell Pronschinske
 
OORPT Dynamic Analysis
OORPT Dynamic AnalysisOORPT Dynamic Analysis
OORPT Dynamic Analysis
lienhard
 
BrownResearch_CV
BrownResearch_CVBrownResearch_CV
BrownResearch_CV
Abby Brown
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
Tao Xie
 

Similar to Why monitoring is an analytics problem (20)

AWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFxAWS Loft Talk: Behind the Scenes with SignalFx
AWS Loft Talk: Behind the Scenes with SignalFx
 
Scaling security in a cloud environment v0.5 (Sep 2017)
Scaling security in a cloud environment  v0.5 (Sep 2017)Scaling security in a cloud environment  v0.5 (Sep 2017)
Scaling security in a cloud environment v0.5 (Sep 2017)
 
MuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.io
MuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.ioMuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.io
MuleSoft Surat Virtual Meetup#4 - Anypoint Monitoring and MuleSoft dataloader.io
 
Making Security Agile
Making Security AgileMaking Security Agile
Making Security Agile
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
MongoDB World 2018: Ch-Ch-Ch-Ch-Changes: Taking Your Stitch Application to th...
 
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
apidays LIVE Paris - Serverless security: how to protect what you don't see? ...
 
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...
Top 10 Software to Detect & Prevent Security Vulnerabilities from BlackHat US...
 
Dev{sec}ops
Dev{sec}opsDev{sec}ops
Dev{sec}ops
 
Pragmatic Pipeline Security
Pragmatic Pipeline SecurityPragmatic Pipeline Security
Pragmatic Pipeline Security
 
AWS live hack: Atlassian + Snyk OSS on AWS
AWS live hack: Atlassian + Snyk OSS on AWSAWS live hack: Atlassian + Snyk OSS on AWS
AWS live hack: Atlassian + Snyk OSS on AWS
 
Log Analytics for Distributed Microservices
Log Analytics for Distributed MicroservicesLog Analytics for Distributed Microservices
Log Analytics for Distributed Microservices
 
Bangalore OpenMSA DevDay - September 19, 2018
Bangalore OpenMSA DevDay - September 19, 2018Bangalore OpenMSA DevDay - September 19, 2018
Bangalore OpenMSA DevDay - September 19, 2018
 
Cncf checkov and bridgecrew
Cncf checkov and bridgecrewCncf checkov and bridgecrew
Cncf checkov and bridgecrew
 
Platform governance, gestire un ecosistema di microservizi a livello enterprise
Platform governance, gestire un ecosistema di microservizi a livello enterprisePlatform governance, gestire un ecosistema di microservizi a livello enterprise
Platform governance, gestire un ecosistema di microservizi a livello enterprise
 
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
 
Empowering developers and operators through Gitlab and HashiCorp
Empowering developers and operators through Gitlab and HashiCorpEmpowering developers and operators through Gitlab and HashiCorp
Empowering developers and operators through Gitlab and HashiCorp
 
OORPT Dynamic Analysis
OORPT Dynamic AnalysisOORPT Dynamic Analysis
OORPT Dynamic Analysis
 
BrownResearch_CV
BrownResearch_CVBrownResearch_CV
BrownResearch_CV
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 

Recently uploaded

Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
Tarandeep Singh
 
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
3a0sd7z3
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
rtunex8r
 
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
uehowe
 
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
xjq03c34
 
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
fovkoyb
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
Toptal Tech
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
davidjhones387
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
wolfsoftcompanyco
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
k4ncd0z
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
Paul Walk
 
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
uehowe
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
Donato Onofri
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
3a0sd7z3
 

Recently uploaded (16)

Bengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal BrandingBengaluru Dreamin' 24 - Personal Branding
Bengaluru Dreamin' 24 - Personal Branding
 
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
快速办理(Vic毕业证书)惠灵顿维多利亚大学毕业证完成信一模一样
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
怎么办理(umiami毕业证书)美国迈阿密大学毕业证文凭证书实拍图原版一模一样
 
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
留学挂科(UofM毕业证)明尼苏达大学毕业证成绩单复刻办理
 
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
办理新西兰奥克兰大学毕业证学位证书范本原版一模一样
 
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
 
Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!Ready to Unlock the Power of Blockchain!
Ready to Unlock the Power of Blockchain!
 
Discover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to IndiaDiscover the benefits of outsourcing SEO to India
Discover the benefits of outsourcing SEO to India
 
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaalmanuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
manuaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaal
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理一比一原版(USYD毕业证)悉尼大学毕业证如何办理
一比一原版(USYD毕业证)悉尼大学毕业证如何办理
 
Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?Should Repositories Participate in the Fediverse?
Should Repositories Participate in the Fediverse?
 
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
办理毕业证(NYU毕业证)纽约大学毕业证成绩单官方原版办理
 
HijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process HollowingHijackLoader Evolution: Interactive Process Hollowing
HijackLoader Evolution: Interactive Process Hollowing
 
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
快速办理(新加坡SMU毕业证书)新加坡管理大学毕业证文凭证书一模一样
 

Why monitoring is an analytics problem

  • 1. SignalFx Microservices and Devs in Charge: Why Monitoring is an Analytics Problem
  • 2. SignalFx Microservices and Devs in Charge: Why Monitoring is an Analytics Problem Phillip Liu phillip@signalfx.com @SignalFx - signalfx.com
  • 3. Agenda • My background • Microservices, a review • Analytics approach to monitoring • Code push side effects, an example • Summary
  • 5. Experience [2013 - ] SignalFx - Founder, CTO, Software Engineer Microservices; Monitoring using Analytics [2008 - 2012] Facebook - Software Engineer, Software Architect Hyperscale SOA; Monitoring using Nagios, Ganglia, and in-house Analytics [2004 - 2008] Opsware - Chief Architect, Software Engineer Monolithic Architecture; Monitoring using Ganglia, Nagios, Splunk [2000 - 2004] Loudcloud - Software Engineer LAMP, Application Server; Monitoring using SNMP, Ganglia, NetCool [1998 - 2000] Marimba - Software Engineer Client / Server; Monitoring using SNMP, FreshWater Software [ … ]
  • 7. A Microservices Definition Loosely coupled service oriented architecture with bounded context. Adrian Cockcroft
  • 8. SignalFx’s Microservices More than 15 internal services. Spanning hundreds of instances. Across 3 AZs. Have dependencies on tens of external services.
  • 9. Monitoring Challenges • High iteration rate leads to shortened test cycles • Integration test combinations are intractable • Catch problems during rolling deployments • Identify upstream/downstream side effects • e.g. backpressure • Identify brownouts before the customer • etc.
  • 12. Store
  • 16. Monitoring at SignalFx •We use SignalFx to monitor SignalFx •CollectD for OS and Docker metrics on all VMs •Yammer metrics for all Java app servers •Custom logger to count exception types •All metrics are sent to an analytics service •Each service deploy a their cadence •Push lab, then canary in prod, then rest of tier
  • 17. Code Push Side Effects
  • 18. Code Push Side Effects Push canary instance and Metadata API dashboard shows healthy tier.
  • 19. Code Push Side Effects However, upstream UI dashboard showed unusual # of timeouts.
  • 20. Code Push Side Effects In search of root cause. Always safe to start by looking at exception counts. Can’t derive much from all the noise.
  • 21. Code Push Side Effects Sum the # of exceptions to create a single signal.
  • 22. Code Push Side Effects Compare sum with time-shifted sum from a day ago.
  • 23. Code Push Side Effects Look at an outlier host - an Analytics service host.
  • 24. Code Push Side Effects java.io.InvalidObjectException: enum constant MURMUR128_MITZ_64 does not exist in class com.google.common.hash.BloomFilterStrategies at java.io.ObjectInputStream.readEnum(ObjectInputStream.java:1743) ~[na: 1.7.0_79] at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347) ~[na:1.7.0_79] at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java: 1990) ~[na:1.7.0_79] … Looking at Analytic’s logs revealed source of the problem.
  • 25. Code Push Side Effects • Analytics across multiple microservices reduced time to identify problem. From push to resolution was ~15min • Service instrumentation helped narrowed down root cause • Discovery allowed us to create a detector using analytics to notify similar problems in the future
  • 26. Other Examples • A customer started dropping data because they reverted to an unsupported API • Compare tsdb write throughput of two different write strategies • Create per-service capacity reports • Identify memory usage patterns across our Analytics service • Create a detector for every previously uncaught error conditions - postmortem output
  • 28. • Measure and Store as much metrics and events as possible • Use data analytics techniques to • Identify problems • Chase down root cause • Create analytics based detectors to notify you of recurrence
  • 29. SignalFx Thank You! Phillip Liu phillip@signalfx.com WE’RE HIRING jobs@signalfx.com @SignalFx - signalfx.com