SlideShare a Scribd company logo
1 of 23
Discover Financial Technology
Kurt Schneider
InfluxData Journey
Kurt Schneider
Domain Architect – Observability
The selection process and roll out of InfluxDB (Telegraf) at
Discover Financial.
InfluxData Journey
30+ years in Monitoring.
27 Years at Large Insurance Company
5 Years at Discover Financial
Discover Financial - Timeline
• Jan 2019 We made a decision to replace CA-UIM
with potential other toolset.
• CA UIM was our infrastructure monitoring tool
• Agent based on prem solution (old Nimsoft)
• We looked at SignalFx, IBM, Datadog, InfluxData
and new UIM solution.
• We POC’d installed Datadog and InfluxDB.
• We chose InfluxData as our partner and signed
contract 2019 Dec.
RFP Process – Reasons for Selection
• Price
• Technology
• Datadog Gaps were the same
• No call back process
• Similar Tools – both written in GO (agent)
• InfluxDB data retrieval was fastest
• Datadog GUI and Alerting more refined
• We were not looking for APM or Synthetics (DD
has many other capabilities we were not looking
for).
Synthetic Monitoring
Web, API, DNS, SSL, HTML, TCP, NTP
On Prem and Cloud Testing
Infrastructure
Monitoring
Windows, AIX, Unix, Logs,
Network
Application Performance
Management and Real User
Monitoring
Event Management
Machine Learning / AIOPS
Event Portal Strategy Shared API (2020)
API Communication API Communication
API Communication API Communication
Discover 2020-2021 Migration Plan
3/30/21 In Progress
Live 10/1/19 700+ Daily Users 18K JVM installed
OCP
• SAAS Solution
• InfluxDB - Time Series Database designed from ground up to
collect time series data like server metrics.
 Provable Fastest database for these types of metrics.
• Hundreds of extensions
• DevOps tool easy to access metrics
A time series database is a software system that is optimized for
storing and serving time series through associated pairs of time
and value. In some fields, time series may be called profiles,
curves, traces or trends.
InfluxData replacing CA-UIM and Ganglia
Users
Chronograf
UI Cloud
InfluxDB
Cloud
Kapacitor
Alert
Engine
AWS Hosted InfluxData
vclp006888.na.disc
overfinancial.com
vclp007382.rw.disc
overfinancial.com
vclp006889.na.disc
overfinancial.com
vclp007381.rw.disc
overfinancial.com
Influx Gateway
Telegraf Forwarders
InfluxData Architecture Discover Financial
TCP Port 48048
Enterprise Servers
Telegraf Agents
Windows, Linux and
AIX 8000+
TCP Port 9092
RW
NA
OKTA
OKTA
InfluxData
Hundreds of Input and Output Plugins Available
Telegraf Current Collection Progress
Using Today Coming Soon –
Actively
Investigating
Future
Enterprise Server Metrics
Over 100 Metrics out of the box
Sample Collected:
Linux (8000+), AIX(300+) and Windows (1200+)
VMWARE (3 vSphere)
Citrix
Active Directory
Over 10,000 Telegraf Agents are Reporting to our Cloud Instance/
Can Alert on any combination of these Metrics.
Will be offering some self-service capabilities and interested party
notification capabilities
Moogsoft Integration
Chronograf Usage
Chronograf UI – Time Series in Real Time
Chronograf Forwarder Health
Telegraf Average Usage across Enterprise
Kapacitor
Grafana Usage
Grafana - Work from Home Dashboard
Grafana / InfluxDB Linux Dashboards
Hundreds of Community Dashboards Available
vSphere Visualization with Telegraf Data
Level 1
Level 2
Level 3
Deployment
• Out of the box configuration
• Agents deployed
• Top 10 KPI monitored (CPU, Disk,
Processes, etc.)
Standard
• Integrated with event manager
• Metrics, events, logs, and traces
• Real-time visualization
• Application process monitors
• Infrastructure tied to applications
(CMDB)
Preventative and Self Healing
• Automatically kick off orchestration
• Baseline monitoring and anomaly
detection
Maturity Model for Infrastructure and Platforms
• Learn the tool
• Work from tickets and calls
• Use tool to root cause
• Tune tool
• Ongoing maintenance
• Adding new
StateBehaviors
Tips and Tricks
• Forwarder Configuration important (We have four for 10K Telegraf Agents)
• metric_batch_size = 30000
metric_buffer_limit = 500000
collection_jitter = "0s"
flush_interval = "1s"
flush_jitter = "0s“
• Batching is important when you have a lot of data ingestion.
• Don't duplicate configurations anywhere
• the forwarders need to run a different local configuration as well; otherwise they
risk forwarding data that is being filtered.
• Telegraf
• Choose Metrics that matter (Telegraf)
• Thousand of metrics in attribute groups – collect the valuable metrics.
Questions

More Related Content

What's hot

Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Flink Forward
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward
 

What's hot (20)

Monitoring and Troubleshooting a Real Time Pipeline
Monitoring and Troubleshooting a Real Time PipelineMonitoring and Troubleshooting a Real Time Pipeline
Monitoring and Troubleshooting a Real Time Pipeline
 
Self-Service Analytics on Hadoop: Lessons Learned
Self-Service Analytics on Hadoop: Lessons LearnedSelf-Service Analytics on Hadoop: Lessons Learned
Self-Service Analytics on Hadoop: Lessons Learned
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
 
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
DataOps Automation for a Kafka Streaming Platform (Andrew Stevenson + Spiros ...
 
Building the Next-gen Digital Meter Platform for Fluvius
Building the Next-gen Digital Meter Platform for FluviusBuilding the Next-gen Digital Meter Platform for Fluvius
Building the Next-gen Digital Meter Platform for Fluvius
 
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ..."Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
"Introduction to Kx Technology", James Corcoran, Head of Engineering EMEA at ...
 
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
Flink Forward Berlin 2018: Krzysztof Zarzycki & Alexey Brodovshuk - "Assistin...
 
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring SolutionHow KeyBank Used Elastic to Build an Enterprise Monitoring Solution
How KeyBank Used Elastic to Build an Enterprise Monitoring Solution
 
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
Building a Streaming Pipeline on Kubernetes Using Kafka Connect, KSQLDB & Apa...
 
Power Your Delta Lake with Streaming Transactional Changes
 Power Your Delta Lake with Streaming Transactional Changes Power Your Delta Lake with Streaming Transactional Changes
Power Your Delta Lake with Streaming Transactional Changes
 
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
Flink Forward Berlin 2018: Wei-Che (Tony) Wei - "Lessons learned from Migrati...
 
Capgemini: Observability within the Dutch government
Capgemini: Observability within the Dutch governmentCapgemini: Observability within the Dutch government
Capgemini: Observability within the Dutch government
 
Microservices meetup April 2017
Microservices meetup April 2017Microservices meetup April 2017
Microservices meetup April 2017
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
 
Zero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using HadoopZero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using Hadoop
 
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT PlatformDiscover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
 
HOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real TimeHOP! Airlines Jets to Real Time
HOP! Airlines Jets to Real Time
 
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc..."An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
 

Similar to Kurt Schneider [Discover Financial] | How Discover Modernizes Observability with InfluxDB Cloud | InfluxDays Virtual Experience NA 2020

RCA OCORA: Safe Computing Platform using open standards
RCA OCORA: Safe Computing Platform using open standardsRCA OCORA: Safe Computing Platform using open standards
RCA OCORA: Safe Computing Platform using open standards
AdaCore
 

Similar to Kurt Schneider [Discover Financial] | How Discover Modernizes Observability with InfluxDB Cloud | InfluxDays Virtual Experience NA 2020 (20)

3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...3 reasons to pick a time series platform for monitoring dev ops driven contai...
3 reasons to pick a time series platform for monitoring dev ops driven contai...
 
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
DCEU 18: From Legacy Mainframe to the Cloud: The Finnish Railways Evolution w...
 
Fog Computing is the Future of the Industrial Internet of Things
Fog Computing is the Future of the Industrial Internet of ThingsFog Computing is the Future of the Industrial Internet of Things
Fog Computing is the Future of the Industrial Internet of Things
 
Predix Builder Roadshow
Predix Builder RoadshowPredix Builder Roadshow
Predix Builder Roadshow
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Discover How Volvo Cars Uses a Time Series Database to Become Data-Driven
Discover How Volvo Cars Uses a Time Series Database to Become Data-DrivenDiscover How Volvo Cars Uses a Time Series Database to Become Data-Driven
Discover How Volvo Cars Uses a Time Series Database to Become Data-Driven
 
Grafana introduction
Grafana introductionGrafana introduction
Grafana introduction
 
Modern Monitoring
Modern MonitoringModern Monitoring
Modern Monitoring
 
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
Javantura v3 - Real-time BigData ingestion and querying of aggregated data – ...
 
How to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin EcosystemHow to Use Telegraf and Its Plugin Ecosystem
How to Use Telegraf and Its Plugin Ecosystem
 
Building Data Intensity with AWS MSK & Lenses.io
Building Data Intensity with AWS MSK & Lenses.ioBuilding Data Intensity with AWS MSK & Lenses.io
Building Data Intensity with AWS MSK & Lenses.io
 
Cloud-Native Workshop New York- Pivotal
Cloud-Native Workshop New York- PivotalCloud-Native Workshop New York- Pivotal
Cloud-Native Workshop New York- Pivotal
 
3 Reasons to Select Time Series Platforms for Cloud Native Applications Monit...
3 Reasons to Select Time Series Platforms for Cloud Native Applications Monit...3 Reasons to Select Time Series Platforms for Cloud Native Applications Monit...
3 Reasons to Select Time Series Platforms for Cloud Native Applications Monit...
 
Zurich: Monitoring a sales force-based insurance application using dynatrace ...
Zurich: Monitoring a sales force-based insurance application using dynatrace ...Zurich: Monitoring a sales force-based insurance application using dynatrace ...
Zurich: Monitoring a sales force-based insurance application using dynatrace ...
 
cncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetescncf overview and building edge computing using kubernetes
cncf overview and building edge computing using kubernetes
 
Addressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge ManagementAddressing Challenges with IoT Edge Management
Addressing Challenges with IoT Edge Management
 
RCA OCORA: Safe Computing Platform using open standards
RCA OCORA: Safe Computing Platform using open standardsRCA OCORA: Safe Computing Platform using open standards
RCA OCORA: Safe Computing Platform using open standards
 
Wavefront by vmware june 2019 - legraswindow
Wavefront by vmware   june 2019 - legraswindowWavefront by vmware   june 2019 - legraswindow
Wavefront by vmware june 2019 - legraswindow
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
 
Developing safety autonomous driving solutions based on the adaptive AUTOSAR ...
Developing safety autonomous driving solutions based on the adaptive AUTOSAR ...Developing safety autonomous driving solutions based on the adaptive AUTOSAR ...
Developing safety autonomous driving solutions based on the adaptive AUTOSAR ...
 

More from InfluxData

How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
InfluxData
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
InfluxData
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
InfluxData
 

More from InfluxData (20)

Announcing InfluxDB Clustered
Announcing InfluxDB ClusteredAnnouncing InfluxDB Clustered
Announcing InfluxDB Clustered
 
Best Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow EcosystemBest Practices for Leveraging the Apache Arrow Ecosystem
Best Practices for Leveraging the Apache Arrow Ecosystem
 
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
 
Power Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDBPower Your Predictive Analytics with InfluxDB
Power Your Predictive Analytics with InfluxDB
 
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
 
Build an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING StackBuild an Edge-to-Cloud Solution with the MING Stack
Build an Edge-to-Cloud Solution with the MING Stack
 
Meet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using RustMeet the Founders: An Open Discussion About Rewriting Using Rust
Meet the Founders: An Open Discussion About Rewriting Using Rust
 
Introducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud DedicatedIntroducing InfluxDB Cloud Dedicated
Introducing InfluxDB Cloud Dedicated
 
Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB Gain Better Observability with OpenTelemetry and InfluxDB
Gain Better Observability with OpenTelemetry and InfluxDB
 
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
 
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...How Delft University's Engineering Students Make Their EV Formula-Style Race ...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
 
Introducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage EngineIntroducing InfluxDB’s New Time Series Database Storage Engine
Introducing InfluxDB’s New Time Series Database Storage Engine
 
Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena Start Automating InfluxDB Deployments at the Edge with balena
Start Automating InfluxDB Deployments at the Edge with balena
 
Understanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage EngineUnderstanding InfluxDB’s New Storage Engine
Understanding InfluxDB’s New Storage Engine
 
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDBStreamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
 
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
 
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
 
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
 
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 

Kurt Schneider [Discover Financial] | How Discover Modernizes Observability with InfluxDB Cloud | InfluxDays Virtual Experience NA 2020

  • 1. Discover Financial Technology Kurt Schneider InfluxData Journey
  • 2. Kurt Schneider Domain Architect – Observability The selection process and roll out of InfluxDB (Telegraf) at Discover Financial. InfluxData Journey 30+ years in Monitoring. 27 Years at Large Insurance Company 5 Years at Discover Financial
  • 3. Discover Financial - Timeline • Jan 2019 We made a decision to replace CA-UIM with potential other toolset. • CA UIM was our infrastructure monitoring tool • Agent based on prem solution (old Nimsoft) • We looked at SignalFx, IBM, Datadog, InfluxData and new UIM solution. • We POC’d installed Datadog and InfluxDB. • We chose InfluxData as our partner and signed contract 2019 Dec.
  • 4. RFP Process – Reasons for Selection • Price • Technology • Datadog Gaps were the same • No call back process • Similar Tools – both written in GO (agent) • InfluxDB data retrieval was fastest • Datadog GUI and Alerting more refined • We were not looking for APM or Synthetics (DD has many other capabilities we were not looking for).
  • 5. Synthetic Monitoring Web, API, DNS, SSL, HTML, TCP, NTP On Prem and Cloud Testing Infrastructure Monitoring Windows, AIX, Unix, Logs, Network Application Performance Management and Real User Monitoring Event Management Machine Learning / AIOPS Event Portal Strategy Shared API (2020) API Communication API Communication API Communication API Communication Discover 2020-2021 Migration Plan 3/30/21 In Progress Live 10/1/19 700+ Daily Users 18K JVM installed OCP
  • 6. • SAAS Solution • InfluxDB - Time Series Database designed from ground up to collect time series data like server metrics.  Provable Fastest database for these types of metrics. • Hundreds of extensions • DevOps tool easy to access metrics A time series database is a software system that is optimized for storing and serving time series through associated pairs of time and value. In some fields, time series may be called profiles, curves, traces or trends. InfluxData replacing CA-UIM and Ganglia
  • 7. Users Chronograf UI Cloud InfluxDB Cloud Kapacitor Alert Engine AWS Hosted InfluxData vclp006888.na.disc overfinancial.com vclp007382.rw.disc overfinancial.com vclp006889.na.disc overfinancial.com vclp007381.rw.disc overfinancial.com Influx Gateway Telegraf Forwarders InfluxData Architecture Discover Financial TCP Port 48048 Enterprise Servers Telegraf Agents Windows, Linux and AIX 8000+ TCP Port 9092 RW NA OKTA OKTA
  • 8. InfluxData Hundreds of Input and Output Plugins Available
  • 9. Telegraf Current Collection Progress Using Today Coming Soon – Actively Investigating Future
  • 10. Enterprise Server Metrics Over 100 Metrics out of the box Sample Collected: Linux (8000+), AIX(300+) and Windows (1200+) VMWARE (3 vSphere) Citrix Active Directory Over 10,000 Telegraf Agents are Reporting to our Cloud Instance/ Can Alert on any combination of these Metrics. Will be offering some self-service capabilities and interested party notification capabilities
  • 13. Chronograf UI – Time Series in Real Time
  • 15. Telegraf Average Usage across Enterprise
  • 18. Grafana - Work from Home Dashboard
  • 19. Grafana / InfluxDB Linux Dashboards Hundreds of Community Dashboards Available
  • 20. vSphere Visualization with Telegraf Data
  • 21. Level 1 Level 2 Level 3 Deployment • Out of the box configuration • Agents deployed • Top 10 KPI monitored (CPU, Disk, Processes, etc.) Standard • Integrated with event manager • Metrics, events, logs, and traces • Real-time visualization • Application process monitors • Infrastructure tied to applications (CMDB) Preventative and Self Healing • Automatically kick off orchestration • Baseline monitoring and anomaly detection Maturity Model for Infrastructure and Platforms • Learn the tool • Work from tickets and calls • Use tool to root cause • Tune tool • Ongoing maintenance • Adding new StateBehaviors
  • 22. Tips and Tricks • Forwarder Configuration important (We have four for 10K Telegraf Agents) • metric_batch_size = 30000 metric_buffer_limit = 500000 collection_jitter = "0s" flush_interval = "1s" flush_jitter = "0s“ • Batching is important when you have a lot of data ingestion. • Don't duplicate configurations anywhere • the forwarders need to run a different local configuration as well; otherwise they risk forwarding data that is being filtered. • Telegraf • Choose Metrics that matter (Telegraf) • Thousand of metrics in attribute groups – collect the valuable metrics.

Editor's Notes

  1. Influxdata will be able to collect far more metrics than CA-UIM including Individual Process Data Responsive GUI/DB (can use grafana or chrongraf) Can scale with on-prem GUI’s as well for some cases. More tracking of what is not being monitored More data points for CPU Memory Etc. Better API for devops and github self service No infrastructure patching Time Series database (fastest in market)