1. Jan-Jaap Oosterwijk
Technology Evangelist
Predictive Monitoring &
Efficient Incident Response
Tuesday, 20 November, 2018
DeFabrique, Utrecht, the Netherlands
Constantin Bajireanu
Manager, Service
Operations Center
2.
3. 70% OF EMPLOYEES ARE IN
ENGINEERING/RESEARCH/
DEVELOPMENT
+5 BILLION DEVICES AND
APPLICATIONS SECURED
SERVING 400+ CUSTOMERS
IN 75+ COUNTRIES
236 PATENTS & 268 PATENTS
PENDING
IRDETO IS THE WORLD
LEADER IN DIGITAL
PLATFORM SECURITY
NEARLY 1,000 SECURITY
EXPERTS EMPLOYED
+15 LOCATIONS COVERING
6 CONTINENTS
4. End-to-End Irdeto 360 Security Portfolio
Production Content Aggregation Distribution and
promotion
Consumption
Content Owners Broadcasters Distributors Devices ConsumerSport Rights Holders
5. IRDETO’S VISION
To build a secure future, where
people can embrace connectivity
without fear.
Irdeto protects platforms and
applications for media &
entertainment, games,
connected transport and IoT
connected industries.
6. 6
Service Operations Center
Incident Management
▶ Incident registration
▶ Triaging and initial troubleshooting
▶ Standard resolution procedures
▶ Escalation
Monitoring
▶ Service availability
▶ Capacity
▶ Health-check and Performance
7. How We Got Started
▶ In 2013 we started offering access to our products as managed
service and established a 24/7 Service Operations Center.
▶ Build monitoring framework
▶ Define and implement incident management process
▶ Monitoring infrastructure is important... but not enough.. by far.
▶ Troubleshooting requires logs... logs... and more logs.
8. Our past monitoring framework
Applications
Networks
Servers
Public Cloud
Web
Services/Global
Events
Alerts
Logs
Service
Operations
Center
Monitoring
Incident
Logs
Logs
Troubleshooting
Metrics
Events
Logs
9. Our present monitoring framework
Applications
Networks
Servers
Public Cloud
Web
Services/Global
Logs
Service
Operations
Center
Monitoring
Logs
Logs
Metrics
Events
Logs
Alerts
Alerts
Events
Incident
Dashboards
Troubleshooting
10. 10
Present
▶ Using Splunk since 2013
▶ Currently at 100Gb a day
▶ What’s in Splunk today
▶ Application logs
▶ Web-server logs
▶ AWS elb/alb logs
▶ Infrastructure logs
▶ Some metrics
Ingest
Measure
Investigate
Dashboard
Alert
Set Treshold
Observe Trend
Alert
Repeat
11. 11
What’s been monitored
▶ Business metrics
▶ Number or requests
▶ Error rate
▶ Response time
▶ Trends
▶ Sudden drop in traffic
▶ Sudden increase in errors