2. 2
Splunk at Fastweb
Alessandro Bono
Network Operations Control
Coordinator
Vincenzo Vignera
Network Operations Control
Professional
3. 3
Fastweb Overview
Today FASTWEB is the Italian leader in Ultra Broadband
~300K ~200K
~310K FTTH
Customers
~400K FTTC
Customers
With 500k customers connected at speed up to 100 Mbps, FASTWEB has a 70% share
of the UBB market
of which
FASTWEB
of which
FASTWEB
~710K UBB
~500K (~70%)
of which
FASTWEB
6. 6
Service Platforms
Monitoring Platforms
OSS Platforms
VAS & Mobile Data Platforms
~3,1 Mln Mailbox
815K – MVNO USIM
~200k q/sec DNS
1,1 Mln ACS Devices
2 Mln UsersPayPerUse
4k Server Monitored with Agents
200k Network Devices
4,5 Mln KPI Collected
8. 8
Reporting Delivered Services
Standard Reporting of Delivered Services
– Situation: Service Platforms Platforms Team and Backbone team consume a lot
of time in Reporting Delivered Services
– Struggling with: Dozens of Platforms for Reporting different KPI
– Wanted: A centralized view for Reporting periodically Delivered Services
9. 9
Reporting Delivered Services
# Monitoring Software
# CLI Command
# Database Queries
# Code
# …
: Before
: After
Enter Splunk: Splunk Enterprise enables Reporting for
different services with the same Output
10. 10
Analyze Bypass SPAMMER Filters
– Situation: Realtime logs Analyzing of Transactions that was sent by 1 IP Address and satisfy two of the
following conditions:
• 2 or More Recipissssent
• At least 20 Mail ("QUEUE From" with different ID in 5 minutes)
• At least 2 Different From
• At least 1 E-mail known as spam (SPAM-BLOCKED).
- Next starting from «Auth» used Mailbox with drill-down report mail sent, % of «Subject» as SPAM
- Top Spammer by Source IP (latest 15m)
- Internet forwarding Check vs Fastwebnet Domain (Reporting Mailbox with more than 1 forward vs
Faswtebnet, External Database Lookup to retrieve Customer Account)
SPAM Finder: Analyzing Problems
11. 11
index="msr" sourcetype="c*_smtp" (transaction_type=QUEUE OR transaction_type=SPAM-BLOCKED)
|stats first(_time) AS time, values(transaction_type) AS type, values(Recipient) AS Recipients, dc(Recipient) AS nb_recipients, values(Relay) AS Relay,
values(Auth) AS Auth, values(From) AS From by transaction_id
|search Auth=* |eval more_than_2_recipients=IF(nb_recipients>=2,1,0) |eval spam_blocked=IF(type="SPAM-BLOCKED",1,0)
|stats first(time) AS first_time, dc(transaction_id) AS nb_mails, values(From) as Froms, dc(From) AS nb_froms, sum(more_than_2_recipients) AS nb_more_than_2_recipients,
sum(spam_blocked) AS nb_spam_blocked BY Relay, Auth
|eval more_than_2_recipients=IF(nb_more_than_2_recipients>0,1,0) |eval spam_blocked=IF(nb_spam_blocked>0,1,0)
|eval more_than_20_mails=IF(nb_mails>=20,1,0) |eval more_than_2_froms=IF(nb_froms>=2,1,0)
|eval possible_spam=more_than_2_recipients+more_than_20_mails+more_than_2_froms+spam_blocked |where possible_spam>=2
|eval first_sent_at=strftime(first_time, "%H:%M:%S") | eval possible_spam="yes"
|table first_sent_at Relay Auth Froms more_than_2_recipients more_than_20_mails more_than_2_froms spam_blocked possible_spam
|sort - first_sent_at
SPAM Finder: Analyzing Problems
12. 12
Storming Detections
Detect Storming Network Devices
– Situation: Network Devices can logs thousand of syslog messages every seconds
caused by interface problems
– Wanted: Network Devices Dashboard to analyzing trends
13. 13
Storming Detections
- Enter Splunk:
- Analyzing Trends supporterd by Dashboard
- Automatic Actions
- Monitoring Deviations
17. 17
Network Troubleshooting
Troubleshooting Bug on Network Devices
– Situation: Problem on 15k Network Devices, every ADSL Board provide services
at 48 Customers ~ 700K Customers affected – Unable to Surf until Board Reset
– Struggling with: Thousand of Customer Center call to report problem
– Wanted: Decrease Recovery Time from 3h to 1h
18. 18
Network Troubleshooting – First Step
Enter Splunk:
– Customer Care use automatic tools to check customer connectivity
– Intercept the actions of automated tools
– We decrease of 50% reporting
19. 19
Splunk – Resolution
Enter Splunk:
– Find the Bug’s
– Implement an automated system to find the bug
– Splunk launches an automated script to reset the board
Customer Care Calling Decrease of
100%