'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
Rpd
1. RPD consuming high CPU
If RPD is consuming high CPU, then perform the following checks and verify the
following parameters:
Check the interfaces: Check if any interfaces are flapping on the router. This
can be verified by looking at the output of the show log messages and show
interfaces ge-x/y/z extensive commands. Find out why they are flapping; if
possible you can consider enabling the hold-time for link up and link down.
Check if there are syslog error messages related to interfaces or any FPC/PIC,
by looking at the output of show log messages.
Check the routes: Verify the total number of routes that are learned by the
router by looking at the output of show route summary. Check if it has reached
the maximum limit.
Check the RPD tasks: Identify what is keeping the process busy. This can be
checked by first enabling set task accounting on. Important: This might
increase the load on the CPU and its utilization; so do not forget to turn it
off when you are finished with the required output collection. Then run show
task accounting and look for the thread with the high CPU time:
user@router> show task accounting
Task Started User Time System Time Longest Run
Scheduler 146051 1.085 0.090 0.000
Memory 1 0.000 0 0.000
<omit>
BGP.128.0.0.4+179 268 13.975 0.087 0.328
BGP.0.0.0.0+179 18375163 1w5d 23:16:57.823 48:52.877 0.142
BGP RT Background 134 8.826 0.023 0.099
Find out why a thread, which is related to a particular prefix or a protocol, is
taking high CPU.
You can also verify if routes are oscillating (or route churns) by looking at
the output of the shell command:
% rtsockmon –t
sender flag type op
[12:12:24] rpd P route delete inet 110.159.206.28 tid=0 plen=30
type=user flags=0x180 nh=indr nhflags=0x4 nhidx=1051574 altfwdnhidx=0 filtidx=0
[12:12:24] rpd P route change inet 87.255.194.0 tid=0 plen=24
type=user flags=0x0 nh=indr nhflags=0x4 nhidx=1048903 altfwdnhidx=0 filtidx=0
[12:12:26] rpd P route delete inet 219.94.62.120 tid=0 plen=29
type=user flags=0x180 nh=indr nhflags=0x4 nhidx=1051583 altfwdnhidx=0 filtidx=0
Pay attention to the timestamp, behavior, and prefix. If a massive routing
update can be found with the same prefix or protocol, then there could be route
oscillation or interface bouncing.
Check RPD Memory. Some times High memory utilization might indirectly lead to
high CPU, to interpret RPD memory utilization in Junos, refer to
http://www.juniper.net/techpubs/en_US/junos/topics/reference/general/rpd-memory-
faq.html.
If still the rpd is consuming high cpu then collect the logs from data
collection checklist
KB22637 - Data Collection Checklist - Logs/data to collect for troubleshooting
and open a case with your technical support representative.
Another way to check the rtsockmon output is as follows:
> start shell
% rtsockmon -t > /var/tmp/rtsockmon.txt
(wait 1 minute)
2. Press CTRL+C
Then in a Unix-like OS which is not Junos OS, issue:
% cat rtsockmon.txt | grep inet | grep add | grep route | cut -c 50- | awk
'{print $1 " " $2}' | sort | uniq -c | rev |cut -b 7-| rev |sort
The output will look something like this:
3 10.51.11.66
151 10.51.145.116
2 10.51.147.28
The first column will be the number of times the route was added. The second
column is the prefix. So in the above example, 10.51.145.116 flappe: 151 times!