This document describes monitoring facilities in CMDS, including monitoring traffic at multiple levels, network elements, and hardware. It provides details on monitoring bearer traffic between CMDS and switches, RAN traffic, internet/WSN traffic, video request traffic, pilot packet monitoring, and analytics raw traffic. It also discusses monitoring the status of individual network elements like Guavus, ConteXtream, and Skyfire.
1. CMDS Operations Guide Alcatel-Lucent — Confidential 1
Use pursuant to applicable agreements
4 Monitoring
Overview
This section describes the monitoring facilities in CMDS. Monitoring can occur at
multiple levels within CMDS. The levels include:
Traffic – monitor bearer traffic between CMDS and the Juniper switches,
traffic toward the RAN, Internet, video optimization and analytical
packets.
Network Elements – monitor the status, logs and other information related
to the individual elements which comprise the CMDS solution.
Hardware – monitor the hardware executing CMDS.
Traffic
This section describes the monitoring of network traffic using the ConteXtream
management console. There are various views that the GUI provides. One should
keep in mind that all views are with respect to the interfaces on the ConteXtream
elements. Thus, many of the entities represented in the GUI are logical
representations on various network elements inside and outside the CMDS
solution. Please reference the ConteXtream SDG Operators Guide for more
information.
2. Monitoring Traffic
2 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
All CMDS Bearer Traffic
The Traffic screen displays the entire bearer traffic ingress/egress to the CMDS.
Figure 4-1 Bearer Traffic screen
Select this screen by clicking Inventory | Service Delivery Grid which represents
all the grid blades. On the far right are graphs representing the total received and
transmitted packets and Kbps. The Traffic tab displays the accumulated traffic
counts and the instantaneous traffic counts.
3. Monitoring Traffic
3 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
RAN Traffic
The HA_EBN5 screen displays the RAN traffic toward the subscribers.
Figure 4-2 HA_EBN5 screen
Select this screen by clicking on the Inventory tab and then the Access
Gateways|HA|HA_EBN5. The Traffic tab displays the total accumulated traffic
counts and the instantaneous traffic counts. The graphs on the right side represent
the total transmitted and received packets in Kbps and Pps.
4. Monitoring Traffic
4 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
Internet/WSN Traffic
The FIREWALL_INTERNET_EBN5 screen displays the Core traffic toward the
Internet and WSN.
Figure 4-3 FIREWALL_INTERNET_EBN5
Select this screen by clicking on the Inventory tab and then
Routers|FIREWALL_INTERNET|FIREWALL_INTERNET_EBN5. In the Traffic tab
in the middle the total accumulated traffic counts and the instantaneous traffic
counts are displayed. The graphs on the right side represent the total transmitted
and received packets in Kbps and pps.
5. Monitoring Traffic
5 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
Video Request Traffic
The Counters screen displays the requests for video.
Pilot Packet Monitoring
On the ConteXtream SDG GUI, the Pilot Packet Monitoring information displays
all available KPI/counters associated with the Pilot Packets.
Figure 4-4 Pilot Packet monitoring
Select this screen by clicking “Service Delivery Grid” and the “Pilot Packets” tab.
The graphs on the right side graphically represent the KPI valued displayed in the
“Pilot Packets” tab.
6. Monitoring Traffic
6 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
Analytics Raw Traffic
The Analytics Elements screen displays the IPFIX records from ConteXtream
toward Guavus Analytics.
Figure 4-5 Analytics Elements screen
Select this screen by clicking Inventory | Analytics Elements. On the far right are
graphs representing the various counters of traffic statistics.
Note: This element representation does not indicate the health of the interface
toward the Analytics elements as the other element representations typically do.
This is due to the use of UDP to send the IPFIX records toward the Guavus
Analytics. The information related to this element are only the activity related to
ConteXtream sending packets toward Guavus and does not necessarily reflect
what Guavus receives.
The counter of interest is the TX (Total) (Pps) which indicates the number of
IPFIX records sent from ConteXtream to Guavus.
7. Monitoring RSA Key Manager
7 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
RSA Key Manager
The RSA Key Manager provides a new SALT key value every 7 days to the RSA
client which is part of the ConteXtream software. Thus, a connection must be
maintained between the CMDS/ConteXtream software and the VzW RSA Key
Manager to support the UIDH feature.
To verify the connection from the distribution center to VzW RSA Key Manager
Server, follow these steps:
1 Login to the Active Management server using SSH, as “admin”.
2 Verify that it is indeed the Active Management Server using “status” command.
Figure 4-6 Active Management server screen
8. Monitoring RSA Key Manager
8 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
3 From the active Management Server, verify the counters.
mgmt counters getManagementStatistics
Use “space” bar to scroll down the list. Capture the current values of the
“RsaCounters” group and counters such as “failed_hashed_data” and
“succeed_hashed_data”. In the below example, “failed_hashed_data” = 0
and “succeed_hashed_data” = 686
Figure 4-7 Management Statistics screen
4 Create a static session – Note that this is a single line command.
mgmt session-chain-manager createSessionChain name dummy ip 1.2.3.4
imsi 123456789 profileName Analytics,MSS,Enrichment
Figure 4-8 Session chain manager
9. Monitoring RSA Key Manager
9 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
5 Again Fetch the updated counters values of the RsaCounters group:
mgmt counters getManagementStatistics
In the “RsaCounters” group, verify that “succeed_hashed_data” counter
incremented by one and the “failed_hashed_data” did NOT increment. In the
below sample, “succeed_hashed_data” counter is incremented by one (to
687) and the “failed_hashed_data” counter remains at “0”.
Figure 4-9 RSA Counters group
6 Remove the static session created:
mgmt session-chain-manager terminateSessionChain ip 1.2.3.4
Figure 4-10 Terminate session chain
10. Monitoring RSA Key Manager
10 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
To verify the session is cleaned up first observe the session exists in the CXT
GUI by creating a filter.
Figure 4-11 Verify session cleanup
Then after deleting the session run the filter again and this session should be
removed.
Figure 4-12 Post deletion
11. Monitoring RSA Key Manager
11 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
7 Failover to the secondary management server. Run the following command from
the currently active management server -
config ha failover
Figure 4-13 Failover
8 Login to the secondary management server, as “admin” and run the following:
status
Wait for “HA State” to become “active_redundant”, and for all grid servers to
become NODE_HA_ACTIVE & NODE_SM_ACTIVE
Figure 4-14 Secondary management server
12. Monitoring RSA Key Manager
12 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
9 Repeat steps (1)-(6) on the new Active Management Server
10 Failover back to the Primary management server to return to the original
configuration. From the current active Management Server, run:
config ha failover
11 Login to the secondary management server, as “admin” and run:
status
Wait for “HA State” to become “active_redundant”, and for all grid servers to
become NODE_HA_ACTIVE & NODE_SM_ACTIVE
E N D O F S T E P S
13. Monitoring Verifying SALT value updates
13 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
Verifying SALT value updates
The following steps can be used to check the time at which the SALT value was
last updated.
1 Login to the Active Management server using SSH, as “root”.
2 Verify that it is indeed the Active Management Server.
3 Change to “/var/log/contextream/mgmt/latest” directory.
4 Open the jboss.log.
vi jboss.log
5 Look for the relevant string “Key Class Name =”. Get the corresponding
“Activation Date” and Deactivation Date”. In the below example, SALT was last
updated on 11th
Nov and will expire on 18th
Nov at the specified time.
Figure 4-15 Verify SALT update screen
14. Monitoring Network elements
14 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
Network elements
The section describes the monitoring capabilities for each of the network
elements.
Guavus
Monitoring Collector Activity
Command:
cli> collector stats ipfix
Output Example:
Last Freezed Bin : 1305550130
Last Freezed Bin Size : 279 KB
Average Bin Size : 78 KB
Average Flow/Sec : 39
Max Flow/Sec : 67
No. of Dropped Flows : 0
Monitoring with CLI
cli> ps get first-bin-time<BinType><BinClass>
cli> ps get last-bin-time<BinType><BinClass>
Dumping Collector IPFix Records
To dump the IPFix records from the collector, do the following:
Login to CLI and enable shell.
Example:
Result:
cli> en_shell
Exporter –d/data/collector/output/<yyyy>/<mm>/<dd>/<hh> -P
exporter –d/data/collector/output/2011/06/23/20 –P
The above command will show all records received between 19:00
and 20:00 on 23 Jun, 2011
15. Monitoring Network elements
15 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
Note: The exporter command is not recursive.
Note: Make sure the file exists before executing this command.
Last freezed bin shows the time in epoch seconds when the last collector bin was
written to.
# date –d
Date –d @ 1318220700
Mon Oct 10 04:25:00 UTC 2011
# date
Mon Oct 10 04:45:05
Note the time difference between these two times - It is acceptable for the
collector to be ~15 minutes behind:
Dropped Flows increases greatly on successive stats outputs, then it indicates
a potential time skew, check the NTP setup section to fix this problem. Also
verify max record flows/sec.
Max Flows/sec shows the maximum records/sec that the collector encountered
since it was last started.
Average Bin Size is the average size of the bin where raw IPFIX records are
stored on disk before processing.
Checking for incoming IPFIX records
# tcpdump –i any udp port
The tcpdump packet capture program can be used to verify that packets are being
transmitted into the system. The collector listens for incoming IPFIX packets on
default ports 4000, 4001 and 4002.
ConteXtream
The ConteXtream CLI enables you to perform administrative tasks on individual
grid servers as well as on the management servers. Examples of the tasks that can
be performed using the CLI include performing the initial configuration of a grid
server’s IP address, management IP address and the interfaces on it. In general,
there is no need to use the CLI except immediately after installation or when
troubleshooting in conjunction with Technical Support. After grid server
installation, you must configure system parameters for that server, including its
management IP address, interfaces, SSH functionality, syslog and host name
using the CLI. Then, simply start the server to begin using it in the system.
The CLI can be accessed by logging into the grid server or the management server
console (either directly, by connecting the applicable cable directly to the PC’s
16. Monitoring Network elements
16 Alcatel-Lucent — Confidential CMDS Operations Guide
Use pursuant to applicable agreements
console port, or remotely via SSH). The user name and password are the same as
those for an Administrator or Expert user.
The ConteXtream CLI and GUI allow monitoring of CPU utilization. Each of the
12 cores on each Grid and Management Server can be monitored. It should be
noted that on the Grid Servers, CPU 0 (or Core 0) often exhibits high utilization.
This is normal.
This CPU core is used by the Grid Server for non-mission critical, non-real-time
management functions such as the CLI, GUI interface and various internal
consistency checks. These functions do not affect bearer traffic and are not
affected by subscriber scale or throughput and thus the CPU 0 utilization will not
increase as subscribers are added. As this CPU is not used for real-time functions,
short term high utilization has no adverse effect on the system.
For more information, see the CXT CMDS Operators Guide Ver. 3.4 and the VzW
CMDS CLI Guide Ver.3.4.
Skyfire
Skyfire consists of two components – the Controller (distribution center) and the
Optimizer (NEC). On the controller there is a single process SkyController which
handles all the incoming requests. The Optimizer has two types on processes:
SkyOptimizerServer and SkyOptimizerHost. The SkyOptimizerServer receives
the incoming requests for optimization. It allocates the request to one of several
SkyOptimizerHost processes.
For logging, there is one type of log file on the Controller and on the Optimizer
there are two types – one for the Server component and one for the Host
component.
Refer to SkyFire Rocket Video Optimization Ver. 2.0.