Fabric OS Command Reference
Fabric OS Command Reference
Fabric OS 7.0.1
53-1002447-01
documentation@brocade.com


Fabric OS Commands : bottleneckMon

bottleneckMon
Monitors and reports latency and congestion bottlenecks on F_Ports and E_Ports.
Synopsis
bottleneckmon --enable [-cthresh congestion_threshold]
[-lthresh latency_threshold] [-time seconds]
[-qtime seconds] [-alert | -noalert]
[-lsubsectimethresh time_threshold]
[-lsubsecsevthresh severity_threshold]
bottleneckmon --disable
bottleneckmon --config [-cthresh congestion_threshold]
[-lthresh latency_threshold][-time seconds]
[-qtime seconds] [-alert | -noalert]
[-lsubsectimethresh time_threshold]
[-lsubsecsevthresh severity_threshold]
[[slot/]port_list]
bottleneckmon --configclear [slot/]port_list
bottleneckmon --exclude [slot/]port_list
bottleneckmon --include [slot/]port_list
bottleneckmon --show [-interval seconds] [-span seconds]
[-refresh][-congestion | -latency] [[slot/]port | '*']
bottleneckmon --status
bottleneckmon --cfgcredittools -intport -recover
[off | onLrOnly | onLrThresh]
bottleneckmon--showcredittools
bottleneckmon --help
Description
Use this command to (1) detect latency and congestion bottlenecks on F[L]_Ports and E_Ports and (2) to manage credit recovery on back-end ports. Bottleneck detection and credit recovery are two independent functions; enabling credit recovery has no impact on bottleneck detection and vice versa.
Bottleneck Detection
For bottleneck detection, this command provides the following management functions:
Enabling or disabling bottleneck detection is a switch-wide operation. If Virtual Fabrics are enabled, the configuration is applied per logical switch and affects all ports on the current logical switch. After the (logical) switch-wide bottleneck detection parameters have been set, you can you can fine-tune the configuration for specific ports.
A bottleneck is defined as a condition where the offered load at a given port exceeds the throughput at the port. This command supports detection of two types of bottleneck conditions: congestion and latency.
A congestion bottleneck arises from link over-utilization. This happens when the offered load exceeds throughput and throughput is at 100%. Frames attempt to egress at a faster rate than the line rate allows. Link utilization is measured once every second at the port. When trunked ports are monitored, link utilization is measured for the entire trunk. A congestion bottleneck is assumed if the utilization during the measured second is 95% or more.
A latency bottleneck occurs when egress throughput at a port is lower than the offered load because of latency in the return of credits from the other end of the link. This is not a permanent condition. The offered load exceeds throughput and throughput is less than 100%. In this case, the load does not exceed the physical capacity of the channel as such, but can occur because of an underperforming device connected to the F_Port, or because of back pressure from other congestion or latency bottlenecks on the E_Ports. Bottleneck detection can help identify these devices and pinpoint the upstream bottlenecks caused by these devices inside the fabric.
When bottleneck detection is enabled on a switch and -alert is specified, the command triggers an SNMP and a RASlog alert when the ports on the configured switch experience latency or congestion. Another alert is sent after the condition resolves. For a given averaging time, each second is marked as affected by latency and/or congestion or not. If the number of affected seconds crosses the configured threshold, an alert is triggered for the port. You can configure a severity threshold for each type of bottleneck and the time interval over which the bottlenecks are measured.
For example, setting a latency threshold of 0.8 and a time window of 30 seconds specifies that an alert should be sent when 80% of the one-second samples over any period of 30 seconds were affected by latency bottleneck conditions. The -qtime option can be used to throttle alerts by specifying the minimum number of seconds between consecutive alerts. Thresholds are configured separately for each type of bottleneck and statistical data are collected independently for each condition. The -qtime parameter applies to both types of bottleneck detection; there can be one latency alert and one congestion alert in a configured quiet time.
Bottleneck detection works both in non-Virtual Fabric mode and in Virtual Fabric Mode. If Virtual Fabrics are enabled, bottleneck detection is configured per logical switch. If a port is removed from a logical switch after bottleneck detection is enabled on the logical switch, the configuration is retained in that logical switch. If the port is added again to the same logical switch, bottleneck detection is automatically re-enabled for this port using the retained configuration. This feature allows you to configure more than one logical switch to perform bottleneck detection on the same port, although only one logical switch performs the operation on the port at any given time.
The --show option displays a history of the bottleneck severity for a specified port or for all ports. Each line of output shows the percentage of one-second intervals affected by bottleneck conditions during the time window shown on that line. When issued for all ports, the union of all port statistics is displayed in addition to individual port statistics. The union value provides a good indicator for the overall bottleneck severity on the switch. You can filter the output to display only latency or congestion bottleneck statistics. When used without port operand the command displays the number of ports affected by bottleneck conditions. A "bottlenecked" port in this output is defined as any port that was affected by a bottleneck for one second or more in the corresponding interval.
When using the --show command, you may see a "no data for x seconds" or "no data" message displayed at the end of a line of output. The "no data..." message in any interval means that there was no data to analyze for the stated number of seconds or for the entire interval if the remark is simply "no data." This typically means that there was no traffic on the link for the stated number of seconds. The percentage of affected seconds displayed takes this into account. For example, if there was no traffic for 6 seconds in an interval of 10 seconds, and 1 second out of the other 4 seconds was affected by a bottleneck, the display for that interval would show 25% as the percentage of affected seconds (1 out of 4), and state "no data for 6 seconds." However, if there is no traffic because the port is offline, the "no data..." message is displayed.
The --status option displays bottleneck configuration details for the current (logical) switch. If virtual fabrics are enabled, ports not belonging to the current logical switch are not displayed. The command output includes the following information:
Bottleneck detection
Enabled or disabled
Switch-wide sub-second latency bottleneck criterion
Displays the following parameters:
Time threshold
The value set with the -lsubsectimethresh operand.
Severity threshold
The value set with the -lsubsecsevthresh operand.
Switch-wide alerting parameters
Displays the following parameters:
Alerts?
Yes (enabled) or No (disabled).
Congestion threshold for alert
The severity threshold for triggering a congestion alert. This threshold indicates the percentage of one-second intervals affected by congestion conditions within a specified time window. The congestion threshold is expressed as a fraction between 0 and 1.
Latency threshold for alert
The severity threshold for triggering a latency alert. This threshold indicates the percentage of one-second intervals affected by latency conditions within a specified time window. The latency threshold is expressed as a fraction between 0 and 1.
Averaging time for alert
The time window in seconds over which the percentage of seconds affected by bottleneck conditions is computed and compared with the threshold.
Quiet time for alert
The minimum number of seconds between consecutive alerts. The value assigned to this parameter applies to both latency and congestion detection.
Per-port overrides for sub-second latency bottleneck criterion
Custom configuration for the above mentioned sub-second latency bottleneck parameters. Note that everything above this line applies to all ports in the switch that don't have any custom configuration or exclusions.
Per-port overrides for alert parameters
Custom configuration for the above mentioned alert parameters.
Excluded ports
List of ports excluded from bottleneck detection.
Credit recovery on back-end ports
Use the --cfgcredittools commands to enable or disable credit recovery of external back-end ports and to display the configuration. When this feature is enabled, credit is recovered on external back-end ports (ports connected to the core blade or core blade back-end ports) when credit loss has been detected on these ports. When used with the -recover onLrOnly option, the recovery mechanism takes the following escalating actions:
If the link reset fails to recover the port, the port reinitializes. A RASlog message is generated (RAS Cx-1015). Note that the port reinitialization does not fault the blade.
If a port is faulted and there are no more online back-end ports in the trunk, the core blade is faulted. (Note that the port blade will always be faulted). A RASlog message is generated (RAS Cx-1017).
When used with the -recover onLrThresh option, recovery is attempted through repeated link resets and a count of the link resets is kept. If the threshold of more than two link resets per hour is reached, the blade is faulted (RAS Cx-1018). Note that regardless of whether the link reset occurs on the port blade or on the core blade, the port blade is always faulted.
For more information on the RASlog messages, refer to the Fabric OS Message Reference.
Notes
Command syntax predating Fabric OS v.6.4.0 is no longer supported as of Fabric OS v.7.0.0.
The execution of this command is subject to Virtual Fabric or Admin Domain restrictions that may be in place. Refer to Chapter 1, "Using Fabric OS Commands" and Appendix A, "Command Availability" for details.
The bottleneck detection commands are supported on F_Ports, FL_Ports, E_Ports, and EX_Ports.
The credit recovery commands are supported only on back-end ports of Condor, Condor 2, and Condor 3-based blades in the Brocade DCX, DCX-4S, DCX 8510-8, and DCX 8510-4 chassis.
Operands
Bottleneck detection commands
The following operands support bottleneck detection:
slot
On bladed systems only, specifies the slot number of the ports to be configured, followed by a slash (/).
port_list
Specifies one or more ports, relative to the slot on bladed systems. Use switchShow for a listing of valid ports. The --show option allows only a single port or all ports ('*') to be specified with this command, unless it is used without port operand. A port list should be enclosed in double quotation marks and can consist of the following:
A port range where beginning and end port are separated by a dash, for example, "8-13" or "5/8-13" on blades systems. A port range cannot span multiple slots.
A wildcard ('*') indicates all ports. The wildcard must be enclosed in single quotation marks and is not allowed with the --config option. To make switch-wide changes, use --config without a port specifier.
--enable
Enables bottleneck detection on the switch. This operation is switch-wide and affects all F[L]_Ports and F_Ports. This operation enables bottleneck detection on all eligible ports of a switch, no matter when they become eligible. If you have Virtual Fabrics enabled and you move ports into a bottleneck enabled logical switch from another logical switch, bottleneck detection is enabled upon completion of the move. You can configure optional thresholds and alerts when you enable the feature, or you can change selected parameters later with the --config command.
--config
Modifies bottleneck detection parameters on specified ports or, when a port list is not specified, on the entire switch. Bottleneck detection must first be enabled before you can fine-tune the configuration with the --config command. The history of bottleneck statistics thus far will not be lost for the specified ports and can be viewed with the --show option. However, alert calculations restart on the specified ports when parameters change. This operation is allowed on excluded ports.
The following parameters can be optionally set with the --enable and --config commands; if omitted, default thresholds apply.
-alert
Enables alerts when configured thresholds are exceeded on the ports that are enabled for bottleneck detection. The alerting mechanism is by RASlog and SNMP traps. This operand is optional; if omitted, no alert is assumed. When -alert is specified, one or more of the following operands may be specified. If -alert is not specified and you try to specify additional configuration parameters, the command fails with an appropriate message.
-cthresh congestion_threshold
Specifies the severity threshold for congestion that triggers an alert. The threshold indicates the percentage of one-second intervals affected by the bottleneck condition within the specified time window. The threshold is expressed as the equivalent fraction between 0 and 1. The default value is 0.8.
-lthresh latency_threshold
Specifies the severity threshold for latency that triggers an alert. The threshold indicates the percentage of one-second intervals affected by the bottleneck condition within the specified time window. The threshold is expressed as the equivalent fraction between 0 and 1. The default value is 0.1.
-time window
Specifies the time window in seconds over which the percentage of seconds affected by bottleneck conditions is computed and compared with the threshold. The maximum window size is 10800 seconds (3 hours). The default is 300 seconds.
-qtime quiet_time
Specifies the minimum number of seconds between consecutive alerts. The default is 300 seconds. The maximum is 31556926 seconds (approximately one year).
-noalert
Disables alerts. This is the default state assumed if neither -alert nor -noalert is specified.
-lsubsectimethresh time_threshold
Sets the threshold for latency bottlenecks at the sub-second level. The time_threshold specifies the minimum fraction of a second that must be affected by latency in order for that second to be considered affected by a latency bottleneck. For example, a value of 0.75 means that at least 75% of a second must have had latency bottleneck conditions in order for that second to be counted as an affected second. The time threshold value must be greater than 0 and no greater than 1. The default value is 0.8. Note that the application of the sub-second numerical limits is approximate. This command erases the statistics history and restarts alert calculations (if alerting is enabled) on the specified ports. When used with the config option, you must specify a port.
-lsubsecsevthresh severity_threshold
Specifies the threshold on the severity of latency in terms of the throughput loss on the port at the sub-second level. The severity threshold is a floating-point value in the range of no less than 1 and no greater than 1000. This value specifies the factor by which throughput must drop in a second in order for that second to be considered affected by latency bottlenecking. For example, a value of 20 means that the observed throughput in a second must be no more than 1/20th the capacity of the port in order for that second to be counted as an affected second. The default value is 50. This command erases the statistics history and restarts alert calculations (if alerting is enabled) on the specified ports. When used with the config option, you must specify a port.
--exclude [slot/]port_list
Excludes the specified ports from bottleneck detection. No data will be collected from these ports, and no alerts will be triggered for these ports. All statistics history for a port is erased when a port is excluded. Alerting parameters are preserved. It is not recommended to exclude ports from monitoring except under special circumstances, for example, when a long-distance port is known to be a bottleneck because of credit insufficiency. The wildcard (*) port specifier is allowed but not recommended. Use --disable to exclude all ports on the switch.
--include [slot/]port_list
Includes previously excluded ports for bottleneck detection. Previously configured switch-wide alerts and threshold parameters reapply when bottleneck detection resumes. The wildcard (*) port specifier may be used as a shorthand for removing all exclusions.
--configclear [slot/]port_list
Removes any port-specific alert parameters from the specified ports and restores switch-wide parameters on these ports. You can still view the history of bottlenecks statistics on these ports. However, alert calculations restart on the specified ports after the parameter reset. This operation is allowed on excluded ports.
--disable
Disables bottleneck detection on the entire switch. This operation erases all configuration details, including the list of excluded ports, all custom thresholds and alerting parameters for specific ports, and all historical data.
--show [[slot/]port |*]
Displays a history of the bottleneck severity for the specified ports. The output shows the percentage of one-second intervals affected by the bottleneck condition within the specified time interval. When a single port is specified, the command displays the bottleneck statistic for that port. When the wildcard (*) is specified, the same statistic is displayed for every port on the switch. Additionally, a combined "union" statistic for the switch as a whole is displayed. When used without a port specifier, the command displays the number of ports affected by bottleneck conditions. A "bottlenecked" port in this output is defined as any port that was affected by a bottleneck for one second or more in the corresponding interval. This command succeeds only on online ports.
The following operands are optional:
-interval seconds
Specifies the time window in seconds over which the percentage of seconds affected by bottleneck conditions is displayed in the output. When a port is specified with the --show command, the maximum interval is 10800 seconds (3 hours). When a wildcard (*) is specified, the maximum interval is defined such that the value of -span divided by the value of the interval cannot exceed 30. The interval value must be greater than 0.The default value is 10 seconds.
-span seconds
Specifies the total duration in seconds covered in the output. When a port is specified with the --show command, the maximum span is 10800 seconds (3 hours). When a wildcard (*) is specified, the maximum span is defined such that the value of -span divided by the value of the interval cannot exceed 30. The span value must be greater than 0. The default value is 10 seconds.
History data are maintained for a maximum of three hours per port, so the span can be 10800 seconds at most. When the show command is issued for all ports (*), the maximum duration is defined such that the value of -span divided by the value of the interval cannot exceed 30.
-refresh
Refreshes the display to continuously update with fresh data at a certain rate. The refresh rate is equal to the number of seconds specified in the interval.
-congestion | -latency
Restricts the display to congestion or latency data. If neither is specified, the command displays combined statistics for both types of bottlenecks.
--status
Displays the details of the Bottleneck Detection configuration for the current (logical) switch. Refer to the command description section for an explanation of the displays. If virtual fabrics are enabled, ports not belonging to the current logical switch are not displayed.
--help
Displays the command usage.
Back-end port credit recovery commands
The following operands support back-end port credit recovery:
--cfgcredittools -intport
Enables credit recovery for internal back-end ports. Use one of the following recovery options:
-recover onLrOnly
Enables the back-end port recovery feature in link reset mode.
-recover onLrThresh
Enables the back-end port recovery feature in link reset threshold mode.
-recover off
Disables the back-end port credit recovery feature.
--showcredittools
Displays the back-end port credit recovery configuration as enabled or disabled. In addition, the output indicates whether link reset mode or link reset threshold mode is configured.
--help
Displays the command usage.
Examples
Bottleneck detection examples
To enable bottleneck detection on the switch without alerts (statistics collected with default parameters are still available for viewing):
switch:admin> bottleneckmon --enable
To enable bottleneck detection on the switch with alerts using default values for thresholds and time (preferred use case):
switch:admin> bottleneckmon --enable -alert
To customize congestion bottleneck detection on a port range after default alerts are enabled switch-wide:
switch:admin> bottleneckmon --enable -alert
switch:admin> bottleneckmon --config -alert \
-cthresh .5 -time 240 1-15
To disable bottleneck detection on a specified port:
switch:admin> bottleneckmon --exclude 2/4
To disable bottleneck detection on all ports of a chassis:
switch:admin> bottleneckmon --disable
To display the number of ports affected by bottleneck conditions:
switch:admin> bottleneckmon --show
======================================================
Fri Feb 26 22:00:00 UTC 2010
======================================================
List of bottlenecked ports in most recent interval:
13 16
=======================================================
Number of
From To bottlenecked ports
=======================================================
Feb 26 21:59:50 Feb 26 22:00:00 2
Feb 26 21:59:40 Feb 26 21:59:50 0
Feb 26 21:59:30 Feb 26 21:59:40 0
Feb 26 21:59:20 Feb 26 21:59:30 0
Feb 26 21:59:10 Feb 26 21:59:20 0
Feb 26 21:59:00 Feb 26 21:59:10 0
Feb 26 21:58:50 Feb 26 21:59:00 0
Feb 26 21:58:40 Feb 26 21:58:50 0
Feb 26 21:58:30 Feb 26 21:58:40 0
Feb 26 21:58:20 Feb 26 21:58:30 2
Feb 26 21:58:10 Feb 26 21:58:20 3
Feb 26 21:58:00 Feb 26 21:58:10 3
Feb 26 21:57:50 Feb 26 21:58:00 3
Feb 26 21:57:40 Feb 26 21:57:50 3
Feb 26 21:57:30 Feb 26 21:57:40 2
Feb 26 21:57:20 Feb 26 21:57:30 2
Feb 26 21:57:10 Feb 26 21:57:20 0
Feb 26 21:57:00 Feb 26 21:57:10 0
Feb 26 21:56:50 Feb 26 21:57:00 0
Feb 26 21:56:40 Feb 26 21:56:50 0
Feb 26 21:56:30 Feb 26 21:56:40 0
Feb 26 21:56:20 Feb 26 21:56:30 0
Feb 26 21:56:10 Feb 26 21:56:20 0
Feb 26 21:56:00 Feb 26 21:56:10 0
Feb 26 21:55:50 Feb 26 21:56:00 0
Feb 26 21:55:40 Feb 26 21:55:50 0
Feb 26 21:55:30 Feb 26 21:55:40 0
To display bottleneck statistics for a single port:
switch:admin> bottleneckmon --show \
-interval 5 -span 30 2/4
=============================================
Wed Jan 13 18:54:35 UTC 2010
=============================================
Percentage of
From To affected secs
==============================================
Jan 13 18:54:05 Jan 13 18:54:10 20.00%
Jan 13 18:54:10 Jan 13 18:54:15 60.00%
Jan 13 18:54:15 Jan 13 18:54:20 0.00%
Jan 13 18:54:20 Jan 13 18:54:25 0.00%
Jan 13 18:54:25 Jan 13 18:54:30 40.00%
Jan 13 18:54:30 Jan 13 18:54:35 80.00%
To display the bottleneck statistic for every port in the switch including the union of all individual port statistics:
switch:admin> bottleneckmon --show -interval 5 -span 30 *
=============================================================
Wed Jan 13 18:54:35 UTC 2010
=============================================================
=================================================================
From To 0 1 2 3 4 5
=================================================================
Jan13 18:54:05 Jan13 18:54:10 20.00 20.00 0.00 80.00 20.00 100.00
=================================================================
From To 5 6 7 8 UNION
=================================================================
Jan13 18:54:05 Jan13 18:54:10 40.00 0.00 0.00 20.00 100.00
=================================================================
From To 0 1 2 3 4 5
=================================================================
Jan13 18:54:10 Jan13 18:54:15 0.00 0.00 20.00 40.00 20.00 0.00
=================================================================
From To 5 6 7 8 UNION
=================================================================
Jan13 18:54:10 Jan13 18:54:15 0.00 20.00 0.00 0.00 40.00
To display only the union statistic for the switch:
switch:admin> bottleneckmon --show -interval 5 -span 30
=============================================================
Wed Jan 13 18:54:35 UTC 2010
=============================================================
Percentage of
From To affected secs
=============================================================
Jan 13 18:54:05 Jan 13 18:54:10 80.00
Jan 13 18:54:10 Jan 13 18:54:15 20.00
Jan 13 18:54:15 Jan 13 18:54:20 80.00
Jan 13 18:54:20 Jan 13 18:54:25 0.00
Jan 13 18:54:25 Jan 13 18:54:30 0.00
Jan 13 18:54:30 Jan 13 18:54:35 40.00
To display bottleneck configuration details for the switch:
switch:admin> bottleneckmon --status
Bottleneck detection - Enabled
==============================
 
Switch-wide sub-second latency bottleneck criterion:
====================================================
Time threshold - 0.800
Severity threshold - 50.000
 
Switch-wide alerting parameters:
=================================
Alerts - Yes
Congestion threshold for alert - 0.800
Latency threshold for alert - 0.100
Averaging time for alert - 300 seconds
Quiet time for alert - 300 seconds
 
Per-port overrides for sub-second latency bottleneck criterion:
===============================================================
Slot Port TimeThresh SevThresh
=========================================
0 3 0.500 100.000
0 4 0.600 50.000
0 5 0.700 20.000
 
Per-port overrides for alert parameters:
========================================
Slot Port Alerts? LatencyThresh CongestionThresh Time(s) QTime(s)
=================================================================
0 1 Y 0.990 0.900 3000 600
0 2 Y 0.990 0.900 4000 600
0 3 Y 0.990 0.900 4000 600
 
Excluded ports:
===============
Slot Port
============
0 2
0 3
0 4
Back-end port credit recovery examples
To enable back-end port credit recovery with the link reset only option and to display the configuration:
switch:admin> bottleneckmon --cfgcredittools \
-intport -recover onLrOnly
switch:admin> bottleneckmon --showcredittools
Internal port credit recovery is Enabled with LrOnly
To enable back-end port credit recovery with the link reset threshold option and to display the configuration:
switch:admin> bottleneckmon --cfgcredittools -intport \
-recover onLrThresh
switch:admin> bottleneckmon --showcredittools
Internal port credit recovery is Enabled with LrOnThresh
To disable back-end port credit recovery and to display the configuration:
switch:admin> bottleneckmon --cfgcredittools \
-intport -recover off
switch:admin> bottleneckmon --showcredittools
Internal port credit recovery is Disabled
See Also
None

Fabric OS Commands : bottleneckMon