Symptoms
How to Interpret the Brocade porterrshow output
What do the porterrshow counters mean?
Interprets and explains the porterrshow output (port errors) of Brocade SAN switches and possible causes of the errors.
For action to be taken when counters increase see the copy of the KB article
Connectrix: How to troubleshoot Fibre Channel node to switch port or SFP communication problems by means of elimination?
below in the notes section of this KB.
/fabos/cliexec/porterrshow:
frames enc crc crc too too bad enc disc link loss loss frjt fbsy c3timeout pcs uncor
tx rx in err g_eof shrt long eof out c3 fail sync sig tx rx err err
0: 575.2m 2.1g 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
1: 576.7m 2.1g 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
2: 611.3m 2.1g 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
3: 613.6m 2.1g 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
This command displays an error summary for all ports.
One output line is displayed per port, and shows error counters in ones, thousands (the number is followed by k), or millions (the number is followed by m)..
Cause
NA
Resolution
frames tx
Frames transmitted: The number of frames transmitted by the port. This number is a statistic that provides a baseline for the error counters.
frames rx
Frames received: The number of frames transmitted by the port. This number is a statistic that provides a baseline for the error counters.
enc in
Encoding errors inside frames: (RX). The number of 8b/10b encoding errors that have occurred inside frame boundaries. This counter is generally a zero value, although occasional errors may occur on a normal link and give a nonzero result. Minimum compliance with the link-bit error rate specification on a link continuously receiving frames would allow approximately one error every 20 minutes for 1 Gb/s. Reinitialization and reboots of associated Nx-port can also cause these errors. These errors are in the sum for the LLI errors.
crc err
Frames with Cyclic Redundancy Check errors: (RX) The number of frames that have failed a Cyclic Redundancy Check. The Cyclic Redundancy Check (CRC) is a four-byte field that shall immediately follow the Data Field and shall be used to verify the data integrity of the frame header and Data Field. SOF (= Start-Of-Frame) and EOF (= End-Of-Frame) delimiters shall not be included in the CRC verification. The CRC field shall be calculated on the Frame header and Data Field prior to encoding for transmission and after decoding upon reception. The CRC field shall be aligned on a word boundary. For the purpose of CRC computation, the bit of the word-aligned four-byte field that corresponds to the first bit transmitted is the highest order bit. Frames that fail a CRC are noted but not modified and the destination device is responsible for rejecting and/or re-requesting the frame. Statistically, enc out errors alone imply cable problems, the enc out and crc err in combination implies GBIC/SFP problems. These errors are in the sum for the LLI errors
crc g_eof
CRC with good EOF (End Of Frame) recieved. (Rx). When a CRC with a good EOF is detected the switch will increment the crc g_eof counter, it will tag the frame so no other port will count this CRC frame and forward the frame on.
This allows CRC frame with good EOF to be quickly traced to the originating port.
too short
The "too short" counter is an error statistics counter which is incremented whenever a frame, bounded by an SOF (Start of Frame) and EOF (End of Frame), is received and the number of words between the SOF and EOF is less than 7 words (6 words header plus 1 word CRC), i.e., 38 bytes (not 48) including the SOF and EOF. This could be caused by the transmitter, or an unreliable link. Data frame size is a variable from 0 to 2112. These errors are in the sum for the LLI errors
too long
Frames longer than maximum: The number of frames that are longer than the maximum frame size (36 bytes + data frame size). Data frame size is a variable from 0 to 2112. These errors are in the sum for the LLI errors. FC frames are 2148 byes maximum. If an EOF is corrupted or data generation is incorrect a too long error is generated.
bad eof
Frames with bad end-of-frame delimiters: The end-of-frame (EOF) delimiter is an ordered set that immediately follows the CRC. After a loss-of-synchronization error continuous-mode alignment allows the receiver to reestablish word alignment at any point in the incoming bit stream while the receiver is Operational. Such realignment is likely (but not guaranteed) to result in Code Violations and subsequent loss of Synchronization. Under certain conditions, it may be possible to realign an incoming bit stream without loss of Synchronization. If such a realignment occurs within a received frame, detection of the resulting error condition is dependent upon higher-level function (e.g., invalid CRC, missing EOF Delimiter).
The EOF delimiter shall designate the end of the frame content and shall be followed by idles. There are three categories of EOF delimiters. One category of delimiter shall indicate that the frame is valid from the senders perspective and potentially valid from the receivers perspective. The second category shall indicate that the frame content is valid. This category shall only be used by an F-Port which receives a complete frame and decodes it before forwarding that frame on to another destination. The third category shall indicate the frame content is corrupted and the frame was truncated during transmission. The third category shall be used by both N-Ports and F-Ports to indicate an internal malfunction, such as a transmitter failure, which does not allow the entire frame to be transmitted normally. These errors are in the sum for the LLI errors.
enc out
8bit/10bit encoding errors occurred in words (ordered sets) outside the FC frame. Words outside of frames are encoded, if this encoding is corrupted or an error is detected enc out is generated.
Encoding error outside of frames: The number of 8b/10b encoding errors that have occurred outside frames boundaries. This counter may become a nonzero value during link initialization but indicates a problem if it increments faster than the link-bit error rate allows (approximately once every 20 minutes for 1 Gb/s). This is usually caused by corrupted primitive sequences, that is: LIP f7,f7.
NOTE: loss sig, loss sync and enc out errors are expected every time a user brings the port down and up by rebooting a host, power cycles a storage sub-system, disconnects/reconnects a cable or invoke the portDisable/portEnable command. Also important is the fact, that these errors are also increasing, while a 2GBit switch negotiates the connection speed to its connected device - keep this in mind. Statistically, enc out errors alone imply cable problems, the enc out and crc err in combination implies SFP problems. These errors are in the sum for the LLI errors.
disc c3
Number of Class 3 discared frames (Rx). Counter includes the sum of the following C3 discard counters reported by portstatshow command:
er_rx_c3_timeout, er_tx_c2_timeout, er_c2_dest_unreach, and er_other_disc
Discard class 3 errors could be generated by switch when devices send frames without FLOGI'ing first or with an invalid destination. This error is just reporting that such a discard occurred.
Class 3 frames can be discarded due to timeouts or invalid or unreachable destinations. This counter increment during normal operation. It can also be used to show the effect of port congestion, means good frames from consecutive S-ID's and D-ID's not being directly routed port to port, but instead an exception frame is routed through the internal port (that normally should not happen with a port to port routing on the ASIC, but it does when the D-ID port suffers a buffer full condition and cannot accept any more frames). Also, if the destination is blocked due to high ISL workload (means that is: a long time with BB Credit Buffer = 0), that can cause buffer full conditions, therefore the S-ID port may (in extreme circumstances) meet a timeout condition and therefore the disc c3 counter will increase. These errors are in the sum for the LLI errors.
Some further information: A port can only receive one frame at a time (outside of xWDM connections it is not possible to shine 2 light pulses down an optical cable at the same time). Therefore if two light sources try to share a port they have to use an arbitration algorithm where one light source goes through and the second waits for it is turn. When the first source has completed, the second source is allowed to go. This means that the sources can only run at 50% utilization (or equal busy and ready times). If the source is capable of streaming data at the speed of the D-ID (which a lot of HBA's are these days) any attempt by another similarly fast HBA will result in a 50% performance decrease.
er_unreachable are discards logged because the destination could not be reached or due to offline/online of devices on the destination.
er_other_disc are actual discards which do not fall into either of the other defined discard frame categories. According Brocade they are insignificant and have no impact on performance.
link fail
Link failures (LF1 or LF2 states): The number of times the port achieved Link fail1 and/or Link fail 2 states. Received (Rx). If a Port remains in the LR Receive State for a period of time greater than a timeout period (R_T_TOV), a Link Reset Protocol Timeout shall be detected which results in a Link Failure condition (enter the NOS Transmit State).
The link failure also indicates that loss of signal or loss of sync lasting longer than the R_T_TOV value was detected while not in the Offline state
loss sync
Loss of synchronization: The number of times synchronization was lost. Synchronization failures on either bit or Transmission-Word boundaries are not separately identifiable and cause loss-of synchronization errors.
Note: "loss sig", "loss sync" and "enc out" errors are expected every time a user brings the port down and up (by rebooting a host, power cycles a storage sub-system, disconnects/reconnects a cable or invoke the portDisable/portEnable command (loss sig = Loss of signal: The number of times the signal was lost. When a loss-of-signal condition is recognized by an operational receiver, the Loss-Of-Synchronization state shall be entered (if the receiver is not presently in that state). The receiver shall remain in this state until one of the following conditions occur: The loss-of-signal condition is corrected and synchronization is regained - or - the receiver is reset.
Note: "loss sig", "loss sync" and "enc out" errors are expected every time a user brings the port down and up (by rebooting a host, power cycles a storage sub-system, disconnects/reconnects a cable or invoke the portDisable/portEnable command.
loss sig
Number of times a Loss of Signal was received, occurs when a signal is transmitted but none is being received on the same port.
frjt
Frames rejected with F_RJT: The number of Fabric Port Reject Frames. These indicate that the delivery of a frame is being denied. Some reasons for issuing an F_RJT include: Class not supported; invalid header field(s); and N-Port unavailable.
fbsy
If fabric can't deliver a class 2 frame within E_D_TOV frame will be discarded and a F_BSY returned. (Tx) Frames busied with F_BSY: Fabric Port Busy Frame. This frame is issued by the Fabric to indicate that a particular cannot be delivered because the Fabric or the destination N-Port is busy.
c3-timeout tx
The number of transmit class 3 frames discarded at the transmission port due to timeout (platform- and port-specific).
c3-timeout rx
The number of receive class 3 frames received at this port and discarded at the transmission port due to timeout (platform- and port-specific).
pcs err
The number of Physical Coding Sublayer (PCS) block errors. This counter records encoding violations on 10 Gbps or 16 Gbps ports.
In the porterrshow ER_PCS_BLK counter, this is applicable only on platforms that support 10 Gbps or 16 Gbps ports (6505/6510/6520/DCX-8510) and it was introduced with Condor3 ASIC, the GEN5 Platform. This counter is equivalent with enc_out for 8Gb/4Gb link and it's used only for 10GB and 16Gb speed.
The 10Gb and 16Gb links use 64B/66B encoding instead of 8B/10B for data transmission and "pcs err" (=er_pcs_blk) counter records encoding violations on 10 Gbps or 16 Gbps ports detected during decoding.
uncor err
The number of uncorrectable forward error corrections (FEC).
No comments:
Post a Comment