dis_diag is diagnostic utility can be used to perform diagnostic of the local PCI Express network controller and network. It will report various events and situations as errors or warnings. dis_diag should on a well configured and functional network not report any errors or warnings.
dis_diag supports various options, please run dis_diag -h for details
[root@scox-2 ~]# dis_diag -h dis_diag version 5.2.0 ( Fri Jun 10 11:16:26 CEST 2016 ) Usage: dis_diag [options] Options: -help This Help -v Verbose output (Max) -n Skip diagnostic. Print local configuration information -P Skip probing of remote nodes / topology investigation -A Skip use of dishosts.conf file. Probe all nodes. -a (adapter) Perform diagnostic only for selected adapter -V (level) Verbose Level 0 - 9 -clear Clear IRM statistics only -noVer Skip the driver version check [root@scox-2 ~]#
Below you will find an example running dis_diag with verbose level 9. We have added some comments to help explaining some of the details. A high level of PCIe chip knowledge is required to fully understand all details. If you have problems and need help, please send the output of dis_diag -V 9 to support.
[root@Dakar-F ~]# dis_diag -V 9 ================================================================================ Dolphin diagnostic tool -- dis_diag version 5.5.1.0 ( Tue Feb 6 18:09:20 CET 2018 ) ================================================================================ dis_diag compiled in 64 bit mode Driver : Dolphin IRM (GX) 5.5.1.0-d Jan 24th 2018 (rev 3b2ede4) Date : Tue Feb 6 18:54:50 CET 2018 System : Linux Dakar-F 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux >>>> Version information both on the driver (release version and date, SVN revision), and Node software. Number of configured local adapters found: 1 Adapter 0 > Type : MXH830 Mode : NTB NodeId : 4 Serial number : MXH830-BC-000999 MXH chip family : Microchip - PFX MXH chip vendorId : 0x11f8 MXH chip device : 0x8532 MXH chip revision : 0x0 (ZB) EEPROM version : 05 EEPROM vendor info : 0x0000 EEPROM development ver : 1 Card revision : BC >>>> Basic information on this Adapter. Multiple adapters can be shown for systems with more than one adapter. >>>> NodeId shows the NodeId of this Adapter. Topology type : Direct 2 nodes Topology Autodetect : Yes Number of enabled links : 1 Max payload size (MPS) : 256 Multicast group size : 2 MB Prefetchable memory size : 256 MB (BAR2) Non-prefetchable size : 64 MB (BAR4) Clock mode slot : Port Clock mode link : Global >>>> Firmware-configurable settings for this adapter. PCIe slot state : x16, Gen3 (8 GT/s) PCIe slot capabilities : x16, Gen3 (8 GT/s) >>>> Shows the connection the adapter has to the host system - should match documentation. ************************* MXH ADAPTER 0 LINK 0 STATE ************************* Link 0 uptime : 563 seconds Link 0 state : ENABLED Link 0 state : x16, Gen3 (8 GT/s) Link 0 required : x16, Gen3 (8 GT/s) Link 0 capabilities : x16, Gen3 (8 GT/s) Link 0 cable inserted : 1 Link 0 active : 1 Link 0 configuration : NTB >>>> Shows the connection the adapter has to this fabric (other adapter/SBC or switch/switchboard) >>>> Also shows the state of this link - here, enabled, and up for several days. >>>> For adapters supporting more than one link, the information is displayed pr link. **************************** MXH ADAPTER 0 STATUS **************************** Chip temperature : 45 C Adapter state : 1 EEPROM init done : 0 *************** MXH ADAPTER 0, PARTNER INFORMATION FOR LINK 0 *************** Partner adapter type : MXH830 Partner serial number : MXH830-000000 Partner link no : 0 Partner number of ports : 1 >>>> Shows what the information collected from the other adapter/switch/switchboard on this link. ***************************** TEST OF ADAPTER 0 ***************************** OK: MXH chip alive in adapter 0. OK: Link alive in adapter 0. ==> Local adapter 0 ok. ************************ TOPOLOGY SEEN FROM ADAPTER 0 ************************ Adapters found: 2 ----- List of all nodes found: Nodes detected: 0004 0008 >>>> Lists NodeIds of nodes detected on this fabric. The NodeId of this node (configured above) should be present. >>>> Nodes expected to be reachable but missing will cause a warning to be displayed. ******************* INTERRUPT INFORMATION FOR ADAPTER NO 0 ******************* Interrupt counters for MX adapter 0: No of total interrupt ......................... : 0 No of doorbell interrupts ..................... : 0 No of message interrupts ...................... : 0 No of switch events ........................... : 0 No of slot up events .......................... : 0 No of slot down events ........................ : 0 No of uncorrectable slot error events ......... : 0 No of correctable slot error events ........... : 0 No of unclaimed interrupts .................... : 0 ************** INTERRUPT INFORMATION FOR ADAPTER NO 0 - LINK 0 ************** Interrupt counters for MX adapter 0 - Link 0: No of total interrupt ......................... : 0 No of link up events .......................... : 0 No of link down events ........................ : 0 No of uncorrectable link error events ......... : 0 No of correctable link error events .......... : 0 >>>> Various types of interrupts received by the card. ******** PCIe CABLE LINK ERROR INFORMATION FOR ADAPTER NO 0 - Link 0 ******** NACK error cnt - PCIe Cable Link 0 ............ : 0 NACK DLLP transmitted ......................: 0 NACK DLLP received .........................: 0 Uncorrectable error cnt - PCIe Cable Link 0 ... : 0 dlperr cnt ................................ : 0 sdoenerr cnt .............................. : 0 poisoned cnt .............................. : 0 fcperr cnt ................................ : 0 compto cnt ................................ : 0 cabort cnt ................................ : 0 uecomp cnt ................................ : 0 rcvovr cnt ................................ : 0 malformed cnt ............................. : 0 ecrc_cnt .................................. : 0 ur_cnt .................................... : 0 acsv_cnt .................................. : 0 uie_cnt ................................... : 0 mcblktlp cnt .............................. : 0 atopeb cnt ................................ : 0 tlppbe cnt ................................ : 0 Correctable error cnt - PCIe Cable Link 0 ..... : 0 rcverr cnt ................................ : 0 badtlp cnt ................................ : 0 baddllp cnt ............................... : 0 rplyrovr cnt .............................. : 0 rplyto cnt ................................ : 0 advisorynf cnt ............................ : 0 cie cnt ................................... : 0 hlo cnt ................................... : 0 **************** PCIe SLOT ERROR INFORMATION FOR ADAPTER NO 0 **************** NACK error cnt - PCIe Slot .................... : 0 NACK DLLP transmitted ......................: 0 NACK DLLP received .........................: 0 Total uncorrectable error cnt - PCIe Slot ..... : 0 dlperr cnt ................................ : 0 sdoenerr cnt .............................. : 0 poisoned cnt .............................. : 0 fcperr cnt ................................ : 0 compto cnt ................................ : 0 cabort cnt ................................ : 0 uecomp cnt ................................ : 0 rcvovr cnt ................................ : 0 malformed cnt ............................. : 0 ecrc_cnt .................................. : 0 ur_cnt .................................... : 0 acsv_cnt .................................. : 0 uie_cnt ................................... : 0 mcblktlp cnt .............................. : 0 atopeb cnt ................................ : 0 tlppbe cnt ................................ : 0 Total correctable error cnt - PCIe Slot ....... : 0 rcverr cnt ................................ : 0 badtlp cnt ................................ : 0 baddllp cnt ............................... : 0 rplyrovr cnt .............................. : 0 rplyto cnt ................................ : 0 advisorynf cnt ............................ : 0 cie cnt ................................... : 0 hlo cnt ................................... : 0 ********************** LINK RESET CNT FOR ADAPTER NO 0 ********************** Driver initialization : 0 Link check problems (RTG) : 0 Max link check attempts reached (RTG/FE) : 0 Fatal error detected : 0 Link watchdog failure : 0 IOCTL Enabled SCI link (IM) : 0 IOCTL Disable SCI link (IM) : 0 IOCTL Reset issued by user interface (IM) : 0 IOCTL Retrain issued by user interface (IM) : 0 IOCTL Prepare reconfiguration (IM) : 0 IOCTL Enable routing tables (IM) : 0 Stuck cable link : 0 Stuck link Serdes : 0 Unexpected link width : 0 Unexpected link speed : 0 Generic test module : 0 Production test : 0 Power management : 0 ******************** LINK RESET HISTORY FOR ADAPTER NO 0 ******************** Reset reason[0] is the first occurrence ************************* DUMP OF ADAPTER NO 0 CSR'S ************************* Adapter 0: dump of CSR's in MXH chip VID, DID, offset 0x0000, has the value 0x853211f8 PCICMD, PCISTS, offset 0x0004, has the value 0x00100406 RID, CCODE, offset 0x0008, has the value 0x06800000 CLS, LTIMER, HDR, BIST, offset 0x000c, has the value 0x00800010 BAR0, offset 0x0010, has the value 0xdc800000 BAR1, offset 0x0014, has the value 0x00000000 BAR2, offset 0x0018, has the value 0xc000000c BAR3, offset 0x001c, has the value 0x00000000 BAR4, offset 0x0020, has the value 0xd8000000 BAR5, offset 0x0024, has the value 0xdc000000 SUBVID, SUBID, offset 0x002c, has the value 0x083011c8 EROMBASE, offset 0x0030, has the value 0x00000000 CAPPTR, offset 0x0034, has the value 0x00000040 INTRLINE, INTRPIN, MAXLAT, offset 0x003c, has the value 0x000000ff PCIECAP, offset 0x0040, has the value 0x00845005 PCIEDCAP, offset 0x0044, has the value 0x00000000 PCIEDCTL, offset 0x0048, has the value 0x00000000 PCIELCAP, offset 0x004c, has the value 0x00000000 PCIELCTL, PCIELSTS, offset 0x0050, has the value 0x80035c11 PCIEDCAP2, offset 0x0064, has the value 0x00020010 PCIEDCTL2, PCIEDSTS2, offset 0x0068, has the value 0x112c8022 PCIELCAP2, offset 0x006c, has the value 0x0000512f PCIELCTL2, PCIELSTS2, offset 0x0070, has the value 0x20437103 PCICTL, offset 0x0078, has the value 0x00000000 SSIDSSVIDCAP, offset 0x00f0, has the value 0x00000000 SSIDSSVID, offset 0x00f4, has the value 0x00000000 AERUES, offset 0x0104, has the value 0x00000000 SNUMCAP, offset 0x0180, has the value 0x00000000 SNUMLDW, offset 0x0184, has the value 0x00000000 SNUMUDW, offset 0x0188, has the value 0x00000000 *********************** SESSION STATUS FROM ADAPTER 0 *********************** Node 4 TO SESSION FROM SESSION: Session ID : 6669 6669 Disabled count : 0 0 Alive count : 0 2817 Status : 3 timeout cnt : 0 bad probe count: 0 alive fail time: 0 ************************* DUMP OF IRM CONFIG SETTING ************************* impl_session : TRUE dmaAllowed : TRUE ccoherprop : TRUE stampSize : 0 linkMessagesEnabled : TRUE linkMessagesEnabled : TRUE maxVcNumber : 2048 useNonBlockBarriers : TRUE notesOnLogFileOnly : FALSE linkWatchdogMaxRetries : 1 linkWatchdogPeriod : 3 readyToGoDelay : 200 useOsConfig : FALSE IRM Config : 0x1302 PSB Config : 0x8 LC Config : 0x1 Memory Pre-Alloc Size : 0 MB Disable force link training : 0 Min link width : x16 Min link speed : Gen3 Interrupt polling : Immediate (1000, 100) DMA polling : Immediate (1000, 100) ---------------------------------- dis_diag discovered 0 note(s). dis_diag discovered 0 warning(s). dis_diag discovered 0 error(s). TEST RESULT: *PASSED*