The Dolphin eXpressWare drivers are designed to adapt to the environment they are operating in; therefore, manual configuration is rarely required. The upper limit for memory allocation of the low-level driver is the only setting that may need to be adapted for a cluster, but this is also done automatically during the installation.
Changing parameters in these files can affect reliability and performance of the PCI Express interconnect. Please carefully read the documentation before you make any changes.
The dis_px.conf
is located in the lib/modules
directory of the DIS installation (default /opt/DIS
) and contains options for the eXpressWare hardware driver.
Changing other values than those described below may cause the interconnect to malfunction. Only do so if instructed by Dolphin support.
Preallocation of memory is recommended on systems without IOMMU (like x86 and x86_64). The problem is the memory fragmentation over time which can cause problems to allocate large segments of contiguous physical memory after the system has been running for some time. To overcome this situation, options has been added to let the IRM driver allocate blocks of memory upon initialization and to provide memory from this pool under certain conditions for allocation of remotely accessible memory segments.
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
ntb_memory_preallocation_size_mb | Defines the number of megabytes of memory the driver shall try to allocate upon initialization. | MB | 0: disable preallocation >0: MB to preallocate in as few blocks as possible | 160 |
ntb_cluster_page_size_kb is the cluster page size in KB. In a cluster, the system page size might vary on the different hosts. In this case, the cluster page size must be adjusted to the MAX system page size on the hosts in the cluster. The cluster page size must be the same on all hosts in a cluster. The default page size is 4KB on most systems. Some ARM systems have page sizes of 64K. If the page size is higher than 4K, all hosts communicating with this node must adjust the cluster page size parameter to match the highest page size in the cluster.
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
ntb_cluster_page_size_kb | Defines the cluster page size in kilobytes. | KB | 4 - 64 | 4 |
The default max multicast segment size is 2 Megabytes and 2 multicast groups. Please modify the following parameters to change the multicast setup. If you are making changes to these settings, please reboot the system. If the driver fails to allocate the specified amount of memory, the eXpressWare drivers will fail during initialization. If that happens, try to add more memory to your server or reduce the requirements. Please take a look at Section 12, “Support” if you have any questions.
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
ntb_mcast_group_size | Defines the maximum size of a single multicast group that should be allocated upon driver initialization. The default setting is 21, i,e, 2^21 = 2 Megabytes. | Integer | 0: disable preallocation >0: log2(size) | 21 |
ntb_mcast_max_groups | Defines the number of multicast groups that should be allocated upon driver initialization. | Integer | 1-4 | 2 |
dis_irm.conf
is located in the lib/modules
directory of the DIS installation (default /opt/DIS
) and contains options for the hardware driver (dis_irm kernel module).
Only a few options are to be modified by the user.
Changing other values in dis_irm.conf
than those described below may cause the interconnect to malfunction. Only do so if instructed by Dolphin support.
Whenever a setting in this file is changed, the driver needs to be reloaded to make the new settings effective. Please note that some of the possible settings are commented out in the dis_irm.conf file. Please remove the leading # to change these settings.
These parameters control memory allocations that are only performed on driver initialization.
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
max-vc-number | Maximum number of virtual channels (one virtual channel is needed per remote memory connection; i.e. 2 per SuperSocket connection) | n/a | integers > 0 The upper limit is the consumed memory; values > 16384 are typically not necessary. | 1024 |
These parameters control the IRM driver interrupt polling mechanism used to minimize system interrupt latencies. This functionality will reduce the remote PCIe fabric interrupt latency and interrupt latency associated with DMA transfers. The remote interrupt and DMA polling can be adjusted independently to custom specific requirements or turned off. The polling thread will consume CPU resources but often also significantly reduce DMA latency.
The dis_tool commands "control-intr-polling" and "control-dma-polling" can be used to experiment and change the values on a running system. Values set in dis_irm.conf will automatically be applied each time the driver is reloaded etc.
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
intr_poll_mode | Controls the interrupt poll mechanism available for optimizing remote interrupts. SISCI, SuperSockets and IPoPCIe communication will benefit from this. | n/a | 0 : Disabled. 1 : Delayed On. 2 : Immediate On. 3 : Always On. | 2 |
intr_poll_thresh_on | Controls the delayed on threshold. Interrupt polling will start if the number of interrupts pr second exceeds the threshold on value. | interrupts/sec | Integers > 0 | 1000 |
intr_poll_thresh_off | Controls the delayed off threshold for interrupt polling. Interrupt polling will be turned off when the number of interrupts pr second is lower than the threshold off value. | interrupts/sec | Integers > 0 | 100 |
dma_poll_mode | Controls the interrupt poll mechanism available for optimizing system interrupt latency with DMA transfers. SISCI, SuperSockets and IPoPCIe communication will benefit from this when DMA operations are used. | n/a | 0 : Disabled. 1 : Delayed On. 2 : Immediate On. 3 : Always On. | 2 |
dma_poll_thresh_on | Controls the delayed on threshold for DMA interrupt polling. DMA Interrupt polling will start if the number of interrupts pr second exceeds the threshold on value. | interrupts/sec | Integers > 0 | 1000 |
dma_poll_thresh_off | Controls the delayed off threshold for DMA interrupt polling. DMA Interrupt polling will be turned off when the number of interrupts pr second is lower than the threshold off value. | interrupts/sec | Integers > 0 | 100 |
These parameters control some driver real-time settings. Changes here are normally only needed if you run a real time application or simulation using the SISCI API.
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
linkWatchdogEnabled | Controls the link watchdog behaviour. The link watchdog is a high availability feature to ensure detection of non operational links. This feature is normally not needed but should be left on for additional high availability. The feature introduces a microsecond level jitter. Should be turned off for real-time applications. | Seconds | 0 : Disabled. Integers > 0 : Watchdog period in seconds. | 3 |
sessionHeartbeatsEnabled | Controls the session heartbeats. The session heartbeat mechanism is used for end to end internal heart beating. This feature introduces a microsecond level jitter. Should be turned off for real-time applications | n/a | 0 : Disabled 1 : Enabled. | 1 - enabled |
Option Name | Description | Unit | Valid Values | Default Value |
---|---|---|---|---|
link-messages-enabled | Control logging of non critical link messages during operation. | n/a | 0: no link messages 1: show link messages | 0 |
notes-disabled | Control logging of non critical notices during operation. | n/a | 0: show notice messages 1: no notice messages | 1 |
warn-disabled | Control logging of general warnings during operation. | n/a | 0: show warning messages 1: no warning messages | 0 |
dis_report_resource_outtages | Control logging of out-of-resource messages during operation. | n/a | 0: no messages 1: show messages | 0 |
notes-on-log-file-only | Control printing of driver messages to the system console | n/a | 0: also print to console 1: only print to kernel message log | 0 |
dis_ssocks.conf is a configuration file for the SuperSockets (dis_ssocks) kernel module. The values defined within this file are passed to the dis_ssocks kernel module when it is loaded (part of SuperSockets startup).
If a value different from the default is required, edit and uncomment (remove the #
) the appropriate line.
#address_family=27; #rds_compat=0;
The following keywords are valid:
AF_SSSOCKS address family index. Default value is 27. If not set, the driver will automatically chose another index between 27 and 32 until it finds an unused index. The index currently used can be retrieved via the /proc
file system like cat /proc/net/af_ssocks/family
.
If this value is set explicitly in dis_ssocks.conf, this value will be chosen, and no search for unused values is performed if this value should already be taken (SuperSockets startup will fail).
Generally, this value is only required if SuperSockets should be used explicitly without the preload library, like when using SuperSockets within the kernel.
RDS compatibility level. Default is 0.