Next to the different operating modes, a number of options are available that influence the operation. Not all options have an impact on all operating modes.
In case that you want to specify the list of Cluster Nodes not interactively but on the command line, you can use the option --nodes
together with a comma-separated list of hostnames and/or IP addresses to do so.
Example:
--nodes n01,n02,n03,n04
If this option is provided, existing configuration files like /etc/dis/dishosts.conf
will not be considered.
By default, the complete software stack will be installed to /opt/DIS. To change the installation path, use the --prefix
option.
Example:
--prefix /usr/dolphin
This will install into /usr/dolphin
. It is recommended to install into a dedicated directory that is located on a local storage device (not mounted via the network). When doing a full cluster install (--install-all
, or default operation), the same installation path will be used on all Cluster Nodes, the Cluster Management Node and potentially the installation machine (if different from the Cluster Management Node).
If you are re-running an installation for which the binary RPM package have already been built, you can save time by not building these packages again, but use the existing ones. The packages have to be placed in two subdirectories node_RPMS
and frontend_RPMS
, just as the SIA does. Then, provide the name of the directory containing these two subdirectories to the installer using the --use-rpms
option.
Example:
--use-rpms $HOME/dolphin
The installer does not verify if the provided packages match the installation target, but the RPM installation itself will fail in this case.
If the installed packages should be replaced with the packages build from the SIA you are currently using even if the installed packages are more recent (have a higher version number), use the option --enforce
. This will enforce the installation of the same software version (the one delivered within this SIA) on all Cluster Nodes and the Cluster Management Node no matter what might be installed on any of these machines. Examples:
--enforce
When doing a full cluster install, the installation script will automatically look for the cluster configuration files dishosts.conf
and networkmanager.conf
in the default path /etc/dis
on the installation machine. If these files are not stored in the default path (i.e. because you have created them on another machine or received them from Dolphin and stored them someplace else), you can specify this path using the --config-dir
option.
Example:
--config-dir /tmp
The script will look for both configuration files in /tmp
.
If you need to specify the two configuration files being stored in different locations, use the options --dishosts-conf
<filename>
and --networkmgr-conf
<filename>
, respectively, to specify where each of the configuration files can be found.
The Dolphin eXpressWare drivers have an option to try to enforce a minimum PCIe link width on the PCIe cable . By using the --link-width
<1|2|4|8|16>
you will set the minimum required link-width in the /etc/dis/dishosts.conf
configuration file. If your cluster consists of equipment with different capabilities, you need to edit the /etc/dis/dishosts.conf
to specify the link width for each host.
--link-width 8
To install the SISCI Development package, use the --enable-sisci-development
option. The SISCI development files will by default be installed in /opt/DIS/src.
--enable-sisci-development
The Dolphin eXpressWare drivers have an option to integrate with the NVIDIA CUDA programming environment. By using the --enable-cuda-support
option, the installer will configure and enable CUDA applications to use SISCI functionality to do GPU RDMA transfers to / from GPUs over the PCI Express network.
To use this option, the CUDA programming environment and drivers must be installed and configured. The eXpressWare drivers will fail to load if the CUDA environment is uninstalled or not available.
Please carefully follow the instructions and questions provided by the installer.
CUDA® is a parallel computing platform and programming model invented by NVIDIA.
--enable-cuda-support
The Dolphin eXpressWare release 5.5.0 and newer also includes support for managing PCIe cards in transparent mode. By using the --install-transparent
option, the installer will install the software to manage transparent cards. This option is currently only available with PX cards.
# sh ./Dolphin_eXpressWare-<version> --install-transparent
The Dolphin eXpressWare release 5.5.0 and newer also includes new functionality to access and manage Transparent Devices. By using the --enable-smartio
option, the installer will install the SmartIO software. This option is only available with newer Linux kernels and PCIe networks based on Broadcom or Microsemi PCIe technology. Please consult the software release note for details.
# sh ./Dolphin_eXpressWare-<version> --enable-smartio
SuperSockets is an optional component of the Dolphin eXpressWare software. It enables networked applications to use PCIe as a network without any changes to the application. By using the --enable-supersockets
option, the SuperSockets software will be installed.
# sh ./Dolphin_eXpressWare-<version> --enable-supersockets
In case you want to run the installation unattended, you can use the --batch
option to have the script assume the default answer for every question that is asked. Additionally, you can avoid most of the console output (but still have the full log file) by providing the option --quiet
. This option can be very useful if you are upgrading an already installed cluster. I.e., to enforce the installation of newly compiled RPM packages and reboot the Cluster Nodes after the installation, you could issue the following command on the Cluster Management Node:
# sh ./Dolphin_eXpressWare-<version> --batch --reboot --enforce >install.log
After this command returns, your cluster is guaranteed to be freshly installed unless any error messages can be found in the file install.log
.
When building RPMs only (using the --build-rpm
option), it is possible to specify that no GUI-applications ( dis_admin and dis_netconfig
) should be build. This is done by providing the --disable-gui
option. This removes the dependency on the QT libraries and header files for the build process Example:
--disable-gui
To remove all software that has been installed via SIA, simply use the --uninstall option:
--uninstall
This will remove all packages from the Cluster Node, and stop all drivers (if they are not in use). A more thorough cleanup, including all configuration data and possible remainings of non-SIA installations, can be achieved with the --wipe
options:
--wipe
This option is a superset of --uninstall
.
--wipe-all
This will attempt to contact all nodes specified in the dishosts.conf file and execute --wipe
on the whole cluster. The result is a cluster-wide uninstallation.