Skip to content

Commit

Permalink
Merge pull request #2028 from cgoveas/devel-1.4.2.2
Browse files Browse the repository at this point in the history
Local repos, input parameters and building clusters
  • Loading branch information
DeepikaKrishnaiah committed May 24, 2023
2 parents d4b5603 + 7b0ba88 commit c7b386e
Show file tree
Hide file tree
Showing 28 changed files with 87 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ Using FreeIPA

Enter the following parameters in ``input/security_config.yml``.

.. warning:: Do not remove or comment any lines in the ``input/security_config.yml`` file.

+----------------------------+----------------------------------------------------------------------------------------------+
| Parameter | Details |
+============================+==============================================================================================+
Expand Down
2 changes: 2 additions & 0 deletions docs/source/InstallationGuides/BuildingClusters/BeeGFS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ Installing the BeeGFS client via Omnia

After the required parameters are filled in ``input/storage_config.yml``, Omnia installs BeeGFS on manager and compute nodes while executing the ``omnia.yml`` playbook.

.. warning:: Do not remove or comment any lines in the ``input/storage_config.yml`` file.

+---------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Name | Details |
+=================================+======================================================================================================================================================================================================================================================+
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
Input parameters for the cluster
-------------------------------------

These parameters are located in ``input/omnia_config.yml``
These parameters are located in ``input/omnia_config.yml``.

.. warning:: Do not remove or comment any lines in the ``input/omnia_config.yml`` file.

.. csv-table:: Parameters
:file: ../../Tables/scheduler.csv
Expand Down
2 changes: 2 additions & 0 deletions docs/source/InstallationGuides/ConfiguringStorage/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ To configure powervault ME4 and ME5 storage arrays, follow the below steps:

Fill out all required parameters in ``storage/powervault_input.yml``:

.. warning:: Do not remove or comment any lines in the ``storage/powervault_input.yml`` file.

+--------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parameter | Details |
+================================+===========================================================================================================================================================================================================================================================+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Configuring ethernet switches (Z series)

* Edit the ``network/ethernet_zseries_input.yml`` file for all Z series PowerSwitches such as Z9332F-ON, Z9262-ON and Z9264F-ON. The default configuration is written for Z9264F-ON.

.. warning:: Do not remove or comment any lines in the ``network/ethernet_zseries_input.yml`` file.

+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Name | Details |
+============================+=====================================================================================================================================================================================+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ Configuring ethernet switches (S3 and S4 series)

* Edit the ``network/ethernet_tor_input.yml`` file for all S3* and S4* PowerSwitches such as S3048-ON, S4048T-ON, S4112F-ON, S4048-ON, S4048T-ON, S4112F-ON, S4112T-ON, and S4128F-ON.

.. warning:: Do not remove or comment any lines in the ``network/ethernet_tor_input.yml`` file.

+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Name | Details |
+============================+=====================================================================================================================================================================================+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ Configuring ethernet switches (S5 series)

* Edit the ``network/ethernet_sseries_input.yml`` file for all S5* PowerSwitches such as S5232F-ON.

.. warning:: Do not remove or comment any lines in the ``network/ethernet_sseries_input.yml`` file.

+----------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Name | Details |
+============================+=====================================================================================================================================================================================+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ Depending on the number of ports available on your Infiniband switch, they can b

Input the configuration variables into the ``network/infiniband_edr_input.yml`` or ``network/infiniband_hdr_input.yml`` as appropriate:

.. warning:: Do not remove or comment any lines in the ``network/infiniband_edr_input.yml`` or ``network/infiniband_hdr_input.yml`` file.

+-------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parameters | Details |
+=========================+========================================================================================================================================================================+
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ For automatic provisioning of servers and discovery, the BMC method can be used.

The following parameters need to be populated in ``input/provision_config.yml`` to discover target nodes using BMC.

.. warning:: Do not remove or comment any lines in the ``input/provision_config.yml`` file.

.. csv-table:: Parameters
:file: ../../../Tables/bmc.csv
:header-rows: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ Manually collect PXE NIC information for target servers and manually define them

The following parameters need to be populated in ``input/provision_config.yml`` to discover target nodes using a mapping file.

.. warning:: Do not remove or comment any lines in the ``input/provision_config.yml`` file.

.. csv-table:: Parameters
:file: ../../../Tables/mapping.csv
:header-rows: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@ Use ``show snmp community`` to verify your changes.

.. note:: The commands provided above sets the SNMP community string of the switch to ``public``. Ensure that the community string set above matches the value provided in ``pxe_switch_snmp_community_string`` in ``input/provision_config.yml``

.. warning:: Target servers with LOM architecture is not supported.
.. warning::
* Target servers with LOM architecture is not supported.
* Do not remove or comment any lines in the ``input/provision_config.yml`` file.

.. csv-table:: Parameters
:file: ../../../Tables/snmpwalk.csv
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ switch_based

The following parameters need to be populated in ``input/provision_config.yml`` to discover target nodes using a mapping file.

.. warning:: Do not remove or comment any lines in the ``input/provision_config.yml`` file.

.. csv-table:: Parameters
:file: ../../../Tables/switch-based.csv
:header-rows: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
After running the provision tool
--------------------------------

This script is optional. To skip over to Slurm, Kubernetes, NFS, BeeGFS and Authentication setup, `click here <../BuildingClusters/index.html>`_.


Once the **servers are provisioned**, run the post provision script to:

* Create ``node_inventory`` in ``/opt/omnia`` listing provisioned nodes. ::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ Input parameters for the provision tool

Fill in all provision-specific parameters in ``input/provision_config.yml``

.. warning:: Do not remove or comment any lines in the ``input/provision_config.yml`` file.

+----------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Parameter | Details |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,27 +16,28 @@ Before you run the provision tool

2. `RHEL 8.x <https://www.redhat.com/en/enterprise-linux-8>`_


Note the compatibility between cluster OS and control plane OS below:

+---------------------+--------------------+------------------+
| | | |
| Control Plane OS | Compute Node OS | Compatibility |
+=====================+====================+==================+
| | | |
| RHEL | RHEL | Yes |
| RHEL [2]_ | RHEL | Yes |
+---------------------+--------------------+------------------+
| | | |
| RHEL | Rocky | Yes |
| RHEL [2]_ | Rocky | Yes |
+---------------------+--------------------+------------------+
| | | |
| Rocky | RHEL | Yes[1]_ |
| Rocky | RHEL | Yes [1]_ |
+---------------------+--------------------+------------------+
| | | |
| Rocky | Rocky | Yes |
+---------------------+--------------------+------------------+

.. [1] For a Rocky control plane and RHEL compute nodes, it is mandatory to populate ``rhel_repo_local_path`` in ``input/provision_config.yml``.
.. [2] Ensure that control planes running RHEL have an active subscription or are configured to access local repositories. The following repositories should be enabled on the control plane: **AppStream**, **Code Ready Builder (CRB)**, **BaseOS**.
* To set up CUDA and OFED using the provisioning tool, download the required repositories from here:

Expand Down
4 changes: 4 additions & 0 deletions input/accelerator_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# This variable accepts the amd gpu version for the RHEL specific OS version
# Verify if the version provided is present in the repo for the OS version on your node
# Verify the url for the compatible version: https://repo.radeon.com/amdgpu/
Expand Down
4 changes: 4 additions & 0 deletions input/login_node_security_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Maximum number of consecutive failures before lockout
# The default value of this variable can't be changed
# Default value: 3
Expand Down
4 changes: 4 additions & 0 deletions input/monitor_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Username for Dockerhub account
# This will be used for Docker login and a kubernetes secret will be created and patched to service account in default namespace.
# This kubernetes secret can be used to pull images from private repositories
Expand Down
4 changes: 4 additions & 0 deletions input/network_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Absolute path to local copy of .tgz file containing mlnx_ofed package.
# The package can be downloaded from https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/
# Optional variable.
Expand Down
4 changes: 4 additions & 0 deletions input/omnia_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Password used for Slurm database.
# The Length of the password should be at least 8.
# The password must not contain -,\, ',"
Expand Down
4 changes: 4 additions & 0 deletions input/passwordless_ssh_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# This variable accepts the user name for which passwordless ssh needs to be setup
user_name: ""

Expand Down
4 changes: 4 additions & 0 deletions input/provision_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Mandatory
# This variable is used to depict the network type for the omnia cluster
# Lom is supported by discovery_mechanism: mapping, bmc and switch_based
Expand Down
4 changes: 4 additions & 0 deletions input/rhsm_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# ---------SUBSCRIPTION MANAGER OPTIONS-----------

# Method to use for activation: portal or satellite.
Expand Down
4 changes: 4 additions & 0 deletions input/security_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Boolean indicating whether FreeIPA is required or not
# It can be set to true or false
# By default it is set to true indicating FreeIPA will be installed on all the nodes
Expand Down
4 changes: 4 additions & 0 deletions input/storage_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---


# NFS bolt-on support, USER have to mount EXTERNAL NFS server, omnia will mount NFS client
# This variable is used for supporting NFS bolt-on on login_node, compute, and manager nodes
Expand Down
4 changes: 4 additions & 0 deletions input/telemetry_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# This variable is used to enable iDRAC telemetry support and visualizations
# Accepted values: "true" or "false"
idrac_telemetry_support: true
Expand Down
4 changes: 4 additions & 0 deletions storage/nfs_server_input.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

# Mandatory field when nfs_node group is defined with an IP and omnia is required to configure nfs server.
# IP of Powervault connected to NFS Server should be provided.
# In a single run of omnia, only one NFS Server is configured.
Expand Down
4 changes: 4 additions & 0 deletions storage/powervault_input.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,10 @@
# limitations under the License.
---

# Do NOT remove or comment out any lines in this file. Simply append the required values against the parameters of your choice.

---

### Usage: powervault ###

# This variable indicates the protocol used by powervault to connect to NFS node.
Expand Down

0 comments on commit c7b386e

Please sign in to comment.