Step 2: Create groups and assign functional roles to the nodes
In Omnia, nodes are organized based on their assigned groups and functional groups. By combining both groups and functional groups, Omnia offers a powerful and flexible approach to manage large-scale node infrastructures, ensuring both logical organization and physical optimization of resources.
A group is based on the physical characteristics of the nodes. It refers to nodes that are located in the same place or have similar hardware. For example, nodes in the same rack or SU (Scalable Unit) might be grouped together, with specific functional groups like Service Kube Node or Slurm Control Node. Groups help with physical organization and management of nodes.
A functional group defines what a node does in the system. It is a way to categorize nodes based on their functionality. Functional groups help group nodes that perform similar tasks, making it easier to manage and assign resources. For example, a node could belong to a functional group such as:
Service Kube Node
Login Node
Login Compiler Node
Slurm Control Node
Slurm Node
Both functional groups and groups must be configured in the functional_groups_config.yml input file. This file defines how nodes are organized in Omnia, including their functional roles and group assignments.
Create Groups
Nodes that are located in the same place or similar hardware can be grouped together. To do so, update the functional_groups_config.yml input file in the /opt/omnia/input/project_default directory which includes all necessary attributes for the nodes, based on their role within the cluster. Each group will have following attributes as indicated in the table below:
Attribute |
Mandatory/Conditional mandatory/Optional |
Description |
|---|---|---|
Group Name - |
Mandatory |
|
Location of the node - |
Mandatory |
Note This attribute is case-sensitive. Ensure to use uppercase characters only. |
Parent of the node- “parent’’ |
Conditional Mandatory |
|
Create Functional groups
Nodes with similar functional roles or functionalities can be grouped together. The following table lists the functional groups available in Omnia.
Note
At least one functional group is mandatory, and you must not change the name of functional groups.
Each group name must be unique across all functional groups in the
functional_groups_config.ymlfile.The functional groups are case-sensitive in nature.
Omnia supports HA functionality for the
service_cluster. For more information, click here.To set up a service cluster, the
service_kube_nodemust be present in the/opt/omnia/input/project_default/functional_groups_config.yml.
Functional Group Name |
Layer |
Details |
|---|---|---|
Slurm control plane - |
Management |
Example: functional_groups:
- name: "slurm_control_node_x86_64"
cluster_name: "slurm_cluster"
group:
- grp0
|
Slurm worker node - |
Compute |
Example: functional_groups:
- name: "slurm_node_x86_64"
cluster_name: "slurm_cluster""
group:
-grp1
|
Slurm worker node - |
Compute |
Example: functional_groups:
- name: "slurm_node_aarch64"
cluster_name: "slurm_cluster"
group:
-grp2
|
Service Cluster Kubernetes worker node - |
Management |
Example: functional_groups:
- name: "service_kube_node_x86_64"
cluster_name: "service_k8s_cluster"
group:
-grp3
|
Login node - |
Management |
Example: functional_groups:
- name: "login_node_x86_64"
cluster_name: "slurm_cluster"
group:
-grp4
|
Login node - |
Management |
Example: functional_groups:
- name: "login_node_aarch64"
cluster_name: "slurm_cluster"
group:
-grp5
|
Login and Compiler node - |
Management |
Example: functional_groups:
- name: "login_compiler_node_x86_64"
cluster_name: "slurm_cluster"
group:
-grp6
|
Login and Compiler node- |
Management |
Example: functional_groups:
- name: "login_compiler_node_aarch64"
cluster_name: "slurm_cluster""
group:
-grp7
|
Recommended Software by functional groups
The following table lists the functional groups along with the recommended software to be deployed on each group.
Caution
Ensure that the software_config.json file contains all required inputs for the software to be deployed on each functional group. For more information, see Input parameters for Local Repositories.
Functional Group Name |
Recommended Software |
|---|---|
service_kube_node_x86_64 |
service_k8s.json, nfs.json, |
slurm_control_node_x86_64 |
slurm_custom.json, nfs.json, openldap.json |
slurm_node_x86_64 |
slurm_custom.json, nfs.json, openldap.json |
slurm_node_aarch64 |
slurm_custom.json, nfs.json, openldap.json |
login_node_x86_64 |
slurm_custom.json, nfs.json, openldap.json |
login_node_aarch64 |
slurm_custom.json, nfs.json, openldap.json |
login_compiler_node_x86_64 |
slurm_custom.json, nfs.json, openldap.json, ucx.json, openmpi.json |
login_compiler_node_aarch64 |
slurm_custom.json, nfs.json, openldap.json, ucx.json, openmpi.json |
Sample
Here’s a sample (using mapping file) for your reference:
groups:
grp0:
location_id: SU-1.RACK-1
parent: ""
grp1:
location_id: SU-1.RACK-1
parent: ""
functional_groups:
- name: "slurm_control_node_x86_64"
cluster_name: "slurm_cluster"
group:
- grp0
If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.