Configuring Cluster DNS
This section describes how to enable and configure Cluster DNS for dynamic hostname resolution in Omnia.
Prerequisites
Before enabling Cluster DNS, ensure the following:
Omnia is deployed on the OIM node with OpenCHAMI services running
input/network_spec.ymlis configured with validadmin_network.dnsentries for upstream DNS forwardinginput/provision_config.ymlexists and is validatedThe OIM node is accessible on the admin network
SMD (State Manager Daemon) is running and accessible from the OIM node
Enabling Cluster DNS
To enable Cluster DNS for dynamic hostname resolution:
Edit the
input/provision_config.ymlfile on the OIM node:vi input/provision_config.yml
Set the
dns_enabledparameter totrue:dns_enabled: true
Note
The default value is
false, which preserves the legacy/etc/hostsbehavior.Validate the configuration using the input validator:
python3 common/library/module_utils/input_validation/input_validator.py -i input/
Ensure no validation errors are reported.
Deploy or redeploy OpenCHAMI with coresmd (if not already deployed):
ansible-playbook prepare_oim/prepare_oim.yml
Run the discovery playbook to populate SMD with node inventory:
ansible-playbook discovery/discovery.yml
Run the provisioning playbook to provision nodes with cloud-init containing
/etc/resolv.conf:ansible-playbook provision/provision.yml
Reprovision (reboot) all compute nodes to apply the new cloud-init configuration.
Important
Nodes must be reprovisioned (rebooted) after setting
dns_enabled: truefor the change to take effect. Existing nodes retain their previous configuration until reprovisioned.
Disabling Cluster DNS (Reverting to /etc/hosts)
To revert to the legacy /etc/hosts behavior:
Edit
input/provision_config.ymland setdns_enabledtofalse:dns_enabled: false
Re-run the provisioning playbook to regenerate cloud-init configs:
ansible-playbook provision/provision.yml
Reprovision (reboot) all compute nodes to apply the new cloud-init configuration.
Verify that
/etc/hostscontains all peer entries on compute nodes.Verify that OIM and Slurm node
/etc/hostsare updated by the playbook.
Note
No coresmd or OpenCHAMI changes are needed for rollback. coresmd continues running but compute nodes no longer query it.
Configuration Parameters
User-Facing Configuration
dns_enabled (boolean, default: false)
- Location: input/provision_config.yml
- When true, nodes use coresmd for hostname resolution instead of /etc/hosts
- DNS records are auto-generated from SMD inventory
- The cluster domain is read from OIM metadata (domain_name)
Existing Parameters Used
The following existing parameters are used by Cluster DNS:
admin_network.dns
- Location: input/network_spec.yml
- DNS forwarders for coresmd and K8s CoreDNS external resolution
- Used to forward non-cluster DNS queries (e.g., google.com)
admin_network.primary_oim_admin_ip
- Location: input/network_spec.yml
- Nameserver IP written to compute node /etc/resolv.conf
- The IP address that coresmd listens on for DNS queries
admin_network.additional_subnets
- Location: input/network_spec.yml
- Triggers multi-subnet CoreDHCP config format (if defined)
- Does not directly affect DNS configuration
domain_name
- Location: OIM metadata (set during prepare_oim.yml)
- Cluster domain used as DNS zone and search domain in resolv.conf
- Example: hpc.cluster
cluster_shortname
- Location: OpenCHAMI config
- Hostname pattern prefix (e.g., nid)
- Used to generate DNS record names
cluster_nidlength
- Location: OpenCHAMI config
- Zero-padded node ID length (e.g., 3 produces nid001)
- Used to generate DNS record names
Verification
After enabling Cluster DNS, verify the configuration using the following commands.
Verify Compute Node Resolver Configuration
On any compute node, verify that /etc/resolv.conf is configured correctly:
cat /etc/resolv.conf
Expected output:
search <domain_name>
nameserver <admin_nic_ip>
options timeout:1 attempts:2
Replace <domain_name> with your cluster domain (e.g., hpc.cluster) and <admin_nic_ip> with the OIM admin IP.
Verify No Peer Entries in /etc/hosts
On any compute node, verify that /etc/hosts contains only localhost entries:
cat /etc/hosts
Expected output should show only localhost entries (e.g., 127.0.0.1 localhost.localdomain localhost). No peer node entries should be present.
Verify Forward DNS Resolution
On any compute node, test forward resolution for a cluster hostname:
getent hosts <hostname>
Replace <hostname> with a cluster node hostname (e.g., nid001.hpc.cluster).
Expected output:
<admin_ip> <hostname>.<domain>
Example:
172.16.0.1 nid001.hpc.cluster
Query coresmd Directly
From the OIM node or any node with network access to the OIM, query coresmd directly using dig:
dig <hostname>.<domain> @<admin_nic_ip>
Replace <hostname> with a cluster node hostname, <domain> with your cluster domain, and <admin_nic_ip> with the OIM admin IP.
Expected output should show an A record with the admin IP address.
Example:
dig nid001.hpc.cluster @172.16.107.254
; <<>> DiG <<>> nid001.hpc.cluster @172.16.107.254
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 12345
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; QUESTION SECTION:
;nid001.hpc.cluster. IN A
;; ANSWER SECTION:
nid001.hpc.cluster. 30 IN A 172.16.0.1
Verify Kubernetes CoreDNS Patching (if K8s is Deployed)
If Kubernetes is deployed, verify that the K8s CoreDNS ConfigMap contains the forward zone:
kubectl -n kube-system get configmap coredns -o yaml
Look for a block similar to:
hpc.cluster:53 {
errors
cache 30
forward . 172.16.107.254
}
Replace hpc.cluster with your cluster domain and 172.16.107.254 with your OIM admin IP.
Verify K8s Pod Resolution (if K8s is Deployed)
From a Kubernetes pod, test resolution of a compute node hostname:
kubectl exec -it <pod> -- getent hosts <hostname>.<domain>
Replace <pod> with a pod name and <hostname>.<domain> with a cluster node hostname.
Expected output:
<admin_ip> <hostname>.<domain>
Verify Slurm Functionality
Verify that Slurm starts successfully:
sinfoExpected output should show all nodes in the expected state (e.g.,
IDLEorALLOCATED).Run a test Slurm job:
srun -N <N> hostname
Replace
<N>with the number of nodes to test.Expected output should complete without DNS errors.
Verify MPI Functionality
Run a test MPI job:
mpirun -np 4 -host <host1>,<host2> hostname
Replace <host1> and <host2> with cluster node hostnames.
Expected output should complete without DNS timeouts.
Verify New Node Auto-Resolution
Add a new node to SMD via the discovery playbook:
ansible-playbook discovery/discovery.yml
Wait up to 30 seconds for coresmd to refresh its cache.
From any compute node, test resolution of the new node:
getent hosts <new_hostname>
Expected output should show the new node’s IP address without requiring any playbook re-run.
Troubleshooting
DNS Queries Failing
Symptom: getent hosts <hostname> returns no results or times out.
Possible Causes:
coresmd container is not running
OIM node is unreachable
dns_enabledis not set totrueon the compute nodeNode is not registered in SMD
Resolution Steps:
Check coresmd status on the OIM node:
podman ps | grep coresmd
If not running, start it:
podman start coresmd
Verify OIM network connectivity from the compute node:
ping <admin_nic_ip>
Verify that
/etc/resolv.confis configured correctly on the compute node:cat /etc/resolv.conf
Verify that the node is registered in SMD:
curl -k https://<oim_ip>:8443/v1/nodes | jq '.[] | select(.hostname=="<hostname>")'
Check coresmd logs for errors:
podman logs coresmd
NXDOMAIN Errors
Symptom: DNS queries return NXDOMAIN (non-existent domain).
Possible Causes:
Node is not registered in SMD
Domain name mismatch
Incorrect hostname format
Resolution Steps:
Verify SMD inventory:
curl -k https://<oim_ip>:8443/v1/nodes
Verify the domain name in OIM metadata matches the query domain:
cat /etc/resolv.conf # on compute node # Check the 'search' domain
Verify the hostname format follows the pattern
{cluster_shortname}{zero_padded_id}.{cluster_domain}Check coresmd Corefile configuration:
podman exec coresmd cat /etc/coredns/Corefile
Slow DNS Resolution
Symptom: DNS queries take more than 1 second to respond.
Possible Causes:
coresmd cache miss (first lookup)
SMD API is slow or unreachable
Network latency between compute node and OIM
Resolution Steps:
Check if this is a cache miss by running the query twice (second should be fast)
Check SMD connectivity from coresmd:
podman exec coresmd curl -k https://<smd_url>:8443/v1/nodes
Check network latency:
ping <admin_nic_ip>
Monitor coresmd cache metrics:
curl http://<admin_nic_ip>:9153/metrics | grep coredns_cache
K8s Pods Cannot Resolve Compute Hostnames
Symptom: kubectl exec <pod> -- getent hosts <hostname> fails.
Possible Causes:
K8s CoreDNS ConfigMap was not patched
Forward zone is missing or incorrect
Pod is using host network and lacks resolver configuration
Resolution Steps:
Verify K8s CoreDNS ConfigMap contains the forward zone:
kubectl -n kube-system get configmap coredns -o yaml
If missing, manually patch the ConfigMap or reprovision the first control plane node
Check if the pod is using host network:
kubectl get pod <pod> -o jsonpath='{.spec.hostNetwork}'
If
true, the pod uses the node’s resolver configuration.
Mixed-State Cluster
Symptom: Some nodes resolve via DNS while others use /etc/hosts.
Possible Causes:
Only some nodes were reprovisioned after changing
dns_enabledInconsistent cloud-init configurations
Resolution Steps:
Check
/etc/resolv.confon affected nodes to determine which mode they are usingReprovision all nodes to ensure consistent configuration:
ansible-playbook provision/provision.yml
Reboot all nodes to apply the new cloud-init configuration
Best Practices
Plan DNS Mode Before Deployment
- Decide on DNS mode (/etc/hosts vs DNS) before initial cluster deployment
- Changing mode after deployment requires reprovisioning all nodes
Monitor coresmd Health - Monitor coresmd container status and logs - Use Prometheus metrics (port 9153) to track DNS query performance - Set up alerts for coresmd downtime
Configure Reliable Upstream DNS
- Configure at least two reliable upstream DNS servers in admin_network.dns
- Test upstream DNS connectivity before enabling Cluster DNS
- Monitor upstream DNS server availability
Test Resolution Before Production - Verify DNS resolution from compute nodes before running production workloads - Test Slurm and MPI job execution with DNS enabled - Verify K8s pod resolution if Kubernetes is deployed
Document Domain Configuration
- Record the cluster domain name (domain_name) for reference
- Document the hostname pattern (cluster_shortname and cluster_nidlength)
- Share this information with cluster users for hostname reference
Plan for High Availability - In the current implementation, the OIM node is a single point of failure for DNS - Plan for OIM HA deployment when high availability is required - Monitor OIM node health and have a recovery plan
Use Short-Name Resolution
- Leverage the search <domain_name> directive in /etc/resolv.conf
- Users can use short hostnames (e.g., nid001) instead of FQDNs (e.g., nid001.hpc.cluster)
- Simplifies Slurm and MPI job configuration
Validate After Node Changes
- After adding or removing nodes, verify DNS resolution within 30 seconds
- Check SMD inventory to confirm node registration
- Use dig or getent hosts to test resolution
Limitations Considerations - Be aware that reverse DNS (PTR records) are not supported - Plan for workarounds if applications require reverse DNS - Note that InfiniBand-specific DNS is not provided - Ensure MPI workloads use UCX auto-detection for IB transport
Migration from /etc/hosts to DNS
To migrate an existing cluster from /etc/hosts to DNS:
Backup Current Configuration - Document current
/etc/hostsentries on a sample node - Record any manual hostname entries that may need special handlingEnable DNS Mode - Set
dns_enabled: trueininput/provision_config.yml- Validate the configurationReprovision Nodes - Run
ansible-playbook provision/provision.yml- Reprovision all nodes (reboot into cloud-init) - Monitor node boot and cloud-init executionVerify DNS Resolution - Test resolution from each node type (compute, Slurm controller, login, K8s) - Verify Slurm functionality with
sinfoand test jobs - Verify MPI job execution - Verify K8s pod resolution if applicableClean Up Stale /etc/hosts Entries - After verification,
/etc/hostsentries are no longer needed - The playbook skips/etc/hostsupdates when DNS is enabled - Manual cleanup is not required but can be performed if desiredUpdate Documentation - Update cluster documentation to reflect DNS mode - Inform users about the change in hostname resolution method - Provide troubleshooting guidance for DNS-related issues
Rollback from DNS to /etc/hosts
To rollback from DNS to /etc/hosts:
Disable DNS Mode - Set
dns_enabled: falseininput/provision_config.yml- Validate the configurationReprovision Nodes - Run
ansible-playbook provision/provision.yml- Reprovision all nodes (reboot into cloud-init) - Monitor node boot and cloud-init executionVerify /etc/hosts Entries - Verify that
/etc/hostscontains all peer entries on compute nodes - Verify that OIM and Slurm node/etc/hostsare updated by the playbookVerify Functionality - Test resolution from each node type using
getent hosts- Verify Slurm functionality - Verify MPI job execution - Verify K8s functionality (pods use node’s/etc/hosts)Update Documentation - Update cluster documentation to reflect
/etc/hostsmode - Inform users about the change in hostname resolution method
Note
coresmd continues running after rollback but compute nodes no longer query it. No coresmd or OpenCHAMI changes are needed for rollback.
If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.