Deploy iDRAC telemetry service on the service cluster

To deploy telemetry service on the service cluster and collect iDRAC telemetry data using Kafka, refer to the following guide.

Prerequisites

  • Redfish must be enabled in iDRAC.

  • If the internet connection is required on the service Kube node, configure it after the node is booted.

  • All service cluster nodes should have access to the Internet.

  • iDRAC firmware must be updated to the latest version.

  • Datacenter license must be installed on the nodes.

  • Ensure that the correct node service tags are being displayed on the iDRAC interface. Otherwise, telemetry data cannot be collected by the idrac_telemetry_receiver container.

  • For telemetry collection on service cluster, all BMC (iDRAC) IPs must be reachable from the service cluster nodes.

  • Ensure that the service_k8s_cluster.yml playbook has been executed successfully and Kubernetes on the service K8s controller node is up and running.

  • Ensure that discovery.yml playbook has been executed successfully with secvice_kube_node_x86_64 in the functional_groups_config.yml, and the bmc_group_data.csv file has been generated.

  • Before running the telemetry.yml playbook for the service cluster, ensure that all the service K8s compute node are reachable and booted and have been configured in the service K8s cluster.

Steps

  1. In the functional_groups_config.yml file, specify the service tag of the service kube node as the parent for the slurm nodes.

  2. Fill up the omnia_config.yml and telemetry_config.yml:

    omnia_config.yml

    Variables

    Mandatory/Optional

    Details

    cluster_name

    Mandatory

    • Type: String

    • Name of the cluster on which you want to deploy Kubernetes.

    • This input is case-sensitive. Do not add any special characters except _ (underscore) in the cluster name.

    deployment

    Mandatory

    • Type: Boolean

    • Indicates if Kubernetes will be deployed or not.

    • Accepted values: true or false

    k8s_cni

    Mandatory

    • Type: String

    • Kubernetes SDN network.

    • Accepted values: calico

    • Default value: calico

    pod_external_ip_range

    Mandatory

    • Type: String

    • These addresses will be used by the loadbalancer for assigning external IPs to Kubernetes services.

    • Ensure that the IP range provided is not assigned to any node in the cluster.

    • Sample values: 172.16.107.170-172.16.107.200

    k8s_service_addresses

    Optional

    • Type: String

    • Kubernetes internal network for services.

    • This network must be unused in your network infrastructure.

    • Default value: "10.233.0.0/18"

    k8s_pod_network_cidr

    Optional

    • Type: String

    • Kubernetes pod network CIDR for internal network. When used, it will assign IP addresses from this range to individual pods.

    • This network must be unused in your network infrastructure.

    • Default value: "10.233.64.0/18"

    topology_manager_policy

    Optional

    • Type: String

    • Kubernetes Topology manager policies.

    • Accepted values: none, best-effort, restricted, or single-numa-node.

    • Default value: none

    • Example: topology_manager_policy: "none"

    topology_manager_scope

    Optional

    • Type: String

    • Kubernetes Topology manager scope.

    • Accepted values are container or pod.

    • Default value: container

    • Example: topology_manager_scope: "container"

    k8s_offline_install

    Optional

    • Accepted value: true

    • With the variable set to true, all packages and images necessary to set up a Kubernetes cluster are pulled from the OIM local repository.

    csi_powerscale_driver_secret_file_path

    Optional

    • Type: File path

    • If you want to deploy the CSI driver for PowerScale on your service cluster, add the file path of the secrets.yaml file to this variable.

    csi_powerscale_driver_values_file_path

    Optional

    • Type: File path

    • If you want to deploy the CSI driver for PowerScale on your service cluster, add the file path of the values.yaml file to this variable.

    nfs_storage_name

    Mandatory

    • Type: String

    • Use same name as mentioned in each of the nfs_name available in storage_config.yml.

    telemetry_config.yml

    Parameter

    Mandatory/Optional

    Details

    idrac_telemetry_support

    Mandatory

    • Type: Boolean

    • If you want iDRAC telemetry support on your service cluster, set this variable to true before executing telemetry.yml or service_k8s_cluster.yml playbooks.

    • Accepted values: true or false

    • Default value: false

    Note

    If idrac_telemetry_support is set to true, mysqldb_user, mysqldb_password, and mysqldb_root_password parameters in the omnia_config_credentials.yml file becomes mandatory.

    kafka_configurations > persistence_size

    Conditional Mandatory

    • Type: Integer

    • The amount of storage allocated for Kafka’s persistent volume.

    • Accepted values: a number followed by Ki, Mi, Gi, Ti, Pi, or Ei.

    • Default value: 8Gi

    kafka_configurations > log_retention_hours

    Conditional Mandatory

    • Type: Integer

    • The number of hours to retain Kafka logs before they are deleted.

    • Default value: 168 (7 days)

    kafka_configurations > log_retention_bytes

    Conditional Mandatory

    • Type: Integer

    • The maximum size of Kafka logs (in bytes) before they are deleted.

    • Default value: -1 (unlimited)

    kafka_configurations > log_segment_bytes

    Conditional Mandatory

    • Type: Integer

    • The maximum size of Kafka log segments (in bytes) before they are deleted.

    • Default value: 1073741824 (1 GB)

    kafka_configurations > topic_partitions

    Conditional Mandatory

    • Type: Integer

    • The number of partitions for each Kafka topic.

    • Increasing this can improve throughput but also increases storage or overhead.

    • Default value: 1

  3. Execute the telemetry.yml playbook.

    cd telemetry
    ansible-playbook telemetry.yml -i inventory
    

Sample telemetry inventory:

[kube_control_plane]
192.168.10.151 bmc_ip=172.10.5.73
192.168.10.152 bmc_ip=172.10.5.74
192.168.10.153 bmc_ip=172.10.5.75

Note

For all nodes in the kube_control_plane group, ensure that the BMC IP address is defined using the bmc_ip variable, in addition to the admin IP address.

Result

The iDRAC telemetry pods along with the mysqldb, activemq, telemetry_receiver, and kafka_pump containers will get deployed on the service_kube_node. The number of iDRAC telemetry pods deployed will be number of service_kube_nodes mentioned as parents in functional_groups_config.yml plus an extra telemetry pod to collect the metric data of OIM, management layer nodes, and the service cluster.

iDRAC telemetry logs collected by the Kafka pump

After applying the telemetry.yml configuration using the Kafka collection type, iDRAC telemetry logs are published to a Kafka topic on the broker. To view the logs, do the following:

  1. Use the following command to view all telemetry pods:

    kubectl get pods -n telemetry
    
  2. Run the following command to access the Kafka pod from which you want to read the logs.

    kubectl exec <kafka-pod> -it  -n telemetry -- bash
    
  3. To read the telemetry logs from the Kafka pod, run the following Kafka console consumer script. For details on using the Kafka consumer, see the: Kafka console consumer documentation

    kafka-console-consumer.sh
     --bootstrap-server localhost:9092
     --topic idrac_telemetry
     --from-beginning
     --consumer.config /tmp/client.properties
    

Note

Metrics visualization using Grafana is not supported for iDRAC telemetry metrics on service cluster.

Accessing the mysqldb database

After telemetry.yml has been executed for the service cluster, you can check the mysqldb database inside the mysqldb container. To view these logs, do the following:

  1. Use the following command to get the names of all the telemetry pods:

    kubectl get pods -n telemetry -l app=idrac-telemetry
    

Note

The idrac-telemetry-0 pod will always be responsible for collecting the telemetry data of the management nodes (oim, service_kube_control_plane, service_kube_node_x86_64, login_node_x86_64, etc.).

  1. Execute the following command:

    kubectl exec -it -n telemetry <iDRAC_telemetry_pod_name> -c mysqldb -- mysql -u <MYSQL_USER> -p
    
  2. When prompted, enter the mysql password to log in.

  3. To enter into the idrac_telemetry_db, use the following command:

    use idrac_telemetrydb;
    
  4. To access the services table:

    Select * from services;
    

If you have any feedback about Omnia documentation, please reach out at omnia.readme@dell.com.