gibizer/Chapter 3. Configuring Compute nodes for performance.md

## Chapter 3. Configuring Compute nodes for performance.md

      
    Raw
  

              Chapter 3. Configuring Compute nodes for performance.md
            
          
    Chapter 3. Configuring Compute nodes for performance

# (gibi): this chapter is good as is
3.1. Configuring CPU pinning on Compute nodes

# (gibi): mostly good. But the 3rd bullet point needs a reword as we don't have overcloud any more. I think we can even drop the 3rd bullet point

Designate Compute nodes for CPU pinning.
Configure the Compute nodes to reserve host cores for pinned instance vCPU processes, floating instance vCPU processes, and host processes.

3. Deploy the overcloud.

Create a flavor for launching instances that require CPU pinning.
Create a flavor for launching instances that use shared, or floating, CPUs.

3.1.2. Designating Compute nodes for CPU pinning

# (gibi): we can simplify a lot here
# (gibi): the note make sense but needs a bit of rewording

⚠️ Note
The following procedure applies to new OpentStackDataPlaneNodeSet CRs that have not yet been provisioned. To reconfigure and OpentStackDataPlaneNodeSet that has already been provisioned, you must drain the guest VMS from all the nodes in the NodeSet first.

# (gibi): we can drop most of the original procedures here. We don't have a product specific way to do HW inspection so what I write below is generic linux procedure


Gather the IP addresses of the compute nodes in the NodeSet you want to configure with CPU pinning.
Collect the IP addresses from the OpenStackDataPlaneNodeSet you want to re-configure.
$ oc get OpenStackDataPlaneNodeSet/openstack-edpm -o go-template --template '{{range $_, $node := .spec.nodes}} {{$node.ansible.ansibleHost}} {{end}}'
"192.168.122.100"
"192.168.122.101"


Check the available PCPUs on these hosts
$ IPS=$(oc get OpenStackDataPlaneNodeSet/openstack-edpm -o go-template --template '{{range $_, $node := .spec.nodes}} {{$node.ansible.ansibleHost}} {{end}}')
$ for IP in $IPS; do echo $IP ; ssh root@$IP lscpu | grep -e 'NUMA node.* CPU(s):' ; done
192.168.122.100
NUMA node0 CPU(s):                  0-1,4-5
192.168.122.101
NUMA node0 CPU(s):                  2-3,6-7


3.1.3. Configuring Compute nodes for CPU pinning

# (gibi): the paragraph and table before procedure looks good as is.
Procedure
# (gibi): the original step 2. can be dropped, the NUMATopologyFilter already enabled by default in NG


Create an ConfigMap with nova-compute configuration snippet to configure the Compute nodes to reserve cores for pinned instances, floating instances, and host processes.
For example create the nova-compute-cpu-pinning.yaml:
apiVersion: v1
data:
  25-nova-extra.conf: |
    [compute]
    cpu_shared_set = 2,6
    cpu_dedicated_set = 1,3,5,7
    reserved_host_memory_mb = <ram>
 
kind: ConfigMap
metadata:
  name: nova-compute-cpu-pinning
  namespace: openstack

You can use cpu_dedicated_set to reserve physical CPU cores for the dedicated instances. You can use cpu_shared_set to reserve physical CPU cores for the shared instances. You can use reserved_host_memory_mb to specify the amount of RAM to reserve for host processes. Replace <ram> with the amount of RAM to reserve in MB.
Create the ConfigMap from nova-compute-cpu-pinning.yaml
$ oc apply -f nova-compute-cpu-pinning.yaml
configmap/nova-compute-cpu-pinning created


Define OpenStackDataPlaneService/nova-custom based on the default OpenStackDataPlaneService/nova to be able to include the new ConfigMap. The default OpenStackDataPlaneService/nova cannot be modified (if modified the system will override the user modification).
For example create dataplaneservice_nova-custom.yaml:
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneService
  name: nova-custom
  namespace: openstack
spec:
  label: dataplane-deployment-nova
  playbook: osp.edpm.nova
  secrets:
  - nova-cell1-compute-config
  configMaps:
  - nova-compute-cpu-pinning

Create the OpenStackDataPlaneService/nova-custom service:
$ oc apply -f dataplaneservice_nova-custom.yaml
openstackdataplaneservice.dataplane.openstack.org/nova-custom created


Modify the OpenStackDataPlaneNodeSet to use the newly created nova-custom service instead of the default nova service.
If the current state is:
$ oc get OpenStackDataPlaneNodeSet/openstack-edpm -o go-template --template '{{range $service := .spec.services}}{{println $service}}{{end}}'
repo-setup
download-cache
configure-network
validate-network
install-os
configure-os
run-os
ovn
neutron-metadata
libvirt
nova
telemetry
then use oc edit OpenStackDataPlaneNodeSet/openstack-edpm and change the definition to look like this:
$ oc get OpenStackDataPlaneNodeSet/openstack-edpm -o go-template --template '{{range $service := .spec.services}}{{println $service}}{{end}}'
repo-setup
download-cache
configure-network
validate-network
install-os
configure-os
run-os
ovn
neutron-metadata
libvirt
nova-custom
telemetry


Optionally to ensure that host processes do not run on the CPU cores reserved for instances, set the ansible variable edpm_tuned_isolated_cores in the OpenStackDataPlaneNodeSet to the CPU cores you have reserved for instances
$ oc patch -o yaml OpenStackDataPlaneNodeSet/openstack-edpm \
  -p='[{"op": "replace", "path": "/spec/nodeTemplate/ansible/ansibleVars/edpm_tuned_profile", "value":"cpu-partitioning"}]' \
  --type json
$ oc patch -o yaml OpenStackDataPlaneNodeSet/openstack-edpm \
  -p='[{"op": "replace", "path":"/spec/nodeTemplate/ansible/ansibleVars/edpm_tuned_isolated_cores", "value":"1-3,5-7"}]' \
  --type json

⚠️ Note: If this configuration is applied then compute nodes in the NodeSet needs to be rebooted after the deployment succeeded. This reboot is not automated today.


Define a new OpenStackDataPlaneDeployment referring to the modified OpenStackDataPlaneNodeSet.
For example create deployment.yaml:
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneDeployment
metadata:
  name: openstack-edpm-2
  namespace: openstack
spec:
  nodeSets:
  - openstack-edpm


Execut the deployment
$ oc apply -f deployment.yaml


# (gibi): the rest of 3.1.x looks good as is but drop the reference to the "overcloud"