Last active
December 2, 2021 12:41
-
-
Save zshi-redhat/c7cfe9e0be63f0330952a28792acff2b to your computer and use it in GitHub Desktop.
kubeconf RDMA setup
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
server side: ib_send_bw -d mlx5_2 -i 1 -x 0 | |
mlx5_2 is the PF or VF dev name, | |
-x is needed for SR-IOV | |
client side: ib_send_bw -d mlx5_4 -i 1 10.56.217.100 -x 0 | |
mlx5_4 is the VF dev name inside pod, | |
10.56.217.100 is the server side IP address of VF or PF. | |
-x is needed for SR-IOV. | |
Erros when running ib_send_bw in Pod: | |
1) mem error, not enough memory to be pinned by ib_send_bw application running inside pod | |
client side error: | |
Couldn't allocate MR | |
failed to create mr | |
Failed to create MR | |
Couldn't create IB resources | |
server side error: | |
local address: LID 0000 QPN 0x00e4 PSN 0xbbb319 | |
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:56:217:100 | |
ethernet_read_keys: Couldn't read remote address | |
Unable to read to socket/rdam_cm | |
Failed to exchange data between server and clients | |
solution: assign enough hugepage mem when creating pod. | |
2) connect error | |
Disable firewalld when connect using ib_send_bw from client to server, otherwise, it(client) will hang there and not return | |
solution: disable firewalld | |
3) Incompatibility issue with GID types | |
SR-IOV device may fail to run ib_send_bw due to the GID issue below, | |
--------------------------------------------------------------------------------------- | |
local address: LID 0000 QPN 0x04ac PSN 0x9da27 | |
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:56:217:200 | |
remote address: LID 0000 QPN 0x04af PSN 0x8766ec | |
GID: 254:128:00:00:00:00:00:00:236:133:54:255:254:205:125:133 | |
Found Incompatibility issue with GID types. | |
Please Try to use a different IP version. | |
solution: add -x option when run ib_send_bw |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: v1 | |
kind: Pod | |
metadata: | |
name: testpod1 | |
annotations: | |
k8s.v1.cni.cncf.io/networks: sriov-net2 | |
spec: | |
containers: | |
- name: appcntr1 | |
image: zenghui/centos-rdma | |
imagePullPolicy: IfNotPresent | |
securityContext: | |
capabilities: | |
add: ["IPC_LOCK"] | |
command: [ "/bin/bash", "-c", "--" ] | |
args: [ "while true; do sleep 300000; done;" ] | |
resources: | |
requests: | |
mellanox.com/mlnx_sriov_rdma: '1' | |
hugepages-1Gi: 4Gi | |
cpu: '6' | |
memory: 100Mi | |
limits: | |
mellanox.com/mlnx_sriov_rdma: '1' | |
hugepages-1Gi: 4Gi | |
cpu: '6' | |
memory: 100Mi | |
volumeMounts: | |
- mountPath: /mnt/huge | |
name: hugepage | |
readOnly: False | |
volumes: | |
- name: hugepage | |
emptyDir: | |
medium: HugePages |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: v1 | |
kind: ConfigMap | |
metadata: | |
name: sriovdp-config | |
namespace: kube-system | |
data: | |
config.json: | | |
{ | |
"resourceList": [{ | |
"resourceName": "mlnx_sriov_rdma", | |
"isRdma": true, | |
"selectors": { | |
"vendors": ["15b3"], | |
"devices": ["1018"], | |
"pfNames": ["p4p1"] | |
} | |
} | |
] | |
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--- | |
apiVersion: v1 | |
kind: ServiceAccount | |
metadata: | |
name: sriov-device-plugin | |
namespace: kube-system | |
--- | |
apiVersion: apps/v1 | |
kind: DaemonSet | |
metadata: | |
name: kube-sriov-device-plugin-amd64 | |
namespace: kube-system | |
labels: | |
tier: node | |
app: sriovdp | |
spec: | |
selector: | |
matchLabels: | |
name: sriov-device-plugin | |
template: | |
metadata: | |
labels: | |
name: sriov-device-plugin | |
tier: node | |
app: sriovdp | |
spec: | |
hostNetwork: true | |
hostPID: true | |
nodeSelector: | |
beta.kubernetes.io/arch: amd64 | |
tolerations: | |
- key: node-role.kubernetes.io/master | |
operator: Exists | |
effect: NoSchedule | |
serviceAccountName: sriov-device-plugin | |
containers: | |
- name: kube-sriovdp | |
image: nfvpe/sriov-device-plugin | |
imagePullPolicy: Always | |
args: | |
- --log-dir=sriovdp | |
- --log-level=10 | |
- --resource-prefix=mellanox.com | |
securityContext: | |
privileged: true | |
volumeMounts: | |
- name: devicesock | |
mountPath: /var/lib/kubelet/ | |
readOnly: false | |
- name: log | |
mountPath: /var/log | |
- name: config-volume | |
mountPath: /etc/pcidp | |
volumes: | |
- name: devicesock | |
hostPath: | |
path: /var/lib/kubelet/ | |
- name: log | |
hostPath: | |
path: /var/log | |
- name: config-volume | |
configMap: | |
name: sriovdp-config | |
items: | |
- key: config.json | |
path: config.json |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
apiVersion: "k8s.cni.cncf.io/v1" | |
kind: NetworkAttachmentDefinition | |
metadata: | |
name: sriov-net1 | |
annotations: | |
k8s.v1.cni.cncf.io/resourceName: mellanox.com/mlnx_sriov_rdma | |
spec: | |
config: '{ | |
"type": "sriov", | |
"cniVersion": "0.3.1", | |
"name": "sriov-network", | |
"ipam": { | |
"type": "host-local", | |
"subnet": "10.56.217.0/24", | |
"routes": [{ | |
"dst": "0.0.0.0/0" | |
}], | |
"gateway": "10.56.217.1" | |
} | |
}' |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment