- Build robust iSER or SRP over RoCE environment.
- OS: CentOS7.1 (7.1.1503)
- IB/RoCE: Inbox Driver
- SCST: 6427(trunk)
- What Configuration should I check?
- Change MLNX OFED
- Update HCA Firmware to latest version. (It's supermicro server and bulition HCA card(MLNX))
- And Others.
Both machine config
cat /etc/modprobe.d/mlx4.conf
options mlx4_en pfctx=3 pfcrx=3
MTU -> 4200
ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.30.8000
node_guid: 0025:90ff:ffdf:82b8
sys_image_guid: 0025:90ff:ffdf:82bb
vendor_id: 0x02c9
vendor_part_id: 4099
hw_ver: 0x0
board_id: SM_2241000001000
phys_port_cnt: 1
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
- iSER and SRP connection cause following error.
- iSCSI(no rdma) seems work well. so maybe Cable is no problem.
Jul 28 20:44:59 srp_target kernel: isert_cm_evt:TIMEWAIT_EXIT(15) status:0 portal:ffff881050d95380 cm_id:ffff8808386f7c00
Jul 28 20:44:59 srp_target kernel: isert_conn_free conn:ffff880851b186c0
HANDLER vdisk_nullio {
DEVICE disk_null
}
HANDLER vdisk_blockio {
DEVICE SAS_DISK1 {
filename /dev/disk/by-id/scsi-3600605b009e4f4e01d4241c13e907c3f
t10_dev_id sasdisk1
}
}
TARGET_DRIVER iscsi {
enabled 1
TARGET iqn.2006-10.tgt {
allowed_portal 192.168.5.10
QueuedCommands 128
LUN 0 SAS_DISK1
LUN 1 disk_null
enabled 1
}
}
iscsiadm -m discovery --op=new --op=delete --type sendtargets --portal 192.168.5.10:3260 -I iser
iscsiadm -m node -l
Jul 28 15:30:52 srp_target kernel: ib_srpt: receiving failed for idx 59 with status 5
Jul 28 15:30:52 srp_target kernel: ib_srpt: receiving failed for idx 60 with status 5
Jul 28 15:30:52 srp_target kernel: ib_srpt: receiving failed for idx 61 with status 5
...
Jul 28 15:30:54 srp_target kernel: ib_srpt: Received CM TimeWait exit for ch 192.168.5.10-615.
Jul 28 15:30:54 srp_target kernel: ib_srpt: Received CM TimeWait exit for ch 192.168.5.10-616.
cat /etc/modprobe.d/ib_srpt.conf
options ib_srpt rdma_cm_port=5000
HANDLER vdisk_blockio {
DEVICE SAS_DISK1 {
filename /dev/disk/by-id/scsi-3600605b009e4f4e01d4241c13e907c3f
t10_dev_id sasdisk1
}
}
TARGET_DRIVER ib_srpt {
TARGET fe80:0000:0000:0000:0225:90ff:fedf:82b9 {
enabled 1
LUN 0 SAS_DISK1
nv_cache 1
}
}
echo dest=192.168.5.10:5000,id_ext=002590ffffdf82b8,ioc_guid=002590ffffdf82b8 > /sys/class/infiniband_srp/srp-mlx4_0-1/add_target
udaddy
udaddy -s server_ip
udaddy: starting client
udaddy: connecting
udaddy: failure creating address handle
test complete
return status -1
rdma_server
rdma_server: start
rdma_server: end 0
rdma_client -s 192.168.5.10
rdma_client: start
rdma_client: end 0
ib_send_bw -d mlx4_0 -i 1 -F --report_gbits
************************************
* Waiting for client to connect... *
************************************
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
RX depth : 512
CQ Moderation : 100
Mtu : 2048[B]
Link type : Ethernet
Gid index : 0
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x00eb PSN 0xd32c4e
GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:30:210:10
remote address: LID 0000 QPN 0x00a6 PSN 0x57cfea
GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:30:210:20
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 2100.082000 != 1942.417000
Test integrity may be harmed !
65536 1000 0.00 31.59 0.060259
---------------------------------------------------------------------------------------
ib_send_bw -d mlx4_0 -i 1 -F --report_gbits 192.168.5.10
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx4_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 2048[B]
Link type : Ethernet
Gid index : 0
Max inline data : 0[B]
rdma_cm QPs : OFF
Data ex. method : Ethernet
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x00a6 PSN 0x57cfea
GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:30:210:20
remote address: LID 0000 QPN 0x00eb PSN 0xd32c4e
GID: 00:00:00:00:00:00:00:00:00:00:255:255:172:30:210:10
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[Gb/sec] BW average[Gb/sec] MsgRate[Mpps]
Conflicting CPU frequency values detected: 2098.687000 != 1583.039000
Test integrity may be harmed !
65536 1000 37.41 31.45 0.059977
---------------------------------------------------------------------------------------
rping -s -C 10 -v
server ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
server ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
server ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
server ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
server ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
server ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
server ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
server ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
server ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
server ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
server DISCONNECT EVENT...
wait for RDMA_READ_ADV state 10
rping -c -a 192.168.5.10 -C 10 -v
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
ping data: rdma-ping-2: CDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrst
ping data: rdma-ping-3: DEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstu
ping data: rdma-ping-4: EFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuv
ping data: rdma-ping-5: FGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvw
ping data: rdma-ping-6: GHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwx
ping data: rdma-ping-7: HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxy
ping data: rdma-ping-8: IJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz
ping data: rdma-ping-9: JKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyzA
client DISCONNECT EVENT...
ucmatose
cmatose: starting server
initiating data transfers
completing sends
receiving data transfers
data transfers complete
cmatose: disconnecting
disconnected
test complete
return status 0
ucmatose -s 192.168.5.10
cmatose: starting client
cmatose: connecting
receiving data transfers
sending replies
data transfers complete
test complete
return status 0