Skip to content

Instantly share code, notes, and snippets.

@singlecheeze
Last active April 22, 2022 19:24
Show Gist options
  • Save singlecheeze/0bbc2c29a5b6670887127b93f7b71e3f to your computer and use it in GitHub Desktop.
Save singlecheeze/0bbc2c29a5b6670887127b93f7b71e3f to your computer and use it in GitHub Desktop.
NVMeoTCP SAN with SPDK

Overview

The below was a test of the Storage Performance Development Kit (SPDK) on a Fedora 35 VM, presenting an NVMeoTCP (NVMeoF/TCP, NVMe/TCP) LUN to a vSphere ESXI 7.0u3d host.

The Fedora 35 VM has a Paravirtual boot disk, and a 100 GB virtual NVMe drive. Both of the VMDKs that make up these two drives are actually on an iSCSI datastore on the ESXi Host with a single TCP path, no MPIO/RR, over a 40 GBps network.

image

image

image

The iSCSI storage for the benchmarks is served from a Synology NAS (With W/R Samsung 950 Pro NVMe Cache Drives): image

image

image

image

image

Benchmarks of two cloned Windows 10 VMs

VMs co-located on the same ESXi Host to keep as much TCP traffic internal to the Host as possible.

Note: Multiple times during benchmarking, ATTO Disk Benchmark (4.01.0f1) would crash at random steps in the full benchmark. SPDK would report the below when this occured and the Windows 10 would freeze, but would not crash. The below indicates a reconnect from ESXI?:

[2022-04-22 12:20:15.830642] ctrlr.c: 255:ctrlr_add_qpair_and_update_rsp: *ERROR*: Got I/O connect with duplicate QID 5
[2022-04-22 13:20:01.854326] ctrlr.c: 255:ctrlr_add_qpair_and_update_rsp: *ERROR*: Got I/O connect with duplicate QID 1
[2022-04-22 13:25:28.909940] ctrlr.c: 255:ctrlr_add_qpair_and_update_rsp: *ERROR*: Got I/O connect with duplicate QID 2
[2022-04-22 13:35:09.981729] ctrlr.c: 255:ctrlr_add_qpair_and_update_rsp: *ERROR*: Got I/O connect with duplicate QID 4

Benchmarks were run twice to allow cache drives on Synology to "warm up", and thin provisioned disks to expand: image

  1. Boot drive (VMDK) directly on the iSCSI datastore image

image

image

  1. Boot drive (VMDK) on the NVMeoTCP presented LUN from the Fedora 35 VM running SPDK, which the virtual NVMe drive that backs the NVMeoTCP LUN is a VMDK on the iSCSI datastore (Still pending benchmarks, because oddly, initial testing shows this VM is somehow, magically faster... which makes no sense because it's storage is "second-order"): image

Whether during an active benchmark run, or no significant IO, the SPDK process stays very active (Polling?): image

Fedora 35 VM/SPDK Setup

Install dependancies:

sudo dnf install dkms kernel-devel kernel-headers

Download and compile SPDK sources:

  1. First clone the SPDK project
    git clone https://github.com/spdk/spdk.git

  2. Change current directory to project folder and update submodules
    cd spdk && git submodule update --init

  3. Install missing dependencies
    sudo scripts/pkgdep.sh

  4. Configure SPDK to use OCF
    ./configure --with-ocf

  5. Compile SPDK
    make

  6. Setup SPDK
    sudo scripts/setup.sh

Note: If boot drive is on an NVMe drive, you might see these issues:
spdk/spdk#318
spdk/spdk#1730
spdk/spdk#1658
spdk/spdk#1563

[dave@fedora spdk]$ sudo scripts/setup.sh status
Hugepages
node     hugesize     free /  total
node0   1048576kB        0 /      0
node0      2048kB     1024 /   1024

Type     BDF             Vendor Device NUMA    Driver           Device     Block devices
NVMe     0000:13:00.0    15ad   07f0   0       uio_pci_generic  -          -
  1. Start the SPDK Target App
    sudo build/bin/spdk_tgt

*Note: If hugepage issues are hit: spdk/spdk#430

  1. Open a seperate terminal (Since SPDK is not running as a service) and configue NVMe drive (Or other device https://spdk.io/doc/bdev.html)
sudo ./scripts/rpc.py nvmf_create_transport -t TCP -u 16384 -m 8 -c 8192
sudo ./scripts/rpc.py nvmf_create_subsystem nqn.2016-06.io.spdk:cnode1 -a -s SPDK00000000000001 -d SPDK_Controller1
sudo ./scripts/rpc.py nvmf_subsystem_add_listener nqn.2016-06.io.spdk:cnode1 -t tcp -a 172.16.1.223 -s 4420
sudo ./scripts/rpc.py bdev_nvme_attach_controller -b NVMe1 -t PCIe -a 0000:13:00.0
sudo ./scripts/rpc.py nvmf_subsystem_add_ns nqn.2016-06.io.spdk:cnode1 NVMe1n1
  1. (Optional) View config from SPDKCLI:
    sudo ./scripts/spdkcli.py
[dave@fedora spdk]$ sudo ./scripts/spdkcli.py
SPDK CLI v0.1

/> ls
o- / ......................................................................................................................... [...]
o- bdevs ................................................................................................................... [...]
| o- aio .............................................................................................................. [Bdevs: 0]
| o- error ............................................................................................................ [Bdevs: 0]
| o- iscsi ............................................................................................................ [Bdevs: 0]
| o- logical_volume ................................................................................................... [Bdevs: 0]
| o- malloc ........................................................................................................... [Bdevs: 0]
| o- null ............................................................................................................. [Bdevs: 0]
| o- nvme ............................................................................................................. [Bdevs: 1]
| | o- NVMe1n1 ...................................................... [eb620146-04a6-9387-000c-296d46fe65c8, Size=100.0G, Claimed]
| o- pmemblk .......................................................................................................... [Bdevs: 0]
| o- raid_volume ...................................................................................................... [Bdevs: 0]
| o- rbd .............................................................................................................. [Bdevs: 0]
| o- split_disk ....................................................................................................... [Bdevs: 0]
| o- virtioblk_disk ................................................................................................... [Bdevs: 0]
| o- virtioscsi_disk .................................................................................................. [Bdevs: 0]
o- iscsi ................................................................................................................... [...]
| o- auth_groups ..................................................................................................... [Groups: 0]
| o- global_params ......................................................................................................... [...]
| | o- allow_duplicated_isid: False ........................................................................................ [...]
| | o- chap_group: 0 ....................................................................................................... [...]
| | o- data_out_pool_size: 2048 ............................................................................................ [...]
| | o- default_time2retain: 20 ............................................................................................. [...]
| | o- default_time2wait: 2 ................................................................................................ [...]
| | o- disable_chap: False ................................................................................................. [...]
| | o- error_recovery_level: 0 ............................................................................................. [...]
| | o- first_burst_length: 8192 ............................................................................................ [...]
| | o- immediate_data: True ................................................................................................ [...]
| | o- immediate_data_pool_size: 16384 ..................................................................................... [...]
| | o- max_connections_per_session: 2 ...................................................................................... [...]
| | o- max_large_datain_per_connection: 64 ................................................................................. [...]
| | o- max_queue_depth: 64 ................................................................................................. [...]
| | o- max_r2t_per_connection: 4 ........................................................................................... [...]
| | o- max_sessions: 128 ................................................................................................... [...]
| | o- mutual_chap: False .................................................................................................. [...]
| | o- node_base: iqn.2016-06.io.spdk ...................................................................................... [...]
| | o- nop_in_interval: 30 ................................................................................................. [...]
| | o- nop_timeout: 60 ..................................................................................................... [...]
| | o- pdu_pool_size: 36864 ................................................................................................ [...]
| | o- require_chap: False ................................................................................................. [...]
| o- initiator_groups ...................................................................................... [Initiator groups: 0]
| o- iscsi_connections .......................................................................................... [Connections: 0]
| o- portal_groups ............................................................................................ [Portal groups: 0]
| o- target_nodes .............................................................................................. [Target nodes: 0]
o- lvol_stores .................................................................................................. [Lvol stores: 0]
o- nvmf .................................................................................................................... [...]
| o- subsystem ................................................................................................... [Subsystems: 2]
| | o- nqn.2014-08.org.nvmexpress.discovery ....................................................... [st=Discovery, Allow any host]
| | | o- hosts ........................................................................................................ [Hosts: 0]
| | | o- listen_addresses ......................................................................................... [Addresses: 0]
| | o- nqn.2016-06.io.spdk:cnode1 ............................................... [sn=SPDK00000000000001, st=NVMe, Allow any host]
| |   o- hosts ........................................................................................................ [Hosts: 0]
| |   o- listen_addresses ......................................................................................... [Addresses: 1]
| |   | o- 172.16.1.223:4420 ............................................................................................... [TCP]
| |   o- namespaces .............................................................................................. [Namespaces: 1]
| |     o- NVMe1n1 .................................................................................................. [NVMe1n1, 1]
| o- transport ................................................................................................... [Transports: 1]
|   o- TCP ................................................................................................................. [...]
o- vhost ................................................................................................................... [...]
  o- block ................................................................................................................. [...]
  o- scsi .................................................................................................................. [...]
  1. Add the new controller to vSphere:
    image

image

image

image

  1. Add the datastore to vSphere:
    image

image

  1. (Optional) Storage vMotion VM onto datastore and watch SPDK Top: sudo ./build/bin/spdk_top
    image
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment