Skip to content

Instantly share code, notes, and snippets.

@tanabarr
Last active August 28, 2019 13:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tanabarr/7b584b727a070dc41fe44e6b28d2ba8c to your computer and use it in GitHub Desktop.
Save tanabarr/7b584b727a070dc41fe44e6b28d2ba8c to your computer and use it in GitHub Desktop.
Deployment workflow for DAOS using the control plane tools
## Prepare devices for use
[root@<hostname> tanabarr]# cd projects/daos_m/
[root@<hostname> daos_m]# source scons_local/utils/setup_local.sh
/home/tanabarr/projects/daos_m
Build vars file found: ./.build_vars.sh
OLD_PATH is /usr/lib64/qt-3.3/bin /usr/local/bin /usr/bin /usr/local/sbin /usr/sbin /usr/lib/go-1.10/bin /usr/local/go/bin /usr/local/go/bin /home/tanabarr/bin /usr/lib/go-1.10/bin /usr/local/go/bin /usr/local/go/bin
[root@<hostname> daos_m]# daos_server storage prep-scm
Memory allocation goals for SCM will be changed and namespaces modified, this will be a destructive operation. ensure namespaces are unmounted and SCM is otherwise unused.
Are you sure you want to continue? (yes/no)
yes
A reboot is required to process new memory allocation goals.
<perform host reboot>
[root@<hostname> daos_m]# daos_server storage prep-scm
Memory allocation goals for SCM will be changed and namespaces modified, this will be a destructive operation. Please ensure namespaces are unmounted and SCM is otherwise unused.
Are you sure you want to continue? (yes/no)
yes
persistent memory kernel devices:
[{UUID:5d2f2517-9217-4d7d-9c32-70731c9ac11e Blockdev:pmem1 Dev:namespace1.0 NumaNode:1} {UUID:2bfe6c40-f79a-4b8e-bddf-ba81d4427b9b Blockdev:pmem0 Dev:namespace0.0 NumaNode:0}]
[root@<hostname> daos_m]# daos_server storage prep-nvme --reset
[root@<hostname> daos_m]# daos_server storage prep-nvme
## Scan and setup for use with relevant device identifiers
[root@wolf-72 daos_m]# daos_server storage scan
Scanning locally-attached storage...
Starting SPDK v18.07-pre / DPDK 18.02.0 initialization...
[ DPDK EAL parameters: spdk -c 0x1 --file-prefix=spdk_pid181813 ]
EAL: Detected 96 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Multi-process socket /var/run/.spdk_pid181813_unix
EAL: Probing VFIO support...
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:2701 spdk_nvme
EAL: PCI device 0000:87:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
EAL: PCI device 0000:da:00.0 on NUMA socket 1
EAL: probe driver: 8086:2701 spdk_nvme
NVMe SSD controller and constituent namespaces:
PCI Addr:0000:da:00.0 Serial:PHKS7505005Y750BGN Model:INTEL SSDPED1K750GA Fwrev:E2010325 Socket:1
Namespace: id:1 capacity:750
PCI Addr:0000:81:00.0 Serial:PHKS7505007J750BGN Model:INTEL SSDPED1K750GA Fwrev:E2010325 Socket:1
Namespace: id:1 capacity:750
PCI Addr:0000:87:00.0 Serial:CVFT5392000G1P6DGN Model:INTEL SSDPEDMD016T4 Fwrev:8DV10171 Socket:1
Namespace: id:1 capacity:1600
SCM modules:
PhysicalID:36 Capacity:539661172736 Location:(socket:0 memctrlr:0 chan:0 pos:1)
PhysicalID:40 Capacity:539661172736 Location:(socket:0 memctrlr:0 chan:1 pos:1)
PhysicalID:44 Capacity:539661172736 Location:(socket:0 memctrlr:0 chan:2 pos:1)
PhysicalID:50 Capacity:539661172736 Location:(socket:0 memctrlr:1 chan:0 pos:1)
PhysicalID:52 Capacity:539661172736 Location:(socket:0 memctrlr:1 chan:1 pos:0)
PhysicalID:55 Capacity:539661172736 Location:(socket:0 memctrlr:1 chan:2 pos:0)
PhysicalID:62 Capacity:539661172736 Location:(socket:1 memctrlr:0 chan:0 pos:1)
PhysicalID:66 Capacity:539661172736 Location:(socket:1 memctrlr:0 chan:1 pos:1)
PhysicalID:70 Capacity:539661172736 Location:(socket:1 memctrlr:0 chan:2 pos:1)
PhysicalID:76 Capacity:539661172736 Location:(socket:1 memctrlr:1 chan:0 pos:1)
PhysicalID:78 Capacity:539661172736 Location:(socket:1 memctrlr:1 chan:1 pos:0)
PhysicalID:81 Capacity:539661172736 Location:(socket:1 memctrlr:1 chan:2 pos:0)
[root@<hostname> daos_m]# daos_server storage prep-nvme --reset
[root@<hostname> daos_m]# daos_server storage prep-nvme --pci-whitelist="0000:da:00.0 0000:81:00.0"
## Apply configuration file changes (pciaddress already in server config example)
diff --git a/utils/config/daos.yml b/utils/config/daos.yml
index 7ff0ec1..c26c30c 100644
--- a/utils/config/daos.yml
+++ b/utils/config/daos.yml
@@ -1,4 +1,5 @@
# DAOS client configuration file.
+access_points: ["wolf-72:10001"]
#
# Location of this configuration file is determined by first checking for the
# path specified through the -o option of the daos_agent and DMG command line.
diff --git a/utils/config/examples/daos_server_sockets.yml b/utils/config/examples/daos_server_sockets.yml
index a7e76f8..4983e4e 100644
--- a/utils/config/examples/daos_server_sockets.yml
+++ b/utils/config/examples/daos_server_sockets.yml
@@ -54,18 +56,18 @@ servers:
# When scm_class is set to ram, tmpfs will be used to emulate SCM.
# The size of ram is specified by scm_size in GB units.
scm_mount: /mnt/daos # map to -s /mnt/daos
- scm_class: ram
- scm_size: 6
+ #scm_class: ram
+ #scm_size: 6
# When scm_class is set to dcpm, scm_list is the list of device paths for
# AppDirect pmem namespaces (currently only one per server supported).
- # scm_class: dcpm
- # scm_list: [/dev/pmem0]
+ scm_class: dcpm
+ scm_list: [/dev/pmem1]
# If using NVMe SSD (will write /mnt/daos/daos_nvme.conf and start I/O
# service with -n <path>)
bdev_class: nvme
- bdev_list: ["0000:81:00.0"] # generate regular nvme.conf
+ bdev_list: ["0000:da:00.0", "0000:81:00.0"] # generate regular nvme.conf
# If emulating NVMe SSD with malloc devices
# bdev_class: malloc # map to VOS_BDEV_CLASS=MALLOC
## Reset SCM and start daos_server in maintenance mode for format
[root@<hostname> daos_m]# umount /mnt/daos ; rm -rf /mnt/daos
[root@<hostname> daos_m]# daos_server -i -o $(pwd)/utils/config/examples/daos_server_sockets.yml
<snip>
waiting for storage format on server 0
<snip>
## Format storage through management tool
[tanabarr@<hostname> daos_m]# daos_shell -i -l <hostname>:10001 -i storage format -f
Active connections: [<hostname):10001]
This is a destructive operation and storage devices specified in the server config file will be erased.
Please be patient as it may take several minutes.
NVMe storage format results:
<hostname>:10001:
pci-address 0000:da:00.0: status CTRL_SUCCESS
pci-address 0000:81:00.0: status CTRL_SUCCESS
SCM storage format results:
boro-84:10001:
mntpoint /mnt/daos: status CTRL_SUCCESS
## daos_server continues and starts daos_io_server after storage format from management tool daos_shell
(continued listing of daos_server standard out...)
DAOS I/O server (v0.5.0) process 29671 started on rank 0 (out of 1) with 1 target xstream set(s), 0 helper XS per target, firstcore 0.
<snip>
## Create pools
[root@<hostname> daos_m]# daos_shell -o /home/tanabarr/projects/daos_m/utils/config/daos.yml -i pool create -s 1G -n 500G
Active connections: [<hostname>:10001]
SCM:NVMe ratio is less than 1%, DAOS performance will suffer!
Creating DAOS pool with 1GB SCM and 500GB NvMe storage (0.002 ratio)
Creating DAOS pool: scmbytes:1073741824 nvmebytes:536870912000 numsvcreps:1 user:"root@" usergroup:"root@" sys:"daos_server"
pool create command results:
<hostname>:10001:
uuid:"e9023bbd-c44f-4370-9c74-dd20fdab2c24" svcreps:"0"
...create more pools, connect and use DAOS.
## Optional reset: Remove superblock to trigger storage format on restart
kill daos_server then ...
[root@<hostname> daos_m]# rm -f /mnt/daos/*; uount
[root@<hostname> daos_m]# daos_server -i -o $(pwd)/utils/config/examples/daos_server_sockets.yml
<snip>
[tanabarr@<hostname> daos_m]# daos_shell -l <hostname>:10001 -i storage format -f
<snip>
... then should be able to create pools as above because SPDK blobs should have been removed from NVMe SSDs
Troubleshooting:
- refer to previous gists for more detailed information
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment