Last active
August 28, 2019 13:09
-
-
Save tanabarr/7b584b727a070dc41fe44e6b28d2ba8c to your computer and use it in GitHub Desktop.
Deployment workflow for DAOS using the control plane tools
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Prepare devices for use | |
[root@<hostname> tanabarr]# cd projects/daos_m/ | |
[root@<hostname> daos_m]# source scons_local/utils/setup_local.sh | |
/home/tanabarr/projects/daos_m | |
Build vars file found: ./.build_vars.sh | |
OLD_PATH is /usr/lib64/qt-3.3/bin /usr/local/bin /usr/bin /usr/local/sbin /usr/sbin /usr/lib/go-1.10/bin /usr/local/go/bin /usr/local/go/bin /home/tanabarr/bin /usr/lib/go-1.10/bin /usr/local/go/bin /usr/local/go/bin | |
[root@<hostname> daos_m]# daos_server storage prep-scm | |
Memory allocation goals for SCM will be changed and namespaces modified, this will be a destructive operation. ensure namespaces are unmounted and SCM is otherwise unused. | |
Are you sure you want to continue? (yes/no) | |
yes | |
A reboot is required to process new memory allocation goals. | |
<perform host reboot> | |
[root@<hostname> daos_m]# daos_server storage prep-scm | |
Memory allocation goals for SCM will be changed and namespaces modified, this will be a destructive operation. Please ensure namespaces are unmounted and SCM is otherwise unused. | |
Are you sure you want to continue? (yes/no) | |
yes | |
persistent memory kernel devices: | |
[{UUID:5d2f2517-9217-4d7d-9c32-70731c9ac11e Blockdev:pmem1 Dev:namespace1.0 NumaNode:1} {UUID:2bfe6c40-f79a-4b8e-bddf-ba81d4427b9b Blockdev:pmem0 Dev:namespace0.0 NumaNode:0}] | |
[root@<hostname> daos_m]# daos_server storage prep-nvme --reset | |
[root@<hostname> daos_m]# daos_server storage prep-nvme | |
## Scan and setup for use with relevant device identifiers | |
[root@wolf-72 daos_m]# daos_server storage scan | |
Scanning locally-attached storage... | |
Starting SPDK v18.07-pre / DPDK 18.02.0 initialization... | |
[ DPDK EAL parameters: spdk -c 0x1 --file-prefix=spdk_pid181813 ] | |
EAL: Detected 96 lcore(s) | |
EAL: No free hugepages reported in hugepages-1048576kB | |
EAL: Multi-process socket /var/run/.spdk_pid181813_unix | |
EAL: Probing VFIO support... | |
EAL: PCI device 0000:81:00.0 on NUMA socket 1 | |
EAL: probe driver: 8086:2701 spdk_nvme | |
EAL: PCI device 0000:87:00.0 on NUMA socket 1 | |
EAL: probe driver: 8086:953 spdk_nvme | |
EAL: PCI device 0000:da:00.0 on NUMA socket 1 | |
EAL: probe driver: 8086:2701 spdk_nvme | |
NVMe SSD controller and constituent namespaces: | |
PCI Addr:0000:da:00.0 Serial:PHKS7505005Y750BGN Model:INTEL SSDPED1K750GA Fwrev:E2010325 Socket:1 | |
Namespace: id:1 capacity:750 | |
PCI Addr:0000:81:00.0 Serial:PHKS7505007J750BGN Model:INTEL SSDPED1K750GA Fwrev:E2010325 Socket:1 | |
Namespace: id:1 capacity:750 | |
PCI Addr:0000:87:00.0 Serial:CVFT5392000G1P6DGN Model:INTEL SSDPEDMD016T4 Fwrev:8DV10171 Socket:1 | |
Namespace: id:1 capacity:1600 | |
SCM modules: | |
PhysicalID:36 Capacity:539661172736 Location:(socket:0 memctrlr:0 chan:0 pos:1) | |
PhysicalID:40 Capacity:539661172736 Location:(socket:0 memctrlr:0 chan:1 pos:1) | |
PhysicalID:44 Capacity:539661172736 Location:(socket:0 memctrlr:0 chan:2 pos:1) | |
PhysicalID:50 Capacity:539661172736 Location:(socket:0 memctrlr:1 chan:0 pos:1) | |
PhysicalID:52 Capacity:539661172736 Location:(socket:0 memctrlr:1 chan:1 pos:0) | |
PhysicalID:55 Capacity:539661172736 Location:(socket:0 memctrlr:1 chan:2 pos:0) | |
PhysicalID:62 Capacity:539661172736 Location:(socket:1 memctrlr:0 chan:0 pos:1) | |
PhysicalID:66 Capacity:539661172736 Location:(socket:1 memctrlr:0 chan:1 pos:1) | |
PhysicalID:70 Capacity:539661172736 Location:(socket:1 memctrlr:0 chan:2 pos:1) | |
PhysicalID:76 Capacity:539661172736 Location:(socket:1 memctrlr:1 chan:0 pos:1) | |
PhysicalID:78 Capacity:539661172736 Location:(socket:1 memctrlr:1 chan:1 pos:0) | |
PhysicalID:81 Capacity:539661172736 Location:(socket:1 memctrlr:1 chan:2 pos:0) | |
[root@<hostname> daos_m]# daos_server storage prep-nvme --reset | |
[root@<hostname> daos_m]# daos_server storage prep-nvme --pci-whitelist="0000:da:00.0 0000:81:00.0" | |
## Apply configuration file changes (pciaddress already in server config example) | |
diff --git a/utils/config/daos.yml b/utils/config/daos.yml | |
index 7ff0ec1..c26c30c 100644 | |
--- a/utils/config/daos.yml | |
+++ b/utils/config/daos.yml | |
@@ -1,4 +1,5 @@ | |
# DAOS client configuration file. | |
+access_points: ["wolf-72:10001"] | |
# | |
# Location of this configuration file is determined by first checking for the | |
# path specified through the -o option of the daos_agent and DMG command line. | |
diff --git a/utils/config/examples/daos_server_sockets.yml b/utils/config/examples/daos_server_sockets.yml | |
index a7e76f8..4983e4e 100644 | |
--- a/utils/config/examples/daos_server_sockets.yml | |
+++ b/utils/config/examples/daos_server_sockets.yml | |
@@ -54,18 +56,18 @@ servers: | |
# When scm_class is set to ram, tmpfs will be used to emulate SCM. | |
# The size of ram is specified by scm_size in GB units. | |
scm_mount: /mnt/daos # map to -s /mnt/daos | |
- scm_class: ram | |
- scm_size: 6 | |
+ #scm_class: ram | |
+ #scm_size: 6 | |
# When scm_class is set to dcpm, scm_list is the list of device paths for | |
# AppDirect pmem namespaces (currently only one per server supported). | |
- # scm_class: dcpm | |
- # scm_list: [/dev/pmem0] | |
+ scm_class: dcpm | |
+ scm_list: [/dev/pmem1] | |
# If using NVMe SSD (will write /mnt/daos/daos_nvme.conf and start I/O | |
# service with -n <path>) | |
bdev_class: nvme | |
- bdev_list: ["0000:81:00.0"] # generate regular nvme.conf | |
+ bdev_list: ["0000:da:00.0", "0000:81:00.0"] # generate regular nvme.conf | |
# If emulating NVMe SSD with malloc devices | |
# bdev_class: malloc # map to VOS_BDEV_CLASS=MALLOC | |
## Reset SCM and start daos_server in maintenance mode for format | |
[root@<hostname> daos_m]# umount /mnt/daos ; rm -rf /mnt/daos | |
[root@<hostname> daos_m]# daos_server -i -o $(pwd)/utils/config/examples/daos_server_sockets.yml | |
<snip> | |
waiting for storage format on server 0 | |
<snip> | |
## Format storage through management tool | |
[tanabarr@<hostname> daos_m]# daos_shell -i -l <hostname>:10001 -i storage format -f | |
Active connections: [<hostname):10001] | |
This is a destructive operation and storage devices specified in the server config file will be erased. | |
Please be patient as it may take several minutes. | |
NVMe storage format results: | |
<hostname>:10001: | |
pci-address 0000:da:00.0: status CTRL_SUCCESS | |
pci-address 0000:81:00.0: status CTRL_SUCCESS | |
SCM storage format results: | |
boro-84:10001: | |
mntpoint /mnt/daos: status CTRL_SUCCESS | |
## daos_server continues and starts daos_io_server after storage format from management tool daos_shell | |
(continued listing of daos_server standard out...) | |
DAOS I/O server (v0.5.0) process 29671 started on rank 0 (out of 1) with 1 target xstream set(s), 0 helper XS per target, firstcore 0. | |
<snip> | |
## Create pools | |
[root@<hostname> daos_m]# daos_shell -o /home/tanabarr/projects/daos_m/utils/config/daos.yml -i pool create -s 1G -n 500G | |
Active connections: [<hostname>:10001] | |
SCM:NVMe ratio is less than 1%, DAOS performance will suffer! | |
Creating DAOS pool with 1GB SCM and 500GB NvMe storage (0.002 ratio) | |
Creating DAOS pool: scmbytes:1073741824 nvmebytes:536870912000 numsvcreps:1 user:"root@" usergroup:"root@" sys:"daos_server" | |
pool create command results: | |
<hostname>:10001: | |
uuid:"e9023bbd-c44f-4370-9c74-dd20fdab2c24" svcreps:"0" | |
...create more pools, connect and use DAOS. | |
## Optional reset: Remove superblock to trigger storage format on restart | |
kill daos_server then ... | |
[root@<hostname> daos_m]# rm -f /mnt/daos/*; uount | |
[root@<hostname> daos_m]# daos_server -i -o $(pwd)/utils/config/examples/daos_server_sockets.yml | |
<snip> | |
[tanabarr@<hostname> daos_m]# daos_shell -l <hostname>:10001 -i storage format -f | |
<snip> | |
... then should be able to create pools as above because SPDK blobs should have been removed from NVMe SSDs | |
Troubleshooting: | |
- refer to previous gists for more detailed information |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment