Skip to content

Instantly share code, notes, and snippets.

@tanabarr
Last active October 8, 2019 17:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tanabarr/3338d1b2fd5306d70f2c98ca91b5a86f to your computer and use it in GitHub Desktop.
Save tanabarr/3338d1b2fd5306d70f2c98ca91b5a86f to your computer and use it in GitHub Desktop.
respawn daos_server as normal user (specified in server config) after starting as root and performing format from management tool (daos_shell)
---- SERVER: ----
root@wolf-72:/home/tanabarr/projects/daos_m$ umount /mnt/daos; rm -rf /mnt/daos; mkdir /mnt/daos
root@wolf-72:/home/tanabarr/projects/daos_m$ daos_server storage prepare -n -b
Preparing locally-attached NVMe storage...
2019/07/11 20:45:30 storage_nvme.go:98: debug: spdk setup with _NRHUGE=1024
2019/07/11 20:45:30 storage_nvme.go:102: debug: spdk setup with _TARGET_USER=root
+++ b/utils/config/examples/daos_server_sockets.yml
@@ -9,7 +9,7 @@ control_log_mask: DEBUG
control_log_file: /tmp/daos_control.log
## uncomment to drop privileges before starting data plane
## (if started as root to perform hardware provisioning)
-# user_name: daosuser
+user_name: tanabarr
# group_name: daosgroup # (optional)
[root@wolf-72 daos_m]# orterun -np 1 -H wolf-72 --mca oob ^ud --allow-run-as-root --report-uri $(pwd)/urifile daos_server start -o /home/tanabarr/projects/daos_m/utils/config/examples/daos_server_sockets.yml -i
[wolf-72.wolf.hpdd.intel.com:440196] pmix_mca_base_component_repository_open: unable to open mca_pnet_opa: libevent_pthreads-2.1.so.6: cannot open shared object file: No such file or directory (ignored)
no control log file specified; logging to stdout
DEBUG 16:28:54.546700 start.go:153: Switching control log level to DEBUG
DEBUG 16:28:54.546798 start.go:167: cfg: &server.Configuration{ControlPort:10001, TransportConfig:(*security.TransportConfig)(0xc00017cc40), Servers:[]*ioserver.Config{(*ioserver.Config)(0xc000086600)}, BdevInclude:[]string(nil), BdevExclude:[]string(nil), NrHugepages:4096, ControlLogMask:3, ControlLogFile:"", ControlLogJSON:false, UserName:"tanabarr", GroupName:"", SystemName:"daos_server", SocketDir:"/var/run/daos_server", Fabric:ioserver.FabricConfig{Provider:"ofi+sockets", Interface:"", InterfacePort:0, PinnedNumaNode:(*uint)(nil)}, Modules:"", Attach:"", AccessPoints:[]string{"localhost"}, FaultPath:"", FaultCb:"", Hyperthreads:false, Path:"/home/tanabarr/projects/daos_m/utils/config/examples/daos_server_sockets.yml", ext:(*server.ext)(0xc00015c7c0), NvmeShmID:238352853, validateProviderFn:(server.networkProviderValidation)(0x5c0120), validateNUMAFn:(server.networkNUMAValidation)(0x5c0e00)}
DEBUG 16:28:54.547335 server.go:61: Warning: active config could not be saved (open /home/tanabarr/projects/daos_m/utils/config/examples/.daos_server.active.yml: permission denied)
DEBUG 16:28:54.547521 server.go:61: Active config saved to /tmp/.daos_server.active.yml (read-only)
Starting SPDK v18.07-pre / DPDK 18.02.0 initialization...
[ DPDK EAL parameters: spdk -c 0x1 --file-prefix=spdk238352853 --base-virtaddr=0x200000000000 --proc-type=auto ]
EAL: Detected 96 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Multi-process socket /var/run/.spdk238352853_unix
EAL: Probing VFIO support...
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:2701 spdk_nvme
EAL: PCI device 0000:87:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
EAL: PCI device 0000:da:00.0 on NUMA socket 1
EAL: probe driver: 8086:2701 spdk_nvme
DAOS control server listening on 0.0.0.0:10001
Waiting for I/O server instance storage to be ready...
DEBUG 16:28:59.583546 harness.go:182: /mnt/daos: checking formatting
DEBUG 16:28:59.583562 external.go:163: check if dir /mnt/daos is mounted
DEBUG 16:28:59.583579 harness.go:182: /mnt/daos: needs format (unmounted ramdisk)
DEBUG 16:28:59.583585 harness.go:197: SCM format required
---- CLIENT: ----
[tanabarr@wolf-72 daos_m]$ daos_shell -l wolf-72:10001 -i storage format -f
Active connections: [wolf-72:10001]
This is a destructive operation and storage devices specified in the server config file will be erased.
Please be patient as it may take several minutes.
NVMe storage format results:
wolf-72:10001:
PCI Addr:0000:87:00.0 Status:CTRL_SUCCESS
SCM storage format results:
wolf-72:10001:
Mntpoint:/mnt/daos Status:CTRL_SUCCESS
[tanabarr@wolf-72 daos_m]$
---- SUBSEQUENT SERVER LOG OUTPUT: ----
DEBUG 16:32:03.781773 ctl_storage_rpc.go:249: received StorageFormat RPC; proceeding to instance storage format
formatting storage for I/O server instance 0
DEBUG 16:32:03.781820 ctl_storage_rpc.go:175: /mnt/daos: checking formatting
DEBUG 16:32:03.781832 external.go:163: check if dir /mnt/daos is mounted
DEBUG 16:32:03.781860 ctl_storage_rpc.go:175: /mnt/daos: needs format (unmounted ramdisk)
DEBUG 16:32:03.781875 ctl_storage_rpc.go:204: performing device format on NVMe controllers
DEBUG 16:32:03.781892 ctl_storage_rpc.go:204: formatting nvme controller at 0000:87:00.0, may take several minutes!...
nvme_ctrlr.c: 710:spdk_nvme_ctrlr_reset: *NOTICE*: resetting controller
Formatted NVMe Controller: 0000:87:00.00
DEBUG 16:32:20.665498 ctl_storage_rpc.go:204: controller format successful (0000:87:00.0)
DEBUG 16:32:20.665549 ctl_storage_rpc.go:204: device format on NVMe controllers completed
DEBUG 16:32:20.665557 ctl_storage_rpc.go:214: performing SCM device reset, format and mount
DEBUG 16:32:20.665567 external.go:187: syscall: calling unmount with /mnt/daos, MNT_DETACH
DEBUG 16:32:20.665575 external.go:218: os: removeall /mnt/daos
DEBUG 16:32:20.665609 ctl_storage_rpc.go:214: no scm_size specified in config for ram tmpfs
DEBUG 16:32:20.665614 ctl_storage_rpc.go:214: mounting scm device tmpfs at /mnt/daos (tmpfs)...
DEBUG 16:32:20.665620 external.go:207: os: mkdirall /mnt/daos, 0777
DEBUG 16:32:20.665652 external.go:147: syscall: mount tmpfs, /mnt/daos, tmpfs, 0, size=6g
DEBUG 16:32:20.665788 ctl_storage_rpc.go:214: scm mount complete.
DEBUG 16:32:20.665796 ctl_storage_rpc.go:214: SCM device reset, format and mount completed
DEBUG 16:32:20.665802 ctl_storage_rpc.go:260: nvme formatted: true, scm formatted: true, has superblock: false
storage format successful on server 0
DEBUG 16:32:20.665812 ctl_storage_rpc.go:232: I/O server instance 0 notifying storage ready
I/O server instance 0 storage ready
DEBUG 16:32:20.665883 harness.go:136: /mnt/daos: checking superblock
DEBUG 16:32:20.665889 superblock.go:169: /mnt/daos: checking formatting
DEBUG 16:32:20.665892 external.go:163: check if dir /mnt/daos is mounted
DEBUG 16:32:20.665899 external.go:163: check if dir /mnt/daos is mounted
DEBUG 16:32:20.665903 superblock.go:177: /mnt/daos already mounted
DEBUG 16:32:20.665926 harness.go:136: /mnt/daos: needs superblock (doesn't exist)
DEBUG 16:32:20.668239 external.go:163: check if dir /mnt/daos is mounted
DEBUG 16:32:20.668252 superblock.go:125: /mnt/daos already mounted
DEBUG 16:32:20.668836 ownership.go:82: running as root, changing file ownership to uid/gid 10695475/10695475
DEBUG 16:32:20.668845 external.go:261: os: walk /var/run/daos_server chown 10695475 10695475
DEBUG 16:32:20.668871 external.go:261: os: walk /mnt/daos chown 10695475 10695475
DEBUG 16:32:20.668886 external.go:261: os: walk /tmp/server.log chown 10695475 10695475
formatting complete and file ownership changed,please rerun daos_server as user tanabarr
[root@wolf-72 daos_m]#
---- SERVER: ----
[root@wolf-72 daos_m]# daos_server storage prepare -n -u tanabarr -b
Preparing locally-attached NVMe storage...
DEBUG 16:37:44.011805 ctl_storage.go:155: spdk setup with _NRHUGE=1024
DEBUG 16:37:44.011948 ctl_storage.go:155: spdk setup with _TARGET_USER=tanabarr
[root@wolf-72 daos_m]# rm -f /tmp/spdk_pci_lock_0000\:*
[root@wolf-72 daos_m]# su tanabarr
[tanabarr@wolf-72 daos_m]$ orterun -np 1 -H wolf-72 --mca oob ^ud --report-uri $(pwd)/urifile_tanabarr daos_server start -o /home/tanabarr/projects/daos_m/utils/config/examples/daos_server_sockets.yml -i
[wolf-72.wolf.hpdd.intel.com:441574] pmix_mca_base_component_repository_open: unable to open mca_pnet_opa: libevent_pthreads-2.1.so.6: cannot open shared object file: No such file or directory (ignored)
no control log file specified; logging to stdout
DEBUG 17:15:18.680811 start.go:153: Switching control log level to DEBUG
DEBUG 17:15:18.680894 start.go:167: cfg: &server.Configuration{ControlPort:10001, TransportConfig:(*security.TransportConfig)(0xc00017ac40), Servers:[]*ioserver.Config{(*ioserver.Config)(0xc000086600)}, BdevInclude:[]string(nil), BdevExclude:[]string(nil), NrHugepages:4096, ControlLogMask:3, ControlLogFile:"", ControlLogJSON:false, UserName:"tanabarr", GroupName:"", SystemName:"daos_server", SocketDir:"/var/run/daos_server", Fabric:ioserver.FabricConfig{Provider:"ofi+sockets", Interface:"", InterfacePort:0, PinnedNumaNode:(*uint)(nil)}, Modules:"", Attach:"", AccessPoints:[]string{"localhost"}, FaultPath:"", FaultCb:"", Hyperthreads:false, Path:"/home/tanabarr/projects/daos_m/utils/config/examples/daos_server_sockets.yml", ext:(*server.ext)(0xc00015c7e0), NvmeShmID:825161406, validateProviderFn:(server.networkProviderValidation)(0x5c0120), validateNUMAFn:(server.networkNUMAValidation)(0x5c0e00)}
DEBUG 17:15:18.681950 server.go:61: Active config saved to /home/tanabarr/projects/daos_m/utils/config/examples/.daos_server.active.yml (read-only)
Starting SPDK v18.07-pre / DPDK 18.02.0 initialization...
[ DPDK EAL parameters: spdk -c 0x1 --file-prefix=spdk825161406 --base-virtaddr=0x200000000000 --proc-type=auto ]
EAL: Detected 96 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Multi-process socket /home/tanabarr/.spdk825161406_unix
EAL: Probing VFIO support...
EAL: PCI device 0000:81:00.0 on NUMA socket 1
EAL: probe driver: 8086:2701 spdk_nvme
EAL: PCI device 0000:87:00.0 on NUMA socket 1
EAL: probe driver: 8086:953 spdk_nvme
EAL: PCI device 0000:da:00.0 on NUMA socket 1
EAL: probe driver: 8086:2701 spdk_nvme
DEBUG 17:15:22.575504 server.go:107: Warning, SCM Setup: ipmctl module discovery: get_number_of_devices: rc=268
DAOS control server listening on 0.0.0.0:10001
DEBUG 17:15:22.575739 harness.go:136: /mnt/daos: checking superblock
DEBUG 17:15:22.575760 superblock.go:169: /mnt/daos: checking formatting
DEBUG 17:15:22.575797 external.go:163: check if dir /mnt/daos is mounted
DEBUG 17:15:22.575843 external.go:163: check if dir /mnt/daos is mounted
DEBUG 17:15:22.575850 superblock.go:177: /mnt/daos already mounted
DEBUG 17:15:22.576813 exec.go:124: daos_io_server:0 config: &ioserver.Config{Rank:(*ioserver.Rank)(nil), Modules:"", TargetCount:8, HelperStreamCount:2, ServiceThreadCore:0, SystemName:"daos_server", SocketDir:"/var/run/daos_server", AttachInfoPath:"", LogMask:"DEBUG,RPC=ERR,MEM=ERR", LogFile:"/tmp/server.log", Storage:ioserver.StorageConfig{SCM:storage.ScmConfig{MountPoint:"/mnt/daos", Class:"ram", RamdiskSize:6, DeviceList:[]string(nil)}, Bdev:storage.BdevConfig{ConfigPath:"/mnt/daos/daos_nvme.conf", Class:"nvme", DeviceList:[]string{"0000:87:00.0"}, DeviceCount:0, FileSize:0, ShmID:825161406, VosEnv:"", Hostname:"wolf-72.wolf.hpdd.intel.com"}}, Fabric:ioserver.FabricConfig{Provider:"ofi+sockets", Interface:"eth0", InterfacePort:31416, PinnedNumaNode:(*uint)(nil)}, EnvVars:[]string{"ABT_ENV_MAX_NUM_XSTREAMS=100", "ABT_MAX_NUM_XSTREAMS=100", "DAOS_MD_CAP=1024", "CRT_CTX_SHARE_ADDR=0", "CRT_TIMEOUT=30", "FI_SOCKETS_MAX_CONN_RETRY=1", "FI_SOCKETS_CONN_TIMEOUT=2000"}, Index:0x0}
DEBUG 17:15:22.576834 exec.go:124: daos_io_server:0 args: [-t 8 -x 2 -g daos_server -d /var/run/daos_server -s /mnt/daos -n /mnt/daos/daos_nvme.conf -i 825161406 -I 0]
DEBUG 17:15:22.576842 exec.go:124: daos_io_server:0 env: [CRT_PHY_ADDR_STR=ofi+sockets ABT_ENV_MAX_NUM_XSTREAMS=100 ABT_MAX_NUM_XSTREAMS=100 DAOS_MD_CAP=1024 CRT_CTX_SHARE_ADDR=0 D_LOG_FILE=/tmp/server.log CRT_TIMEOUT=30 FI_SOCKETS_MAX_CONN_RETRY=1 FI_SOCKETS_CONN_TIMEOUT=2000 OFI_INTERFACE=eth0 OFI_PORT=31416]
Starting I/O server instance 0: /home/tanabarr/projects/daos_m/install/bin/daos_io_server
daos_io_server:0 Using legacy core allocation algorithm
ERROR: daos_io_server:0 [wolf-72.wolf.hpdd.intel.com:441592] pmix_mca_base_component_repository_open: unable to open mca_pnet_opa: libevent_pthreads-2.1.so.6: cannot open shared object file: No such file or directory (ignored)
daos_io_server:0 Starting SPDK v18.07-pre / DPDK 18.02.0 initialization...
[ DPDK EAL parameters: daos -c 0x1 --file-prefix=spdk825161406 --base-virtaddr=0x200000000000 --proc-type=auto ]
ERROR: daos_io_server:0 EAL: Detected 96 lcore(s)
ERROR: daos_io_server:0 EAL: Auto-detected process type: SECONDARY
daos_io_server:0 EAL: Multi-process socket /home/tanabarr/.spdk825161406_unix_441592_24943e46e6cdbe
daos_io_server:0 EAL: Probing VFIO support...
daos_io_server:0 EAL: PCI device 0000:81:00.0 on NUMA socket 1
daos_io_server:0 EAL: probe driver: 8086:2701 spdk_nvme
daos_io_server:0 EAL: PCI device 0000:87:00.0 on NUMA socket 1
daos_io_server:0 EAL: probe driver: 8086:953 spdk_nvme
daos_io_server:0 EAL: PCI device 0000:da:00.0 on NUMA socket 1
daos_io_server:0 EAL: probe driver: 8086:2701 spdk_nvme
ERROR: daos_io_server:0 bdev_nvme.c: 977:attach_cb: *ERROR*: Failed to assign name to NVMe device
ERROR: daos_io_server:0 bdev_nvme.c: 977:attach_cb: *ERROR*: Failed to assign name to NVMe device
DEBUG 17:15:30.204775 mgmt_drpc.go:111: I/O server instance 0 ready: uri:"ofi+sockets://10.8.1.72:31416" nctxs:18 drpcListenerSock:"/var/run/daos_server/daos_io_server_441592.sock"
DEBUG 17:15:30.204856 harness.go:271: create MS (bootstrap=true)
DEBUG 17:15:30.437692 instance.go:351: start MS
Management Service access point started (bootstrapped)
daos_io_server:0 DAOS I/O server (v0.6.0) process 441592 started on rank 0 (out of 1) with 8 target, 2 helper XS per target, firstcore 0, host wolf-72.wolf.hpdd.intel.com.
---- ANOTHER TTY: ----
[tanabarr@wolf-72 daos_m]$ ps -efl| grep daos
0 S tanabarr 441574 441539 0 80 0 - 90520 poll_s 17:15 pts/1 00:00:00 orterun -np 1 -H wolf-72 --mca oob ^ud --report-uri /home/tanabarr/projects/daos_m/urifile_tanabarr daos_server start -o /home/tanabarr/projects daos_m/utils/config/examples/daos_server_sockets.yml -i
0 S tanabarr 441579 441574 2 80 0 - 633513 futex_ 17:15 pts/1 00:00:04 daos_server start -o /home/tanabarr/projects/daos_m/utils/config/examples/daos_server_sockets.yml -i
0 S tanabarr 441592 441579 99 80 0 - 1515030 do_sig 17:15 pts/1 01:04:42 /home/tanabarr/projects/daos_m/install/bin/daos_io_server -t 8 -x 2 -g daos_server -d /var/run/daos_server -s /mnt/daos -n /mnt/daos/daos_nvme.conf -i 825161406 -I 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment