keisukefukuda/ChainerMN pytest error

## ChainerMN pytest error

Fri Apr 12 07:58:03 UTC 2019

================================================================================
In process.sh:
  CUDA_VERSION = 9.0
  PYTHON_VERSION = 2.7.15
  OPENMPI_VERSION = 2.1.3
  Chainer = https://github.com/chainer/chainer@master
  CuPy = https://github.com/cupy/cupy@master

  OMPI_COMM_WORLD_LOCAL_RANK = 1
  OMPI_COMM_WORLD_RANK = 3
  OMPI_COMM_WORLD_NODE_RANK = 1
  OMPI_COMM_WORLD_SIZE = 4
  OMPI_COMM_WORLD_LOCAL_SIZE = 2
  host = kokona-job000679-worker-1

/usr/local/cuda-9.0/bin:/usr/local/pyenv/shims:/usr/local/cuda-9.2/bin:/bin:/usr/bin:/usr/local/pyenv/bin:/usr/local/openmpi-2.1.3/bin:/usr/local/openmpi-2.1.3/bin:/usr/local/pyenv/shims:/usr/local/cuda-9.2/bin:/bin:/usr/bin:/usr/local/pyenv/bin:/usr/local/pyenv/shims:/usr/local/cuda-9.2/bin:/bin:/usr/bin:/usr/local/pyenv/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
/usr/local/cuda-9.2/bin/nvcc
  nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Tue_Jun_12_23:07:04_CDT_2018
Cuda compilation tools, release 9.2, V9.2.148

  nvidia-smi
Fri Apr 12 07:58:04 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:3F:00.0 Off |                    0 |
| N/A   33C    P0    26W / 250W |     11MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  On   | 00000000:40:00.0 Off |                    0 |
| N/A   34C    P0    27W / 250W |     11MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Fri Apr 12 07:58:04 UTC 2019
================================================================================
Setup Python, MPI, CuDNN, etc.
================================================================================
Python 2.7.15
Fri Apr 12 07:58:04 UTC 2019
================================================================================
Install Chainer / CuPy
================================================================================
+ export CPATH=/cudnnenv/.cudnn/active/cuda/include:/cudnnenv/active/cuda/include:
+ CPATH=/cudnnenv/.cudnn/active/cuda/include:/cudnnenv/active/cuda/include:
+ export LD_LIBRARY_PATH=/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/openmpi-2.1.3/lib:/cudnnenv/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
+ LD_LIBRARY_PATH=/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/openmpi-2.1.3/lib:/cudnnenv/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
+ export LIBRARY_PATH=/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/cuda/lib64/stubs
+ LIBRARY_PATH=/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/cuda/lib64/stubs
Waiting for Chainer repository to be set up...
+ export CPATH=/usr/local/nccl/2.4.2-1/include:/cudnnenv/.cudnn/active/cuda/include:/cudnnenv/active/cuda/include:
+ CPATH=/usr/local/nccl/2.4.2-1/include:/cudnnenv/.cudnn/active/cuda/include:/cudnnenv/active/cuda/include:
+ export LD_LIBRARY_PATH=/usr/local/nccl/2.4.2-1/lib:/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/openmpi-2.1.3/lib:/cudnnenv/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
+ LD_LIBRARY_PATH=/usr/local/nccl/2.4.2-1/lib:/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/openmpi-2.1.3/lib:/cudnnenv/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
+ export LIBRARY_PATH=/usr/local/nccl/2.4.2-1/lib:/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/cuda/lib64/stubs
+ LIBRARY_PATH=/usr/local/nccl/2.4.2-1/lib:/cudnnenv/.cudnn/active/cuda/lib64:/cudnnenv/active/cuda/lib64:/usr/local/cuda/lib64/stubs
+ '[' 1 -eq 0 ']'
+ echo 'Waiting for Chainer repository to be set up...'
+ sleep 30
+ true
+ '[' -f /tmp/chainer_install_done ']'
Still waiting for /tmp/chainer_install_done to be created
+ echo 'Still waiting for /tmp/chainer_install_done to be created'
+ sleep 5
+ true
+ '[' -f /tmp/chainer_install_done ']'
Still waiting for /tmp/chainer_install_done to be created
+ echo 'Still waiting for /tmp/chainer_install_done to be created'
+ sleep 5
+ true
+ '[' -f /tmp/chainer_install_done ']'
Still waiting for /tmp/chainer_install_done to be created
+ echo 'Still waiting for /tmp/chainer_install_done to be created'
+ sleep 5
+ true
+ '[' -f /tmp/chainer_install_done ']'
Found /tmp/chainer_install_done file
+ echo 'Found /tmp/chainer_install_done file'
+ break
+ set -e
+ date
Fri Apr 12 07:58:49 UTC 2019
+ cat
================================================================================
Chainer software versions:
================================================================================
+ echo 'Chainer versions:'
Chainer versions:
+ python -c 'import chainer; print('\''Chainer '\'', chainer.__version__)'
('Chainer ', '6.0.0rc1')
+ python -c 'import cupy; print('\''Cupy '\'', cupy.__version__)'
('Cupy ', '6.0.0rc1')
+ python -c 'import chainer; chainer.print_runtime_info()'
Platform: Linux-4.4.0-116-generic-x86_64-with-debian-stretch-sid
Chainer: 6.0.0rc1
NumPy: 1.16.2
CuPy:
  CuPy Version          : 6.0.0rc1
  CUDA Root             : /usr/local/cuda-9.2
  CUDA Build Version    : 9020
  CUDA Driver Version   : 10000
  CUDA Runtime Version  : 9020
  cuDNN Build Version   : 7500
  cuDNN Version         : 7500
  NCCL Build Version    : 2402
  NCCL Runtime Version  : 2402
iDeep: Not Available
+ date
Fri Apr 12 07:58:54 UTC 2019
+ echo =====================================================================
=====================================================================
+ echo '                            ibstat'
                            ibstat
+ echo =====================================================================
=====================================================================
+ ibstat
CA 'mlx5_0'
	CA type: MT4115
	Number of ports: 1
	Firmware version: 12.23.1000
	Hardware version: 0
	Node GUID: 0x506b4b03001c9e1a
	System image GUID: 0x506b4b03001c9e1a
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 100
		Base lid: 351
		LMC: 0
		SM lid: 1
		Capability mask: 0x2651e848
		Port GUID: 0x506b4b03001c9e1a
		Link layer: InfiniBand
CA 'mlx5_1'
	CA type: MT4115
	Number of ports: 1
	Firmware version: 12.23.1000
	Hardware version: 0
	Node GUID: 0x506b4b03001c9dce
	System image GUID: 0x506b4b03001c9dce
	Port 1:
		State: Active
		Physical state: LinkUp
		Rate: 100
		Base lid: 334
		LMC: 0
		SM lid: 1
		Capability mask: 0x2651e848
		Port GUID: 0x506b4b03001c9dce
		Link layer: InfiniBand
+ date
Fri Apr 12 07:58:54 UTC 2019
+ echo =====================================================================
+ echo '                            nvidia-smi'
=====================================================================
                            nvidia-smi
+ echo =====================================================================
=====================================================================
+ nvidia-smi
Fri Apr 12 07:58:54 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:3F:00.0 Off |                    0 |
| N/A   33C    P0    26W / 250W |     11MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  On   | 00000000:40:00.0 Off |                    0 |
| N/A   34C    P0    27W / 250W |     11MiB / 32480MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
+ nvidia-smi -x -q
+ grep uuid
		<uuid>GPU-f2688668-b481-6666-6305-fc9f28487066</uuid>
		<uuid>GPU-bfc28117-c095-cf68-cf17-14c444e92540</uuid>
+ date
Fri Apr 12 07:58:57 UTC 2019
+ echo =====================================================================
=====================================================================
+ echo '                      chainermn-micro-benchmark'
                      chainermn-micro-benchmark
+ echo =====================================================================
=====================================================================
+ date
Fri Apr 12 07:58:57 UTC 2019
+ cat
================================================================================
Main task
================================================================================
+ cd /chainer
+ case $KOKONA_TARGET in
+ TIMEOUT=7200
+ timeout -s KILL -k 30 7200 python -m pytest --color=yes --full-trace --duration=10 -x --capture=no -s -v -m 'not slow' tests/chainermn_tests
[1m============================= test session starts ==============================[0m
platform linux2 -- Python 2.7.15, pytest-4.4.0, py-1.8.0, pluggy-0.9.0 -- /usr/local/pyenv/versions/2.7.15/bin/python
cachedir: .pytest_cache
rootdir: /chainer, inifile: setup.cfg
[1mcollecting ... [0mmbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
[1m
collecting 0 items                                                             [0m[1m
collecting 174 items                                                           [0m[1m
collected 266 items / 4 deselected / 262 selected                              [0m

tests/chainermn_tests/communicator_tests/test_communication_utility.py::TestCommunicationUtility::test_chunked_bcast_objs [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_cpu[param0] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param0] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param1] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param2] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param3] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param4] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param5] [32mPASSED[0m
tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param6] [31mFAILED[0m

=================================== FAILURES ===================================
[1m[31m________________________ test_communicator_gpu[param6] _________________________[0m

cls = <class '_pytest.runner.CallInfo'>
func = <function <lambda> at 0x7efd48829cf8>, when = 'call'
reraise = (<class '_pytest.outcomes.Exit'>, <type 'exceptions.KeyboardInterrupt'>)

[1m    @classmethod[0m
[1m    def from_call(cls, func, when, reraise=None):[0m
[1m        #: context of invocation: one of "setup", "call",[0m
[1m        #: "teardown", "memocollect"[0m
[1m        start = time()[0m
[1m        excinfo = None[0m
[1m        try:[0m
[1m>           result = func()[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/runner.py[0m:226:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

[1m>       lambda: ihook(item=item, **kwds), when=when, reraise=reraise[0m
[1m    )[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/runner.py[0m:198:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_HookCaller 'pytest_runtest_call'>, args = ()
kwargs = {'item': <Function test_communicator_gpu[param6]>}, notincall = set([])

[1m    def __call__(self, *args, **kwargs):[0m
[1m        if args:[0m
[1m            raise TypeError("hook calling supports only keyword arguments")[0m
[1m        assert not self.is_historic()[0m
[1m        if self.spec and self.spec.argnames:[0m
[1m            notincall = ([0m
[1m                set(self.spec.argnames) - set(["__multicall__"]) - set(kwargs.keys())[0m
[1m            )[0m
[1m            if notincall:[0m
[1m                warnings.warn([0m
[1m                    "Argument(s) {} which are declared in the hookspec "[0m
[1m                    "can not be found in this hook call".format(tuple(notincall)),[0m
[1m                    stacklevel=2,[0m
[1m                )[0m
[1m>       return self._hookexec(self, self.get_hookimpls(), kwargs)[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/pluggy/hooks.py[0m:289:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.config.PytestPluginManager object at 0x7efd5d434850>
hook = <_HookCaller 'pytest_runtest_call'>
methods = [<HookImpl plugin_name='runner', plugin=<module '_pytest.runner' from '/usr/local/pyenv/versions/2.7.15/lib/python2.7/...u[param6]>>>, <HookImpl plugin_name='logging-plugin', plugin=<_pytest.logging.LoggingPlugin object at 0x7efd5d73aa90>>]
kwargs = {'item': <Function test_communicator_gpu[param6]>}

[1m    def _hookexec(self, hook, methods, kwargs):[0m
[1m        # called from all hookcaller instances.[0m
[1m        # enable_tracing will set its own wrapping function at self._inner_hookexec[0m
[1m>       return self._inner_hookexec(hook, methods, kwargs)[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/pluggy/manager.py[0m:68:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

hook = <_HookCaller 'pytest_runtest_call'>
methods = [<HookImpl plugin_name='runner', plugin=<module '_pytest.runner' from '/usr/local/pyenv/versions/2.7.15/lib/python2.7/...u[param6]>>>, <HookImpl plugin_name='logging-plugin', plugin=<_pytest.logging.LoggingPlugin object at 0x7efd5d73aa90>>]
kwargs = {'item': <Function test_communicator_gpu[param6]>}

[1m    self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall([0m
[1m        methods,[0m
[1m        kwargs,[0m
[1m>       firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,[0m
[1m    )[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/pluggy/manager.py[0m:62:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

item = <Function test_communicator_gpu[param6]>

[1m    def pytest_runtest_call(item):[0m
[1m        _update_current_test_var(item, "call")[0m
[1m        sys.last_type, sys.last_value, sys.last_traceback = (None, None, None)[0m
[1m        try:[0m
[1m>           item.runtest()[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/runner.py[0m:123:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Function test_communicator_gpu[param6]>

[1m    def runtest(self):[0m
[1m        """ execute the underlying test function. """[0m
[1m>       self.ihook.pytest_pyfunc_call(pyfuncitem=self)[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/python.py[0m:1464:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_HookCaller 'pytest_pyfunc_call'>, args = ()
kwargs = {'pyfuncitem': <Function test_communicator_gpu[param6]>}
notincall = set([])

[1m    def __call__(self, *args, **kwargs):[0m
[1m        if args:[0m
[1m            raise TypeError("hook calling supports only keyword arguments")[0m
[1m        assert not self.is_historic()[0m
[1m        if self.spec and self.spec.argnames:[0m
[1m            notincall = ([0m
[1m                set(self.spec.argnames) - set(["__multicall__"]) - set(kwargs.keys())[0m
[1m            )[0m
[1m            if notincall:[0m
[1m                warnings.warn([0m
[1m                    "Argument(s) {} which are declared in the hookspec "[0m
[1m                    "can not be found in this hook call".format(tuple(notincall)),[0m
[1m                    stacklevel=2,[0m
[1m                )[0m
[1m>       return self._hookexec(self, self.get_hookimpls(), kwargs)[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/pluggy/hooks.py[0m:289:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <_pytest.config.PytestPluginManager object at 0x7efd5d434850>
hook = <_HookCaller 'pytest_pyfunc_call'>
methods = [<HookImpl plugin_name='python', plugin=<module '_pytest.python' from '/usr/local/pyenv/versions/2.7.15/lib/python2.7/...=<module '_pytest.skipping' from '/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/skipping.pyc'>>]
kwargs = {'pyfuncitem': <Function test_communicator_gpu[param6]>}

[1m    def _hookexec(self, hook, methods, kwargs):[0m
[1m        # called from all hookcaller instances.[0m
[1m        # enable_tracing will set its own wrapping function at self._inner_hookexec[0m
[1m>       return self._inner_hookexec(hook, methods, kwargs)[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/pluggy/manager.py[0m:68:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

hook = <_HookCaller 'pytest_pyfunc_call'>
methods = [<HookImpl plugin_name='python', plugin=<module '_pytest.python' from '/usr/local/pyenv/versions/2.7.15/lib/python2.7/...=<module '_pytest.skipping' from '/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/skipping.pyc'>>]
kwargs = {'pyfuncitem': <Function test_communicator_gpu[param6]>}

[1m    self._inner_hookexec = lambda hook, methods, kwargs: hook.multicall([0m
[1m        methods,[0m
[1m        kwargs,[0m
[1m>       firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,[0m
[1m    )[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/pluggy/manager.py[0m:62:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

pyfuncitem = <Function test_communicator_gpu[param6]>

[1m    @hookimpl(trylast=True)[0m
[1m    def pytest_pyfunc_call(pyfuncitem):[0m
[1m        testfunction = pyfuncitem.obj[0m
[1m        iscoroutinefunction = getattr(inspect, "iscoroutinefunction", None)[0m
[1m        if iscoroutinefunction is not None and iscoroutinefunction(testfunction):[0m
[1m            msg = "Coroutine functions are not natively supported and have been skipped.\n"[0m
[1m            msg += "You need to install a suitable plugin for your async framework, for example:\n"[0m
[1m            msg += "  - pytest-asyncio\n"[0m
[1m            msg += "  - pytest-trio\n"[0m
[1m            msg += "  - pytest-tornasync"[0m
[1m            warnings.warn(PytestWarning(msg.format(pyfuncitem.nodeid)))[0m
[1m            skip(msg="coroutine function and no async plugin installed (see warnings)")[0m
[1m        funcargs = pyfuncitem.funcargs[0m
[1m        testargs = {arg: funcargs[arg] for arg in pyfuncitem._fixtureinfo.argnames}[0m
[1m>       testfunction(**testargs)[0m

[1m[31m/usr/local/pyenv/versions/2.7.15/lib/python2.7/site-packages/_pytest/python.py[0m:178:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

param = {'allreduce_grad_dtype': None,
 'batched_copy': False,
 'communicator_class': ...one,
 'gpu': False,
 'model_dtype': None,
 'multi_node': True,
 'nccl1': False}

[1m    @pytest.mark.parametrize('param', gpu_params)[0m
[1m    @chainer.testing.attr.gpu[0m
[1m    def test_communicator_gpu(param):[0m
[1m        check_send_recv(param, True)[0m
[1m>       check_collective_communication(param, True)[0m

[1m[31mtests/chainermn_tests/communicator_tests/test_communicator.py[0m:476:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

param = {'allreduce_grad_dtype': None,
 'batched_copy': False,
 'communicator_class': ...one,
 'gpu': False,
 'model_dtype': None,
 'multi_node': True,
 'nccl1': False}
use_gpu = True

[1m    def check_collective_communication(param, use_gpu):[0m
[1m        communicator = create_communicator(param, use_gpu)[0m
[1m        mpi_comm.barrier()[0m
[1m    [0m
[1m        model = ExampleModel(param.model_dtype)[0m
[1m        if use_gpu:[0m
[1m            model.to_gpu()[0m
[1m        check_bcast_data(communicator, model)[0m
[1m>       check_allreduce_grad(communicator, model)[0m

[1m[31mtests/chainermn_tests/communicator_tests/test_communicator.py[0m:444:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

communicator = <chainermn.communicators.two_dimensional_communicator.TwoDimensionalCommunicator object at 0x7efd4882dc50>
model = <test_communicator.ExampleModel object at 0x7efd4882dbd0>

[1m    def check_allreduce_grad(communicator, model):[0m
[1m        # We need to repeat twice for regressions on lazy initialization of[0m
[1m        # sub communicators.[0m
[1m    [0m
[1m        for _ in range(2):[0m
[1m            model.a.W.grad[:] = communicator.rank[0m
[1m            model.b.W.grad[:] = communicator.rank + 1[0m
[1m            model.c.b.grad[:] = communicator.rank + 2[0m
[1m    [0m
[1m            communicator.allreduce_grad(model)[0m
[1m            base = (communicator.size - 1.0) / 2[0m
[1m    [0m
[1m            chainer.testing.assert_allclose(model.a.W.grad,[0m
[1m                                            (base + 0) * np.ones((3, 2)))[0m
[1m            chainer.testing.assert_allclose(model.b.W.grad,[0m
[1m                                            (base + 1) * np.ones((4, 3)))[0m
[1m            chainer.testing.assert_allclose(model.c.b.grad,[0m
[1m>                                           (base + 2) * np.ones((5, )))[0m

[1m[31mtests/chainermn_tests/communicator_tests/test_communicator.py[0m:316:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

x = array([3.5, 3.5, 3.5, 5. , 5. ], dtype=float32)
y = array([3.5, 3.5, 3.5, 3.5, 3.5]), atol = 1e-05, rtol = 0.0001
verbose = True

[1m    def assert_allclose(x, y, atol=1e-5, rtol=1e-4, verbose=True):[0m
[1m        """Asserts if some corresponding element of x and y differs too much.[0m
[1m    [0m
[1m        This function can handle both CPU and GPU arrays simultaneously.[0m
[1m    [0m
[1m        Args:[0m
[1m            x: Left-hand-side array.[0m
[1m            y: Right-hand-side array.[0m
[1m            atol (float): Absolute tolerance.[0m
[1m            rtol (float): Relative tolerance.[0m
[1m            verbose (bool): If ``True``, it outputs verbose messages on error.[0m
[1m    [0m
[1m        """[0m
[1m        x = backend.CpuDevice().send(utils.force_array(x))[0m
[1m        y = backend.CpuDevice().send(utils.force_array(y))[0m
[1m        try:[0m
[1m            numpy.testing.assert_allclose([0m
[1m                x, y, atol=atol, rtol=rtol, verbose=verbose)[0m
[1m        except AssertionError as e:[0m
[1m            f = six.StringIO()[0m
[1m            f.write(str(e) + '\n\n')[0m
[1m            f.write([0m
[1m                'assert_allclose failed: \n' +[0m
[1m                '  shape: {} {}\n'.format(x.shape, y.shape) +[0m
[1m                '  dtype: {} {}\n'.format(x.dtype, y.dtype))[0m
[1m            if x.shape == y.shape:[0m
[1m                xx = numpy.atleast_1d(x)[0m
[1m                yy = numpy.atleast_1d(y)[0m
[1m                err = numpy.abs(xx - yy)[0m
[1m                tol_err = atol + rtol * numpy.abs(yy).astype(numpy.float64)[0m
[1m                i = numpy.unravel_index([0m
[1m                    numpy.argmax(err.astype(numpy.float64) - tol_err), err.shape)[0m
[1m                if yy[i] == 0:[0m
[1m                    rel_err = 'inf'[0m
[1m                else:[0m
[1m                    rel_err = err[i] / numpy.abs(yy[i])[0m
[1m                f.write([0m
[1m                    '  i: {}\n'.format(i) +[0m
[1m                    '  x[i]: {}\n'.format(xx[i]) +[0m
[1m                    '  y[i]: {}\n'.format(yy[i]) +[0m
[1m                    '  relative error[i]: {}\n'.format(rel_err) +[0m
[1m                    '  absolute error[i]: {}\n'.format(err[i]))[0m
[1m            opts = numpy.get_printoptions()[0m
[1m            try:[0m
[1m                numpy.set_printoptions(threshold=10000)[0m
[1m                f.write('x: ' + numpy.array2string(x, prefix='x: ') + '\n')[0m
[1m                f.write('y: ' + numpy.array2string(y, prefix='y: ') + '\n')[0m
[1m            finally:[0m
[1m                numpy.set_printoptions(**opts)[0m
[1m>           raise AssertionError(f.getvalue())[0m
[1m[31mE           AssertionError: [0m
[1m[31mE           Not equal to tolerance rtol=0.0001, atol=1e-05[0m
[1m[31mE           [0m
[1m[31mE           Mismatch: 40%[0m
[1m[31mE           Max absolute difference: 1.5[0m
[1m[31mE           Max relative difference: 0.42857143[0m
[1m[31mE            x: array([3.5, 3.5, 3.5, 5. , 5. ], dtype=float32)[0m
[1m[31mE            y: array([3.5, 3.5, 3.5, 3.5, 3.5])[0m
[1m[31mE           [0m
[1m[31mE           assert_allclose failed: [0m
[1m[31mE             shape: (5,) (5,)[0m
[1m[31mE             dtype: float32 float64[0m
[1m[31mE             i: (3,)[0m
[1m[31mE             x[i]: 5.0[0m
[1m[31mE             y[i]: 3.5[0m
[1m[31mE             relative error[i]: 0.428571428571[0m
[1m[31mE             absolute error[i]: 1.5[0m
[1m[31mE           x: [3.5 3.5 3.5 5.  5. ][0m
[1m[31mE           y: [3.5 3.5 3.5 3.5 3.5][0m

[1m[31mchainer/testing/array.py[0m:59: AssertionError
========================== slowest 10 test durations ===========================
4.89s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param0]
2.30s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param1]
1.61s call     tests/chainermn_tests/communicator_tests/test_communication_utility.py::TestCommunicationUtility::test_chunked_bcast_objs
0.99s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param4]
0.23s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_cpu[param0]
0.22s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param5]
0.20s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param2]
0.20s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param6]
0.10s call     tests/chainermn_tests/communicator_tests/test_communicator.py::test_communicator_gpu[param3]

(0.00 durations hidden.  Use -vv to show these durations.)
[1m[31m============== 1 failed, 8 passed, 4 deselected in 13.16 seconds ===============[0m

------------------------------------------------------------
Error occured on /process.sh [Line 303]: Status 1

PID: 15
Current directory: /chainer
Command line: /process.sh
------------------------------------------------------------

++ onerror 303
++ status=1
++ script=/process.sh
++ line=303
++ shift
++ args=
++ echo ''
++ echo ------------------------------------------------------------
++ echo 'Error occured on /process.sh [Line 303]: Status 1'
++ echo ''
++ echo 'PID: 15'
+++ id
++ echo 'Current directory: /chainer'
++ echo 'Command line: /process.sh '
++ echo ------------------------------------------------------------
++ echo ''