Skip to content

Instantly share code, notes, and snippets.

@ashkurti
Created October 3, 2014 15:42
Show Gist options
  • Save ashkurti/9345b8dd5ab2b363fd03 to your computer and use it in GitHub Desktop.
Save ashkurti/9345b8dd5ab2b363fd03 to your computer and use it in GitHub Desktop.
DEBUG activated for gromacs/lsdmap workflow that fails in linux/stampede platform
[ExTASY-toolsOct2] ardita@poirot 201% extasy --RPconfig stampede.rcfg --Kconfig gromacslsdmap.wcfg
2014:10:03 16:41:15 radical.pilot.MainProcess: [INFO ] radical.pilot version: 0.20
2014:10:03 16:41:15 radical.pilot.MainProcess: [INFO ] using database url mongodb://ec2-184-72-89-141.compute-1.amazonaws.com:27017/
2014:10:03 16:41:15 radical.pilot.MainProcess: [INFO ] using database name radicalpilot
2014:10:03 16:41:15 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/localhost.json
2014:10:03 16:41:15 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/futuregrid.json
2014:10:03 16:41:15 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/lrz.json
2014:10:03 16:41:16 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/epsrc.json
2014:10:03 16:41:16 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/radical.json
2014:10:03 16:41:16 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/das4.json
2014:10:03 16:41:16 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/ncar.json
2014:10:03 16:41:16 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/iu.json
2014:10:03 16:41:16 radical.pilot.MainProcess: [INFO ] Loaded resource configurations from /users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/configs/xsede.json
2014:10:03 16:41:20 radical.pilot.MainProcess: [INFO ] New Session created{'database_url': 'mongodb://ec2-184-72-89-141.compute-1.amazonaws.com:27017/', 'database_name': 'radicalpilot', 'last_reconnect': None, 'uid': '542ec39bf8cdba4854bad298', 'created': datetime.datetime(2014, 10, 3, 15, 41, 16, 536059)}.
Session UID: 542ec39bf8cdba4854bad298
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] Worker thread (ID: Thread-1[139844068046592]) for PilotManager 542ec3a0f8cdba4854bad299 started.
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] Connected to MongoDB. Serving requests for PilotManager 542ec3a0f8cdba4854bad299.
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] saga.utils.PTYShell ('sftp://stampede.tacc.utexas.edu/')
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] PTYShell init <saga.utils.pty_shell.PTYShell object at 0x2cc20d0>
2014:10:03 16:41:20 radical.pilot.MainProcess: [INFO ] PTY prompt pattern: [\$#%>\]]\s*$
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] open master pty for [ssh] [ardi@stampede.tacc.utexas.edu] ardi: /usr/bin/env TERM=vt100 "/usr/bin/ssh" -t -o IdentityFile=/users/ardita/.ssh/id_rsa -o ControlMaster=yes -o ControlPath=/tmp/saga_ssh_ardita_%h_%p.ardi.ctrl ardi@stampede.tacc.utexas.edu'
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] PTYProcess init <saga.utils.pty_process.PTYProcess object at 0x2cc2310>
2014:10:03 16:41:20 radical.pilot.MainProcess: [INFO ] running: /usr/bin/env TERM=vt100 /usr/bin/ssh -t -o IdentityFile=/users/ardita/.ssh/id_rsa -o ControlMaster=yes -o ControlPath=/tmp/saga_ssh_ardita_%h_%p.ardi.ctrl ardi@stampede.tacc.utexas.edu
2014:10:03 16:41:20 radical.pilot.MainProcess: [DEBUG ] write: [ 6] [ 33] ( export PS1='$' ; set prompt='$'\n)
2014:10:03 16:41:21 radical.pilot.MainProcess: [DEBUG ] write: [ 6] [ 51] ( export PS1='$' > /dev/null 2>&1 || set prompt='$'\n)
2014:10:03 16:41:21 radical.pilot.MainProcess: [DEBUG ] write: [ 6] [ 28] ( printf 'HELLO_%d_SAGA\n' 1\n)
2014:10:03 16:41:22 radical.pilot.MainProcess: [DEBUG ] write: [ 6] [ 51] ( export PS1='$' > /dev/null 2>&1 || set prompt='$'\n)
2014:10:03 16:41:22 radical.pilot.MainProcess: [DEBUG ] write: [ 6] [ 28] ( printf 'HELLO_%d_SAGA\n' 2\n)
2014:10:03 16:41:22 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 113] (ControlSocket /tmp/saga_ssh_ardita_stampede.tacc.utexas.edu_22.ardi.ctrl already exists, disabling multiplexing\n)
2014:10:03 16:41:23 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 1023] (Last login: Fri Oct 3 05:13:2 ... r-guides/\nUser News: htt)
2014:10:03 16:41:23 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 1024] (p://www.tacc.utexas.edu/user-s ... e has three parallel file syst)
2014:10:03 16:41:23 radical.pilot.MainProcess: [DEBUG ] got initial shell prompt (5) (ControlSocket /tmp/saga_ssh_ardita_stampede.tacc.utexas.edu_22.ardi.ctrl already exists, disabling multiplexing
Last login: Fri Oct 3 05:13:21 2014 from ppapoirot.pharm.nottingham.ac.uk
------------------------------------------------------------------------------
Welcome to the Stampede Supercomputer
Texas Advanced Computing Center, The University of Texas at Austin
------------------------------------------------------------------------------
** Unauthorized use/access is prohibited. **
If you log on to this computer system, you acknowledge your awareness
of and concurrence with the UT Austin Acceptable Use Policy. The
University will prosecute violators to the full extent of the law.
TACC Usage Policies:
http://www.tacc.utexas.edu/user-services/usage-policies/
______________________________________________________________________________
Questions and Problem Reports:
--> XD Projects: help@xsede.org (email)
--> TACC Projects: portal.tacc.utexas.edu (web)
Documentation: http://www.tacc.utexas.edu/user-services/user-guides/
User News: http://www.tacc.utexas.edu/user-services/user-news/
______________________________________________________________________________
Welcome to Stampede, *please* read these important system notes:
--> Stampede is currently running the SLURM resource manager to
schedule all compute resources. Example SLURM job scripts are
available on the system at /share/doc/slurm
To run an interactive shell, issue:
srun -p development -t 0:30:00 -n 32 --pty /bin/bash -l
To submit a batch job, issue: sbatch job.mpi
To show all queued jobs, issue: showq
To kill a queued job, issue: scancel <jobId>
)
2014:10:03 16:41:23 radical.pilot.MainProcess: [DEBUG ] waiting for prompt trigger HELLO_2_SAGA: (5) (ControlSocket /tmp/saga_ssh_ardita_stampede.tacc.utexas.edu_22.ardi.ctrl already exists, disabling multiplexing
Last login: Fri Oct 3 05:13:21 2014 from ppapoirot.pharm.nottingham.ac.uk
------------------------------------------------------------------------------
Welcome to the Stampede Supercomputer
Texas Advanced Computing Center, The University of Texas at Austin
------------------------------------------------------------------------------
** Unauthorized use/access is prohibited. **
If you log on to this computer system, you acknowledge your awareness
of and concurrence with the UT Austin Acceptable Use Policy. The
University will prosecute violators to the full extent of the law.
TACC Usage Policies:
http://www.tacc.utexas.edu/user-services/usage-policies/
______________________________________________________________________________
Questions and Problem Reports:
--> XD Projects: help@xsede.org (email)
--> TACC Projects: portal.tacc.utexas.edu (web)
Documentation: http://www.tacc.utexas.edu/user-services/user-guides/
User News: http://www.tacc.utexas.edu/user-services/user-news/
______________________________________________________________________________
Welcome to Stampede, *please* read these important system notes:
--> Stampede is currently running the SLURM resource manager to
schedule all compute resources. Example SLURM job scripts are
available on the system at /share/doc/slurm
To run an interactive shell, issue:
srun -p development -t 0:30:00 -n 32 --pty /bin/bash -l
To submit a batch job, issue: sbatch job.mpi
To show all queued jobs, issue: showq
To kill a queued job, issue: scancel <jobId>
)
2014:10:03 16:41:23 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 352] (ems: $HOME (permanent,\n quota'd, backed-up) $WORK (permanent, quota'd, not backed-up) and\n $SCRATCH (high-speed purged storage). The "cdw" and "cds" aliases\n are provided as a convenience to change to your $WORK and $SCRATCH\n directories, respectively.\n______________________________________________________________________________\n\n)
2014:10:03 16:41:24 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 244] (----------------------- Project balances for user ardi ------------------------\n| Name Avail SUs Expires | Name Avail SUs Expires |\n| TG-MCB090174 105857 2015-09-30 | TG-TRA140016 -81080 2015-05-06 | \n)
2014:10:03 16:41:24 radical.pilot.MainProcess: [DEBUG ] write: [ 6] [ 28] ( printf 'HELLO_%d_SAGA\n' 2\n)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 405] (-------------------------- Disk quotas for user ardi --------------------------\n| Disk Usage (GB) Limit %Used File Usage Limit %Used |\n| /home1 1.3 5.0 25.49 50693 150000 33.80 |\n| /work 0.0 1024.0 0.00 0 3000000 0.00 |\n-------------------------------------------------------------------------------\n)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 169] (\nTip 33 (See "module help tacc_tips" for features or how to disable)\n\n Execute "module spider" to get a complete list of all installed software on the system.\n\n)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 37] (login1.stampede(1)$ $$HELLO_1_SAGA\n$)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] got shell prompt trigger (4) (
See "man slurm" or the Stampede user guide for more detailed information.
--> To see all the software that is available across all compilers and
mpi stacks, issue: "module spider"
--> To see which software packages are available with your currently loaded
compiler and mpi stack, issue: "module avail"
--> Stampede has three parallel file systems: $HOME (permanent,
quota'd, backed-up) $WORK (permanent, quota'd, not backed-up) and
$SCRATCH (high-speed purged storage). The "cdw" and "cds" aliases
are provided as a convenience to change to your $WORK and $SCRATCH
directories, respectively.
______________________________________________________________________________
----------------------- Project balances for user ardi ------------------------
| Name Avail SUs Expires | Name Avail SUs Expires |
| TG-MCB090174 105857 2015-09-30 | TG-TRA140016 -81080 2015-05-06 |
-------------------------- Disk quotas for user ardi --------------------------
| Disk Usage (GB) Limit %Used File Usage Limit %Used |
| /home1 1.3 5.0 25.49 50693 150000 33.80 |
| /work 0.0 1024.0 0.00 0 3000000 0.00 |
-------------------------------------------------------------------------------
Tip 33 (See "module help tacc_tips" for features or how to disable)
Execute "module spider" to get a complete list of all installed software on the system.
login1.stampede(1)$ $$HELLO_1_SAGA)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] got initial shell prompt (5) (
$)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] waiting for prompt trigger HELLO_2_SAGA: (5) (
$)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] read : [ 6] [ 31] ($HELLO_2_SAGA\n$HELLO_2_SAGA\n$)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] got shell prompt trigger (4) ($HELLO_2_SAGA
$HELLO_2_SAGA)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] got initial shell prompt (5) (
$)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] Got initial shell prompt (5) (
$)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] PTYProcess init <saga.utils.pty_process.PTYProcess object at 0x2cbc250>
2014:10:03 16:41:26 radical.pilot.MainProcess: [INFO ] running: /usr/bin/env TERM=vt100 /usr/bin/ssh -t -o IdentityFile=/users/ardita/.ssh/id_rsa -o ControlMaster=no -o ControlPath=/tmp/saga_ssh_ardita_%h_%p.ardi.ctrl ardi@stampede.tacc.utexas.edu
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] write: [ 7] [ 33] ( export PS1='$' ; set prompt='$'\n)
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] read : [ 7] [ 55] (Shared connection to stampede.tacc.utexas.edu closed.\n)
Traceback (most recent call last):
File "/users/ardita/ExTASY-toolsOct2/bin/extasy", line 9, in <module>
load_entry_point('radical.ensemblemd.extasy==0.1', 'console_scripts', 'extasy')()
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/ensemblemd/extasy/bin/runme.py", line 113, in main
umgr,session=startPilot(Kconfig,RPconfig)
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/ensemblemd/extasy/bin/runme.py", line 64, in startPilot
pilot = pmgr.submit_pilots(pdesc)
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/pilot_manager.py", line 303, in submit_pilots
resource_config=resource_cfg)
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/radical/pilot/controller/pilot_manager_controller.py", line 357, in register_start_pilot_request
shell = sup.PTYShell (url, self._session, logger, opts={})
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/saga/utils/pty_shell.py", line 228, in __init__
self.pty_shell = self.factory.run_shell (self.pty_info)
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/saga/utils/pty_shell_factory.py", line 418, in run_shell
self._initialize_pty (sh_slave, info, is_shell=True)
File "/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/saga/utils/pty_shell_factory.py", line 384, in _initialize_pty
raise ptye.translate_exception (e)
saga.exceptions.NoSuccess: Insufficient system resources: Insufficient system resources: read from process failed '[Errno 5] Input/output error' : (Shared connection to stampede.tacc.utexas.edu closed.
) ((Shared connection to stampede.tacc.utexas.edu closed.
)) (/users/ardita/ExTASY-toolsOct2/lib/python2.6/site-packages/saga/utils/pty_exceptions.py +59 (translate_exception) : e = se.NoSuccess ("Insufficient system resources: %s" % cmsg))
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] PTYProcess del <saga.utils.pty_process.PTYProcess object at 0x2cbc250>
2014:10:03 16:41:26 radical.pilot.MainProcess: [DEBUG ] PTYShell del <saga.utils.pty_shell.PTYShell object at 0x2cc20d0>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment