Skip to content

Instantly share code, notes, and snippets.

@ashkurti
Created October 21, 2015 14:43
Show Gist options
  • Save ashkurti/faa44cd8647d5dc691f7 to your computer and use it in GitHub Desktop.
Save ashkurti/faa44cd8647d5dc691f7 to your computer and use it in GitHub Desktop.
================================================================================
EnsembleMD (0.3.6)
================================================================================
Starting Allocation ok
Verifying pattern ok
Starting pattern execution ok
--------------------------------------------------------------------------------
Executing simulation-analysis loop with 2 iterations on 16 allocated core(s) on 'xsede.stampede'
Job waiting on queue...2015-10-21 15:39:48,744: radical.saga.cpi : MainProcess : PilotLauncherWorker-1: ERROR : NoSuccess: Couldn't get job id from submitted job! sbatch output:
-----------------------------------------------------------------
Welcome to the Stampede Supercomputer
-----------------------------------------------------------------
--> Verifying valid submit host (login3)...OK
--> Verifying valid jobname...OK
--> Enforcing max jobs per user...OK
--> Verifying availability of your home dir (/home1/02998/ardi)...OK
--> Verifying availability of your work dir (/work/02998/ardi)...OK
--> Verifying availability of your scratch dir (/scratch/02998/ardi)...OK
--> Verifying valid ssh keys...OK
--> Verifying access to desired queue (development)...OK
--> Verifying job request is within current queue limits...FAILED
[*] Too many simultaneous jobs in queue.
--> Max job limits for development = 1 jobs
removed `tmp_z8CmwQ.slurm'
2015-10-21 15:39:48,762: radical.pilot : MainProcess : PilotLauncherWorker-1: ERROR : Using bootstrapper /users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/radical/pilot/bootstrapper/default_bootstrapper.sh
Copying bootstrapper 'file://localhost/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/radical/pilot/bootstrapper/default_bootstrapper.sh' to agent sandbox (<saga.filesystem.directory.Directory object at 0x7f6ee0135d50>).
Copying sdist 'file://localhost/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/radical/utils/radical.utils-0.37.tar.gz' to sandbox (sftp://stampede.tacc.utexas.edu/work/02998/ardi/radical.pilot.sandbox/rp.session.moriarty.pharm.nottingham.ac.uk.ardita.016729.0013-pilot.0000/).
Copying sdist 'file://localhost/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/saga-python-0.37.tar.gz' to sandbox (sftp://stampede.tacc.utexas.edu/work/02998/ardi/radical.pilot.sandbox/rp.session.moriarty.pharm.nottingham.ac.uk.ardita.016729.0013-pilot.0000/).
Copying sdist 'file://localhost/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/radical/pilot/controller/..//radical.pilot-0.37.10.tar.gz' to sandbox (sftp://stampede.tacc.utexas.edu/work/02998/ardi/radical.pilot.sandbox/rp.session.moriarty.pharm.nottingham.ac.uk.ardita.016729.0013-pilot.0000/).
Writing agent configuration to file '/tmp/rp_agent_cfg_4wRW2v.json'.
Copying agent configuration file 'file://localhost/tmp/rp_agent_cfg_4wRW2v.json' to sandbox (sftp://stampede.tacc.utexas.edu/work/02998/ardi/radical.pilot.sandbox/rp.session.moriarty.pharm.nottingham.ac.uk.ardita.016729.0013-pilot.0000/).
Submitting SAGA job with description: {'Project': 'TG-MCB090174', 'Executable': '/bin/bash', 'TotalPhysicalMemory': None, 'WorkingDirectory': '/work/02998/ardi/radical.pilot.sandbox/rp.session.moriarty.pharm.nottingham.ac.uk.ardita.016729.0013-pilot.0000/', 'Queue': 'development', 'Environment': {}, 'WallTimeLimit': 20, 'Arguments': ['-l bootstrap_1.sh', " -d 'radical.utils-0.37.tar.gz:saga-python-0.37.tar.gz:radical.pilot-0.37.10.tar.gz' -m 'create' -p 'pilot.0000' -r 'local' -s 'rp.session.moriarty.pharm.nottingham.ac.uk.ardita.016729.0013' -v '/work/02998/ardi/radical.pilot.sandbox/ve_stampede' -a 'multicore' -e 'module purge' -e 'module load TACC' -e 'module load intel/15.0.2' -e 'module load python/2.7.9' -e 'module unload xalt' -e 'source ~train00/ssi_sourceme' -e 'export TACC_DELETE_FILES=TRUE' -w 'export PATH=$PATH' -w 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH'"], 'ProcessesPerHost': None, 'Error': 'bootstrap_1.err', 'Output': 'bootstrap_1.out', 'TotalCPUCount': 16}
Pilot launching failed! (Couldn't get job id from submitted job! sbatch output:
-----------------------------------------------------------------
Welcome to the Stampede Supercomputer
-----------------------------------------------------------------
--> Verifying valid submit host (login3)...OK
--> Verifying valid jobname...OK
--> Enforcing max jobs per user...OK
--> Verifying availability of your home dir (/home1/02998/ardi)...OK
--> Verifying availability of your work dir (/work/02998/ardi)...OK
--> Verifying availability of your scratch dir (/scratch/02998/ardi)...OK
--> Verifying valid ssh keys...OK
--> Verifying access to desired queue (development)...OK
--> Verifying job request is within current queue limits...FAILED
[*] Too many simultaneous jobs in queue.
--> Max job limits for development = 1 jobs
removed `tmp_z8CmwQ.slurm'
(/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/adaptors/slurm/slurm_job.py +620 (_job_run) : " sbatch output:\n%s" % out)))
Traceback (most recent call last):
File "/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/radical/pilot/controller/pilot_launcher_worker.py", line 736, in run
pilotjob.run()
File "/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/job/job.py", line 462, in run
return self._adaptor.run (ttype=ttype)
File "/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/adaptors/cpi/decorators.py", line 57, in wrap_function
return sync_function (self, *args, **kwargs)
File "/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/adaptors/slurm/slurm_job.py", line 1207, in run
self._id = self.js._job_run (self.jd)
File "/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/adaptors/slurm/slurm_job.py", line 620, in _job_run
" sbatch output:\n%s" % out)
NoSuccess: Couldn't get job id from submitted job! sbatch output:
-----------------------------------------------------------------
Welcome to the Stampede Supercomputer
-----------------------------------------------------------------
--> Verifying valid submit host (login3)...OK
--> Verifying valid jobname...OK
--> Enforcing max jobs per user...OK
--> Verifying availability of your home dir (/home1/02998/ardi)...OK
--> Verifying availability of your work dir (/work/02998/ardi)...OK
--> Verifying availability of your scratch dir (/scratch/02998/ardi)...OK
--> Verifying valid ssh keys...OK
--> Verifying access to desired queue (development)...OK
--> Verifying job request is within current queue limits...FAILED
[*] Too many simultaneous jobs in queue.
--> Max job limits for development = 1 jobs
removed `tmp_z8CmwQ.slurm'
(/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/adaptors/slurm/slurm_job.py +620 (_job_run) : " sbatch output:\n%s" % out))
2015-10-21 15:39:49,668: radical.enmd.SingleClusterEnvironment: MainProcess : Thread-1 : ERROR : Resource error: Pilot launching failed! (Couldn't get job id from submitted job! sbatch output:
-----------------------------------------------------------------
Welcome to the Stampede Supercomputer
-----------------------------------------------------------------
--> Verifying valid submit host (login3)...OK
--> Verifying valid jobname...OK
--> Enforcing max jobs per user...OK
--> Verifying availability of your home dir (/home1/02998/ardi)...OK
--> Verifying availability of your work dir (/work/02998/ardi)...OK
--> Verifying availability of your scratch dir (/scratch/02998/ardi)...OK
--> Verifying valid ssh keys...OK
--> Verifying access to desired queue (development)...OK
--> Verifying job request is within current queue limits...FAILED
[*] Too many simultaneous jobs in queue.
--> Max job limits for development = 1 jobs
removed `tmp_z8CmwQ.slurm'
(/users/ardita/ExTASY_0.2_Oct_21/lib/python2.7/site-packages/saga/adaptors/slurm/slurm_job.py +620 (_job_run) : " sbatch output:\n%s" % out)))
2015-10-21 15:39:49,669: radical.enmd.SingleClusterEnvironment: MainProcess : Thread-1 : ERROR : Pattern execution FAILED.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment