Discussion:
[torqueusers] (no subject)
Ioannis Botsis
2017-04-27 04:17:32 UTC
Permalink
Hello there



I have a cluster of 44 nodes wn0[01-44].



I use the following script mypbs.pbs to submit jobs



_____

#PBS -N simpledemo

#PBS -l nodes=7

#PBS -q tuc

#PBS -m abe -M ***@isc.tuc.gr

#PBS -k oe



myjob

_____



Instead of binding 7 nodes it binds node wn007. The same if I try to bind 1
or 2 or 3 or 4 or ...23.. or 44 nodes, always binds the N-th node where N
between 1 and 44.



Instead of

#PBS -l nodes=7

I try

#PBS -l
nodes=wn002.grid.tuc.gr:ppn=2+wn038.grid.tuc.gr:ppn=2+wn017.grid.tuc.gr:ppn=
2+wn020.grid.tuc.gr:ppn=2



It binds always only the first node of the list whatever it is.



pbsnodes command return all nodes running and free...



Any hint



Ioannis Botsis
John Griffin-Wiesner
2017-04-27 14:03:57 UTC
Permalink
It's probably defaulting to ppn=1 and giving you 7 cores on the
one node. Try the following where "X" equals the number of cores
on each node.

#PBS -l nodes=7:ppn=X
Post by Ioannis Botsis
Hello there
I have a cluster of 44 nodes wn0[01-44].
I use the following script mypbs.pbs to submit jobs
_____
#PBS -N simpledemo
#PBS -l nodes=7
#PBS -q tuc
#PBS -k oe
myjob
_____
Instead of binding 7 nodes it binds node wn007. The same if I try to bind 1
or 2 or 3 or 4 or ...23.. or 44 nodes, always binds the N-th node where N
between 1 and 44.
Instead of
#PBS -l nodes=7
I try
#PBS -l
nodes=wn002.grid.tuc.gr:ppn=2+wn038.grid.tuc.gr:ppn=2+wn017.grid.tuc.gr:ppn=
2+wn020.grid.tuc.gr:ppn=2
It binds always only the first node of the list whatever it is.
pbsnodes command return all nodes running and free...
Any hint
Ioannis Botsis
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
--
John Griffin-Wiesner
HPC Systems Administrator
Minnesota Supercomputing Institute
http://www.msi.umn.edu
***@msi.umn.edu
Loading...