[torqueusers] mismatch cpuset vs. pbsstat/qstat output with regard to allocated cores

Thomas Zeiser

2017-03-21 09:51:29 UTC

Hello,

did no one else observe the discrepancy I described below?

torqueReleaseNotes6.0.3.pdf mentions as resolved issues in 6.0.3
"Torque was not allocating resources correctly for cgroup jobs.
(TRQ-3790)". However, I cannot find a related commit in the github
repository. Can anyone elaborate on the fix?

Regards

Thomas

Post by Thomas Zeiser
Hallo,
we are running torque-6.0.2 on one of your clusters with Cgroup
enabled. The scheduler is maui. I just observed that there is - in
some cases - a mismatch between the qstat/pbsnodes output and the
/sys/fs/cgroup/cpuset/torque/112998.tgadm1/cpuset.cpus:0-3
/sys/fs/cgroup/cpuset/torque/113051.tgadm1/cpuset.cpus:8-15
/sys/fs/cgroup/cpuset/torque/113072.tgadm1/cpuset.cpus:4-7
pbsnodes tg036
tg036
state = job-exclusive
power_state = Running
np = 16
properties = tiny
ntype = cluster
jobs = 0-3/112998.tgadm1,4-11/113051.tgadm1,12-15/113072.tgadm1
status = rectime=1488899215,macaddr=00:00:7a:e4:99:a4,cpuclock=OnDemand:2101MHz,varattr=,jobs=112998.tgadm1(cput=64117,energy_used=0,mem=493632kb,vmem=145076kb,walltime=16047,session_id=26960) 113051.tgadm1(cput=1812,energy_used=0,mem=23552000kb,vmem=98193120kb,walltime=2744,Error_Path=/dev/pts/0,Output_Path=/dev/pts/0,session_id=10220) 113072.tgadm1(cput=1388,energy_used=0,mem=361940kb,vmem=14687900kb,walltime=1400,session_id=12022),state=free,netload=32984659554,gres=,loadave=6.03,ncpus=16,physmem=65850036kb,availmem=128203960kb,totmem=131490480kb,idletime=347,nusers=3,nsessions=6,sessions=26960 9638 9740 10220 10973 12022,uname=Linux tg036 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64,opsys=linux
mom_service_port = 15002
mom_manager_port = 15003
total_sockets = 2
total_numa_nodes = 2
total_cores = 16
total_threads = 16
dedicated_sockets = 0
dedicated_numa_nodes = 0
dedicated_cores = 0
dedicated_threads = 16
qstat -an
112998.tgadm1 aaaa16 work iso 26960 1 4 15500mb 24:00:00 R 04:34:39
tg036/0-3
113051.tgadm1 aaaa15 work STDIN 10220 1 8 23000mb 23:00:00 R 00:52:56
tg036/4-11
113072.tgadm1 aaaa14 work sim 10220 1 4 15500mb 03:00:00 R 00:32:56
tg036/12-15
I.e. 113051 runs on cores 8-15 while pbsnodes/qstat claim 4-11; and
113072 runs on cores 4-7 while pbsnodes/qstat claim 12-15.
It seems that pbsnodes/qstat increase the core range linearly while
the actual cpuset tries to keep within the numa domain.
Any comments?
Regards
Thomas