Discussion:
[torqueusers] let the big one in.
andrealphus
2016-12-09 22:29:08 UTC
Permalink
So as our resource user base has grown we're realizing we need to
probably introduce some type of throttling so that a single user isnt
using all the available nodes at any one time.

We're in a mixed situation where about half our users are submitting
job arrays of tiny jobs (1 or 2 processors per job) but with thousands
of jobs in the array.

and the other side where the other half of our users submit a single
job, where that single simulation might need 10-25 nodes to run (we
have a 60 node resource).

We could just set a maxnode of 20 nodes, but that doesnt seem to be a
well received idea.

Is there anyway we could two maxnode resource limits, one for single
jobs and one for job arrays?

Torque Version: 4.2.6.1
Nicholas Lindberg
2016-12-09 22:36:49 UTC
Permalink
Moab has a fairly robust fair share feature you can enable and tweak. Not sure about Maui. Do you use either of those or do you schedule with Torque?



Sent from my Verizon, Samsung Galaxy smartphone


-------- Original message --------
From: andrealphus <***@gmail.com>
Date: 12/9/16 4:29 PM (GMT-06:00)
To: Torque Users Mailing List <***@supercluster.org>
Subject: [torqueusers] let the big one in.

So as our resource user base has grown we're realizing we need to
probably introduce some type of throttling so that a single user isnt
using all the available nodes at any one time.

We're in a mixed situation where about half our users are submitting
job arrays of tiny jobs (1 or 2 processors per job) but with thousands
of jobs in the array.

and the other side where the other half of our users submit a single
job, where that single simulation might need 10-25 nodes to run (we
have a 60 node resource).

We could just set a maxnode of 20 nodes, but that doesnt seem to be a
well received idea.

Is there anyway we could two maxnode resource limits, one for single
jobs and one for job arrays?

Torque Version: 4.2.6.1
andrealphus
2016-12-09 22:41:56 UTC
Permalink
We schedule with Torque....
Post by Nicholas Lindberg
Moab has a fairly robust fair share feature you can enable and tweak. Not
sure about Maui. Do you use either of those or do you schedule with Torque?
Sent from my Verizon, Samsung Galaxy smartphone
-------- Original message --------
Date: 12/9/16 4:29 PM (GMT-06:00)
Subject: [torqueusers] let the big one in.
So as our resource user base has grown we're realizing we need to
probably introduce some type of throttling so that a single user isnt
using all the available nodes at any one time.
We're in a mixed situation where about half our users are submitting
job arrays of tiny jobs (1 or 2 processors per job) but with thousands
of jobs in the array.
and the other side where the other half of our users submit a single
job, where that single simulation might need 10-25 nodes to run (we
have a 60 node resource).
We could just set a maxnode of 20 nodes, but that doesnt seem to be a
well received idea.
Is there anyway we could two maxnode resource limits, one for single
jobs and one for job arrays?
Torque Version: 4.2.6.1
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
David Beer
2016-12-09 23:26:02 UTC
Permalink
I don't know how many policies you intend to use, but the more policies you
need, the more likely you are to be happy with either Maui (free) or Moab
(not free, but you get support). From Torque's perspective, the server has
options such as: max_user_queuable and max_user_run on the server. You can
limit arrays with things like max_slot_limit and max_job_array_size. These
are all documented here:
http://docs.adaptivecomputing.com/torque/6-1-0/adminGuide/help.htm#topics/torque/13-appendices/serverParameters.htm%3FTocPath%3DAppendices%7C_____2

If you need something really advanced, I'd suggest you look into Moab and
or Maui, as they offer a lot more control. The other option is that you can
get creative with Torque. You can also set max_user_queuable limits on
queues, so you could create queues that filter jobs by size, and then set
queueable limits for the different queues to get finer-grained settings.

Cheers,
Post by andrealphus
We schedule with Torque....
Post by Nicholas Lindberg
Moab has a fairly robust fair share feature you can enable and tweak. Not
sure about Maui. Do you use either of those or do you schedule with
Torque?
Post by Nicholas Lindberg
Sent from my Verizon, Samsung Galaxy smartphone
-------- Original message --------
Date: 12/9/16 4:29 PM (GMT-06:00)
Subject: [torqueusers] let the big one in.
So as our resource user base has grown we're realizing we need to
probably introduce some type of throttling so that a single user isnt
using all the available nodes at any one time.
We're in a mixed situation where about half our users are submitting
job arrays of tiny jobs (1 or 2 processors per job) but with thousands
of jobs in the array.
and the other side where the other half of our users submit a single
job, where that single simulation might need 10-25 nodes to run (we
have a 60 node resource).
We could just set a maxnode of 20 nodes, but that doesnt seem to be a
well received idea.
Is there anyway we could two maxnode resource limits, one for single
jobs and one for job arrays?
Torque Version: 4.2.6.1
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
--
David Beer | Torque Architect
Adaptive Computing
andrealphus
2016-12-09 23:28:56 UTC
Permalink
Thanks David, I appreciate the insights!
Post by David Beer
I don't know how many policies you intend to use, but the more policies you
need, the more likely you are to be happy with either Maui (free) or Moab
(not free, but you get support). From Torque's perspective, the server has
options such as: max_user_queuable and max_user_run on the server. You can
limit arrays with things like max_slot_limit and max_job_array_size. These
http://docs.adaptivecomputing.com/torque/6-1-0/adminGuide/help.htm#topics/torque/13-appendices/serverParameters.htm%3FTocPath%3DAppendices%7C_____2
If you need something really advanced, I'd suggest you look into Moab and or
Maui, as they offer a lot more control. The other option is that you can get
creative with Torque. You can also set max_user_queuable limits on queues,
so you could create queues that filter jobs by size, and then set queueable
limits for the different queues to get finer-grained settings.
Cheers,
Post by andrealphus
We schedule with Torque....
Post by Nicholas Lindberg
Moab has a fairly robust fair share feature you can enable and tweak. Not
sure about Maui. Do you use either of those or do you schedule with Torque?
Sent from my Verizon, Samsung Galaxy smartphone
-------- Original message --------
Date: 12/9/16 4:29 PM (GMT-06:00)
Subject: [torqueusers] let the big one in.
So as our resource user base has grown we're realizing we need to
probably introduce some type of throttling so that a single user isnt
using all the available nodes at any one time.
We're in a mixed situation where about half our users are submitting
job arrays of tiny jobs (1 or 2 processors per job) but with thousands
of jobs in the array.
and the other side where the other half of our users submit a single
job, where that single simulation might need 10-25 nodes to run (we
have a 60 node resource).
We could just set a maxnode of 20 nodes, but that doesnt seem to be a
well received idea.
Is there anyway we could two maxnode resource limits, one for single
jobs and one for job arrays?
Torque Version: 4.2.6.1
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
--
David Beer | Torque Architect
Adaptive Computing
_______________________________________________
torqueusers mailing list
http://www.supercluster.org/mailman/listinfo/torqueusers
Loading...