I’ve long found that SGE users are perfectly willing to do the right thing when it comes to sharing a computing infrastructure among multiple competing workgroups. What has often been lacking have been SGE features accessible to non-admin users that empower users to have more control over how their jobs run and are prioritized.
A very common example of this is a situation where a user will say:
“I need to submit 100,000 jobs but I don’t want to totally take over the cluster and upset my coworkers – can I limit how many of my jobs run at any given time so that resources are left free for others?”
As a Grid Engine consultant, training and administrator I’ve personally felt that working with people wanting to be “good citizens” has sometimes been a challenge. Most of the common SGE methods for limiting or controlling job execution and policies are available only to users with SGE Administrator privileges. As nice as it is to handle one-off cluster resource allocation situations these sorts of requests can consume lots of admin time and can occasionally cause problems if people make SGE quota or scheduler changes without tight coordination and planning.
Well, it was undocumented in the initial release but ever since SGE version 6.2u4 people have had the ability to limit concurrent execution of tasks within array jobs that they submit. The syntax looks like:
$ qsub -t 1-20 -tc 5 test.sh
… where the “-tc” argument is new. The example above shows a 20-task array job being submitted with a request to run no more than 5 at any one time.
This feature is now documented as of SGE 6.2u5:
-tc max_running_tasks
allow users to limit concurrent array job task execution.
Parameter max_running_tasks specifies maximum number of simultaneously
running tasks. For example we have running SGE with 10 free slots. We
call qsub -t 1-100 -tc 2 jobscript. Then only 2 tasks will be
scheduled to run even when 8 slots are free.
This is a very welcome new feature addition to Grid Engine, I suspect it will be popular and well received by the user community.