In a recent mailing list post, Rui Ramos describes a commonly encountered resource allocation problem:
… I’m making some tests and if i have queue that’s full and have this list of jobs waiting
jobA 4 slots
jobB 1 slot
jobB.1 1 slot
jobB.2 1 slot
…
Let’s say that the jobs of type B are very quick and a user submits 2000 of them. On the other hand, we have a job that requires 4 slots. But each time we have a free slot it starts a job of type B. following this the jobA only executes when all jobB are finished. Unless the GridEngine can make some kind of slot reservation for jobs with higher priority ? Is this native in the N1GE scheduler, do we need to set it up ?
For people with clusters that run a mix of serial and parallel job, this can be a common problem. The serial jobs zip in and out of the execution slots fast enough that there are never enough free slots at any given scheduling interval to satisfy the demands of pending parallel jobs that need multiple slots in order to execute.
The end result is that the larger parallel jobs languish or “starve” in the pending list for very long periods of time.
The mailing list thread contains some useful replies:
Reuti provides a solution:
what you need is “resource reservation”. Just turn on the reservation in the scheduler “qconf -msconf” by setting “max_reservation 20” or an appropriate value and submit the parallel job with “-R y”.
… and Andreas provides a link to the resource reservation specification document that provides more information about Rui’s problem under the heading of “large parallel job starvation problem”:
... Resource reservation can be used to guarantee resources are dedicated
to jobs in jobs priority order. A good example which helps to comprehend
the problem solved with resource reservaiton/backfilling is the so-called
"large parallel job starvation problem". In this scenario there is one
high priority pending job (possibly parallel) A that requires a larger quota
of a particular resource and a stream of smaller and lower priority jobs B(i)
requiring a smaller quota of the same resource.
Without resource reservation an assignment for A can not be guaranteed
assumed the stream of B(i) jobs does not stop - even if job A actually
has higher priority than the B(i) jobs:
A
|
+---+----+--------+--------+--------+--------+--------+ +----------+
| B(0) | B(2) | B(4) | B(6) | B(8) | B(10) | | |
+---+----+---+----+---+----+---+----+---+----+---+----+---+ A |
| B(1) | B(3) | B(5) | B(7) | B(9) | B(11) | |
+--------+--------+--------+--------+--------+--------+----------+-->
With resource reservation job A gets a reservation that blocks lower
priority B(i) jobs and thus guarantees resources will be available for
A as soon as possible:
A
|
+---+----+----------+--------+
| B(0) | | B(2) | ...
+---+----+ A +--------+--------+
| | | B(1) | B(3) | ...
+----+----------+--------+--------+------------------------------->