Queue Configuration

The scheduler on CyEnce is torque which is a follow-on of OpenPBS (Open Portable Batch System). This is an Open Source product.

The scheuling of jobs runs through Torque and Maui and a local metascheduler which allows for differentation of scheduling priority for jobs from groups which have different account balances, in particular, goving more resources to groups which have a positiove account balance versus a negative account balance.

A user can see his/her account balance by issuing the command: /usr/local/bin/account_balance Details of how this is calculated can be found at:

Allocation Policies

For groups with a negative balance

Queues Limits (Limits for groups with Negative Account Balance in red)

      Positive Group Balance     Negative Group Balance
Queue Name Max Time/Job (hours) Max Nodes/Job Max Nodes/Queue Max Jobs/Queue Max Jobs/User Max Jobs/Group   Max Nodes/Queue Max Jobs/Queue Max Jobs/User Max Jobs/Group  
long_large 504 100 128 4 3 4   100 1 1 1
long_medium 168 8 35 4 3 4   16 2 1 2
long_2node 1008 2 6 4 3 4   4 2 1 1
short_large 1 100 100 3 2 3   100 1 1 1
short_medium 24 8 64 14 13 14   16 2 1 1
short_2node 96 2 32 8 7 8   4 2 1 1
small 4 16 64 6 5 6   64 16 1 1

Adjustment to above Limits when CyEnce is under-utilized

 To maximize system utilitization, when the system is relatively idle, short 
duration jobs are allowed to run even beyond the above limits.  

  For now, this is accomplished by the following.             
Define High Priority Groups (HPGs) as  those with a positive allocation.

Every 5 minutes  (each 5 minutes), the following checks are  run, and one job per user
may be started under the following rules:

IF jobs from HPGs are waiting for compute ndoes
THEN following jobs are run:
  jobs which can run which leave at least  32 nodes free
ELSE (no HPGs are waiting)`
  jobs of  2 hours or less which can run and leave at least 46 nodes free          
  jobs of 24 hours or less which can run and leave at least 70 nodes free          
  jobs of 48 hours or less owhich can run and leave at least 132 nodes free          
ENDIF

After those checks, we also check if there are any jobs waiting in the short_large queue
iif there are not, then`
  jobs of 2  hours or less which can run and leave at least 36  nodes free          
  jobs of 24 hours or less which can run and leave at least 116 nodes free          
ENDIF

These rules do get tweaked as the overall load changes.