- Condo 2017
- Who can use CyEnce?
- CPU Allocation
- Available Software
- Login and File Transfer
- Managing jobs using Slurm Workload Manager
- Slurm Job Script Generator for CyEnce
- How to Use Accelerator Nodes
- Using MPI
- Using OpenMP or Auto-parallelism
- compiler options
- Queue Configuration
- Using DDT Parallel Debugger, MAP profiler and Performance Reports
- Classroom HPC Cluster
- UNIX Introduction
- Globus Online
- File Transfers
- Support & Contacts
- Systems & Equipment
- HPC Cluster Access Request
- FAQ: Frequently Asked Questions
- Recources for Obtaining Help
The scheduler on CyEnce is torque which is a follow-on of OpenPBS (Open Portable Batch System). This is an Open Source product.
The scheuling of jobs runs through Torque and Maui and a local metascheduler which allows for differentation of scheduling priority for jobs from groups which have different account balances, in particular, goving more resources to groups which have a positiove account balance versus a negative account balance.
A user can see his/her account balance by issuing the command: /usr/local/bin/account_balance Details of how this is calculated can be found at:
For groups with a negative balance
|Queue Name||Max Time/Job (hours)||Max Nodes/Job||Max Nodes/Queue||Max Jobs/Queue||Max Jobs/User||Max Jobs/Group||Max Nodes/Queue||Max Jobs/Queue||Max Jobs/User||Max Jobs/Group|
To maximize system utilitization, when the system is relatively idle, short duration jobs are allowed to run even beyond the above limits. For now, this is accomplished by the following. Define High Priority Groups (HPGs) as those with a positive allocation. Every 5 minutes (each 5 minutes), the following checks are run, and one job per user may be started under the following rules: IF jobs from HPGs are waiting for compute ndoes THEN following jobs are run: jobs which can run which leave at least 32 nodes free ELSE (no HPGs are waiting)` jobs of 2 hours or less which can run and leave at least 46 nodes free jobs of 24 hours or less which can run and leave at least 70 nodes free jobs of 48 hours or less owhich can run and leave at least 132 nodes free ENDIF After those checks, we also check if there are any jobs waiting in the short_large queue iif there are not, then` jobs of 2 hours or less which can run and leave at least 36 nodes free jobs of 24 hours or less which can run and leave at least 116 nodes free ENDIF These rules do get tweaked as the overall load changes.