As of July 1, 2021 the whole Condo Cluster is under Free Tier model. The cluster consists primarily of 134 SuperMicro servers each with two 8-core Intel Haswell processors, 128 GB of memory and 2.5 TB of available local disk. Besides these compute nodes there are three large memory nodes. Two large memory nodes have four 8-core Intel Ivy Bridge processors and 1 TB of main memory. Third large memory node has four 10-core Ivy Bridge processors and 2 TB of main memory. One GPU node has two 10-core Intel Haswell processors, two NVIDIA Tesla K20c accelerator cards, 768 GB of memory and 5.5 TB of available local disk. All nodes and storage are connected via Intel/Qlogic QDR InfiniBand (40 Gb/s) switch.

Detailed Hardware Specification

Number of NodesProcessors per NodeCores per NodeMemory per NodeInterconnectLocal $TMPDIR DiskAccelerator CardCPU-Hour Cost FactorPartition
134Two 2.6 GHz 8-Core
Intel E5-2640 v3
16128 GB40G IB2.5 TBN/A1.00 
2Four 2.6 GHz 8-Core
Intel E5 4620 v2
321 TB40G IB1.8 TBN/A4.50fat
1Four 2.2 GHz 10-Core
Intel E7-4830 v2
402 TB40G IB1.3 TBN/A9.45huge
1

Two 2.3 GHz 10-Core
Intel E5-2650 v3

20768 GB40G IB5.5 TB2x NVIDIA K20c2.00gpu

 

 

HPC group schedules regular maintenances every 3 months to update system software and to perform other tasks that require a downtime.

The date of the next maintenance is listed in the message of the day displayed at login (when ssh-ing to the cluster).

Note: Queued jobs will not start if they cannot complete before the maintenance begins. In the output of the squeue command the reason for those jobs will state (ReqNodeNotAvail, Reserved for maintenance) . The jobs will start after the scheduled outage completes.