High Performance (Spiedie) Computing | Thomas J. Watson College of Engineering and Applied Science

About

The High-Performance Computing Cluster aptly name “Spiedie” is a 2808 compute core & 512 Knights Landing core cluster housed in the Thomas J. Watson College of Engineering and Applied Science's data center located at the Innovative Technology Complex. This research facility offers computer capabilities for researchers across Binghamton University.

Raw Stats

20 Core 128GB Head Node
292TB Available FDR Infiniband Connected NFS Storage Node
132 Compute Nodes
2808 native compute cores
512 Intel Knights Landing Cores
8 nVidia P100 GPU
40/56gb Infiniband to all nodes
1GbE to all nodes for management and OS deployment

Since the deployment of the Spiedie cluster it has gone through various expansions and deployments, growing from 32 compute nodes to its current (March 2023) 132 compute nodes. Most of these expansions came from individual researcher grant awards. These individuals realized the importance of the cluster to forward their research and helped grow this valuable resource.

The Watson College continues to pursue opportunities to enhance the Spiedie Cluster and to expand its outreach to other researchers in different transdisciplinary areas of research. Support for the cluster has come from Watson College and researchers from the Chemistry, Computer Science, Electrical & Computer Engineering, Mechanical Engineering, and the Physics departments. A University Road Map 2016-17 Grant funded additional compute and storage capabilities.

2021 Calendar Year Job Details

The Spiedie Cluster as of December 1, 2021, 71,981 jobs have run consuming 20,104,734.8 CPU hours across 828,937.3 run time hours.

On average jobs were 20 cores in size.
Most jobs consumed 279.31 Hours of CPU Time
Average wait time is 2.88 Hours per job

2020 Calendar Year Job Details

The Spiedie cluster in 2020 completed 199,557 Jobs consuming 26,933,923.9 CPU hours across 937,624.8 run time hours.

On average jobs were 16 cores in size
Most jobs consumed an average 134.67 Hours of CPU Time
Average wait time was 3.95 Hours per job

Expansion History

Early 2012 - A head node, Storage node and 32 compute nodes were purchased with building infrastructure funds. This initial deployment created a cluster with 384 compute cores and 2.3TB of memory spread across 32 compute nodes with 11TB of shared storage.
Late 2014 - 29 compute nodes were purchase with collaboration between a final round of building infrastructure funds and a new researchers start-up fund. This purchase added 464 compute cores and 1.5TB of memory across 29 compute nodes totaling 848 cores and 5TB of RAM over 61 compute nodes.
October 2015 - 16 compute nodes were purchased along with storage for use by a researcher. This purchase added 256 compute cores and 2TB of memory to the cluster. This brought the total cluster node count to 77 and total available compute cores to 1104.
December 2015 - 3 compute nodes were purchased by a researcher. This purchase added 96 compute cores and 576GB of memory to the cluster. This brought the total cluster node count to 80 and total available compute cores to 1200.
July 2016 - 16 compute nodes were purchased by two researchers for a joint project. This purchase added 384 compute cores and 2TB of memory to the cluster. This brought the total cluster node count to 96 and total available compute cores to 1584.
July 2016 - 12 compute nodes were purchased by a researcher. This purchased added 288 compute cores and 1.5TB of memory to the cluster. This brought the total cluster node count 108 and total available compute cores to 1872.
Through a Binghamton University Road Map Grant, the High performance and data-intensive computing facility proposal that was awarded the following items were or will be added to the Spiedie Cluster.
- October 2016 - A new head node with dual Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz and 128GB of DDR4 RAM replaced the 5-year-old head node from the initial deployment. Additional storage was added to the core operating system as well as dedicated SSD disks for supporting application delivery to the cluster.
- October 2016 - a new storage node was purchased to replace the old storage which had a capacity of 11TB. The new storage increased our spindle from 24 disks to 52 with the ability to add more. Capacity was also increased from 11TB to 292TB. Storage continues to be accessible via NFS through a 56Gb/s FDR Infiniband interface.
November 2016 - 1 compute node was purchased by a researcher. This purchased added 20 computer cores, 1 NVidia K80 GPU and 256GB of memory to the cluster. This brought the total cluster node count 109 and total available compute cores to 1892.
January 2017 - 8 Compute nodes were purchased by two researchers. This purchase added 128 compute cores and 512GB of memory to the cluster. This brought the total cluster node count to 117 and total available compute cores to 2020.
April 2017 - 1 Compute node was purchased by a researcher. This purchase added 28 compute cores and 64GB of memory to the cluster. This brought the total cluster node count to 118 and 2018 compute cores.
May 2017 - 16 Compute nodes were purchased with funds from a Road Map Grant. This purchase added 448 Cores, 1.5TB of RAM and eight NVidia Pascal P100 12Gb GPGPU Cards. This brought the total cluster node count to 135 and 2496 native compute cores.
May 2017 - 8 Intel Knights Landing nodes were purchased with funds from a Road Map Grant. This purchase added 512 native compute cores, 2,048 compute threads and 768GB of RAM. This brought the total cluster node count to 143, 3,008 native compute cores and 4,544 compute threads.
September 2017 - 1 compute node was purchased by a researcher. This purchase added 28 compute cores, 128GB of RAM and 2 NVidia P100 12Gb GPGPU cards. This brought the total cluster node count to 143 and 3,036 native compute cores.
Early May 2018 - 2 compute nodes were purchased by a researcher. This purchased added 80 compute cores and 512GB of RAM. This brought the total cluster node count to 146 and 3116 native compute cores.
Late May 2018 - 16 compute nodes were purchased by a researcher. This purchased added 320 compute cores and ~1TB of RAM. This brought the total cluster node count to 161 and 3436 native compute cores.
July 2018 - 1 compute node was purchased by a researcher. This purchase added 28 compute cores and 384GB of RAM. This brought the total cluster node count to 162 and 3464 native compute cores.
2019 - 3 Researchers bought a total of 6 compute nodes. This brought the cluster node count to 166 and 3648 native compute cores
2020 - A researcher bought 2 compute nodes. This brought the cluster node count to 168 and 3784 native compute cores
2021 - A researcher bought 2 compute nodes. This brought the cluster node count to 170 and 3848 native compute cores
2022 - A number of systems were removed from the cluster reducing the total to 160

Head Node

Consists of a Red Barn HPC head node with dual Intel(R) Xeon(R) CPU ES-2640 v4 @ 2.40GHz.and 128GB of DDR4 RAM with dedicated SSD storage.

Storage Node

A common file system accessible by all nodes is hosted on a second Red Barn HPC server providing 292TB, with the ability to add additional storage drives. Storage is accessible via NFS through a 56Gb/s FDR Infiniband interface.

Compute Nodes

The 160 compute nodes are a heterogeneous mixture of varying processor architectures, generations, and capacity.

Management and Network

Networking between the head, storage and compute nodes utilizes Infiniband for inter-node communication and Ethernet for management. Bright Cluster Manager provides monitoring, management of the nodes with SLURM handling, jobs submission, queuing, and scheduling. The cluster currently supports MATLAB jobs up to 600 cores along with, VASP, COMSOL, R and almost any *nix based application.

Cluster Policy

High-Performance Computing at Binghamton is a collaborative environment where computational resources have been pooled together to form the Spiedie cluster.

Access Options

Subsidized access (No cost)

Maximum of 48 cores per faculty group
Storage is monitored
Higher priority queues have precedence
122 hr wall time

Access application

Yearly subscription access

$1,675/year, faculty research group
Queue core restrictions are removed
Queued ahead of lower priority jobs
Fair-share queue enabled
Storage is monitored
122 hr wall time
per research group access

Subscription application

Condo access

Purchase your own nodes to integrate into the cluster

High priority on your nodes
Fair-share access to other nodes
No limits on job submission to your nodes
Storage is monitored
Your nodes are accessible to others when not in use
~$200/node annual support maintenance
MATLAB users ~$12/worker (core) for annual MDCS support

Watson IT will assist with quoting, acquisition, integration, and maintenance of purchased nodes. For more information on adding nodes to the Spiedie cluster email Phillip Valenta with your inquiry.