High Performance (Spiedie) Computing

About

The High-Performance Computing Cluster aptly name “Spiedie” is a 3464 compute core & 512 Knights Landing core cluster housed in the Thomas J. Watson School of Engineering and Applied Science's data center located at the Innovative Technology Complex. This research facility offers computer capabilities for researchers across Binghamton University.

Raw Stats

  • 20 Core 128GB Head Node
  • 292TB Available FDR Infiniband Connected NFS Storage Node
  • 162 Compute Nodes
  • 3464 native compute cores
  • 512 Intel Knights Landing Cores
  • 17 TB RAM
  • 9 nVidia P100 GPU
  • 40/56gb Infiniband to all nodes
  • 1GbE to all nodes for management and OS deployment

Since the deployment of the Spiedie cluster it has gone through various expansions and deployments, growing from 32 compute nodes to its current (Dec 2017) 143 compute nodes. Most of these expansions came from individual researcher grant awards. These individuals realized the importance of the cluster to forward their research and helped grow this valuable resource.

The Watson School continues to pursue opportunities to enhance the Spiedie Cluster and to expand its outreach to other researchers in different transdisciplinary areas of research. Support for the cluster has come from the Watson School and researchers from the Chemistry, Computer Science, Electrical & Computer Engineering, Mechanical Engineering, and the Physics departments. A University Road Map 2016-17 Grant funded additional compute and storage capabilities.

2017 Calendar Year Job Details

The Spiedie cluster as of September 21, 2017, 1,053,157 jobs have run consuming 9,148,253 CPU hours across 727,568 run time hours.

  • On average jobs were 4 cores in size.
  • Most jobs consumed 8.72 Hours of CPU Time
  • Average wait time is .16 Hours per job

2016 Calendar Year Job Details

The Spiedie cluster in 2016 completed 138,981 Jobs consuming 7,144,200.6 CPU Hours across 163,901.6 run time hours.

  • On average jobs were 6 cores in size
  • Most jobs consumed an average 51.40 CPU Hours running for 5.50 hours
  • Average wait time for any job was 1.44 hours
  • Early 2012 - A head node, Storage node and  32 compute nodes were purchased with building infrastructure funds. This initial deployment created a cluster with 384 compute cores and 2.3TB of memory spread across 32 compute nodes with 11TB of shared storage.
  • Late 2014 - 29 compute nodes were purchase with collaboration between a final round of building infrastructure funds and a new researchers start-up fund. This purchase added 464 compute cores and 1.5TB of memory across 29 compute nodes totaling 848 cores and 5TB of RAM over 61 compute nodes.
  • October 2015 - 16 compute nodes were purchased along with storage for use by a researcher. This purchase added 256 compute cores and 2TB of memory to the cluster. This brought the total cluster node count to 77  and total available compute cores to 1104.
  • December 2015 - 3 compute nodes were purchased by a researcher. This purchase added 96 compute cores and 576GB of memory to the cluster. This brought the total cluster node count to 80 and total available compute cores to 1200.
  • July 2016 - 16 compute nodes were purchased by two researchers for a joint project. This purchase added 384 compute cores and 2TB of memory to the cluster. This brought the total cluster node count to 96 and total available compute cores to 1584.
  • July 2016 - 12 compute nodes were purchased by a researcher. This purchased added 288 compute cores and 1.5TB of memory to the cluster. This brought the total cluster node count 108 and total available compute cores to 1872.
  • Through a Binghamton University Road Map Grant, the High performance and data-intensive computing facility proposal that was awarded the following items were or will be added to the Spiedie Cluster.
    • October 2016 - A new head node with dual Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz and 128GB of DDR4 RAM replaced the 5-year-old head node from the initial deployment. Additional storage was added to the core operating system as well as dedicated SSD disks for supporting application delivery to the cluster.
    • October 2016 - a new storage node was purchased to replace the old storage which had a capacity of 11TB. The new storage increased our spindle from 24 disks to 52 with the ability to add more. Capacity was also increased from 11TB to 292TB. Storage continues to be accessible via NFS through a 56Gb/s FDR Infiniband interface.
  • November 2016 - 1 compute node was purchased by a researcher. This purchased added 20 computer cores, 1 NVidia K80 GPU and 256GB of memory to the cluster. This brought the total cluster node count 109 and total available compute cores to 1892.
  • January 2017 - 8 Compute nodes were purchased by two researchers. This purchase added 128 compute cores and 512GB of memory to the cluster. This brought the total cluster node count to 117 and total available compute cores to 2020.
  • April 2017 - 1 Compute node was purchased by a researcher. This purchase added 28 compute cores and 64GB of memory to the cluster. This brought the total cluster node count to 118 and 2018 compute cores.
  • May 2017 - 16 Compute nodes were purchased with funds from a Road Map Grant. This purchase added 448 Cores, 1.5TB of RAM and eight NVidia Pascal P100 12Gb GPGPU Cards. This brought the total cluster node count to 135 and 2496 native compute cores.
  • May 2017 - 8 Intel Knights Landing nodes were purchased with funds from a Road Map Grant. This purchase added 512 native compute cores, 2,048 compute threads and 768GB of RAM. This brought the total cluster node count to 143, 3,008 native compute cores and 4,544 compute threads.
  • September 2017 - 1 compute node was purchased by a researcher. This purchase added 28 compute cores, 128GB of RAM and 2 NVidia P100 12Gb GPGPU cards. This brought the total cluster node count to 143 and 3,036 native compute cores.
  • Early May 2018 - 2 compute nodes were purchased by a researcher. This purchased added 80 compute cores and 512GB of RAM. This brought the total cluster node count to 146 and 3116 native compute cores.
  • Late May 2018 - 16 compute nodes were purchased by a researcher. This purchased added 320 compute cores and ~1TB of RAM. This brought the total cluster node count to 161 and 3436 native compute cores.
  • July 2018 - 1 compute node was purchased by a researcher. This purchase added 28 compute cores and 384GB of RAM. This brought the total cluster node count to 162 and 3464 native compute cores.

Head Node

Consists of a Red Barn HPC head node with dual Intel(R) Xeon(R) CPU ES-2640 v4 @ 2.40GHz.and 128GB of DDR4 RAM with dedicated SSD storage. 

Storage Node

A common file system accessible by all nodes is hosted on a second Red Barn HPC server providing 292TB, with the ability to add additional storage drives.  Storage is accessible via NFS through a 56Gb/s FDR Infiniband interface.

Compute Nodes

The 162 compute nodes are a heterogeneous mixture of varying processor architectures, generations, and capacity.

Management and Network

Networking between the head, storage and compute nodes utilizes Infiniband for inter-node communication and Ethernet for management. Bright Cluster Manager provides monitoring, management of the nodes with SLURM handling, jobs submission, queuing, and scheduling. The cluster currently supports MATLAB jobs up to 600 cores along with, VASP, COMSOL, R and almost any *nix based application.

Cluster Policy

High-Performance Computing at Binghamton is a collaborative environment where computational resources have been pooled together to form the Spiedie cluster.   

Access Options

Subsidized access (No cost)

  • Maximum of 24 cores per faculty group
  • The queue is limited to a maximum of 240 cores
  • Storage is monitored
  • Higher priority queues have precedence
  • 122 hr wall time

Access application

Yearly subscription access 

  • $3,120/year, faculty research group
  • Queued ahead of lower priority jobs
  • Fair-share queue enabled
  • Storage is monitored 
  • 122 hr wall time
  • per research group access

Subscription application

Condo access

Purchase your own nodes to integrate into the cluster

  • High priority on your nodes
  • Fair-share access to other nodes
  • No limits on job submission to your nodes
  • Storage is monitored
  • Your nodes are accessible to others when not in use
  • ~$50/node annual support maintenance
  • MATLAB users ~$12/worker (core) for annual MDCS support

Watson IT will assist with quoting, acquisition, integration, and maintenance of purchased nodes.  For more information on adding nodes to the Spiedie cluster email Phillip Valenta with your inquiry.