The High-Performance Computing Cluster aptly name “Spiedie” is a 3008 compute core & 512 Knights Landing core cluster housed in the Thomas J. Watson School of Engineering and Applied Science's data center located at the Innovative Technology Complex. This research facility offers computer capabilities for researchers not only in the Watson School, but all departments across Binghamton University.
- 20 Core 128GB Head Node
- 292TB Available FDR Infiniband Connected NFS Storage Node
- 143 Compute Nodes
- 3,008 native compute cores (4,544 Compute Threads)
- 512 Intel Knights Landing Cores
- 13.95TB RAM
- 40/56gb Infiniband to all nodes
- 1GbE to all nodes for management and OS deployment
Since the deployment of the Spiedie cluster it has gone through various expansions and deployments, growing from 32 compute nodes to its current (Dec 2017) 143 compute nodes. Most of these expansions came from individual researcher grant awards. These individuals realized the importance of the cluster to forward their research and helped grow this valuable resource.
The Watson School continues to pursue opportunities to enhance the Spiedie Cluster and to expand its outreach to other researchers in different transdisciplinary areas of research. Support for the cluster has come from the Watson School and researchers from the Chemistry, Computer Science, Electrical & Computer Engineering, Mechanical Engineering, and the Physics departments. A University Road Map 2016-17 Grant funded additional compute and storage capabilities.
2017 Calendar Year Job Details
The Spiedie cluster as of September 21, 2017, 1,053,157 jobs have run consuming 9,148,253 CPU hours across 727,568 run time hours.
- On average jobs were 4 cores in size.
- Most jobs consumed 8.72 Hours of CPU Time
- Average wait time is .16 Hours per job
2016 Calendar Year Job Details
The Spiedie cluster in 2016 completed 138,981 Jobs consuming 7,144,200.6 CPU Hours across 163,901.6 run time hours.
- On average jobs were 6 cores in size
- Most jobs consumed and average 51.40 CPU Hours running for 5.50 hours
- Average wait time for any job was 1.44 hours
- Early 2012 - A head node, Storage node and 32 compute nodes were purchased with building infrastructure funds. This initial deployment created a cluster with 384 compute cores and 2.3TB of memory spread across 32 compute nodes with 11TB of shared storage.
- Late 2014 - 29 compute nodes were purchase with collaboration between a final round of building infrastructure funds and a new researchers start-up fund. This purchase added 464 compute cores and 1.5TB of memory across 29 compute nodes totaling 848 cores and 5TB of RAM over 61 compute nodes.
- October 2015 - 16 compute nodes were purchased along with storage for use by a researcher. This purchase added 256 compute cores and 2TB of memory to the cluster. This brought the total cluster node count to 77 and total available compute cores to 1104.
- December 2015 - 3 compute nodes were purchased by a researcher. This purchase added 96 compute cores and 576GB of memory to the cluster. This brought the total cluster node count to 80 and total available compute cores to 1200.
- July 2016 - 16 compute nodes were purchased by two researchers for a joint project. This purchase added 384 compute cores and 2TB of memory to the cluster. This brought the total cluster node count to 96 and total available compute cores to 1584.
- July 2016 - 12 compute nodes were purchased by a researcher. This purchased added 288 compute cores and 1.5TB of memory to the cluster. This brought the total cluster node count 108 and total available compute cores to 1872.
- Through a Binghamton University Road Map Grant, the High performance and data intensive computing facility proposal that was awarded
the following items were or will be added to the Spiedie Cluster.
- October 2016 - A new head node with dual Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz and 128GB of DDR4 RAM replaced the 5 year old head node from the initial deployment. Additional storage was added for the core operating system as well as dedicated SSD disks for supporting application delivery to the cluster.
- October 2016 - a new storage node was purchased to replace the old storage which had a capacity of 11TB. The new storage increased our spindle from 24 disks to 52 with ability to add more. Capacity was also increased from 11TB to 292TB. Storage continues to be accessible via NFS through a 56Gb/s FDR Infiniband interface.
- November 2016 - 1 compute node was purchased by a researcher. This purchased added 20 computer cores, 1 NVidia K80 GPU and 256GB of memory to the cluster. This brought the total cluster node count 109 and total available compute cores to 1892.
- January 2017 - 8 Compute nodes were purchased by two researchers. This purchase added 128 compute cores and 512GB of memory to the cluster. This brought the total cluster node count to 117 and total available compute cores to 2020.
- April 2017 - 1 Compute node was purchased by a researcher. This purchase added 28 compute cores and 64GB of memory to the cluster. This brought the total cluster node count to 118 and total available compute cores 2048.
- May 2017 - 16 Compute nodes were purchased with funds from a Road Map Grant. This purchase added 448 Cores, 1.5TB of RAM and eight NVidia Pascal P100 12Gb GPGPU Cards. This brought the total cluster node count to 135 and 2496 native compute cores.
- May 2017 - 8 Intel Knights Landing nodes were purchased with funds from a Road Map Grant. This purchase added 512 native compute cores, 2,048 compute threads and 768GB of RAM. This brought the total cluster node count to 143, 3,008 native compute cores and 4,544 compute threads.
- September 2017 - 1 compute node was purchased by a researcher. This purchase added 28 compute cores , 128GB of RAM and 2 NVidia P100 12Gb GPGPU cards. This brought the total cluster node count to 143 and 3,036 native compute cores.
Consists of a Red Barn HPC head node with dual Intel(R) Xeon(R) CPU ES-2640 v4 @ 2.40GHz.and 128GB of DDR4 RAM with dedicated SSD storage.
A common file system accessible by all nodes is hosted on a second Red Barn HPC server providing 292TB, with the ability to add additional storage drives. Storage is accessible via NFS through a 56Gb/s FDR Infiniband interface.
The 143 compute nodes are a heterogeneous mixture of varying processor architectures, generations and capacity.
- 32 computes nodes consist of dual socket Intel 6 core X5690 CPUs running at 3.47GHz. 16 of these nodes have 48GB of DDR3 RAM, 16 have 96GB of DDR3 RAM
- 29 compute nodes consist of dual socket Intel 8 core E5-2667 v2 CPUs running at 3.3GHz each with 96GB of DDR3 RAM
- 16 compute nodes consist of dual socket Intel 8 core E5-2640 v3 CPUs running at 2.6GHz each with 128GB of DDR4 RAM.
- 3 compute nodes consist of dual socket Intel 16 core E5-2698 v3 CPUs running at 2.3GHz each with 192GB of DDR4 RAM.
- 28 compute nodes consist of dual socket Intel 14 core E5-2650 v4 CPUs running at 2.2GHz each with 128GB of DDR4 RAM.
- 1 compute node consists of dual socket Intel 10 core E5-2630 v4 CPUs running at 2.2GHz with 256GB of DDR4 RAM and a NVidia K80 GPU.
- 8 compute nodes with dual socket Intel E5-2620 v4 CPUS running at 2.1GHz with 64GB of DDR4 RAM
- 1 compute node with dual socket Intel E5-2680 v4 CPUs running at 2.4GHz with 64GB of DDR4 RAM
- 12 compute nodes with dual socket Intel E5-2660 v4 CPUs running at 2.0GHz with 64GB of DDR4 RAM
- 4 compute nodes with dual socket Intel E5-2660 v4 CPUs running at 2.0GHz with two NVidia Telsa P100 12GB GPGPUs, two with 256GB of DDR4 RAM, two with 128GB of DDR4 RAM
- 8 computes with single socket Xeon Phi Knights Landing 64 core 7210 CPU with 96GB of DDR4 RAM
- 1 compute nodes with dual socket Intel E5-2680 v4 running at 2.4GHz with 128GB of DDR4 RAM and two NVidia Tesla P100 12GB GPGPUs.
Management and Network
Networking between the head, storage and compute nodes utilizes Infiniband for inter-node communication and Ethernet for management. Bright Cluster Manager provides monitoring, management of the nodes with SLURM handling, jobs submission, queuing, and scheduling. The cluster currently supports MATLAB jobs up to 600 cores along with, VASP, COMSOL, R and almost any *nix based application.
High-Performance Computing at Binghamton is a collaborative environment where computational resources have been pooled together to form the Spiedie cluster.
No cost subsidized access
- Maximum of 24 cores per faculty group
- Queue is limited to a maximum of 240 cores
- Storage is monitored
- Higher priority queues have precedence
- 122 hr wall time
Yearly subscription access
- $3,120/year, faculty research group
- Queued ahead of lower priority jobs
- Fair-share queue enabled
- Storage is monitored
- 122 hr wall time
- per research group access
Purchase your own nodes to integrate into the cluster
- High priority on your nodes
- Fair-share access to other nodes
- No limits on job submission to your nodes
- Storage is monitored
- Your nodes are accessible to others when not in use
- ~$50/node annual support maintenance
- MATLAB users ~$12/worker (core) for annual MDCS support
Watson IT will assist with quoting, acquisition, integration and maintenance of purchased nodes. For more information on adding nodes to the Spiedie cluster email Phillip Valenta with you inquery.