Cluster specs and docs

Computational requirements


  • 2GB memory per node
  • High Bandwidth to MSS


  • 2-3GB per node
  • High bandwidth to MSS
  • 300,000 SUs (on Abe) over the course of 3-5 years.
    • Several mu passes of 15000 SUs with a few mta passes of 2000? SUs per mu pass.
    • One mu pass = 15000 SUs, one mta = 2000 SUs
    • For 2011 data, ~5 mu passes with ~5 mta per mu pass
    • Same for 2013 data
  • 150TB MSS space
    • ~75 TB per run

Draft of letter to Mike Pflugmacher

Dear Mike,

We have two experiments that currently make use of the Abe cluster. Our analysis software for both experiments operates in a embarrassingly parallel fashion, so we make no use of any MPI or shared memory functionality. The processors on any of the mentioned clusters are sufficient for our needs, and we generally need about 2GB of memory per core. Given sufficient access to available cores, the bottleneck for running a pass over our dataset is the bandwidth from the MSS to staging scratch space.

One of our experiments, MuCap, will be concluding this summer, so only a few final dataset passes will be performed (fewer than 30,000 SU required on Abe equivalent hardware). The other, MuSun, is just now moving into the mode of full-dataset analysis and will require ~300,000 SUs (on Abe) over the course of the next 3-5 years. Given this time scale and the inconvenience of updating batch job scripts and the like to a new cluster, we prefer to move to the cluster that will remain active the longest.

Of the clusters you mentioned (QueenBee, Steele, Ember, Lonestar, Trestles), we have the following impression:
QueenBee: shutting down in the summer - not suitable
Ember: hardware is acceptable, but cluster is specialized for a highly-parallel computing model that we would not use.
Steele: functionally similar to Abe for our purposes, however contains fewer available cores. What is the bandwidth to MSS?
Trestles/Lonestar: hardware acceptable, but does it have equivalent bandwidth to the MSS as Abe?

