IRVINE, CA, July 25th, 2004 - Supercomputing power does not always relate directly to the number of processors. Instead, the most efficient high performance computing systems are those specifically tuned for a particular application. For instance, HPC Clusters conducting Computational Fluent Dynamics do not require the same hardware configuration as clusters searching database. Superior cluster design is achieved by first understanding the application and then selecting hardware that maximizes computing power for your budget. The new PSSC Labs PowerWulf Linux cluster at the Rochester Institute of Technology (RIT) cluster, known as GalaxySimulator, is a perfect example of specialized computing to achieve superior performance. RIT's newests cluster has a peak performance of 4.0 Teraflops; placing it as the 60th fastest computer in the world . Amazingly the cluster only contains 64 Intel Xeon processors. A comparable performance machine on the Top500.org website has nearly 1000 processors.
How is this performance accomplished? David Merritt and other scientists at RIT understood their cluster would be used primarily for n-Body simulations. n-Body simulations consists of modelling nĚ particles in three dimensional space in regards to their position over time, or in other words it considers the motion of particles under a force which changes depending on position. RIT scientists discovered the micro-GRAPE6-A accelerator card. The GRAPE card, developed by a designed and built by a group of astrophysicists at the University of Tokyo. PSSC Labs integrated these GRAPE6-A cards into a PowerWulf Cluster based on Intel's Xeon processors.
According to Dr. Merritt GalaxySimulator is used "primarily for simulating the dynamical evolution of galactic nuclei, investigating such problems as the interaction of stars with supermassive black holes, the evolution of binary black holes, and the effects of mergers on the structure of galaxies. The RIT cluster is the largest of its kind in the world."
The GRAPE6-A cards contain multiple pipelines for computing inverse-square gravitational forces between particles; their peak performance is 130 Gflops per board, giving the cluster a speed of about 4 Tflops. The on-board memory of each GRAPE card can hold about 128K particles (positions, velocities, accelerations) for a total of 4M particles across the cluster; even greater particle numbers are possible if the gravitational forces are computed via a less-accurate 'tree' algorithm which puts most of the particles on the node memories.
For complete details please refer to PSSC Labs Delivers GalaxySimulator to Rochester Institute of Technology