Monkey Talk: Cluster Opinions and Insights from Cluster Monkey.
By Douglas Eadline
Who is Responsible?
The phrase, if I only had one throat to choke often comes to mind when thinking about support for clusters. On the one hand, the history and the nature of clusters promotes a kind-of do your own thing methodology. On the other hand, how is the market ever going to grow beyond the pioneers if we do not have shrink-wrapped turn-key systems with a single point of support?
In the past, I talked with a cluster user who had paper files for each of ten vendors he had to manage. From compilers to interconnects, a cluster is a collection of technology with no single all-knowing vendor. How nice would it be to have one number, one voice, and dare I say one throat to choke when there are support issues. Traditional big iron supercomputers have a single point of contact, however, their single all-in-one price is a little too steep for most cluster users. I suspect that support is one area where clusters may increase rather than decrease the cost of HPC (High Performance Computing). Of course, many of the pioneering clusters were built without regard to the added support cost and in many cases thrive on a multi-vendor collaborative support model. Having built a few clusters in my day, I understand this attitude and have probably muttered on more than one occasion, Support, I don't need no stinking support, I built the cluster, patched the software, integrated the middleware, and made it work. If it breaks, I will fix it.
Is single source support possible for component systems? In reality are we really building custom cars and hoping the local dealerships will provide warranty service? Can we expect a single vendor to support the integration of everything we could possibly use to build a cluster? The price to performance curve is too compelling to ignore clusters and maybe a new support model will emerge that better suites the needs of production systems.
I believe the answers to these questions will begin to emerge as we move forward in the market. Finally, I would also like to think, that with a strong community and open standards, we do not have to choke throats to support clusters. We can continue to work together to develop best practices and have open discussions about how to best solve problems. After all, we really just want to crunch numbers faster than the next guy, right?
Douglas Eadline can be found swing around the binary trees at Cluster Monkey
This work licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License