SpyderByte.com ;Technical Portals 
      
 News & Information Related to Linux High Performance Computing, Linux Clustering and Cloud Computing
Home About News Archives Contribute News, Articles, Press Releases Mobile Edition Contact Advertising/Sponsorship Search Privacy
HPC Vendors
Cluster Quoter (HPC Cluster RFQ)
Hardware Vendors
Software Vendors
HPC Consultants
Training Vendors
HPC Resources
Featured Articles
Cluster Builder
Beginners
Whitepapers
Documentation
Software
Lists/Newsgroups
Books
User Groups & Organizations
HP Server Diagrams
HPC News
Latest News
Newsletter
News Archives
Search Archives
HPC Links
ClusterMonkey.net
Scalability.org
HPCCommunity.org

Beowulf.org
HPC Tech Forum (was BW-BUG)
Gelato.org
The Aggregate
Top500.org
Cluster Computing Info Centre
Coyote Gultch
Dr. Robert Brown's Beowulf Page
FreshMeat.net: HPC Software
SuperComputingOnline
HPC User Forum
GridsWatch
HPC Newsletters
Stay current on Linux HPC news, events and information.
LinuxHPC.org Newsletter

Other Mailing Lists:
Linux High Availability
Beowulf Mailing List
Gelato.org (Linux Itanium)

LinuxHPC.org
Home
About
Contact
Mobile Edition
Sponsorship

Latest News

University of Alberta wins the Cluster Challenge
Posted by Michael Edwards, Monday November 26 2007 @ 02:00PM EST

The team from the University of Alberta took the checkered flag at the first ever Cluster Challenge at SuperComputing 2007. The team was comprised of six student members: Antoine Filion, Paul Greidanus (Student Lead), Gordon Willem Klok, Chris Kuethe, Andrew Nisbet, and Stephen Portillo. Dr Paul Lu, an associate professor from UA’s Department of Computing Science, coached the team.

All teams, comprised of six undergraduates and one University staff member, were allowed a single 19” rack of equipment that was assembled at the conference center. Over the first three days of the conference a combination of industry standard benchmarks and current scientific modeling problems were run on the clusters. The teams were also limited to a single 30A, 110V power circuit and penalties were given for excess draw. Results were displayed through out the course of the competition on a 42” display, which was also a factor in the power limitations. Participants were judged by benchmark performance and the throughput of the scientific applications.

The University teams were paired with corporate sponsors that provided the computers and networking equipment used for the competition. SGI supplied the University of Alberta team with five Altix XE310 nodes, which the team installed with Scientific Linux 4.5 and OSCAR 5.0 (Open Source Cluster Application Resources ttp://oscar.openclustergroup.org). The 1U XE310 consists of two motherboards, each of which is dual-socket running 2.66Ghz quad-core processors, for a total of 16 cores per chassis, and 16Gb of RAM. The team used a total of 48 cores for the HPCC/Linpack runs, and 64 for the rest of the competition (Linpack is expensive on the competition's 26A power budget). A Voltaire ISR-9024 InfiniBand switch was used to provide a high performance network to the cluster.

The software stack used by the University of Alberta team was composed of OSCAR, SystemImager, C3, Sun Grid Engine, Ganglia, and MVAPICH2 as core elements. Each piece was chosen for its usability, manageability and performance. "OSCAR allowed us to deploy the cluster quickly, and get onto the important work of benchmarking and characterizing our applications. It also sets up applications to allow our jobs to run, and have status displays with no effort," said Paul Greidanus, the team’s Student Lead.

OSCAR includes all the software necessary for building a HPC cluster -- no prior experience is necessary and its intuitive interfaces allow anyone to quickly deploy a supercomputer.

According to Bernard Li, an open source developer who has worked with a number of open source clustering-related applications, including OSCAR, "Open source clustering software has come a long way since they started appearing in the late 90's. Now these softwares have moved out of research/academia and into the enterprise as the code gets more mature and stable. These open source software will continue to contribute greatly in bringing HPC to the masses in the years to come."

However, as in all competitions, people and strategy are also important, and this is one place where the team's approach and effort before the competition paid off.

Team member Chris Kuthe had this to say about the team’s preparations. "Before the competition we tested each application thoroughly. This allowed us to ensure that the applications worked correctly, and gave us processes for managing runtime errors. The testing also revealed some critical properties about the application scaling: sometimes the cost of off-node communication was offset by access to a larger number of cores, in other cases a smaller of cores would suffice if slow interconnects could be avoided. We were pleased to find that POV-Ray could be used to consume any unallocated CPU time without adversely impacting other running processes."

Gordon Willem Klok was also thought that preparation was important to the team’s victory. "In particular [preparation] was emphasized by our coach Dr Paul Lu. Understanding the importance of speedup curves and correctly characterizing the applications ahead of time allowed us find a good mix of work loads to maximize utilization."

Power was the major limiting factor between the teams, and was a constant source of challenges. The teams were given two metered power bars with 13A current limits which they were not permitted to exceed.

Power management was also very important in the team’s victory.

"We arrived very well prepared having conducted a considerable number of tests to develop a power profile and gauge this profile against the contest limitations. We chose to characterize not just the cluster power consumption in aggregate, but attempted to ascertain what the cost of each of the removable hardware components was and weighed this carefully against its utility in terms of performance or ease of use,” said team member Gordon Willem Klok.

There was a power outage of the section of Reno that the conference was being held in, which caused the entire convention center to lose power for a time, interrupting the competition. The University of Alberta team recovered quickly from this outage, due mostly to their scheduler allocating jobs onto the nodes as soon as they came back, and the team's preparation.

The team spent most of the setup day running the challenge applications and the datasets provided by the challenge organizers. They were surprised to learn that one benchmark, the popular LINPACK benchmark, would consistently break the contest’s power limitations if they used the 56 cores they had planned in Edmonton. The team decided to risk losing points on the LINPACK benchmark by lowering the number of cores used during that portion of the competition rather than risk breaking the power limitations. They also discovered during the setup day that they could run the other applications on all 64 cores if they used the built in throttling mechanism of the XE310 to even out the load on the processors and handle occasional power spikes caused by changing workloads. “It was a risky gamble that in the end paid off we were drawing precisely 26 amps for a good portion of the competition," said team member Gordon Willem Klok.


< JPPF 1.0 RC2 released | Implementing a High Performance Service Oriented Architecture >

 

Affiliates

Cluster Monkey

HPC Community


Supercomputing 2010

- Supercomputing 2010 website...

- 2010 Beowulf Bash

- SC10 hits YouTube!

- Louisiana Governor Jindal Proclaims the week of November 14th "Supercomputing Week" in honor of SC10!








Appro: High Performance Computing Resources
IDC: Appro Xtreme-X Supercomputer Blade Solution
Analysis of the Xtreme-X architecture and management system while assessing challenges and opportunities in the technical computing market for blade servers.

Video - The Road to PetaFlop Computing
Explore the Scalable Unit concept where multiple clusters of various sizes can be rapidly built and deployed into production. This new architectural approach yields many subtle benefits to dramatically lower total cost of ownership.
White Paper - Optimized HPC Performance
Multi-core processors provide a unique set of challenges and opportunities for the HPC market. Discover MPI strategies for the Next-Generation Quad-Core Processors.

Appro and the Three National Laboratories
[Appro delivers a new breed of highly scalable, dynamic, reliable and effective Linux clusters to create the next generation of supercomputers for the National Laboratories.

AMD Opteron-based products | Intel Xeon-based products



Home About News Archives Contribute News, Articles, Press Releases Mobile Edition Contact Advertising/Sponsorship Search Privacy
     Copyright © 2001-2013 LinuxHPC.org
Linux is a trademark of Linus Torvalds
All other trademarks are those of their owners.
    
  SpyderByte.com ;Technical Portals