SpyderByte.com ;Technical Portals 
      
 News & Information Related to Linux High Performance Computing, Linux Clustering and Cloud Computing
Home About News Archives Contribute News, Articles, Press Releases Mobile Edition Contact Advertising/Sponsorship Search Privacy
HPC Vendors
Cluster Quoter (HPC Cluster RFQ)
Hardware Vendors
Software Vendors
HPC Consultants
Training Vendors
HPC Resources
Featured Articles
Cluster Builder
Beginners
Whitepapers
Documentation
Software
Lists/Newsgroups
Books
User Groups & Organizations
HP Server Diagrams
HPC News
Latest News
Newsletter
News Archives
Search Archives
HPC Links
ClusterMonkey.net
Scalability.org
HPCCommunity.org

Beowulf.org
HPC Tech Forum (was BW-BUG)
Gelato.org
The Aggregate
Top500.org
Cluster Computing Info Centre
Coyote Gultch
Dr. Robert Brown's Beowulf Page
FreshMeat.net: HPC Software
SuperComputingOnline
HPC User Forum
GridsWatch
HPC Newsletters
Stay current on Linux HPC news, events and information.
LinuxHPC.org Newsletter

Other Mailing Lists:
Linux High Availability
Beowulf Mailing List
Gelato.org (Linux Itanium)

LinuxHPC.org
Home
About
Contact
Mobile Edition
Sponsorship

Latest News

Solving the programming problems of parallel Linux clusters
Posted by Ilya Mirman, Friday November 10 2006 @ 03:16PM EST

The case for promoting Linux clusters over traditional supercomputers has focused on hardware affordability. Open source architectures based on standards-based multicore CPUs promise to make high-performance computers (HPCs) affordable and accessible to a mass market of mainstream technical computing users, proponents argue. What were yesterday’s large proprietary systems costing a quarter million dollars are today very cost-effective Linux clusters costing $20K or so.

A big reason is the increasing computational sophistication of commodity, standards-based microprocessors. The AMD Opteron processor and Intel Xeon, for example, have gained traction in the HPC market thanks to their ability to support 64-bit computations and large, high-speed memory capacity at an affordable price. In fact, the Opteron now powers approximately 10 percent of the world’s 500 most powerful supercomputers. But while commoditization and open source technology has certainly lowered costs, the enormous complexities of parallel programming Linux clusters remain the biggest barrier to their accessibility. The “software gap” – the gap between hardware capabilities of a Linux cluster and actual benefits we can practically extract through programming – is wide and growing. There is a lack of applications available for parallel computers, and custom development of parallel applications is fundamentally flawed.

Here’s why: technical computing spans two divorced realms – desktop computers and HPCs. Both environments have much to offer, but the disconnect must be overcome if the power of Linux clusters is to be harnessed.

Desktop computers have been the preferred platform point for science and engineering, particularly during the early stages of new product or system modeling, simulation, and optimization. The interactivity offered by these tools lends themselves well to the iterative process of research and discovery.

However, the desktop’s performance limitations are outpacing Moore’s Law due to the single-core CPU. Users understand that their success depends no longer on increasing the clock speeds of CPUs, but from putting multiple processors to work simultaneously in cluster or other parallel architectures. Yet parallel architectures are inherently non-interactive, batch-mode beasts, which stymie the real-time feedback needed for scientific and engineering productivity.

The interactive dilemma

Millions of engineers and scientists have access to a rich set of interactive high-level software applications in two general categories: 1) very high level languages (VHLLs) for custom application development, such as MATLAB, Python, Mathematica, Maple, or IDL; and 2) vertical applications developed by commercial independent software vendors (ISVs), such as SolidWorks for computer-aided design or Ansys for finite-element analysis.

The desktop tools offer an easy way to manipulate high-level objects (e.g., matrices with MATLAB, or parameter-driven geometric features in SolidWorks), hiding many of the underlying low-level programming complexities from the user. They also provide an interactive development and execution environment, the usage mode needed for productivity in science and engineering. But in the HPC world there are few commercial applications for parallel servers – less than 5% of the desktop science and engineering applications run on Linux clusters, or any other kind of parallel HPC server for that matter. This limited application availability is compounded by the specialized nature of the models and algorithms. Consequently, most technical applications for clusters are handed off to a parallel programming specialist to transform into custom code. Typically these applications are a prototype program written in a desktop-based high-level application (MATLAB, etc.), and “prose” that attempt to capture the particular model, system, or algorithm. The parallel programmer then writes the application in C, Fortran, and MPI (message passing interface) used for inter-processor communication and synchronization – relatively complex low-level programming. Only after the application is developed for the HPC server can it be executed to allow testing and scaling with the real data.

This process is slow, expensive, inflexible, and remarkably error-prone. Because each of these steps can be several months, scientists and engineers are limited to how much iteration to the algorithms and models they can make. More than 75 percent of the “time to solution” is spent programming the models for use on HPCs, rather than developing and refining them up front, or using them in production to make decisions and discoveries.

Bridging the gap

The ideal solution to this problem is a “fusion technology” that combines the power of a Linux cluster with the desktop application. It must enable scientists and engineers to write applications in their favorite VHLLs on desktops, and have them automatically parallelized and able to run interactively on Linux clusters. In other words, let the end users continue to work in their preferred environments, hide from them parallel programming challenges, and let them more easily access the Linux cluster.

With this approach, they could write just enough of the application in a VHLL to start testing with real data, as they incrementally refine the application. They could also take advantage of the many popular open source parallel libraries already in the public domain, turning these traditionally batch mode tools into interactive resources. With an interactive workflow, the time to “first calculation” can be within minutes, rather than the several months or years required to first program the parallel application.

Recently, several VHLL software vendors have introduced new parallel solutions that bridge desktops to Linux clusters. They are typically hybrid platforms built with both proprietary and open source components. For example, Interactive Supercomputing’s (ISC’s) Star-P is an interactive parallel computing platform that incorporates a number of popular open source libraries, which the company worked with the open source community to integrate, debug and improve the libraries’ performance. These open source libraries include:

- ScaLAPACK, a software library for linear algebra computations on distributed-memory computers,

- ATLAS, the Automatically Tuned Linear Algebra Software designed to provide portably optimal linear algebra software,

- FFTW, a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions, and

- SuperLU, a general purpose library for the direct solution of large, sparse, nonsymmetric systems of linear equations on high performance machines.

Interactive parallel computing platforms such as Star-P also feature software development toolkits (SDKs) that enable users to “plug in” existing codes from the open source community. This plug-in capability will give the hundreds of thousands of scientists, engineers, and analysts working at government, academic and commercial research facilities that use high-performance computing to easily string together open-source libraries. Consider, for example, Trilinos, an open source project developed by the Sandia National Labs to help facilitate the design, development, integration and ongoing support of mathematical software libraries. A Trilinos package is an integral unit usually developed by a small team of experts in a particular algorithms area such as algebraic preconditioners, nonlinear solvers, etc. Ken Stanley of 500 Software and principle architect of the Amesos direct sparse solver package in the Trilinos framework, developed a Star-P interface to the framework to provide broad-ranging high-performance capabilities for solving numerical systems at the heart of many complex multiphysics applications.

Once programming barriers are lowered, many more scientists and engineers can take advantage of the affordability and accessibility of Linux clusters and other high performance open source technology to experience supercomputing for the first time. The development of custom HPC codes that used to take months or years will become as interactive as our desktop PCs are today.

# # #

Ilya Mirman is vice president at Interactive Supercomputing

http://www.interactivesupercomputing.com , and can be reached at imirman@interactivesupercomputing.com


< HPC Affects Storage, Too | Wolfram Previewing Advances in Grid Computing and Eclipse-Based Software Development at SC06 >

 

Affiliates

Cluster Monkey

HPC Community


Supercomputing 2010

- Supercomputing 2010 website...

- 2010 Beowulf Bash

- SC10 hits YouTube!

- Louisiana Governor Jindal Proclaims the week of November 14th "Supercomputing Week" in honor of SC10!








Appro: High Performance Computing Resources
IDC: Appro Xtreme-X Supercomputer Blade Solution
Analysis of the Xtreme-X architecture and management system while assessing challenges and opportunities in the technical computing market for blade servers.

Video - The Road to PetaFlop Computing
Explore the Scalable Unit concept where multiple clusters of various sizes can be rapidly built and deployed into production. This new architectural approach yields many subtle benefits to dramatically lower total cost of ownership.
White Paper - Optimized HPC Performance
Multi-core processors provide a unique set of challenges and opportunities for the HPC market. Discover MPI strategies for the Next-Generation Quad-Core Processors.

Appro and the Three National Laboratories
[Appro delivers a new breed of highly scalable, dynamic, reliable and effective Linux clusters to create the next generation of supercomputers for the National Laboratories.

AMD Opteron-based products | Intel Xeon-based products



Home About News Archives Contribute News, Articles, Press Releases Mobile Edition Contact Advertising/Sponsorship Search Privacy
     Copyright © 2001-2013 LinuxHPC.org
Linux is a trademark of Linus Torvalds
All other trademarks are those of their owners.
    
  SpyderByte.com ;Technical Portals