SpyderByte.com ;Technical Portals 
      
 News & Information Related to Linux High Performance Computing, Linux Clustering and Cloud Computing
Home About News Archives Contribute News, Articles, Press Releases Mobile Edition Contact Advertising/Sponsorship Search Privacy
HPC Vendors
Cluster Quoter (HPC Cluster RFQ)
Hardware Vendors
Software Vendors
HPC Consultants
Training Vendors
HPC Resources
Featured Articles
Cluster Builder
Beginners
Whitepapers
Documentation
Software
Lists/Newsgroups
Books
User Groups & Organizations
HP Server Diagrams
HPC News
Latest News
Newsletter
News Archives
Search Archives
HPC Links
ClusterMonkey.net
Scalability.org
HPCCommunity.org

Beowulf.org
HPC Tech Forum (was BW-BUG)
Gelato.org
The Aggregate
Top500.org
Cluster Computing Info Centre
Coyote Gultch
Dr. Robert Brown's Beowulf Page
FreshMeat.net: HPC Software
SuperComputingOnline
HPC User Forum
GridsWatch
HPC Newsletters
Stay current on Linux HPC news, events and information.
LinuxHPC.org Newsletter

Other Mailing Lists:
Linux High Availability
Beowulf Mailing List
Gelato.org (Linux Itanium)

LinuxHPC.org
Home
About
Contact
Mobile Edition
Sponsorship

Latest News

Multiprocessor Performance Analysis
Posted by Pradosh K Roy, Tuesday December 28 2004 @ 07:34AM EST

Performance Analysis remains a ambigous term when used in the context of computer systems , irrespective of its organization and architecture.An early measure was MIPS. MIPS had fallen out of use , because it does not account for the fact that different sytems require different number of instructions to implement agiven program. Another metric is CPI. Both MIPS and CPI have severe limitations as measures of performance. With the advent of RISC and hardware/software ILP , a new measure MFLOPS was introduced. This measure led to the concept of Theoretical Peak Performance [TPP] , which is simply the algebraic sum of FP pipes and Clock Rate. TPP , however is not useful in predicting the observed performance unless the workload consist of small programs that operate normally close to the peak and provides only a rough indication of what performance can be obtained on real world scientific applications. A more meaningful metric is sutsained performance.

Execution Time of real programs on an unloaded system i.e. the latency to comlete a job including disk access , memory access , IO activities and OS overhead could be taken as a reliable and reproducible measure for performance , following Hennessey & Patterson. The term CPU performance is defined as the user CPU time on an unloaded system , while system performance is defeined as the elapsed time on an unloaded system.

For the overall performance evaluation of uniprocessor/multiprocessor systems benchmark suites were introduced since 1980s.The five levels of in decreasing order of accuracy are real applications , scrripted applications , kernels , toy benchmarks and synthetic benchmarks.One of the attempts to create a standardized uniprocessor benchmark suites is by SPEC. SPEC89,SPEC95,SPECcpu2000 are typical benchmarks introduced subsequently.Unfortunately , it is unclear whether the current SPECcpu suite adequately represent todays high performance applications , let alone other application classes such as mobile computing , according to Skardon et.al. For multiprocessors systems bechmarking is more difficult because of at least four reasons pointed out by Cueller et. al and they are immaturity of parallel applications , immaturity of parallel programming languages , sensitivity of programs to behavioral differences , limitaions of simulations. Nevertheless some illustrious benchmarks developed since 1994 for multiprocess performance analyis includes NPB2.x ,ParkBench, LINPACK, HPL etc. Microbenchmarks offering a way to isolate individual aspects of a processor's performance have registered some progess in recent times. For example , memory performance for data intensive applications has been addressed recently uisng the Data Cube by Frumkin et. al.

A known disadvantage of all these multiprocessor performance benchmarks is that they do not measure system level aspects of performance. A new methodolugy suggested by David Bailey and the LBNL Group suggests a measure which includes system utilization, effectiveness of job scehduling , handling of large jobs , level of process management for multiprocessor production systems.

The methodology suggested by Bailey defines System Efficiency Ratio and Effective System Performance , which extends the idea of throughput benchmark with features that mimic day-to-day supercomputer centre operations. The system efficiency ratio is independent of of the computational rate and is also relatively independent of the number of processors used,thus permitting comparison between platforms.

The measures suggested by Bailey provide quantitative data on system contention effects , scheduling efficiency of the system as well as provide useful insight on how to better manage them.

The methodology could be adopted by the Linux High Performance Cluster Computing Community for commercial/industrial benefits.

Selected References. 1.Hennessey,Patterson,2003,Computer Architecture:A Quantitative Approach,2.NAS Parallel benchmark Suite available from http://www.nas.nasa.gov,3.Wong,A et al.1999,Evaluating System Effectuveness in High Performance Compuitng Systems , LBNL Technical Report No.44542;4.Wong,A.,et al.,2000,ESP:A System Utilization Benchmark , Proc.Supercomputing 2000 Conference , IEEE Press;5.Skardon et.al.,2003,Challenges in Computer Architecture Evaluation,IEEE Computerpp.30-36

< Linux Networx Clusterworx 3.2 Provides New Features for Premier Cluster Management | HP ramps up supercomputing push in M'sia >

 

Affiliates

Cluster Monkey

HPC Community


Supercomputing 2010

- Supercomputing 2010 website...

- 2010 Beowulf Bash

- SC10 hits YouTube!

- Louisiana Governor Jindal Proclaims the week of November 14th "Supercomputing Week" in honor of SC10!








Appro: High Performance Computing Resources
IDC: Appro Xtreme-X Supercomputer Blade Solution
Analysis of the Xtreme-X architecture and management system while assessing challenges and opportunities in the technical computing market for blade servers.

Video - The Road to PetaFlop Computing
Explore the Scalable Unit concept where multiple clusters of various sizes can be rapidly built and deployed into production. This new architectural approach yields many subtle benefits to dramatically lower total cost of ownership.
White Paper - Optimized HPC Performance
Multi-core processors provide a unique set of challenges and opportunities for the HPC market. Discover MPI strategies for the Next-Generation Quad-Core Processors.

Appro and the Three National Laboratories
[Appro delivers a new breed of highly scalable, dynamic, reliable and effective Linux clusters to create the next generation of supercomputers for the National Laboratories.

AMD Opteron-based products | Intel Xeon-based products



Home About News Archives Contribute News, Articles, Press Releases Mobile Edition Contact Advertising/Sponsorship Search Privacy
     Copyright © 2001-2013 LinuxHPC.org
Linux is a trademark of Linus Torvalds
All other trademarks are those of their owners.
    
  SpyderByte.com ;Technical Portals