Performance Analysis remains a ambigous term when used in the context of computer systems , irrespective of its organization and architecture.An early measure was MIPS. MIPS had fallen out of use , because it does not account for the fact that different sytems require different number of instructions to implement agiven program. Another metric is CPI. Both MIPS and CPI have severe limitations as measures of performance. With the advent of RISC and hardware/software ILP , a new measure MFLOPS was introduced. This measure led to the concept of Theoretical Peak Performance [TPP] , which is simply the algebraic sum of FP pipes and Clock Rate. TPP , however is not useful in predicting the observed performance unless the workload consist of small programs that operate normally close to the peak and provides only a rough indication of what performance can be obtained on real world scientific applications. A more meaningful metric is sutsained performance.
Execution Time of real programs on an unloaded system i.e. the latency to comlete a job including disk access , memory access , IO activities and OS overhead could be taken as a reliable and reproducible measure for performance , following Hennessey & Patterson. The term CPU performance is defined as the user CPU time on an unloaded system , while system performance is defeined as the elapsed time on an unloaded system.
For the overall performance evaluation of uniprocessor/multiprocessor systems benchmark suites were introduced since 1980s.The five levels of in decreasing order of accuracy are real applications , scrripted applications , kernels , toy benchmarks and synthetic benchmarks.One of the attempts to create a standardized uniprocessor benchmark suites is by SPEC. SPEC89,SPEC95,SPECcpu2000 are typical benchmarks introduced subsequently.Unfortunately , it is unclear whether the current SPECcpu suite adequately represent todays high performance applications , let alone other application classes such as mobile computing , according to Skardon et.al. For multiprocessors systems bechmarking is more difficult because of at least four reasons pointed out by Cueller et. al and they are immaturity of parallel applications , immaturity of parallel programming languages , sensitivity of programs to behavioral differences , limitaions of simulations. Nevertheless some illustrious benchmarks developed since 1994 for multiprocess performance analyis includes NPB2.x ,ParkBench, LINPACK, HPL etc. Microbenchmarks offering a way to isolate individual aspects of a processor's performance have registered some progess in recent times. For example , memory performance for data intensive applications has been addressed recently uisng the Data Cube by Frumkin et. al.
A known disadvantage of all these multiprocessor performance benchmarks is that they do not measure system level aspects of performance. A new methodolugy suggested by David Bailey and the LBNL Group suggests a measure which includes system utilization, effectiveness of job scehduling , handling of large jobs , level of process management for multiprocessor production systems.
The methodology suggested by Bailey defines System Efficiency Ratio and Effective System Performance , which extends the idea of throughput benchmark with features that mimic day-to-day supercomputer centre operations. The system efficiency ratio is independent of of the computational rate and is also relatively independent of the number of processors used,thus permitting comparison between platforms.
The measures suggested by Bailey provide quantitative data on system contention effects , scheduling efficiency of the system as well as provide useful insight on how to better manage them.
The methodology could be adopted by the Linux High Performance Cluster Computing Community for commercial/industrial benefits.
Selected References. 1.Hennessey,Patterson,2003,Computer Architecture:A Quantitative Approach,2.NAS Parallel benchmark Suite available from http://www.nas.nasa.gov,3.Wong,A et al.1999,Evaluating System Effectuveness in High Performance Compuitng Systems , LBNL Technical Report No.44542;4.Wong,A.,et al.,2000,ESP:A System Utilization Benchmark , Proc.Supercomputing 2000 Conference , IEEE Press;5.Skardon et.al.,2003,Challenges in Computer Architecture Evaluation,IEEE Computerpp.30-36