by Allen Goldberg, Jan Prins, John Reif, Rik Faith, Zhiyong Li, Peter Mills, Lars Nyland, Dan Palmer, James Riely, Stephen Westfold, April, 1994 (rev. Aug 1994).
In recent years technological advances have made the construction of large-scale parallel computers economically attractive. These machines have the potential to provide fast solutions to computationally demanding problems that arise in computational science, real-time control, computer simulation, large database manipulation, and other areas. However , applications that exploit this performance potential have been slow to appear; such applications have proved exceptionally difficult to develop and, once written, too often fail to deliver the expected speed-up.
This state of affairs may be explained by the proliferation of parallel architectures and the simultaneous lack of effective high-level architecture-independent programming languages. Parallel applications are currently developed using low-level parallel programming notations that reflect specific features of the target architecture (e.g., shared vs. distributed memory , SIMD vs. MIMD, exposed vs. hidden interconnection network). These notations have significant drawbacks:
The problem is a fundamental one: abstract models of parallel computation lead to unrealistic algorithms, whereas machine-specific models lead to intractable analysis of even the simplest programs. The ef fect of the tar get architec- ture is pervasive: at a high level, dif ferent architectures often require fundamentally dif ferent algorithms to achieve optimal performance; at a low level, overall performance exhibits great sensitivity to changes in communication to- pology and memory hierarchy. The result is that the developer of a parallel application must explore a complex and poorly-understood design space that contains significant architecture-dependent trade-of fs. Moreover , this algorithmic exploration must be con- ducted using programming languages that of fer exceptionally low levels of abstraction in their mechanisms for ex- pressing concurrency . While a reasonable solution to these problems is to trade reduced access to architecture-specific performance for improved abstract models of computation, this trade may not always be the right one: the whole point of parallelism, for most applications, is performance.
The goal of the DUNCK (Duke University, University of North Carolina at Chapel Hill and the Kestrel Institute) ProtoTech effort is to provide improved capabilities for