A compiler-driven supercomputer

作者:

Highlights:

摘要

The overall performance of supercomputers is slow compared to the speed of their underlying logic technology. This discrepancy is due to several bottlenecks: memories are slower than the CPU, conditional jumps limit the usefulness of pipelining and pre-fetching mechanisms, and functional-unit parallelism is limited by the speed of hardware scheduling. This paper describes a supercomputer architecture called Ring of Pre-fetch Elements (ROPE) that attempts to solve the problems of memory latency and conditional jumps, without hardware scheduling. ROPE consists of a very pipelined CPU data path with a new instruction pre-fetching mechanism that supports general multi-way conditional jumps. An optimizing compiler based on a global code transformation technique (Percolation Scheduling or PS) gives high performance without scheduling hardware.

论文关键词:

论文评审过程:Available online 1 April 2002.

论文官网地址:https://doi.org/10.1016/0096-3003(86)90128-1