Modeling the performance of general purpose instruction level parallel architectures in image processing

Migliardi, Mauro; Maresca, Massimo

RISC Instruction Level Parallel systems are today the most commonly used high performance computing platform. On such systems, Image Processing and Pattern Recognition (IPPR) tasks, if not thoroughly optimized to fit each architecture, exhibit a performance level up to one order of magnitude lower than expected. In this paper we identify the sources of such behavior and we model them defining a set of indices to measure their influence. Our model allows planning program optimizations, assessing the results of such optimizations as well as evaluating the efficiency of the CPUs architectural solutions in IPPR tasks. A case study using a combination of a specific IPPR task and a RISC workstation is used to demonstrate these capabilities. We analyze the sources of inefficiency of the task, we plan some source level program optimizations, namely data type optimization and loop unrolling, and we assess the impact of these transformations on the task performance. The results of our study allow us to obtain an eight times performance improvement and to conclude that, in low-medium level IPPR tasks, it is more difficult to efficiently exploit superscalarity than pipelining.