论文部分内容阅读
DOACROSS loops are significant parts in many important scientific and engineering applications,which are generally exploited pipeline/wave-front parallelism by loop transformations.However,previous work almost statically performs iterations in parallel threads,thus causing a waste of computing resources in thread synchronization.This paper proposes a brand-new parallel strategy for DOACROSS loops that provides a dynamic task assignment with reduced dependences to achieve wave-front parallelism through loop tiling.The proposed strategy uses a master-slave parallel mode and some customized structures to realize dynamic and flexible parallelization,which effectively avoids threads from waiting in communication.An efficient tile size selection (TSS) approach is also proposed to preserve data reuse in cache for tiled codes.The experimental results show that the proposed parallel strategy obtains good and stable speedups over six typical benchmarks with different problem sizes and different numbers of threads on an Intel(R) Xeon(R) 32-core server.And it outperforms two static strategies,a barrier-based strategy and a post/wait-based strategy,by 32% and 20% in average performance,respectively.This strategy also yields a better performance than a mutex-based dynamic strategy.Besides,it has been demonstrated that the proposed TSS approach can achieve a near-optimal performance and is comparable with a state-of-the-art TSS approach.