HW/SW Co-optimization for Stencil Computation:Beginning with a Customizable Core

来源 :Tsinghua Science and Technology | 被引量 : 0次 | 上传用户：greenman

【摘要】

：

Energy efficiency is one of the most important issues for High Performance Computing(HPC) today.Heterogeneous HPC platform with some energy-efficient customizab

【作者】

：

Yanhua Li Youhui Zhang Weiming Zheng

【机构】

：

Department of Computer Science, Tsinghua University,

【出处】

：

Tsinghua Science and Technology

【发表日期】

：

2016年05期

【关键词】

：

cache processor SIMD overcome hardware processors instruction impressive exploit

下载到本地 , 更方便阅读

下载此文赞助VIP

声明 : 本文档内容版权归属内容提供方 , 如果您对本文有版权争议 , 可与客服联系进行内容授权或下架

论文部分内容阅读

Energy efficiency is one of the most important issues for High Performance Computing(HPC) today.Heterogeneous HPC platform with some energy-efficient customizable cores(as application-specific accelerators)is believed as one of the promising solutions to meet ever-increasing computing needs and to overcome power density limitations. In this paper, we focus on using customizable processor cores to optimize the typical stencil computations—— the kernel of many high-performance applications. We develop a series of effective software/hardware co-optimization strategies to exploit the instruction-level and memory-computation parallelism,as well as to decrease the energy consumption. These optimizations include loop tiling, prefetching, cache customization, Single Instruction Multiple Data(SIMD), and Direct Memory Access(DMA), as well as necessary ISA extensions. Detailed tests of power-efficiency are given to evaluate the effect of all these optimizations comprehensively. The results are impressive: the combination of these optimizations has improved the application performance by 341% while the energy consumption has been decreased by 35%; a preliminary comparison with X86, GPU, and FPGA platforms also showed that the design could achieve an order of magnitude higher performance efficiency. We believe this work can help understand sources of inefficiency in general-purpose chips and can be used as a beginning to customize an energy efficient CMP for further improvement. Energy efficiency is one of the most important issues for High Performance Computing (HPC) today. Heterogeneous HPC platform with some energy-efficient customizable cores (as application-specific accelerators) is believed as one of the promising solutions to meet ever-increasing computing needs and to overcome power density limitations. In this paper, we focus on using customizable processor cores to optimize the typical stencil computations - the kernel of many high-performance applications. We develop a series of effective software / hardware co-optimization strategies to exploit the optimizations include loop tiling, prefetching, cache customization, Single Instruction Multiple Data (SIMD), and Direct Memory Access (DMA), as well as necessary ISA extensions. Detailed tests of power-efficiency are given to evaluate the effect of all these optimizations comprehensively. The results ar e impressive: the combination of these optimizations has improved the application performance by 341% while the energy consumption has been decreased by 35%; a preliminary comparison with X86, GPU, and FPGA platforms also showed that the design could achieve an order of magnitude higher performance efficiency. We believe this work can help understand sources of inefficiency in general-purpose chips and can be used as a beginning to customize an energy efficient CMP for further improvement.

其他文献

施工班组思想政治工作浅谈

班组思想政治工作是企业思想政治工作的一部分,万不可缺。这是因为班组是企业的细胞,它不仅处于企业生产活动的第一线,而且还是企业思想政治工作的前哨阵地。因此,班组思想

期刊

思想政治工作思想情绪工程进度施工班组施工队伍政治素质班组建设排优解难万不可工程质量

儿童期外伤后癫癎持续状态的意义

作者于1970～1971年期间观察了因脑震荡住院的102例16岁以下儿童患者,其中有5例出现癫癎持续状态。他们过去未发现有癫癎病史,亦无癫癎的家族史,发作均在外伤后两小时内出现。

期刊

外伤后癫癎持续状态抗痉挛药物脑震荡外伤血肿神经系童期硬膜大发作

血管加压素在感染性休克病人中的应用

感染性休克是分布性休克的一种 ,血流动力学的表现以高心输出量、低外周血管阻力为特征。目前 ,对于感染性休克的治疗主要包括三个部分 :①保持足够的平均动脉压 ;②清除感染

期刊

感染性休克血管加压素内脏灌注血流动力学平均动脉压分布性休克药物治疗儿茶酚胺类压力反射血管收缩作用

经济萧条下的3G产业链何以突围?

李毅中在TD-SCDMA上对中移动曾指示:“只许成功,不许失败。”“我只看好水泥,除了水泥以外的其他商品都不看好。”摩根大通首席经济学家龚方雄认为,因为能源价格、煤炭价格继

期刊

基建项目能源价格移动数据业务业务提供商能源密集型中国电信中国联通中国移动增值业务话音服务

李司忒菌脑膜炎和脑膜脑炎

单核细胞增多性李司忒菌(LM)系Gram(+)球菌,长约0.5～1.2μ,天然广泛存在于湿土、水、畜粪、草料中.经由病畜排泄物污染的食物可传染给人类,亦可经由创口、结合膜炎而感染.凡

期刊

单核细胞增多小脑共济失调脑膜脑炎脑膜刺激征神经症状脑膜炎皮质类固醇排泄物肝硬化颅神经损害

“金威”缘何不愁“嫁”?

眼下,投产仅一年多的“金威”、“好顺”啤酒正红遍神州大地,走俏东南亚及独联体市场,并且还大有漂洋过海,受宠于欧美市场之势。不少人在问,为什么新投产的企业会这样走红?

期刊

产品质量评审啤酒有限公司评酒会生产调度生产系统红遍生产过程明星产品历史最高记录实施计划

一组癫癎病人对抗惊厥药物的反应差异

本文对118例癫癎大发作及精神运动型发作的患者,就抗惊癫药物的血清浓度与患者年龄、性别、药物剂量以及癫癎发作的频度和控制之间的关系进行了研究。 118例中,男81,女37。

期刊

癫癎抗惊厥药物癫癎大发作扑癎酮频度苯妥英钠反应差异血清浓度精神运动大发作

赵朴王鹏设计作品

请下载后查看，本文暂不支持在线获取查看简介。 Please download to view, this article does not support online access to view profile.

期刊

赵朴王鹏

家长,请走出学前子女教育的误区

时下,由于生活节奏的加快,许多学前儿童的父母在子女教育上,显得缺乏耐心,不肯动脑筋,不愿下功夫,懒于去寻求适合于孩子特点的方法来引导孩子的健康成长。这就严重影响了孩

期刊

子女教育出学身心发育学前儿童家庭教育给你学习教育开发智力去上气冲

课堂增值:学习焦虑的辅导与化解

一、基本信息1、背景介绍茜茜(化名),女,13岁,小学五年级学生重组家庭孩子,身体健康,性格开朗。父亲是海员,无暇照顾家庭。茜茜虽从小与父母同住,但学习与生活起居均由母亲关

期刊

五年级上海外国语大学学习焦虑生活起居兴趣班课外作业小升外国语学校情绪困扰自我怀疑

HW/SW Co-optimization for Stencil Computation:Beginning with a Customizable Core

与本文相关的学术论文