POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 1394|回复: 0
打印 上一主题 下一主题

Compilation for Explicitly Managed Memory Hierarchies (On CBE)

[复制链接]
跳转到指定楼层
1#
发表于 2006-10-21 14:41 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
Compilation for Explicitly Managed Memory Hierarchies

Stanford University

Timothy J. Knight
Ji Young Park
Manman Ren
Mike Houston
Mattan Erez
Kayvon Fatahalian
Alex Aiken
William J. Dally
Pat Hanrahan

This paper presents the design and implementation of an optimizing compiler for architectures with software-managed memory hierarchies. We model programs as hierarchies of bulk operations with explicit parallelism. Bulk operations are either bulk data transfers or kernels (bulk computations) on bulk data. Our focus here is on the backend compiler phases; we begin with a relatively lowlevel intermediate representation close to the target class of machines. In this representation all parallelism is explicit, every bulk operation is assigned to a speci c level of the memory hierarchy, and all sizes of data sets are known.We then illustrate the important optimizations that should be applied to such programs to produce ef cient code for hierarchical memory machines. In this paper we do not consider the problem of mapping a source-level language to the intermediate representation (but see [16] for an example). Any performance-oriented compiler must address the issues we consider regardless of programming model; a compiler for Pro Agramming model further removed from the target class of machines simply faces a superset of the issues we address.

We make the following contributions:  

  • We present a general model for machines with explicitly managed memory hierarchies (Section 2). This model illustrates the level of abstraction at which our compiler works and provides the conceptual framework in which we develop our optimizations.
  • We present an intermediate representation (IR) expressing programs as hierarchies of bulk operations. An IR that represents operations at multiple scales enables optimizations to be applied uniformly across different levels of the machine: the same optimizations apply to coarse-grained computation near the root and ner-grained computation at the leaves.
  • We also give an operational semantics for IR programs executing on an abstract machine, which both makes clear our notions of hierarchical programs and bulk operations and provides the basis for understanding the correctness of our optimizations (Section 3).
     
  • We describe a number of optimizations that are key to acheiving high performance on our target class of machines (Section 4). We discuss some important details of compiling for the Cell architecture (Section 5) and give experimental results with a Cellbased compiler showing the effectiveness of our optimizations for utilizing execution and bandwidth resources (Section 6).


http://theory.stanford.edu/~aiken/publications/new/hierarchy.pdf
您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2024-5-15 23:00

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表