POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 1698|回复: 4
打印 上一主题 下一主题

stream 架构里 的 global data store 目前还无法使用

  [复制链接]
跳转到指定楼层
1#
发表于 2008-12-11 16:17 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
http://forums.amd.com/devforum/m ... 0&enterthread=y

GDS is not currently supported in CAL even though the hardware does have this feature. This is mainly because there is no global locking mechanism to synchronize on and therefor there is no good way of using it. A Global GPR is a Register that is shared between the same thread index of a wavefront. These are the shared registers in IL.
For example, if you have two wavefronts with thread id's numbers 0-63 and 64-127. By declaring 1 shared register in your IL kernel, threads n and n + 64 can both read/write to sr0. Shared registers are guarantee atomic accesses in the same instruction only. So you can do a rmw operation on this register and another wavefront will see the updated value. This is useful for doing simple reductions in compute shader. You can do a simple reduction of any size in 3 passes instead of the log (n) passes that is currently required.
It would go something like this:
first pass:
run 1 thread per data point and have it update a globally shared register(either min, max, sum, etc...)
second pass:
run 1 wavefront per simd and use the LDS to share data between threads and update a single thread with the result of the rest of the threads and write out to global buffer
third pass:
run 1 wavefront and have it reduce the data from the global buffer to a single point

This can only be guaranteed to work if you use calCtxRunProgramGridArray and set the array to be three passes.
2#
发表于 2008-12-11 16:50 | 只看该作者
global data store有什么用的?
回复 支持 反对

使用道具 举报

3#
 楼主| 发表于 2008-12-11 16:54 | 只看该作者
回复 支持 反对

使用道具 举报

btbaby 该用户已被删除
4#
发表于 2010-1-16 11:49 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

5#
发表于 2010-1-17 21:00 | 只看该作者
技术取得东西 都看不懂呀
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-5-29 09:47

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表