POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
楼主: Vendicare

Tegra4的Geforce ULP为何使用custom cores

  [复制链接]
发表于 2013-1-8 20:16 | 显示全部楼层
Vendicare 发表于 2013-1-8 19:56
话是这么说,只是没什么游戏用到了这个Gs.

本来是vs做的事情单独分出个gs我觉得意义不是很大

这么说只要流水线上增加相应的专用功能单元,分离渲染架构同样可以适用到DX11的范畴?
回复 支持 反对

使用道具 举报

发表于 2013-1-8 21:53 | 显示全部楼层
gf7那种非统一渲染的sp就是渣
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 21:54 | 显示全部楼层
Xenomorph 发表于 2013-1-8 20:16
这么说只要流水线上增加相应的专用功能单元,分离渲染架构同样可以适用到DX11的范畴?

理论上如此,nvidia前任首席科学家David Krik这么说过。
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:00 | 显示全部楼层
Vendicare 发表于 2013-1-8 21:54
理论上如此,nvidia前任首席科学家David Krik这么说过。

了解,谢谢~Wayne前端莫非是24VS—48PS—12TMU?
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:01 | 显示全部楼层
本帖最后由 Vendicare 于 2013-1-8 22:01 编辑
GTX999 发表于 2013-1-8 21:53
gf7那种非统一渲染的sp就是渣

非统一渲染哪来的shader processor?

另外GF7系列怎么渣了?愿闻其详
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:01 | 显示全部楼层
Xenomorph 发表于 2013-1-8 22:00
了解,谢谢~Wayne前端莫非是24VS—48PS—12TMU?

木有资料,不敢拍大腿啊!
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:05 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:01
木有资料,不敢拍大腿啊!

Tegra 2的前端是4VS—4PS—1TMU;Tegra 3的是4VS—8PS—2TMU;前面某个温柔善良漂亮可爱的女孩子说了“在Tegra 3的基础上放大为6倍”,不就是24VS—48PS—12TMU了……
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:11 | 显示全部楼层
本帖最后由 Vendicare 于 2013-1-8 22:14 编辑
Xenomorph 发表于 2013-1-8 22:05
Tegra 2的前端是4VS—4PS—1TMU;Tegra 3的是4VS—8PS—2TMU;前面某个温柔善良漂亮可爱的女孩子说了“在 ...

这个目前俺既没看见white paper,也没看见实物啊。

虽然24VS—48PS—12TMU可能性非常非常大,但是我不确定。一来12TMU比较可疑,二来如果T4想要支持DX10甚至DX11的话是否要加入GS和CS也不好说(我认为GS应该可以通过VS扩展,CS则通过PS实现比较容易输出)。
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:17 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:11
这个目前俺既没看见white paper,也没看见实物啊。

虽然24VS—48PS—12TMU可能性非常非常大,但是我不 ...

请问CS是那种shader?
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:23 | 显示全部楼层
Xenomorph 发表于 2013-1-8 22:17
请问CS是那种shader?

CS = Compute Shader != counter strike

compute shader是DX11的新特性,和GS一样的了无新意。
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:30 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:11
这个目前俺既没看见white paper,也没看见实物啊。

虽然24VS—48PS—12TMU可能性非常非常大,但是我不 ...

为了上高分辨率,搞12个TMU也可以理解吧?
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:32 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:23
CS = Compute Shader != counter strike

compute shader是DX11的新特性,和GS一样的了无新意。

这么说Compute Shader像是一种通用单元了……都没有专门负责图形渲染工作的某一个流程……
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:42 | 显示全部楼层
本帖最后由 Vendicare 于 2013-1-8 22:42 编辑
Xenomorph 发表于 2013-1-8 22:32
这么说Compute Shader像是一种通用单元了……都没有专门负责图形渲染工作的某一个流程……

其实从FX5800时代开始Pixel shader就具备完善的数学计算能力,最早就有人用FX5800算蛋白质折叠(参考《GPU Gem1》)。

Compute shader需要的就是算术运算单元,别的倒不是很重要。
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:45 | 显示全部楼层
coollab 发表于 2013-1-8 22:30
为了上高分辨率,搞12个TMU也可以理解吧?

额,12个TMU自然是有可能啦。不过这样就要赶上GT630了,T4到底要闹哪样。
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:47 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:45
额,12个TMU自然是有可能啦。不过这样就要赶上GT630了,T4到底要闹哪样。

8个比较合理吧?
我猜的……
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:49 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:42
其实从FX5800时代开始Pixel shader就具备完善的数学计算能力,最早就有人用FX5800算蛋白质折叠(参考《GP ...

话说nVIDIA能不能怀旧一下,用一共接近2000个专用职能算术逻辑单元打造一个能耗比、绝对性能都比GK104高的GPU呢?
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 22:56 | 显示全部楼层
Xenomorph 发表于 2013-1-8 22:49
话说nVIDIA能不能怀旧一下,用一共接近2000个专用职能算术逻辑单元打造一个能耗比、绝对性能都比GK104高的 ...

这个应该比较难,主要是不符合Nvidia的GPGPU发展趋势。硬件简单了,软件开发人员就麻烦了。

Cg进行通用计算的时候编程方法比较痛苦,可以说痛不欲生。
回复 支持 反对

使用道具 举报

发表于 2013-1-8 22:59 | 显示全部楼层
Vendicare 发表于 2013-1-8 22:56
这个应该比较难,主要是不符合Nvidia的GPGPU发展趋势。硬件简单了,软件开发人员就麻烦了。

Cg进行通用 ...

分开工程师,GPGPU一条路,图形性能冲击极限的能耗比另一条路……
回复 支持 反对

使用道具 举报

 楼主| 发表于 2013-1-8 23:14 | 显示全部楼层
本帖最后由 Vendicare 于 2013-1-8 23:24 编辑
Xenomorph 发表于 2013-1-8 22:59
分开工程师,GPGPU一条路,图形性能冲击极限的能耗比另一条路……

D. Kirk: Our DirectX 10 GPU may be Unified-Shader, or not. Everyone thinks I said "we won't go there (Unified-Shader)." But what I said is just you can't know it until (our GPU) debuts.

D. Kirk: When's the right time for a Unified-Shader hardware, that's the problem. I agree that in future GPU will be simpler, less kinds of processors. Different hardware pieces such as Vertex Shader, Pixel Shader, ROP, frontend processor and Tesselator will change into a single piece that can do all things one day. But it takes time and can't be done at once. The change will happen progressively.

D. Kirk: The cost (of US) is huge. For example, (an updated architecture of) G71 can support "Unified" programming model, but (even in that case) execution is not Unified. The performance/mm^2 (die size) of G71 is very high. On the other hand, The performance/mm^2 of Xbox 360 GPU (with Unified-Shader) (Xenos) is lower. Which do you prefer?

D. Kirk: It's true that Unified-Shader is flexible, but it's more flexible than actual need. It's like 200-inches belt. If it's 200-inches it fits you however overweight you are, but if you're not overweight it's useless.

One of the reasons that support Unified-Shader is it enables better load balancing. You can assign Shader to pixel processing if required, and to vertex processing too. But, in the end, in most cases pixel processing is required. For example you may render 100 million pixels but not 100 million polygons. Of course, even if the setup unit can draw 100 million polygons.

D. Kirk: In the logical diagram of D3D 10, Vertex Shader, Geometry Shader and Pixel Shader are placed side by side. What happens if they are placed in the same box? Each Shader is a different part. If they get unified they become wasteful.

Besides, it requires more I/O (wires) because all connections with memory concentrate on the box. Registers and constants are put in a single box too. It's because you have to keep all vertex states, pixel states and geometry states together while doing load balancing. A bigger register array requires more ports.

D. Kirk: Let's take a look at the computation trend. A simple CPU of 20 years ago had only 1 function unit. In other words, it was Unified-Shader. (laugh) But now even Intel doesn't design such a CPU.

Complicated operations always give us the possibility to make many operations parallel. So we've been evolved GPU by making different pieces busy at the same time in a pipeline approach. If you distribute (a pipeline) to 20 operations each piece can do 20 operations by processing them in parallel. But if all are Unified you have to do 20 operations on 20 processors (Shaders).

I'm not saying Unified-Shader is not a good idea. But to enable (a single Shader) to do everything is a lot more difficult than expected. So I think it will go progressively.

D. Kirk: Even though they say it's a unified pipeline I think it's a hybrid and not completely unified. It's possible that it's an incomplete Unified-Shader with some parts unified but other parts shared.

It's not that I have a proof of that. But it should be the right decision for them. I think they don't make waste in Unified-Shader as they are clever.

D. Kirk: We want to remove special-purpose units from GPU. On the other hand, we also want to run (special graphics functions) really fast. If you remove all special-purpose implementations from GPU it's just a Pentium.

---------总之大神说:我也这么想................
回复 支持 反对

使用道具 举报

发表于 2013-1-8 23:45 | 显示全部楼层
Vendicare 发表于 2013-1-8 23:14
D. Kirk: Our DirectX 10 GPU may be Unified-Shader, or not. Everyone thinks I said "we won't go the ...

嗯嗯~加油吧……
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2024-3-29 20:15

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表