POPPUR爱换

标题: 多核版 PhysX FluidMark 1.2 ：4 核 CPU 击败 GTX 275 单卡 [打印本页]

作者: Edison 时间: 2010-3-18 22:25
标题: 多核版 PhysX FluidMark 1.2 ：4 核 CPU 击败 GTX 275 单卡
New PhysX FluidMark 1.2: First Tests

with 3 comments

As we mentioned previously, upcoming FluidMark 1.2, next version of popular GPU PhysX testing and benchmarking application, will include support for Multi-Core CPU PhysX calculations, and overall multi-threading optimizations as well.

Jerome Guinot, FluidMark developer, was kind enough to provide us with latest beta-version of new Fluid-Mark 1.2, and we’ll try to answer finally, what is faster – GPU PhysX or properly optimized CPU PhysX.

But first, lets take a closer look at new FluidMark. (click to view full picture)

Main control panel now includes several additional options, like “Force PhysX CPU” – ability to switch between GPU and CPU PhysX, without necessity to use Nvidia Control Panel.

“Multi-core PhysX” checkbox enables all multi-threading optimizations, vital and most interesting part of new FluidMark.

“# of CPU cores” is used specify number of CPU cores dedicated to simulation (up to 32 in current version), however this option is no so transparent as it looks – increased number of cores adds additional fluid emitters to the scene (one emitter per core or two in general), and with equal number of particles, various number of emitters can affect performance.

Application window has also changed – benchmark is still based on SPH fluid simulation, buit into PhysX SDK (latest version 2.8.3.21 is used), but scene includes additional static objects, particles appearance if different and, as mentioned earlier, several emitters can be used simultaneously. Nice addition – GPU temperature overlay, usefull for GPU stress testing.

Final Global score in benchmarking mode is calculated now in a different way, and can’t be compared with previous version of FluidMark. It consist of two components – GraphX score (graphics framerate per second) and PhysX score (physics simulations per second).

Thus, Global score = (GraphX_score * 0.3 + PhysX_score * 1.7) / 2

Now, lets do some testing.

The Wonder of Multi-Threading !

Take a look at the following graph:

[Three emitters were used (# CPU cores = 4) with fixed number of particles - 15 000. Timerange - 60 sec. 800x600 rendering window. System: C2Q 9400 @ 2.66 GHz CPU, Nvidia GTX 275 + GTX 260 (192 sp) GPUs, 4GB RAM, Win XP, PhysX System Software 9.09.1112]

When “Multi-core PhysX” option is off, PhysX simulation and scene rendering are done in the same thread and, more important, PhysX SDK multi-threading flags are not set.

But when “Multi-core PhysX” is enabled, all PhysX simulations are done in separate threads and since there is still a thread for the rendering, scene rendering is boosted because there is no longer PhysX in scene thread. Same situation with PhysX, one or several threads are completely dedicated for physics simulation.

While SPH fluid simulation is running on CPU with “Multi-core PhysX” set to off, load is destributed through several cores (probably due to internal Windows threads management), but in sum that’s 26% – full one core.

But with multi-threaded optimizations enabled, application fully utilizes all four cores by 100%, what results in great speed boost.

In addition, one interesting detail was discovered – fluid simulation is running faster on GPU when one emitter is used, and opposite way – for CPU it prefers multiple emitters (with equal number of particles) – probably that’s peculiarity of PhysX SDK itself.

For example, with one emitter and multi-core PhysX switched to off, CPU simulation results in 36 global points (64 with 3 emitters – on graph above), while GTX 275 GPU – in 247 points (128 with 3 emitters). But since one emitter can’t utilize more than two cores, number of emitters was increased to gain equality.

Therefore, bechmarking seems to be a little tricky in new FluidMark. We are curious if someone will come with solid method after app release.

P.S. Thanks to Jerome for beta FluidMark and detailed explanations

作者: pds21 时间: 2010-3-18 22:31
提示: 作者被禁止或删除内容自动屏蔽

作者: 餐具 时间: 2010-3-18 22:35
AMD那句话没白说啊，立杆见影的效果

作者: 绿色世界 时间: 2010-3-18 23:07
提示: 作者被禁止或删除内容自动屏蔽

作者: ft5555 时间: 2010-3-18 23:29
ls的给解释下为何Multi-core PhysX开启后

gpu physx的得分也会倍增

作者: rickerlian 时间: 2010-3-18 23:53
世界真奇妙

作者: ft5555 时间: 2010-3-19 00:00

不知道，同样离奇的是Multi-core PhysX Off下，CPU+GPU相对GPU几乎没有增长

PS：我记得以前哪位大大说 ...
纳尼？发表于 2010-3-18 23:36

更奇妙的图中275+260得分是 275单卡两倍多。。。。

作者: AlanLW 时间: 2010-3-19 00:10
还有一点，Q9400比GTX275便宜多了

作者: nfsking2 时间: 2010-3-19 00:11
本帖最后由 nfsking2 于 2010-3-19 00:12 编辑

有个问题，不知道是不是我小白了
AMD提出的质疑是：Multi-Core CPU Support is Disabled in PhysX
而PhysX加速又分CPU和GPU
所以问题的焦点应该是，当PhysX以CPU做物理加速时，是否运用到了多核心

这个软件既然叫PhysX FluidMark，应该就是通过PhysX API，调用CPU或者GPU来做物理加速。
而从1.2版的情况来看，软件能成功调用多核CPU来做物理加速，甚至能将每个核心都占用满，所以AMD当初所谓的“PhysX禁用了多核心物理加速，以夸大GPU物理加速”这个说法就不怎么站得稳了。

所以现在的情况是：4核CPU如果满负荷做物理加速，比一张中高端显卡的效率更高，而NV也没故意屏蔽PhysX对多核CPU的支持。

所以一切问题都在游戏开发商那边？

作者: nfsking2 时间: 2010-3-19 01:18
本帖最后由 nfsking2 于 2010-3-19 02:56 编辑

回复 nfsking2

显然FluidMark是通过特殊手段破解而达到了取消某些限制的，只要有功能这中破解对于程序 ...
ArthurMa 发表于 2010-3-19 00:41

那就要看FluidMark的作者到底是NV的人，还是中立人士了
因为按照他的说法：

I’m currently updating Geeks3D’s PhysX FluidMark tool and from my last tests, multi-core CPU support in PhysX seems to be ok (that confirms what NVIDIA said in this news)…

而这个"what NVIDIA said in this news"指的是：

Here is the reply of Nadeem Mohammad, NVIDIA’s PhysX director of product management, to AMD’s accusations:

I have been a member of the PhysX team, first with AEGIA, and then with Nvidia, and I can honestly say that since the merger with Nvidia there have been no changes to the SDK code which purposely reduces the software performance of PhysX or its use of CPU multi-cores.

Our PhysX SDK API is designed such that thread control is done explicitly by the application developer, not by the SDK functions themselves. One of the best examples is 3DMarkVantage which can use 12 threads while running in software-only PhysX. This can easily be tested by anyone with a multi-core CPU system and a PhysX-capable GeForce GPU. This level of multi-core support and programming methodology has not changed since day one. And to anticipate another ridiculous claim, it would be nonsense to say we “tuned” PhysX multi-core support for this case.

PhysX is a cross platform solution. Our SDKs and tools are available for the Wii, PS3, Xbox 360, the PC and even the iPhone through one of our partners. We continue to invest substantial resources into improving PhysX support on ALL platforms–not just for those supporting GPU acceleration.

As is par for the course, this is yet another completely unsubstantiated accusation made by an employee of one of our competitors. I am writing here to address it directly and call it for what it is, completely false. Nvidia PhysX fully supports multi-core CPUs and multithreaded applications, period. Our developer tools allow developers to design their use of PhysX in PC games to take full advantage of multi-core CPUs and to fully use the multithreaded capabilities.

大意就是NV从AEGIA收购PhysX之后，并没有对SDK做任何“限制多核CPU”的修改，并且举例XB360，PS3等多平台共用SDK来证明PhysX能够在多核CPU上良好运行。

所以，针对AMD的指责，这里又出现N种可能：
1.FluidMark的作者Jerome Guinot是NV的人，有办法通过特殊手段绕过PhysX的多线程限制（这个不好说，一切皆有可能）
2.FluidMark在使用CPU进行物理加速时，根本没有真正调用PhysX API（虽然大部分人无从可知，不过对于某些人来说，验证这个说法不是难事吧）
3.NV通过更新PhysX运行库，取消了对多核CPU的限制（更换低版本的PhysX运行库试试）
4.PhysX本身就没限制过CPU，AMD指责只是红眼行为（这个应该是跟第1+2点的可能性相同的）

而针对CPU和GPU加速的性能差异，有3种可能
1.CPU在处理物理加速时，效率确实比显卡高
2.FluidMark对GPU物理加速优化不够
3.单卡进行GPU物理加速时，GPU并未100%参与到物理加速中，一部分资源被用来渲染图像。

而性能差异的可能性中，后两点的可能性比较大，因为在FluidMark 1.1.1中，使用AN混交进行GPU物理加速时，GPU-Z显示物理加速卡的GPU占用率很低。

待机时，这张作为物理加速卡的8800GTX 768M（刷过Ultra的分频BIOS，所以GPU-Z识别型号请无视）的显存占用量是132M（混交必须将桌面扩展到N卡，会占用一定显存）
[attach]1239642[/attach]

做1080P分辨率的物理加速时，GPU占用率34%，显存占用量297M（8xAA下，GPU仍然占用34%，显存311M）
[attach]1239643[/attach]

也就是说FluidMark最多能占用180M左右的显存以及34%左右的GPU资源。

而此时，主卡5870的GPU占用率为96%
[attach]1239647[/attach]

而5870的性能距离目前N卡王者GTX295也并不遥远
所以从以上测试情况看，要让任何一张N卡同时处理画面渲染和物理加速，确实是力不从心了

并且在现在的PhysX游戏中，游戏调用的显存大小仅128M（比如蝙蝠侠），具体见http://we.pcinlife.com/thread-1367643-1-1.html，或者搜索 PhysX acceleration currently allocates 128MB of device memory in default
（对此有疑问的XD，可以找找蝙蝠侠安装目录中的某个ini文件中的某个参数，搜索关键词 gpuHeapSize）

260+与275双卡对FluidMark进行渲染和加速时，效率大增好像也说明了某些问题

作者: Kevsun 时间: 2010-3-19 01:50
软件对硬件的支持不够？

作者: westlee 时间: 2010-3-19 06:57
提示: 作者被禁止或删除内容自动屏蔽

作者: 86751213 时间: 2010-3-19 07:22
提示: 作者被禁止或删除内容自动屏蔽

作者: fuxingchina 时间: 2010-3-19 07:50
提示: 作者被禁止或删除内容自动屏蔽

作者: zblskj 时间: 2010-3-19 08:04
显而易见 NV 给游戏厂商出工程师出钱赞助，要是游戏厂商还用CPU做PHX，NV的GPU PHX 不就失去了原本的意义。从 E大的转帖看， PHX 是可以支持多核CPU的，而其运算效率不低。突然发现NV为什么这么着急于
开发费米，（费米很多构架像极了多核CPU) 。。。。。因为费米的特性才是 PHX 物理和CUDA的最佳运算构架。想想480 480个多核心的CPU 做物理效率就上去了。当然 NV还是要解决游戏中游戏+PHX物理协同工作的问题。
不过这点我相信NV 应该能做到。---

作者: defv4 时间: 2010-3-19 08:18
不会的老黄不会向Intel低头的人

作者: fuxingchina 时间: 2010-3-19 08:45
提示: 作者被禁止或删除内容自动屏蔽

作者: 380 时间: 2010-3-19 08:51
提示: 作者被禁止或删除内容自动屏蔽

作者: 380 时间: 2010-3-19 09:00
提示: 作者被禁止或删除内容自动屏蔽

作者: 380 时间: 2010-3-19 09:03
提示: 作者被禁止或删除内容自动屏蔽

作者: 2ndWeapon 时间: 2010-3-19 09:07
单卡GPU物理加速时究竟分配多少GPU资源给物理多少给图形是个很难把握的事情吧。这个结果看起来就像是优化不足

作者: westlee 时间: 2010-3-19 09:17
提示: 作者被禁止或删除内容自动屏蔽

作者: westlee 时间: 2010-3-19 09:18
提示: 作者被禁止或删除内容自动屏蔽

作者: NV30F0 时间: 2010-3-19 09:18
回复 17# westlee

而且cpu潜力也不小。。。

作者: 66666 时间: 2010-3-19 09:38
这个单卡GPU效率低下跟显存大小是不是有关系？

作者: westlee 时间: 2010-3-19 09:44
提示: 作者被禁止或删除内容自动屏蔽

作者: Fanjet 时间: 2010-3-19 09:55
又是物理特效，有几款游戏支持，又有几款游戏可玩度高。

作者: 380 时间: 2010-3-19 10:05
提示: 作者被禁止或删除内容自动屏蔽

作者: garou 时间: 2010-3-19 10:52
lz的立场一点也不中立啊～

作者: kaven 时间: 2010-3-19 11:01
老黄卖个破绽，故意压低n卡得分，让i和a放松警惕，现在i和a都对physX怀有敌意，
等到physX成为行业标准的时候，老黄驱动一改，。。。。呵呵，当我说笑话好了

作者: nfsking2 时间: 2010-3-19 11:13

回复 nfsking2

这位NV PhysX的负责的说明其实是从某种层面可以看作影射了之前的问题.
至于将Multi t ...
ArthurMa 发表于 2010-3-19 09:38

蟑螂理论解释不了FluidMark 1.2对多核的支持

作者: slr 时间: 2010-3-19 12:13
275+260性能2倍以上是否是同时进行画面渲染与物理运算时占用了部分GPU资源

作者: westlee 时间: 2010-3-19 12:57
提示: 作者被禁止或删除内容自动屏蔽

作者: 湛江一哥 时间: 2010-3-19 14:41
这个是什么？

作者: rtyou 时间: 2010-3-19 14:45

bc2的物理效果，我看还不如红色派系。
westlee 发表于 2010-3-19 12:57

你说的这两个游戏，物理引擎都是用的Havok

作者: nfsking2 时间: 2010-3-19 16:45

回复 nfsking2

蟑螂理论是针对有人把事情归咎于游戏开发商，而非FluidMark 1.2，希望看清我说的话.
ArthurMa 发表于 2010-3-19 12:01

所以您的意思是游戏开发商没错，他们根本就不该优化多核CPU物理加速，PhysX对CPU的良好支持需要全靠NV做完，NV需要像个慈善家而不是商人

作者: 鱼儿水中游 时间: 2010-3-19 20:22
E大出来说个话总结一下吧。

作者: cenxuebin 时间: 2012-8-14 11:02

作者: a9988a 时间: 2012-8-14 14:39
如此千年古墓你都挖出来

欢迎光临 POPPUR爱换 (https://we.poppur.com/)