POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 2203|回复: 15
打印 上一主题 下一主题

在Istanbul的帮助下,四路Opteron终于胜了双路Nehalem EP

[复制链接]
跳转到指定楼层
1#
发表于 2009-2-26 16:15 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
How AMD's Istanbul might close the gap with Nehalem EP





Date: February 25th, 2009
Author: Johan De Gelas
The Istanbul cores are the same as those that can be found in the AMD's latest Shanghai CPU. But the "uncore" part of Istanbul is more interesting. By now, you have probably heard about AMD's "HT-assist" technology, a probe or snoop filter. Every time a new cacheline is brought into the L3-cache of for example CPU 1 on the current Shanghai Platform, a broadcast message is sent to all L3-caches of all CPUs, and CPU 1 has to wait until those CPUs answer.

In the case of Istanbul, the CPU will simply check it's snoop filter in it's own L3-cache, and if none of the other CPUs have that certain cacheline, it can go ahead. This lowers the latency of bringing in a new cacheline and raises the effective bandwidth.



To better understand this, we combined our own stream benchmarking with the one that AMD presented. All AMD systems are using DDR-2 800.





As each Stream thread works on its own data, there is no reason to send out coherency synchronization requests. These requests slow the process of getting new cachelines in the L3 and hence lower effective memory bandwidth. What is interesting is that this will not only benefit the applications that use the HT interconnects a lot for coherency traffic, but also applications like stream which do not need the HT interconnects. Also notice that HT 3.0 does not improve memory bandwidth, as Stream will try to keep its thread data local. Our testing used SUSE SLES 10 SP2 and AMD used Windows 2008. Both OSs are well optimized and NUMA aware.


This means that especially HPC applications, with many threads all working on their own data, will benefit from the higher effective bandwidth. Besides HT assist, AMD has now confirmed to us that the memory controller has been tuned quite a bit. This higher amount of bandwidth will allow the quad Istanbul to stay out of the reach of the dual Nehalem EP Xeons in many HPC applications.


HT assist might also improve the SAP and OLTP scores quite a bit, but for a different reason. SAP and OLTP applications perform a lot of cache coherency syncronization requests, so the snoop filter will substantially lower the average latency of such requests as in some cases:
  • the CPU will only wait on one other CPU (instead of waiting for all responses to come back)
  • the CPU won't have to wait at all, as the other CPUs don't have this line.

Secondly, this will also lower memory latency, which is a bonus for almost every multi-threaded application.


Lower memory latency, higher bandwidth, lower "cache coherency" latency and more interconnect bandwidth: the improved "uncore" of Istanbul will be vital to close the gap with Nehalem. Much will depend on how quickly Intel introduces its own hexacore 32 nm Xeons, but that probably won't happen before 2010. Istanbul is shaping up to be a really good alternative for Intel's quadcore Nehalem. We might see a good fight after all...

Don't forget to check it.anandtech.com (IT portal) often, as many of our blogposts (for example the VMworld 2009 coverage) are not published on the frontpage of Anandtech.com.
2#
发表于 2009-2-26 17:07 | 只看该作者
4路26核和2路8核???
回复 支持 反对

使用道具 举报

3#
 楼主| 发表于 2009-2-26 17:11 | 只看该作者
4路26核和2路8核???
Sirlion 发表于 2009-2-26 17:07


正是这个Stream Benchmark,AMD在发布会上对其津津乐道……
http://news.mydrivers.com/1/128/128324.htm
回复 支持 反对

使用道具 举报

4#
发表于 2009-2-26 17:31 | 只看该作者

AMD你也会有今天,哈哈哈哈。
回复 支持 反对

使用道具 举报

5#
发表于 2009-2-26 18:45 | 只看该作者
正确的对法是4路24核24线程opteron完胜2路8核16线程nehalem ep
回复 支持 反对

使用道具 举报

6#
发表于 2009-2-26 19:26 | 只看该作者
{titter:] 2vs2的结果是显而易见的

{titter:] 期待8核心的喝奶
回复 支持 反对

使用道具 举报

7#
发表于 2009-2-26 19:35 | 只看该作者
A社不是喜欢玩"AMD xxxxx孤独求败"么,来呀{titter:]
回复 支持 反对

使用道具 举报

8#
发表于 2009-2-26 19:36 | 只看该作者
又见DDR2对DDR3。毫无意义的测试
回复 支持 反对

使用道具 举报

9#
发表于 2009-2-26 19:52 | 只看该作者
AMD这代架构不行。。。
堆核心那么高成本再贱卖不是自杀啊?!
赶紧勒裤带换。。。
回复 支持 反对

使用道具 举报

10#
发表于 2009-2-26 20:24 | 只看该作者
11# Sakray

你说没意义,可是AMD觉得很有意义啊
回复 支持 反对

使用道具 举报

11#
发表于 2009-2-26 20:52 | 只看该作者
{titter:] 2vs2的结果是显而易见的

{titter:] 期待8核心的喝奶
Asuka 发表于 2009-2-26 19:26


Nehalem-EX有四通道FBD2,2S平台Stream测试干掉Istanbul应该没啥问题。4S平台有望冲击100GB/s大关。
回复 支持 反对

使用道具 举报

12#
发表于 2009-2-26 21:14 | 只看该作者
11# Sakray  

你说没意义,可是AMD觉得很有意义啊
YsMilan 发表于 2009-2-26 20:24


AMD觉得有意义关我啥事{closedeyes:]    A就继续放烟雾弹吧
回复 支持 反对

使用道具 举报

13#
发表于 2009-2-26 22:31 | 只看该作者
這樣還好意思拿出來比??{sweat:]
回复 支持 反对

使用道具 举报

14#
发表于 2009-2-26 23:01 | 只看该作者
下面的评论都在指责anandtech饭A,哈哈哈哈
回复 支持 反对

使用道具 举报

15#
发表于 2009-11-1 18:04 | 只看该作者
搞得和HP PA-RISC vs IBM PowerPC一样,2个核才抵人家1颗
问之,答曰,每个核的价钱也是人家的一半嘛   -____-
回复 支持 反对

使用道具 举报

16#
发表于 2009-11-1 22:53 | 只看该作者
看着挺有意思的。
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-9-9 23:18

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表