POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 4160|回复: 31
打印 上一主题 下一主题

期待Penryn:一起来等解禁

[复制链接]
跳转到指定楼层
1#
发表于 2007-7-10 11:17 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
据说Penryn在7.15号保密协议解禁,还有不到1周,大家尽情YY吧!
  w00t)
2#
 楼主| 发表于 2007-7-10 11:20 | 只看该作者
Penryn增强型核心架构  四核心版Penryn处理器将会拥有820百万的电子管(Kentsfield拥有582百万),核心面积将会只有107mm 2。而这也将会使得新芯片的核心面积要比当前65nm Quad core处理器小25%。
  新的Penryn处理器还拥有一个新特征,那就是对Intel Streaming SIMD Extensions 4 (SSE4)指令。现在已经确认Penryn将会拥有更高的IPC以及更高的核心频率。虽然INTEL并没有明说是否会超过3G,但是考虑到新处理器的FSB已经达到了1600MHZ,因此3.2G的核心频率还是有可能的。不过来自INTEL公司的多位人氏已经确认如果需要的话,45nm处理器的核心频率将会更高(从安全方面考虑估计可以达到3.6G,这是考虑到当前 Core 2处理器的超频能力)。
  至于在功耗方面,INTEL将会在新处理器上推出被其称作“Deep Power Down”的技术,或者说是更低的节电状态C6。新的C6状态可以将处理器的核心电压降至其所采用制程技术的极限,在该状态下除了降低处理器核心频率以外还将会关闭所有的高速缓存。由于其功耗实在很低,因此该技术也将会在新一代 Penryn处理器上得到应用。
  Penryn系列处理器将会与现有socket兼容,因此在桌面市场上我们会看到INTEL会推出基LGA-775接口的产品。因此我们认为INTEL公司将会推出新款芯片组产品,不过我们目前还不清楚新的芯片组是否能够支持1600MHZ的FSB。
  基于Penryn的处理同样也将会拥有一个更加出色的divider单元,新的单元除了速度可以提高一倍以外还将会使用一种名为Radix 16的新技术。同样的shuffle引擎也将会得到改进。据介绍INTEL公的“Super Shuffle Engine”将会是128-bit单向shuffle单元,该单元可以在单一的循环进行全宽度的shuffle,并且可以改善 SSE2, SSE3 SSE4以及shuffle相类似操作比如封包、解封以及( pack, unpack and wider packed shifts)
  最后一个改进就是新的“Split Load Cache Enhancement”,该技术将可以降低非高速缓存排列数据带来的影响。这会常见于一些繁重的SSE成像应用中。新处理器桌面版以及 Xeon处理器的功耗将会分别为 120W, 80W 以及50W,这个没有发生变化。而双核心处理器则将会达到 40W/65W以及80W的TDP
  Intel EDAT,将解决多核心处理器频率缺陷?
  INTEL同样提到了“Enhanced Dynamic Acceleration Technology”(增强型动态加速技术),该技术将能够有效得整合基于载入的超频。如果你正在运行一个单线程应用(或者一个多线程应应但主要仍然使用单线程),这时INTEL EDAT将能够第二个核心的功耗并同时提高正处于工作状态处理器核心的频率以在任何时候都保证相同的热度范围(thermal envelope)。
  EDAT同样能够安排单核心以及多核心处理器的核心频。当所有核心都处于运行状态时,多核心系统的频率将会降低,而当一些核心处于待机状态时处理器的核心频率将会达到与单核心处理器相同的频率。虽然单核心处理器已经彻底从路线图上消失,但是考虑到现在仍然有很多应用仍然是原生单线程,因此这些应用将会因此而受益,同时将来的处理器也都会将提供这一选项。
  性能介绍:
  INTEL并没有透露太多有关Penryn的性能,不过Penryn还是向我们做了一定的介绍。我们不清楚这些测试是在什么样的测试环境下所进行,同时我们也并没有参与测试,不过这些介绍还是值得大家一看。
  当3.2GHz Penryn (1.6GHz FSB)与3.0GHz Conroe (1.33GHz FSB)进行对比时,INTEL测得其游戏性能获得了超过20%的提升(代码未变),而在进行视频解码时,如果使用SSE4的话,Penryn的性能则至少可以提升40%以上。
  最后INTEL提到了服务器领域,据介绍当前最快的 quad core Penryn(>3GHz)以 2.67GHz quad core Xeon相比时,在带宽以及FP加强应用中的性能提升将会超过45%。这里给大家介绍的内容都是由INTEL公司提供,因此在我们证实这些介绍之前还需要一定的时间。不过考虑到之前给大家介绍的Penryn 的改进,因此有理由相信Penryn将会比当前的Conroe更快。不过是快10%、20%或者是更高也就只有以后才来知道了。
回复 支持 反对

使用道具 举报

3#
发表于 2007-7-10 11:21 | 只看该作者
准备换平台中,本来说直接跳过 Penryn 等 Nehalem 了,但是又怕等不及... :(

过两天就入 P35
回复 支持 反对

使用道具 举报

4#
发表于 2007-7-10 11:23 | 只看该作者
默认 1600 FSB? 还让不让人超了 :mad: :mad: :mad: :mad:
回复 支持 反对

使用道具 举报

5#
发表于 2007-7-10 11:25 | 只看该作者
Conroe到Penryn的提升应该和Deschutes到Coppermine的提升很像,不同的是频率提升的幅度应该没那时大。
不过我更关心的是Barcelona的NDA什么时候到期。
回复 支持 反对

使用道具 举报

6#
发表于 2007-7-10 11:25 | 只看该作者
这个怎么看都是双核,四核还要继续拼?w00t)

回复 支持 反对

使用道具 举报

7#
发表于 2007-7-10 11:25 | 只看该作者
默认 1600 FSB? 这个有点超前
回复 支持 反对

使用道具 举报

8#
 楼主| 发表于 2007-7-10 11:28 | 只看该作者
原帖由 HeavenPR 于 2007-7-10 11:23 发表
默认 1600 FSB? 还让不让人超了 :mad: :mad: :mad: :mad:


那个是Xeon阿
桌面哪能FSB1600呢,应该是1333的
回复 支持 反对

使用道具 举报

9#
发表于 2007-7-10 11:35 | 只看该作者
总算要出新品了,噢也
回复 支持 反对

使用道具 举报

10#
 楼主| 发表于 2007-7-10 11:39 | 只看该作者
原帖由 飞鸟真 于 2007-7-10 11:25 发表
这个怎么看都是双核,四核还要继续拼?w00t)

http://diy.yesky.com/imagelist/2007/093/77q3917s3wk2.jpg


是的
就是要继续拼啊
想要原生只能等明年的Nehalem了
回复 支持 反对

使用道具 举报

11#
 楼主| 发表于 2007-7-10 13:07 | 只看该作者

到时候看看是不是真的

回复 支持 反对

使用道具 举报

12#
发表于 2007-7-10 13:14 | 只看该作者
2F文章的翻译质量实在是.......
回复 支持 反对

使用道具 举报

13#
发表于 2007-7-10 18:50 | 只看该作者
性能如此强悍?还有几天揭底!
人生就在等待中度过了
回复 支持 反对

使用道具 举报

14#
 楼主| 发表于 2007-7-10 23:11 | 只看该作者
原帖由 boris_lee 于 2007-7-10 13:14 发表
2F文章的翻译质量实在是.......


还是转个E文的
不过恐有人看起来费尽~ 唉
回复 支持 反对

使用道具 举报

15#
 楼主| 发表于 2007-7-10 23:13 | 只看该作者
You knew it had to be coming. A year ago Intel previewed its first Core 2 processors ahead of their release, and with Penryn due out before the end of the year the boys in blue are back again.
Penryn is still pretty early, although Intel was able to reach over 3GHz on all of the samples we tested. Not surprisingly, the number of benchmarks we were able to run was quite limited. Intel also provided us with a handful of its own test results demonstrated at IDF Beijing which we have reproduced here as well.


Penryn in action
As a recap, Penryn is the 45nm micro-architectural update to Intel's current Core 2 processors. The slide below shows most of the improvements to Penryn:
A faster divider and super shuffle engine both improve IPC in very specific applications. As we mentioned in our IDF day 1 coverage, faster FSB speeds appear to be reserved for Penryn based Xeon processors at this point as desktop Penryn cores will use a 1333MHz FSB. Penryn takes the total amount of L2 cache up to 6MB per two cores, giving the quad core Penryn chips a total of 12MB of on-die L2 cache. Penryn also has improved power management technologies, but only for mobile Penryn chips.

Penryn up and running
First off we'll start with the results we ran ourselves under Intel's supervision. Intel set up three identical systems, one based on a Core 2 Extreme X6800 (dual core, 2.93GHz/1066MHz FSB), one based on a Wolfdale processor (Penryn, dual core, 3.20GHz/1066MHz FSB) and one based on Yorkfield (Penryn, quad core, 3.33GHz/1333MHz FSB).

The modified BadAxe 2 board; can you spot the mod?

Can't find it? It's under that blue heatsink
The processors were plugged into a modified Intel BadAxe2 motherboard, with the modification being necessary to support Penryn. Each system had 2GB of DDR2-800 memory and a GeForce 8800 GTX. All of our tests were run under Windows XP.

Wolfdale - 2 cores

Yorkfield - 4 cores
The Cinebench 9.5 test is the same one we run in our normal CPU reviews, with the dual core Penryn (Wolfdale) scoring about 20% faster than the dual core Conroe. Keep in mind that the Wolfdale core is running at a 9.2% higher clock speed, but even if Cinebench scaled perfectly with clock speed there's still at least a 10% increase in performance due to the micro-architectural improvements found in Penryn.
Next up was Intel's Half Life 2 Lost Coast benchmark which was run with the following settings:
Setting
Model Detail
High
Texture Detail
High
Shader Detail
High
Water Detail
Reflect World
Shadow Detail
High
Texture Filtering
Trilinear
HDR
Full

Half Life 2 performance at a very CPU bound 1024 x 768 has Wolfdale just under 19% faster than Conroe. Once again, clock speed does play a part here but we'd expect at least a 10% increase in performance just due to the advancements in Penryn.
At 1600 x 1200 the performance difference shrinks to 10.6%, still quite respectable:
回复 支持 反对

使用道具 举报

16#
 楼主| 发表于 2007-7-10 23:14 | 只看该作者
Next up are Intel's Penryn benchmark results revealed at IDF Beijing. The system configuration is a little different, as both Penryn systems run at 3.33GHz and the systems are running Windows Vista Ultimate 32-bit. The exact config is listed below:
Test System Configuration Wolfdale 3.33GHz Yorkfield 3.33GHz Core 2 Extreme QX6800 (2.93GHz)
CPU
Pre-production dual core Penryn 3.33GHz/1333MHz FSB 6MB L2
Pre-production quad core Penryn 3.33GHz/1333MHz 12MB L2
Core 2 Extreme QX6800 quad core 2.93GHz/1066MHz 8MB L2
Motherboard
Pre-production BadAxe2 975X
Pre-production BadAxe2 975X
BadAxe2 975X
BIOS
Pre-production BIOS
Pre-production BIOS
Pre-production BIOS
Chipset Driver
8.1.1.1010
8.1.1.1010
8.1.1.1010
Video Card
GeForce 8800 GTX

Video Driver
NVIDIA 100.65
Memory
2 x 1GB DDR2-800 5-5-5-15

Hard Drive
Seagate 7200.10 320GB

And now the results:
BenchmarkWolfdale 3.33GHz Yorkfield 3.33GHz Core 2 Extreme QX6800 (2.93GHz)
3DMark '06 V1.1.0 Pro CPU (score) :
3061
4957
4070
3DMark '06 V1.1.0 Pro Overall (score) :
11015
11963
11123
Mainconcept H.264 Encoder (seconds) :
119
73
89
Cinebench R9.5 (CPU test)
1134
1935
1549
Cinebench R10 Beta (CPU test)
7045
13068
10416
HL2 Lost Coast Build 2707 (fps) :
210
210
153
DivX 6.6 Alpha w/ VirtualDub 1.7.1 (seconds)
22
18
38

For easier comparison we took the two quad-core chips (Yorkfield vs. Kentsfield) and looked at performance scaling between the two:
BenchmarkYorkfield Performance Advantage
3DMark '06 V1.1.0 Pro CPU (score) :
21.8%
3DMark '06 V1.1.0 Pro Overall (score) :
7.6%
Mainconcept H.264 Encoder (seconds) :
18.0%
Cinebench R9.5 (CPU test)
24.9%
Cinebench R10 Beta (CPU test)
25.5%
HL2 Lost Coast Build 2707 (fps) :
37.3%
DivX 6.6 Alpha w/ VirtualDub 1.7.1 (seconds)
111%

The Yorkfield system runs at a 13.6% higher clock speed than the Kentsfield system giving it an inherent advantage, but that's clearly not all that's making it faster. Half-Life 2 went up an expected 21.8% (we're assuming that Intel ran these numbers at 1024 x 768), and Cinebench saw a 25% increase in performance.
The DivX 6.6 test is particularly strong for Intel because it is using an early alpha version of DivX with support for SSE4. With SSE4 support, the quad-core Yorkfield processor ends up being more than 50% faster than Kentsfield, which bodes very well for Penryn if applications like DivX can bring SSE4 support in time for launch.
Final WordsObviously we'll reserve final judgments on Penryn for our official review of the CPU, but these initial results look very promising. We would expect to see clock for clock Penryn vs. Conroe improvements to be in the 5 - 10% range at minimum depending on the application. Factor in higher clock speeds and you can expect our CPU performance charts to shift up by about 20% by the end of this year.
Intel has shown its cards, now it's time for AMD to respond with those long overdue Barcelona tests...
回复 支持 反对

使用道具 举报

17#
 楼主| 发表于 2007-7-10 23:16 | 只看该作者
Penryn's Enhanced Core Architecture

The quad core version of Penryn contains 820 million transistors (Kentsfield has 582 million) in two very small dies of 107mm2. That makes the new design 25 percent smaller than Intel's current 65nm Quad core (143 mm2).


The new Penryn CPU also has yet another addition to the x86 ISA: Intel Streaming SIMD Extensions 4 (SSE4) instructions. It has also been confirmed that Penryn will deliver higher IPC and higher clock speeds. Intel wouldn't say more than "more than 3 GHz", but considering that the FSB is bumped up to 1600 MHz, 3.2 GHz is likely. However, several Intel people confirmed that if necessary ("depending on what the competition does"), the 45nm CPUs can go quite a bit higher (3.6 GHz is probably a safe estimate, considering how far current Core 2 CPUs are able to overclock).


With regards to power, Intel will be introducing what it is calling "Deep Power Down Technology", or a new lower power state, C6. The new C6 state reduces core voltage down to the absolute minimum for the given process technology, shuts down the core clock as well as turns off all of the caches. It is the absolute lowest power state that can be attained and will be introduced on Mobile Penryn family processors.

Penryn family processors are supposed to be socket-compatible, meaning that on the desktop we will see them introduced as LGA-775 CPUs. We'd expect that Intel's new lineup of chipsets will be required, but we are not sure if the new chipsets will support the 1600MHz FSB out of the box or if a refresh will be required.


Penryn-based processors also have a much better divider unit, roughly doubling the divider speed using a faster divide technique called Radix 16. Also, the shuffle engine has been improved. Intel's "Super Shuffle Engine" is a 128-bit, single-pass shuffle unit that can perform full-width shuffles in a single cycle, improving performance for SSE2, SSE3 and SSE4 instructions that have shuffle-like operations such as pack, unpack and wider packed shifts.


The last improvement is the "Split Load Cache Enhancement" which lowers the impact of data which is not aligned to cacheline boundaries. This seems to happen in some SSE intensive imaging applications.

The Quad core desktop and the quad core Xeon products will need 120W, 80W and 50W (LV) just like today. The dual core products will get a 40W/65W and 80W TDP.

Better Virtualization

Intel's current hardware support for virtualization in the current Core architecture is lackluster to say the least. To understand this you must understand what happens in a "pure" software-based virtualization solution such as VMware ESX 2.5.3 running on older Intel CPUs.

A technique called "ring deprivileging" is used as the guest OS cannot be allowed to run in the lowest ring 0 where it normally runs; the Virtual Machine Manager or hypervisor now runs there. That means that every time the guest application asks the help of the guest OS, which needs to run instructions which are only available in ring 0, the VMM must intercept that "SYSENTER" and emulate the normal execution. This is quite costly in performance terms.

Hardware assisted virtualization does not have that problem: both the OS and the VMM have their own ring 0. Despite this, Intel's HW assisted solutions didn't give any speed boost. It has not been discussed in detail, but Penryn speeds up virtual machine transition (entry/exit) times by 25% to 75%, and this requires no virtual machine software changes. This might be similar to AMD's nested page technology, although we don't have any clear details at present.

Last but not least, the dual core Penryn processors get a 6 MB shared cache and the quad versions get 12 MB cache. Both new designs will also come with a "higher degree of associativity". Considering the current designs are 16-way set associative, most likely the newer chips will feature a 24-way set associative L2 cache.

Intel EDAT: the End of the Multi-core Clock Speed Disadvantage?

Intel also talked about its "Enhanced Dynamic Acceleration Technology" which is effectively integrated overclocking based on load. If you are running a single threaded application (or a multi-threaded application that's predominantly using a single thread), Intel's EDAT can power down the second core and increase the frequency of the working core to maintain the same thermal envelope at all times.

Intel's EDAT could spell the end of the clock speed differential between single and multi-core processors. With all cores running workloads, the multi-core system would be clocked lower, but when some cores are idle the chip could potentially run at the same speed as a single core solution would. Single core designs have pretty much disappeared from roadmaps already, but considering there are still applications that are single threaded in nature and benefit more from clock speed improvements, future processors will offer both options in a single package.

Performance

Intel hasn't revealed too much about the performance of Penryn but Pat did leave us with a few comments. We don't know anything more about the test conditions than what we are presenting, and we didn't do the measurements ourselves, so take it for what it's worth.

Comparing a 3.2GHz Penryn (1.6GHz FSB) to a 3.0GHz Conroe (1.33GHz FSB), Intel has measured more than 20% increase in gaming performance (with no code changes). For video encoding applications, if SSE4 is utilized, the same Penryn vs. Conroe comparison can offer more than a 40% increase in performance.

Finally, Intel mentioned that in the server space, the fastest quad core Penryn available (>3GHz) vs. a 2.67GHz quad core Xeon resulted in a greater than 45% increase in performance in "bandwidth and FP intensive applications". It's incredibly vague (and oddly similar to AMD's claims of Barcelona vs. Xeon performance), but Pat mentioned that STREAM and certain benchmarks in SpecFP could be considered to be "bandwidth and FP intensive".

Again, we are just reporting what Intel told us. It will be a while before we can actually verify any of these claims or put them in the right context. Given the various enhancements that we've reported on, however, it's only reasonable to expect Penryn to be faster than Conroe, clock-for-clock. Whether that's 10% faster, 20% faster, or something else will be made clear in the future.
回复 支持 反对

使用道具 举报

18#
发表于 2007-7-10 23:24 | 只看该作者
偶蛋疼了一下

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?注册

x
回复 支持 反对

使用道具 举报

19#
发表于 2007-7-10 23:57 | 只看该作者
w00t) 2407有望无?
回复 支持 反对

使用道具 举报

20#
 楼主| 发表于 2007-7-11 00:05 | 只看该作者
原帖由 enmaai 于 2007-7-10 23:57 发表
w00t) 2407有望无?


2407跟Penryn没啥关系
拿今年出Penryn来赌2407,不是被驴踢了就是被电梯夹了
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2024-11-5 16:38

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表