POPPUR爱换

 找回密码
 注册

QQ登录

只需一步,快速开始

手机号码,快捷登录

搜索
查看: 2739|回复: 10
打印 上一主题 下一主题

在AVX面前,A社的SSE5又是一个渣……

[复制链接]
跳转到指定楼层
1#
发表于 2008-6-18 13:23 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
http://www.realworldtech.com/forums/index.cfm?action=detail&id=91619&threadid=91619&roomid=2

When AMD published their new ISA extension named SSE5 in late August 2007, they also introduced a new instruction code format for instructions with 3 or 4 operands. When Intel presented their AVX extension in April this year they introduced another code format that also supports 3 or 4 operands. These two formats are very different. We are now in a position where AMD and Intel are using completely different coding schemes for the same instructions. This is every programmer's nightmare! I cannot imagine any significant number of programmers making three versions of their code: one for AMD, one for Intel, and one for compatibility with older processors.

The forking of instruction sets and coding schemes is one of the less desirable consequences of free competition. We would all prefer some kind of international standardization committee that could approve new instruction codes. Such a committee would be reluctant to accept new shortsighted patches that add just another complication to instruction decoding. They would have weeded out the bizarre undocumented instructions from the old 8086 days that are still supported. And they might not accept the addition of new instructions to the already bulging instruction set mainly for marketing reasons with little technical benefit. Unfortunately, there is little hope that such a committee will be formed.

I have looked into the details of the two competing instruction formats and made a comparison:

* Both ISA extensions are compatible with all existing code.
两者都和现有代码兼容

* SSE5 supports 3 operands for new instructions only. AVX extends existing instructions to 3 operands as well. Almost all existing instructions on XMM registers are extended to 3 operands, and the code format makes room for also extending general-purpose register instructions to 3 operands.
SSE5只对于新指令支持3操作数,AVX将现有指令也扩展支持到3操作数。

* SSE5 supports instructions with 4 operands, but only if two of the operands are the same register. AVX supports any combination of 4 registers by adding an extra code byte. Future extension to 5 operands is possible.
SSE5支持4操作数,但是其中两个必须是相同的寄存器。AVX支持任意四个寄存器的组合,还有延展到5个寄存器的能力  

* SSE5 makes instructions longer. AVX makes some instructions longer and some instructions shorter, but most instructions keep the same length as before despite containing one more register operand and other new information.
SSE5使指令更长,AVX使指令长度平均起来维持不变

* SSE5 adds yet another complication to the already very complicated instruction decoding procedure. AVX makes instruction decoding simpler by sanitizing a lot of old patches. The many prefixes and escape bytes that pester the current instruction set are joined together into a single "VEX" prefix that is 2 or 3 bytes long.
SSE5使已经很复杂的指令更繁琐,而AVX和现有指令之间的关系更明晰,也更简洁优雅

* AVX supports the extension of the 128-bit vector registers (XMM registers) to 256 bits (YMM registers) with room for further extensions in the future. SSE5 has no room for new extensions.
AVX能将现有的向量寄存器延展到256位,SSE5不支持。

* AVX has 3 unused bits for future extensions to the now overloaded opcode map. This means no new shortsighted patches for a foreseeable future.
AVX有三个未定义的位,以后还可以再进行扩展

Before I saw the AVX documentation, I would have denied that it was possible to add so much new information without making instructions longer. The trick is that it makes one long prefix instead of many short prefixes. One or a few bits in the new VEX prefix contains the same information as a whole 8-bit or even 16-bit prefix or escape code in the current coding scheme. The two VEX prefixes are made out of two obsolete instructions, LDS and LES, which are valid in 16- and 32-bit mode but invalid in 64-bit mode. Certain bits in the VEX prefix that indicate register extensions available only in 64-bit mode are placed in such a way in the VEX prefix that the only values valid in 32-bit mode form an invalid register operand if interpreted as a legacy LDS or LES instruction. This is a solution no less ingenious than the x64 extension invented by AMD.  

Looking at the advantages of AVX over SSE5 there can be no doubt that AMD has no choice but to adopt AVX. There is no way AMD can stay in competition without supporting the new 256-bit vectors and the 3-operand version of all existing XMM instructions. And, incidentally, it will be easier to implement the new 3-operand instructions for AMD than it is for Intel because the current Intel microarchitecture does not allow micro-operations with more than two inputs, while the AMD microarchitecture has no such limitation.  

Let me explain the advantage of 3-operand instructions to those who don't know what this is about. Most of the current instructions place the result of a calculation in the same register as one of the input operands, e.g.:
A = A * B.
With a 3-operand version, you can do:
C = A * B.
This gives the programmer the freedom to reuse the original value of A in other calculations without having to copy it to another register. The result is fewer register-to-register moves and hence more efficient and compact code.  

The SSE5 instructions will suffer the same fate as AMD's 3DNow instructions. Nobody ever used the 3DNow instructions because they are not supported in Intel processors. They are superseded by the more efficient SSE instructions, but AMD have to keep supporting them in all their future processors for the sake of backwards compatibility. Let's hope that AMD have the guts to drop SSE5 altogether before it's too late. There has been some speculation that they might.  

Too bad that AMD haven't seen this coming before they published their SSE5 spec. Intel must have been able to keep their plans secret despite the patent sharing agreement between AMD and Intel. Maybe there is no patent on AVX?  

See also my second posting on the software and hardware consequences of extending the size of the vector registers in the thread "Consequences of extending XMM registers to YMM".

[ 本帖最后由 itany 于 2008-6-18 13:26 编辑 ]
2#
发表于 2008-6-18 13:43 | 只看该作者
奈何包含AVX扩展?现在的45nm不行吗?
回复 支持 反对

使用道具 举报

头像被屏蔽
3#
发表于 2008-6-18 14:02 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

头像被屏蔽
4#
发表于 2008-6-18 14:09 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

5#
 楼主| 发表于 2008-6-18 20:22 | 只看该作者
原帖由 daniel_k 于 2008-6-18 13:43 发表
奈何包含AVX扩展?现在的45nm不行吗?


Nehalem里边没有,Westmere也没有,等到Sandy Bridge才有
回复 支持 反对

使用道具 举报

6#
发表于 2008-6-19 08:18 | 只看该作者
原帖由 itany 于 2008-6-18 20:22 发表


Nehalem里边没有,Westmere也没有,等到Sandy Bridge才有

那我可以安心地等45nm降价了:loveliness:
PS:目前intel报价虚高,老大知道intel什么时候有调价举措吗?
回复 支持 反对

使用道具 举报

头像被屏蔽
7#
发表于 2008-6-19 08:46 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

8#
 楼主| 发表于 2008-6-19 08:47 | 只看该作者
原帖由 daniel_k 于 2008-6-19 08:18 发表

那我可以安心地等45nm降价了:loveliness:
PS:目前intel报价虚高,老大知道intel什么时候有调价举措吗?


汗,“老大”我怎么敢当
第三季度肯定会降价的……
回复 支持 反对

使用道具 举报

头像被屏蔽
9#
发表于 2008-6-19 10:24 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

10#
 楼主| 发表于 2008-6-19 10:55 | 只看该作者
原帖由 少年包青天 于 2008-6-19 10:24 发表
那就当老二吧,哈哈哈


有AMD这个老二在,叫别人老二是对别人人格的极大侮辱和智力的严重蔑视
要当就当老三吧
回复 支持 反对

使用道具 举报

头像被屏蔽
11#
发表于 2008-7-6 01:13 | 只看该作者
提示: 作者被禁止或删除 内容自动屏蔽
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

广告投放或合作|网站地图|处罚通告|

GMT+8, 2025-8-17 23:39

Powered by Discuz! X3.4

© 2001-2017 POPPUR.

快速回复 返回顶部 返回列表