|
原帖由 Ricepig 于 2007-9-29 22:42 发表 ![]()
其实就一个矩阵乘法,有SSE会快不少.
SSE本身速度快是现实,但是目前貌似算法能适应SSE的不多,即很大部分问题不能用SIMD模式的算法搞定~~~
http://en.wikipedia.org/wiki/AMD64
The original AMD64 architecture adopted Intel's SSE and SSE2 as core instructions.SSE3 instructions were added in April 2005. SSE2 replaces the x87instruction set's IEEE 80-bit precision, with the choice of either IEEE32-bit or 64-bit floating-point mathematics. This providesfloating-point operations compatible with many other modern CPUs. TheSSE and SSE2 instructions have also been extended to support the eightnew XMM registers. SSE and SSE2 are available in 32-bit mode in modernx86 processors; however, if they're used in 32-bit programs, thoseprograms will only work on systems with processors that support them.This is not an issue in 64-bit programs, as all processors that supportAMD64 support SSE and SSE2, so using SSE and SSE2 instructions insteadof x87 instructions does not reduce the set of machines on which theprograms will run. Since SSE and SSE2 are generally faster than, andduplicate most of the features of, the traditional x87 instructions, MMX, and 3DNow!, the latter are redundant under AMD64.
[ 本帖最后由 acqwer 于 2007-9-29 22:48 编辑 ] |
|