Disadvantages
Not all algorithms can be vectorized. For example, a flow-control-heavy task like code parsing wouldn't benefit from SIMD.
It also has large register files which increases power consumption and chip area.
Currently, implementing an algorithm with SIMD instructions usually requires human labor; most compilers don't generate SIMD instructions from a typical C program, for instance. Vectorization in compilers is an active area of computer science research. (Compare vector processing.)
Programming with particular SIMD instruction sets can involve numerous low-level challenges.
SSE has restrictions on data alignment; programmers familiar with the x86 architecture may not expect this.
Gathering data into SIMD registers and scattering it to the correct destination locations is tricky and can be inefficient.
Specific instructions like rotations or three-operand addition aren't in some SIMD instruction sets.
Instruction sets are architecture-specific: old processors and non-x86 processors lack SSE entirely, for instance, so programmers must provide non-vectorized implementations (or different vectorized implementations) for them. Similarly, the next-generation instruction sets from Intel and AMD will be incompatible with each other (see SSE5 and AVX).
The early MMX instruction set shared a register file with the floating-point stack, which caused inefficiencies when mixing floating-point and MMX code. However, SSE2 corrects this.
关于SIMD的 |