Streaming SIMD Extensions 2 (SSE2)

The Intel(R) Pentium(R) 4 processor introduces the SSE2 extensions, which offer several enhancements to the Intel MMX(TM) technology and Streaming SIMD extensions. These enhancements include:

New SIMD instructions introduced in the IA-32 architecture include floating-point SIMD instructions, integer SIMD instructions, conversion between SIMD floating-point data and SIMD integer data, and conversion of packed data between XMM registers and MMX registers.

New floating-point SIMD instructions allow computations to be performed on packed double-precision floating-point values (two double-precision values per XMM register). The computation of SIMD floating-point instructions and the single-precision and double-precision floating-point formats are compatible with IEEE Standard 754 for Binary Floating-Point Arithmetic.

New integer SIMD instructions provide flexible and higher dynamic range computational power by supporting arithmetic operations on packed doubleword and quadword data as well as other operations on packed byte, word, doubleword, quadword and double quadword data.

In addition to new 128-bit SIMD instructions described in the previous paragraph, there are 128-bit enhancements to 68 integer SIMD instructions, which operated solely on 64-bit MMX registers in the Pentium II and Pentium III processors. Those 64-bit integer SIMD instructions are enhanced to support operation on 128-bit XMM registers in the Pentium 4 processor. These enhanced integer SIMD instructions allow software developers to deliver new performance levels when implementing floating-point and integer algorithms, and to have maximum flexibility by writing SIMD code with either XMM registers or MMX registers.

To speed up processing and improve cache usage, the SSE2 extensions offers several new instructions that allow application programmers to control the cacheability of data. These instructions provide the ability to stream data in and out of the registers without disrupting the caches and the ability to prefetch data before it is actually used.

The new architectural features introduced with the SSE2 extensions do not require new operating system support. This is because the SSE2 extensions do not introduce new architectural states, and the FXSAVE/FXRSTOR instructions, which supports the SSE extensions, also supports SSE2 extensions and are sufficient for saving and restoring the state of the XMM registers, the MMX registers, and the x87 FPU registers during a context switch. The CPUID instruction

has been enhanced to allow operating system or applications to identify for the existence of the SSE and SSE2 features.

The SSE2 extensions are accessible in all IA-32 architecture operating modes in the Intel Pentium 4 processor. The Pentium 4 processor maintains IA-32 software compatibility. All existing software continues to run correctly, without modification on the Pentium 4 processor and future IA-32 processors that incorporate the SSE2 extensions. Also, existing software continues to run correctly in the presence of applications that make use of the SSE2 instructions.