PADDB/PADDW/PADDD--Packed Add

Opcode

Instruction

Description

0F FC /r

PADDB mm, mm/m64

Add packed byte integers from mm/m64 and mm.

66 0F FC /r

PADDB xmm1,xmm2/m128

Add packed byte integers from xmm2/m128 and xmm1.

0F FD /r

PADDW mm, mm/m64

Add packed word integers from mm/m64 and mm.

66 0F FD /r

PADDW xmm1, xmm2/m128

Add packed word integers from xmm2/m128 and xmm1.

0F FE /r

PADDD mm, mm/m64

Add packed doubleword integers from mm/m64 and mm.

66 0F FE /r

PADDD xmm1, xmm2/m128

Add packed doubleword integers from xmm2/m128 and xmm1.

Description

Performs a SIMD add of the packed integers from the source operand (second operand) and the destination operand (first operand), and stores the packed integer results in the destination operand. The source operand can be an MMX™ technology register or a 64-bit memory location, or it can be an XMM register or a 128-bit memory location. The destination operand can be an MMX or an XMM register. See Figure 9-4 in the IA-32 Intel(R) Architecture Software Developer's Manual, Volume 1 for an illustration of a SIMD operation.

The PADDB instruction adds packed byte integers. When an individual result is too large to be represented in 8 bits (overflow), the result is wrapped around and the low 8 bits are written to the destination operand (that is, the carry is ignored).

The PADDW instruction adds packed word integers. When an individual result is too large to be represented in 16 bits (overflow), the result is wrapped around and the low 16 bits are written to the destination operand.

The PADDD instruction adds packed doubleword integers. When an individual result is too large to be represented in 32 bits (overflow), the result is wrapped around and the low 32 bits are written to the destination operand. .

Note that the PADDB, PADDW, and PADDD instructions can operate on either unsigned or signed (two's complement notation) packed integers; however, it does not set bits in the EFLAGS register to indicate overflow and/or a carry. To prevent undetected overflow conditions, software must control the ranges of values operated on.

Operation

PADDB instruction with 64-bit operands:
DEST[7..0] DEST[7..0] + SRC[7..0];
* repeat add operation for 2nd through 7th byte *;
DEST[63..56] DEST[63..56] + SRC[63..56];

PADDB instruction with 128-bit operands:
DEST[7-0] DEST[7-0] + SRC[7-0];
* repeat add operation for 2nd through 14th byte *;
DEST[127-120] DEST[111-120] + SRC[127-120];

PADDW instruction with 64-bit operands:
DEST[15..0] DEST[15..0] + SRC[15..0];
* repeat add operation for 2nd and 3th word *;
DEST[63..48] DEST[63..48] + SRC[63..48];

PADDW instruction with 128-bit operands:
DEST[15-0] DEST[15-0] + SRC[15-0];
* repeat add operation for 2nd through 7th word *;
DEST[127-112] DEST[127-112] + SRC[127-112];

PADDD instruction with 64-bit operands:
DEST[31..0] DEST[31..0] + SRC[31..0];
DEST[63..32] DEST[63..32] + SRC[63..32];

PADDD instruction with 128-bit operands:
DEST[31-0] DEST[31-0] + SRC[31-0];
* repeat add operation for 2nd and 3th doubleword *;
DEST[127-96] DEST[127-96] + SRC[127-96];

Intel(R) C++ Compiler Intrinsic Equivalents

PADDB __m64 _mm_add_pi8(__m64 m1, __m64 m2)

PADDB __m128i_mm_add_epi8 (__m128ia,__m128ib )

PADDW __m64 _mm_addw_pi16(__m64 m1, __m64 m2)

PADDW __m128i _mm_add_epi16 ( __m128i a, __m128i b)

PADDD __m64 _mm_add_pi32(__m64 m1, __m64 m2)

PADDD __m128i _mm_add_epi32 ( __m128i a, __m128i b)

Flags Affected

None.

Protected Mode Exceptions

#GP(0) - If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. (128-bit operations only.) If memory operand is not aligned on a 16-byte boundary, regardless of segment.

#SS(0) - If a memory operand effective address is outside the SS segment limit.

#UD - If EM in CR0 is set. (128-bit operations only.) If OSFXSR in CR4 is 0. (128-bit operations only.) If CPUID feature flag SSE-2 is 0.

#NM - If TS in CR0 is set.

#MF (64-bit operations only.) - If there is a pending x87 FPU exception.

#PF(fault-code) - If a page fault occurs.

#AC(0) (64-bit operations only.) - If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3.

Real-Address Mode Exceptions

#GP(0) (128-bit operations only.) - If memory operand is not aligned on a 16-byte boundary, regardless of segment. If any part of the operand lies outside of the effective address space from 0 to FFFFH.

#UD - If EM in CR0 is set. (128-bit operations only.) If OSFXSR in CR4 is 0. (128-bit operations only.) If CPUID feature flag SSE-2 is 0.

#NM - If TS in CR0 is set.

#MF (64-bit operations only.) - If there is a pending x87 FPU exception.

Virtual-8086 Mode Exceptions

Same exceptions as in Real Address Mode

#PF(fault-code) - For a page fault.

#AC(0) (64-bit operations only.) - If alignment checking is enabled and an unaligned memory reference is made.

 

 

 

 

 

 

 


For details, see Volume 2A and Volume 2B of the Intel(R) 64 and IA-32 Intel Architecture Software Developer's Manual. For the latest updates on the instruction set information, go to the web site.