Integer Arithmetic Operations for Streaming SIMD Extensions 2

The integer arithmetic operations for Streaming SIMD Extensions 2 (SSE2) are listed in the following table followed by their descriptions. The floating point packed arithmetic intrinsics for SSE2 are listed in the Floating-point Arithmetic Operations topic.

For detailed information about an intrinsic, click on that intrinsic name in the following table.

The results of each intrinsic operation are placed in registers. The information about what is placed in each register appears in the tables below, in the detailed explanation of each intrinsic. R, R0, R1...R15 represent the registers in which results are placed.

The prototypes for SSE2 intrinsics are in the emmintrin.h header file.

Intrinsic Operation Instruction
_mm_add_epi8 Addition PADDB
_mm_add_epi16 Addition PADDW
_mm_add_epi32 Addition PADDD
_mm_add_si64 Addition PADDQ
_mm_add_epi64 Addition PADDQ
_mm_adds_epi8 Addition PADDSB
_mm_adds_epi16 Addition PADDSW
_mm_adds_epu8 Addition PADDUSB
_mm_adds_epu16 Addition PADDUSW
_mm_avg_epu8 Computes Average PAVGB
_mm_avg_epu16 Computes Average PAVGW
_mm_madd_epi16 Multiplication and Addition PMADDWD
_mm_max_epi16 Computes Maxima PMAXSW
_mm_max_epu8 Computes Maxima PMAXUB
_mm_min_epi16 Computes Minima PMINSW
_mm_min_epu8 Computes Minima PMINUB
_mm_mulhi_epi16 Multiplication PMULHW
_mm_mulhi_epu16 Multiplication PMULHUW
_mm_mullo_epi16 Multiplication PMULLW
_mm_mul_su32 Multiplication PMULUDQ
_mm_mul_epu32 Multiplication PMULUDQ
_mm_sad_epu8 Computes Difference/Adds PSADBW
_mm_sub_epi8 Subtraction PSUBB
_mm_sub_epi16 Subtraction PSUBW
_mm_sub_epi32 Subtraction PSUBD
_mm_sub_si64 Subtraction PSUBQ
_mm_sub_epi64 Subtraction PSUBQ
_mm_subs_epi8 Subtraction PSUBSB
_mm_subs_epi16 Subtraction PSUBSW
_mm_subs_epu8 Subtraction PSUBUSB
_mm_subs_epu16 Subtraction PSUBUSW

 

__mm128i _mm_add_epi8(__m128i a, __m128i b)

Adds the 16 signed or unsigned 8-bit integers in a to the 16 signed or unsigned 8-bit integers in b.

R0 R1 ... R15
a0 + b0 a1 + b1; ... a15 + b15

 

__mm128i _mm_add_epi16(__m128i a, __m128i b)

Adds the 8 signed or unsigned 16-bit integers in a to the 8 signed or unsigned 16-bit integers in b.

R0 R1 ... R7
a0 + b0 a1 + b1 ... a7 + b7

 

__m128i _mm_add_epi32(__m128i a, __m128i b)

Adds the 4 signed or unsigned 32-bit integers in a to the 4 signed or unsigned 32-bit integers in b.

R0 R1 R2 R3
a0 + b0 a1 + b1 a2 + b2 a3 + b3

 

__m64 _mm_add_si64(__m64 a, __m64 b)

Adds the signed or unsigned 64-bit integer a to the signed or unsigned 64-bit integer b.

R0
a + b

 

__m128i _mm_add_epi64(__m128i a, __m128i b)

Adds the 2 signed or unsigned 64-bit integers in a to the 2 signed or unsigned 64-bit integers in b.

R0 R1
a0 + b0 a1 + b1

 

__m128i _mm_adds_epi8(__m128i a, __m128i b)

Adds the 16 signed 8-bit integers in a to the 16 signed 8-bit integers in b using saturating arithmetic.

R0 R1 ... R15
SignedSaturate (a0 + b0) SignedSaturate (a1 + b1) ... SignedSaturate (a15 + b15)

 

__m128i _mm_adds_epi16(__m128i a, __m128i b)

Adds the 8 signed 16-bit integers in a to the 8 signed 16-bit integers in b using saturating arithmetic.

R0 R1 ... R7
SignedSaturate (a0 + b0) SignedSaturate (a1 + b1) ... SignedSaturate (a7 + b7)

 

__m128i _mm_adds_epu8(__m128i a, __m128i b)

Adds the 16 unsigned 8-bit integers in a to the 16 unsigned 8-bit integers in b using saturating arithmetic.

R0 R1 ... R15
UnsignedSaturate (a0 + b0) UnsignedSaturate (a1 + b1) ... UnsignedSaturate (a15 + b15)

 

__m128i _mm_adds_epu16(__m128i a, __m128i b)

Adds the 8 unsigned 16-bit integers in a to the 8 unsigned 16-bit integers in b using saturating arithmetic.

R0 R1 ... R7
UnsignedSaturate (a0 + b0) UnsignedSaturate (a1 + b1) ... UnsignedSaturate (a7 + b7)

 

__m128i _mm_avg_epu8(__m128i a, __m128i b)

Computes the average of the 16 unsigned 8-bit integers in a and the 16 unsigned 8-bit integers in b and rounds.

R0 R1 ... R15
(a0 + b0) / 2 (a1 + b1) / 2 ... (a15 + b15) / 2

 

__m128i _mm_avg_epu16(__m128i a, __m128i b)

Computes the average of the 8 unsigned 16-bit integers in a and the 8 unsigned 16-bit integers in b and rounds.

R0 R1 ... R7
(a0 + b0) / 2 (a1 + b1) / 2 ... (a7 + b7) / 2

 

__m128i _mm_madd_epi16(__m128i a, __m128i b)

Multiplies the 8 signed 16-bit integers from a by the 8 signed 16-bit integers from b. Adds the signed 32-bit integer results pairwise and packs the 4 signed 32-bit integer results.

R0 R1 R2 R3
(a0 * b0) + (a1 * b1) (a2 * b2) + (a3 * b3) (a4 * b4) + (a5 * b5) (a6 * b6) + (a7 * b7)

 

__m128i _mm_max_epi16(__m128i a, __m128i b)

Computes the pairwise maxima of the 8 signed 16-bit integers from a and the 8 signed 16-bit integers from b.

R0 R1 ... R7
max(a0, b0) max(a1, b1) ... max(a7, b7)

 

__m128i _mm_max_epu8(__m128i a, __m128i b)

Computes the pairwise maxima of the 16 unsigned 8-bit integers from a and the 16 unsigned 8-bit integers from b.

R0 R1 ... R15
max(a0, b0) max(a1, b1) ... max(a15, b15)

 

__m128i _mm_min_epi16(__m128i a, __m128i b)

Computes the pairwise minima of the 8 signed 16-bit integers from a and the 8 signed 16-bit integers from b.

R0 R1 ... R7
min(a0, b0) min(a1, b1) ... min(a7, b7)

 

__m128i _mm_min_epu8(__m128i a, __m128i b)

Computes the pairwise minima of the 16 unsigned 8-bit integers from a and the 16 unsigned 8-bit integers from b.

R0 R1 ... R15
min(a0, b0) min(a1, b1) ... min(a15, b15)

 

__m128i _mm_mulhi_epi16(__m128i a, __m128i b)

Multiplies the 8 signed 16-bit integers from a by the 8 signed 16-bit integers from b. Packs the upper 16-bits of the 8 signed 32-bit results.

R0 R1 ... R7
(a0 * b0)[31:16] (a1 * b1)[31:16] ... (a7 * b7)[31:16]

 

__m128i _mm_mulhi_epu16(__m128i a, __m128i b)

Multiplies the 8 unsigned 16-bit integers from a by the 8 unsigned 16-bit integers from b. Packs the upper 16-bits of the 8 unsigned 32-bit results.

R0 R1 ... R7
(a0 * b0)[31:16] (a1 * b1)[31:16] ... (a7 * b7)[31:16]

 

__m128i_mm_mullo_epi16(__m128i a, __m128i b)

Multiplies the 8 signed or unsigned 16-bit integers from a by the 8 signed or unsigned 16-bit integers from b. Packs the lower 16-bits of the 8 signed or unsigned 32-bit results.

R0 R1 ... R7
(a0 * b0)[15:0] (a1 * b1)[15:0] ... (a7 * b7)[15:0]

 

__m64 _mm_mul_su32(__m64 a, __m64 b)

Multiplies the lower 32-bit integer from a by the lower 32-bit integer from b, and returns the 64-bit integer result.

R0
a0 * b0

 

__m128i _mm_mul_epu32(__m128i a, __m128i b)

Multiplies 2 unsigned 32-bit integers from a by 2 unsigned 32-bit integers from b. Packs the 2 unsigned 64-bit integer results.

R0 R1
a0 * b0 a2 * b2

 

__m128i _mm_sad_epu8(__m128i a, __m128i b)

Computes the absolute difference of the 16 unsigned 8-bit integers from a and the 16 unsigned 8-bit integers from b. Sums the upper 8 differences and lower 8 differences, and packs the resulting 2 unsigned 16-bit integers into the upper and lower 64-bit elements.

R0 R1 R2 R3 R4 R5 R6 R7
abs(a0 - b0) + abs(a1 - b1) +...+ abs(a7 - b7) 0x0 0x0 0x0 abs(a8 - b8) + abs(a9 - b9) +...+ abs(a15 - b15) 0x0 0x0 0x0

 

__m128i _mm_sub_epi8(__m128i a, __m128i b)

Subtracts the 16 signed or unsigned 8-bit integers of b from the 16 signed or unsigned 8-bit integers of a.

R0 R1 ... R15
a0 - b0 a1 - b1 ... a15 - b15

 

__m128i_mm_sub_epi16(__m128i a, __m128i b)

Subtracts the 8 signed or unsigned 16-bit integers of b from the 8 signed or unsigned 16-bit integers of a.

R0 R1 ... R7
a0 - b0 a1 - b1 ... a7 - b7

 

__m128i _mm_sub_epi32(__m128i a, __m128i b)

Subtracts the 4 signed or unsigned 32-bit integers of b from the 4 signed or unsigned 32-bit integers of a.

R0 R1 R2 R3
a0 - b0 a1 - b1 a2 - b2 a3 - b3

 

__m64 _mm_sub_si64 (__m64 a, __m64 b)

Subtracts the signed or unsigned 64-bit integer b from the signed or unsigned 64-bit integer a.

R
a - b

 

__m128i _mm_sub_epi64(__m128i a, __m128i b)

Subtracts the 2 signed or unsigned 64-bit integers in b from the 2 signed or unsigned 64-bit integers in a.

R0 R1
a0 - b0 a1 - b1

 

__m128i _mm_subs_epi8(__m128i a, __m128i b)

Subtracts the 16 signed 8-bit integers of b from the 16 signed 8-bit integers of a using saturating arithmetic.

R0 R1 ... R15
SignedSaturate (a0 - b0) SignedSaturate (a1 - b1) ... SignedSaturate (a15 - b15)

 

__m128i _mm_subs_epi16(__m128i a, __m128i b)

Subtracts the 8 signed 16-bit integers of b from the 8 signed 16-bit integers of a using saturating arithmetic.

R0 R1 ... R15
SignedSaturate (a0 - b0) SignedSaturate (a1 - b1) ... SignedSaturate (a7 - b7)

 

__m128i _mm_subs_epu8 (__m128i a, __m128i b)

Subtracts the 16 unsigned 8-bit integers of b from the 16 unsigned 8-bit integers of a using saturating arithmetic.

R0 R1 ... R15
UnsignedSaturate (a0 - b0) UnsignedSaturate (a1 - b1) ... UnsignedSaturate (a15 - b15)

 

__m128i _mm_subs_epu16 (__m128i a, __m128i b)

Subtracts the 8 unsigned 16-bit integers of b from the 8 unsigned 16-bit integers of a using saturating arithmetic.

R0 R1 ... R7
UnsignedSaturate (a0 - b0) UnsignedSaturate (a1 - b1) ... UnsignedSaturate (a7 - b7)