Concatenate Intrinsics

Use the following SSSE3 intrinsics for concatenation.

 

extern __m128i _mm_alignr_epi8 (__m128i a, __m128i b, int n);

Concatenate a and b, extract byte-aligned result shifted to the right by n.

Interpreting t1 as 256-bit unsigned integer, a, b, and r as 128-bit unsigned integers:

t1[255:128] = a;

t1[127:0] = b;

t1[255:0] = t1[255:0] >> (8 * n); // unsigned shift

r[127:0] = t1[127:0];

 

extern __m64 _mm_alignr_pi8 (__m64 a, __m64 b, int n);

Concatenate a and b, extract byte-aligned result shifted to the right by n.

Interpreting t1 as 127-bit unsigned integer, a, b and r as 64-bit unsigned integers:

t1[127:64] = a;

t1[63:0] = b;

t1[127:0] = t1[127:0] >> (8 * n); // unsigned shift

r[63:0] = t1[63:0];