pub fn _mm512_castps512_ps128(a: __m512) -> __m128
Cast vector of type __m512 to type __m128. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.
Intel’s documentation