Intel has recently launched the ISPC 1.27 compiler version, boasting full compatibility with the AVX10.2 instruction set. This latest iteration not only optimizes short vector element-level functions and unsigned type cross-channel operations but also amplifies dot product operation capabilities. In terms of performance, the ISPC 1.27 compiler achieves a remarkable speedup of approximately 10x for masked load/store operations under the AVX-512 architecture. Furthermore, under the AVX2 architecture, the packed_store_active2 instruction has witnessed a substantial efficiency boost, with speedups of around 65% for int32 types and 45% for int64 types.
