> You can use intrinsics from rust Right, I know there’s some support. > You can...

adamnemecek · on June 23, 2019

You can link the exact same intrinsics in a rust binary to get the same intrinsics.

> That’s not “in fact”, that’s your opinion. Personally, I don’t think simple is good.

When I grow up I want to be as smart as you.

And re: the faster project, you can use the exact same instructions. This is nothing about rust or not.

Also, Rust is simply a better language for low level things even ignoring intrinsics.

Const-me · on June 23, 2019

> You can link the exact same intrinsics in a rust binary to get the same intrinsics.

Intrinsics are not library functions. You don’t link them anywhere. They’re processed by compiler not linker, and for SIMD math, each one usually becomes a single instruction. Linked functions are too slow for that.

> This is nothing about rust or not.

When I code C and write y=_mm_rsqrt_ps(x) I know I’ll get my rsqrtps instruction. When I write y=_mm_div_ps(_mm_set1_ps(1), _mm_sqrt_ps(x)) I know I’ll get slower more precise version. I don’t want compiler to choose one for me while converting a formula into machine code.

adamnemecek · on June 23, 2019

Ok not link but include header.

You can do the same in rust. See the explicit section of the faster project.

Const-me · on June 23, 2019

> Ok not link but include header.

Sorry to disappoint but Rust can’t include C++ headers. Even if it could, they wouldn’t work, because intrinsics are not library functions.

> See the explicit section of the faster project.

These aren’t C intrinsics, they are library functions exported from stdsimd crate. Which in turn forwards them to LLVM. Requires Rust nighty. Also I’m not sure that many levels of indirection are good for performance. You usually want these m128/m256 values to stay in registers. In C++, I sometimes have to write __forceinline to achieve that, or the compiler breaks performance by making function calls, or referencing RAM.

adamnemecek · on June 23, 2019

> Sorry to disappoint but Rust can’t include C++ headers.

You are pedantic.

https://doc.rust-lang.org/1.29.0/std/arch/#static-cpu-featur...

Const-me · on June 23, 2019

That’s library functions from stdsimd crate.

Looks like significant overhead over C intrinsics. Two calls to transmute() for every instruction. And other calls for every instruction, stuff like as_i32x4.

It’s technically possible every last one of them compile into nothing at all, and emits just a single desired instruction. I don’t believe these optimizations are 100% reliable, however. They aren’t reliable in clang or vc++, I sometimes have to use trickery to force compilers to inline stuff, keep data in registers instead of loads/stores, and otherwise not screw up the performance.

adamnemecek · on June 23, 2019

There's no transmute. Look I don't have time for this.

Const-me · on June 23, 2019

> There's no transmute.

That crate calls transmute twice, for every single instruction.

https://github.com/rust-lang-nursery/stdsimd/blob/master/cra...

steveklabnik · on June 23, 2019

What he’s trying to say is that transmute is also an intrinsic.

They correspond to the machine instructions, that’s their entire purpose. That’s also why they’re intrinsics.

adamnemecek · on June 23, 2019

Check the actual compiler output.

steveklabnik · on June 23, 2019

(This is exactly how Rust’s intrinsics work.)