TL;DR They want to squeeze every drop of performance out of the CPU when processing media, and maintaining a mixture of intrinsics code and assembly is not worth the trade off when doing 100% assembly offers better performance guarantees, readability, and ease of maintenance / onboarding of developers.
Intrinsics have the disadvantages of asm (non-portable) but also don't reliably have the advantages of them (compilers are pretty unpredictable about optimizing with them) and they're ugly (especially x86 with its weird Hungarian stuff).
There is just a little bit of intrinsics code in ffmpeg, which I wrote, that does memory copies.
It's like this because we didn't want to hide the memory accesses from the compiler, because that hurts optimization, as well as memory tools like ASan.
Intrinsics have the huge advantage of enabling wrapper functions, which remove the ugly names and allow you to write user code only once, such that it is even portable (or at least multiplatform-dependent).
Good point about asan and other instrumentation :) hm, I'd think that is very important for codecs in particular?
Well that was more true when you had to care about the 8 registers of x86, CPUs were only like 2-4 wide, and codecs preferred to operate on 8x8 blocks and one bitdepth.
Nowadays the impact of suboptimal register allocation and addressing calculations of compilers is almost unmeasurable between having 16/32 registers available and CPUs that are 8-10 wide in the frontend but only 3-4 vector units in the backend. But the added complexity of newer codecs has strained their use of the nasm/gas macro systems to be far less readable or maintainable than intrinsics. Like, think of how unmaintainable complex C macros are and double that.
And it's not uncommon to find asm in ffmpeg or related projects written suboptimally in a way a compiler wouldn't, either because the author didn't fully read/understand CPU performance manuals or because rewriting/twisting the existing macros to fix a small suboptimality is more work than it's worth.
(yes, I have written some asm for ffmpeg in the past)
TL;DR They want to squeeze every drop of performance out of the CPU when processing media, and maintaining a mixture of intrinsics code and assembly is not worth the trade off when doing 100% assembly offers better performance guarantees, readability, and ease of maintenance / onboarding of developers.