While indeed modern embedding are more robust; All embeddings are first layer of...

minimaxir · on March 12, 2024

That's not "all embeddings", that's just implementations like word2vec/fastText. And even though they are fast, they both don't get context as well and require significant preprocessing (e.g. stemming/stop word removal).

Implementations that use a LLM require a full forward pass, but the many optimizations in inference speed make it not noticable for small applications.