All embeddings are first layer of DNN. In case of word2vec this is shallow 2-layer network. Selection of embedding is multiplication of embedding matrix by one-hot vector, which is usually optimized as array lookup.
That's not "all embeddings", that's just implementations like word2vec/fastText. And even though they are fast, they both don't get context as well and require significant preprocessing (e.g. stemming/stop word removal).
Implementations that use a LLM require a full forward pass, but the many optimizations in inference speed make it not noticable for small applications.
All embeddings are first layer of DNN. In case of word2vec this is shallow 2-layer network. Selection of embedding is multiplication of embedding matrix by one-hot vector, which is usually optimized as array lookup.