For production workloads, I generally agree. It's an unsupported hack with a questionable future, I wouldn't do anything money-making with it.
However, for tinkering and consumer workloads, it already works pretty well. Enough of cuDNN and cuBLAS work to run PyTorch and in turn, Stable Diffusion with https://github.com/lshqqytiger/ZLUDA - there's even a fairly user-friendly setup process already in https://github.com/vladmandic/automatic .
I was able to get a personal non-ML related project working on my AMD card in just a few minutes, which saved me a lot of development time before I then deployed the production workload on NV hardware (this is probably why AMD pulled the plug on the project - it's almost more of a boost to NV than anything else, AMD really need people to be writing code on ROCm to deploy on AMD datacenter hardware).
As for the Stable Diffusion thing, a silly edge case - because MIOpen (and therefore PyTorch-on-ROCm) doesn't work on Windows yet (they're slated to ship it next month, I think).
For production workloads, I generally agree. It's an unsupported hack with a questionable future, I wouldn't do anything money-making with it.
However, for tinkering and consumer workloads, it already works pretty well. Enough of cuDNN and cuBLAS work to run PyTorch and in turn, Stable Diffusion with https://github.com/lshqqytiger/ZLUDA - there's even a fairly user-friendly setup process already in https://github.com/vladmandic/automatic .
I was able to get a personal non-ML related project working on my AMD card in just a few minutes, which saved me a lot of development time before I then deployed the production workload on NV hardware (this is probably why AMD pulled the plug on the project - it's almost more of a boost to NV than anything else, AMD really need people to be writing code on ROCm to deploy on AMD datacenter hardware).