if im being honest i care more about multiple local ai apps on my desktop all hooking into the same ollama instance rather than all downloading their own models as part of the app so i have like multiple 10s of gbs of repeated weights all over the place because apps dont talk to each other
I haven't used a local model in a while but ollama was the only one I've seen convert models into a different format. (I think for reduplication). You should be able to say download a gguf file and point a bunch of frontends to that same file.
if im being honest i care more about multiple local ai apps on my desktop all hooking into the same ollama instance rather than all downloading their own models as part of the app so i have like multiple 10s of gbs of repeated weights all over the place because apps dont talk to each other
what does it take for THAT to finally happen