This is a fascinating idea. Have StableDiffusion generate an image from the image you'd like to "compress" + a random seed. Feed that output to an adversarial network that compares source image to output and scores it. Try again with new seed.
After running for a while, the adversarial network outputs a seed, and you now have a few characters representing a reasonable approximation of your image.
I expect something after jpegXL will be a neural network based compression scheme, where the client has a n GB neural net attached. There have been several that already show promising results (it's likely to be more of a standards issue than a technical issue).
In 80s there was a man (forgot his name) who claimed that one day you could store an entire high res movie on a floppy disc. One day he might be right when AI can regenerate sequences of seeds to images/video. You just need a petabyte of models stored somewhere.