1) that the aggregate savings from compressing the images needs to outweigh the initial cost of distributing the decompressor.
2) to be lossless, decompression must be deterministic an unambiguous, so you can't compress _everything_ down to zero bits; you can compress only _one_ thing down to zero bits, because otherwise you wouldn't be able to unambiguously determine which thing is represented by your zero bits.
You have to convey which algorithm, which takes bits. And at the very least you need a pointer to a file, which also takes bits. You’d do well to look for archives of alt.comp.compression.
There was also a classic thread that surfaced recently on HN about a compression challenge whereby someone tried to disengenuously attempt to compress a file of random data (uncompressable by definition) by splitting it on a character then deleting that character from each file. Was a simple algorithm, that appeared to require fewer bits to encode. The problem is, all this person did was shift the bits to the filesystem’s metadata, which is not obvious from the command line. The final encoding ended up taking more bits once you take said metadata into account.
"I will custom write a new program to (somehow) generate each image and then distribute that instead of my image" is not a compression algorithm. But I think you'd do well over at the halfbakery.
I'm arguing it's the same as this image compression technique. They rely on a huge neural network which must exist wherever the image is to be decompressed.
If I'm allowed to bring along an unlimited amount of background data, then I can compress everything down to zero bits.
In contrast, an algorithm like LZ78 can expressed in a 5 line python script and perform decently on a wide variety of data types.
> If I'm allowed to bring along an unlimited amount of background data, then I can compress everything down to zero bits.
If by "background data" you mean the decompressor, this is patently false. No matter how much information is contained in the decompressor (The Algorithm + stable weights that don't change), you can only compress one thing down to any given new representation ( low resolution image + differential from rescale using stable weights ).
If by "background data" you mean new data that the decompressor doesn't already have, then you're ignoring the definition of compression. Your compressed data is all bits sent on the fly that aren't already possessed by the side doing the decompression regardless of obtuse naming scheme.
> I'm arguing it's the same as this image compression technique.
That's wrong, because this scheme doesn't claim to send a custom image generator instead of each image, which is what you're proposing.
Unless it's overfit on some particular inputs, but if so-- it's bad science.
Ideally they would have trained the network on a non-overlapping collection of images from their testing but if they did that I don't see it mentioned in the paper.
The model is only 15.72 MiB (after compression with xz), so it would amortize pretty quickly... even if it was trained on the input it looks like it still may be pretty competitive at a fairly modest collection size.