Our docker file for running some ML code at work is 6GiB. That does not include the model files. What the fuck, Nvidia? Am I downloading thousands of permutations of generated code for things I’ll never use?
If you look at the build, yea, it includes everything and kitchen sink. No one cares to parse it down because in most cases, the big GPU servers running this have plenty of hard disk space and since it's a long running container image in most cases, the pull time isn't considered big enough problem for most people to fix.
Prime example of "Hardware is cheap and inefficiencies be damned"
I can show you some of the big, unnecessary files: all the .a files in /usr/local/cuda* (unless you're building software inside your container). That's, IIRC, at least a gig.
Look at the stargazing snapshotter if you have access to deploy it. I pulls small blobs of the image as they're used by the host, and it tries to use heuristics to put startup data in the initial blob.
As far as I understand, one of things that it includes actually literally is the permutations of code adapted for every different model of supported nVidia hardware; that is a major (and desirable!) part of the driver+CUDA deployment.