Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Our docker file for running some ML code at work is 6GiB. That does not include the model files. What the fuck, Nvidia? Am I downloading thousands of permutations of generated code for things I’ll never use?


If you look at the build, yea, it includes everything and kitchen sink. No one cares to parse it down because in most cases, the big GPU servers running this have plenty of hard disk space and since it's a long running container image in most cases, the pull time isn't considered big enough problem for most people to fix.

Prime example of "Hardware is cheap and inefficiencies be damned"


If only it was viable to analyze which files get used. Then cut down the image to just what’s needed.


I can show you some of the big, unnecessary files: all the .a files in /usr/local/cuda* (unless you're building software inside your container). That's, IIRC, at least a gig.


Look at the stargazing snapshotter if you have access to deploy it. I pulls small blobs of the image as they're used by the host, and it tries to use heuristics to put startup data in the initial blob.

https://github.com/containerd/stargz-snapshotter


As far as I understand, one of things that it includes actually literally is the permutations of code adapted for every different model of supported nVidia hardware; that is a major (and desirable!) part of the driver+CUDA deployment.


If using C++, clang and boost are going to take up a substantial portion of that 6 GB


*Docker image, not Dockerfile


Spoiler, they’re inlining weights with a heredoc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: