Enjoy the insight, but the title makes my eye twitch. How about "LLM weights are...

lblume · 2025-03-16T16:21:16 1742142076

Small LLM weights are not really interesting though. I am currently training GPT-2 small sized models for a scientific project right, and their world models are just not good enough to generate any kind of real insight about the world it was trained in except for corpus biases.

kragen · 2025-03-17T04:31:25 1742185885

Small large language models? This sounds like the apocryphal headline when a spiritualist with dwarfism escaped prison: "Small medium at large." Do you also have some dehydrated water and a secure key escrow system?

dmos62 · 2025-03-16T17:52:12 1742147532

A collection of newspapers is generally a better source than a single leaflet, but even a leaflet is a piece of history.