> Nothing guarantees optimal placement, but all the mainstream schedulers attemp...

toast0 · on Feb 3, 2023

> Makes sense economically (although I'm a little sad to read it environmentally because the power usage difference should still be real).

Does fully loading 4 cores in one server save power over fully loading 2 cores in 2 servers? If you turn off the idle server, probably yes? If not, I'd have to see measurements, but I could imagine it going either way. Lower activity means less heat means lower voltage means less power per unit of work, maybe.

You're likely to get better performance out of the two servers though (which might not be great, because then you have a more variable product).

GauntletWizard · on Feb 3, 2023

In modern CPUs with modern thermal management approaches, it's probable that fully loading two cores in two servers is much more efficient than even powering off the second server, because in each machine the primary delta in power draw between idle-state and max-load is in thermal management (fans), and running cores more distributed will increase passive cooling, as well as allowing the CPU cores that are in use to run in more energy-efficient modes.

That said, I haven't done the actual math here, just seen the power draw benchmarks that show idle -> single core draw -> all core draw as a curve with idle and single core usage well under the ratio of number of cores, without even accounting for the fact that each core is more performant under single-core workloads.

scottlamb · on Feb 3, 2023

> Does fully loading 4 cores in one server save power over fully loading 2 cores in 2 servers?

That's the premise, and I have no particular reason to doubt it. There are several levels at which it might be true, from using deeper sleep states (core level? socket level?) to going wild and de-energizing entire PDUs.

> You're likely to get better performance out of the two servers though (which might not be great, because then you have a more variable product).

Yeah, exactly, it's a double-edged sword. The fly.io article says the following...

> With strict bin packing, we end up with Katamari Damacy scheduling, where a couple overworked servers in our fleet suck up all the random jobs they come into contact with. Resource tracking is imperfect and neighbors are noisy, so this is a pretty bad customer experience.

...and I've seen problems along those lines too. State-of-the-art isolation is imperfect. E.g., some workloads gobble up the shared last-level CPU cache and thus cause neighbors' instructions-per-cycle to plummet. (It's not hard to write such an antagonist if you want to see this in action.) Still, ideally you find the limits ahead of time, so you don't think you have more headroom than you really do.

tptacek · on Feb 3, 2023

No, it's not that power usage for us is free, it's that the business is growing (like any startup), so there is already a constant expansionary pressure on our regions; for the foreseeable future (years), our regions will tend to have significantly more servers than a scheduler would tell us we needed. Whatever we save in power costs by keeping some of those servers powered off, we lose in technical operational costs by keeping the rest of the servers running artificially hot.