> For transformer, v4 chip has 70-100% compute capacity and 40% memory of A100 f...

YetAnotherNick · on May 26, 2023

> Nvidia demonstrated this back in 2021.

Because Nvidia created a supercomputer with A100, with lot of focus for networking. Cloud providers don't give that option.

haldujai · on May 26, 2023

Azure and AWS have both offered high-bandwidth cluster options that allow scaling beyond a single server for several months now.

Pretty sure MosaicML also does this but I haven't used their offering.