Could models such as Dalle2 (huge amount of parameters), be trained on distributed/decentralized network of volunteer/paid computers in the style of LHC at home [0]
what does backprop look like? how long does a single training iteration take? what happens if someone drops out mid loop? how much does it cost to move the training data around?