Why? They have no idea how many users they will get. If they buy A100s on the assumption they will get 50M daily active users they run the risk of wasting an enormous amount of money if they get 1M users instead. And its not like these GPUs grow on trees. Clearly MSFT is struggling to set up compute fast enough, see the decreasing rate limits on GPT-4.
So you're saying they might be incompetent enough for their estimation to be off by 50x? You're also saying Google, _a cloud computing provider selling access to A100s_, can't scale this dynamically?
> Clearly MSFT is struggling to set up compute fast enough, see the decreasing rate limits on GPT-4.
Well, even so, this is how you'd deal with capacity problems, rather than by arbitrarily shutting out parts of the world.
Predicting the future correctly isn't a matter of competence... And yes, even Google can't scale infinitely in the face of an actual physical resource constraint. When I worked there, there was a period when there was a shortage of memory chips, which required a lot of creativity. I suspect the current period is very constrained by how fast AI/ML focused chips can be manufactured.
> Predicting the future correctly isn't a matter of competence... And yes, even Google can't scale infinitely in the face of an actual physical resource constraint.
In order to not geo-block, Google would need to be able to scale infinitely? What kind of straw man is this?
> there was a shortage of memory chips, which required a lot of creativity.
Geo-blocking doesn't strike me as particularly creative as far as solutions go. Whatever problem they're trying to address, the excuses being made here seem weak and don't make Google look any less incompetent.
:shrug: Yeah it's actually very difficult to launch a very resource intensive product globally all at once. It's not really an excuse, it's just a real problem that Google has. It limits what they can launch and how they can do it, because anything they launch will get used by a huge number of people right away, and everyone expects them to achieve a magical level of performance at that scale. Rolling out region by region based on where data centers are located and what hardware is deployed in them is one possible way to deal with this. I don't think it's good, really, I just think it's unsurprising.
>So you're saying they might be incompetent enough for their estimation to be off by 50x?
Not sure about incompetence, it's just a very hard problem. Sam Altman's estimations for ChatGPT were off by 10x, for reference. Apparently some employees thought it would be a total flop that would barely make the last page of major newspapers.
> You're also saying Google, _a cloud computing provider selling access to A100s_, can't scale this dynamically?
I'm saying that Google can not acquire enormous quantities of A100s overnight, yes.
Why? They have no idea how many users they will get. If they buy A100s on the assumption they will get 50M daily active users they run the risk of wasting an enormous amount of money if they get 1M users instead. And its not like these GPUs grow on trees. Clearly MSFT is struggling to set up compute fast enough, see the decreasing rate limits on GPT-4.