I think just common sense and less bullshit rationalisation would have been enough.
They had a billion dollars in cash to burn, so they hired more than they needed. They should have hired as needed, not as requested by Masayoshi Son.
They shouldn't be so dogmatic. Some teams were too overworked, most were underworked (which means over-engineering will ensue), but no mobility was allowed because "ideally teams have N people".
They shouldn't be so dogmatic pt 2. Services were one-per-team, instead of one-per-subject. So yeah, our internal tool for putting balloons and clowns into images lived together with the authentication micro-service, because it's the same team.
Rewriting everything twice without analysis was wrong. The rewrites were because previous versions were "too complex" and too custom-made but newer ones had an even more complex architecture, but "this time it's right, software sometimes need complexity".
Believing that some things were terrible would have gone a long way. Launching the main node.js server locally would take 10 to 20 minutes to launch, while something of the same complexity would often take about 2 or 3 seconds. Of course it would blow up in production! Maybe try to fix instead of ordering another rewrite.
They were good people, I miss the company and still use the product, but it didn't need to be like this.
It comes from a dogmatic reaction against microservices. Microservices were problematic in certain ways, but instead of analysing what went wrong and why, they just went the opposite direction and started doing "big services only". It was a misguided approach, plain and simple.
Interestingly due to internal bureaucracy and understaffing in some teams, there was a lot of "multiple-teams-per-service", which yeah, is another issue in itself.
They had a billion dollars in cash to burn, so they hired more than they needed. They should have hired as needed, not as requested by Masayoshi Son.
They shouldn't be so dogmatic. Some teams were too overworked, most were underworked (which means over-engineering will ensue), but no mobility was allowed because "ideally teams have N people".
They shouldn't be so dogmatic pt 2. Services were one-per-team, instead of one-per-subject. So yeah, our internal tool for putting balloons and clowns into images lived together with the authentication micro-service, because it's the same team.
Rewriting everything twice without analysis was wrong. The rewrites were because previous versions were "too complex" and too custom-made but newer ones had an even more complex architecture, but "this time it's right, software sometimes need complexity".
Believing that some things were terrible would have gone a long way. Launching the main node.js server locally would take 10 to 20 minutes to launch, while something of the same complexity would often take about 2 or 3 seconds. Of course it would blow up in production! Maybe try to fix instead of ordering another rewrite.
They were good people, I miss the company and still use the product, but it didn't need to be like this.