Another implicit social contract is that you can tell whether a request is coming from a commercial or non-commercial source based on the originating ISP. This was always a heuristic but it was more reliable in the past.
If 1000 AWS boxes start hammering your API you might raise an eyebrow, but 1000 requests coming from residential ISPs around the world could be an organic surge in demand for your service.
Residential proxy services break this - which has been happening on some level for a long time, but the AI-training-set arms race has driven up demand and thus also supply.
It's quite easy to block all of AWS, for example, but it's less easy to figure out which residential IPs are part of a commercially-operated botnet.
IPs that you've never seen before are hitting a single random page deep within your site are bots, or first-time followers of a search engine link. Grey list them and respond slowly. If they are seen again at normal human rates, unthrottle them.
If 1000 AWS boxes start hammering your API you might raise an eyebrow, but 1000 requests coming from residential ISPs around the world could be an organic surge in demand for your service.
Residential proxy services break this - which has been happening on some level for a long time, but the AI-training-set arms race has driven up demand and thus also supply.
It's quite easy to block all of AWS, for example, but it's less easy to figure out which residential IPs are part of a commercially-operated botnet.