This is specifically what they said about sharding
> The primary rationale is that sharding existing application workloads would be highly complex and time-consuming, requiring changes to hundreds of application endpoints and potentially taking months or even years
On one hand OAI sell coding agents and constantly hype how easy it will replace developers and most of the code written is by agents, on the other hand they claim it will take years to refactor
Genuinely sounds like the kind of challenge that could be solved with a swarm of Codex coding agents. I'm surprised they aren't treating this as an ideal use-case to show off their stack!
Getting the sharing in-place, yes, but maintaining it operationally would still be a headache. Things like schema migrations across shards, resharding, and even observability.
Sharding can be made mostly transparent, but it's not purely a DB-level concern in practice. Once data is split across nodes, join patterns, cross-shard transactions, global uniqueness, certain keys hit with a lot of traffic, etc matter a lot. Even if partitioning handles routing, the application's query patterns and its consistency/latency requirements can still force application-level changes.
> Once data is split across nodes, join patterns, cross-shard transactions, global uniqueness, certain keys hit with a lot of traffic
If you're having trouble there then a proxy "layer" between your application and the sharded database makes sense, meaning your application still keeps its naieve understanding of the data (as it should) and the proxy/database access layer handles that messiness... shirley
When sharded, anything crossing a shard boundary becomes non-transactional.
Ie. if you shard by userId, then a "share" feature which allows a user to share data with another user by having a "SharedDocuments" table cannot be consistent.
That in turn means you're probably going to have to rewrite the application to handle cases like a shared document having one or other user attached to it disappear or reappear. There are loads of bugs that can happen with weak consistency like this, and at scale every very rare bug is going to happen and need dealing with.
Two-phase commit provides an eventual consistency guarantee only....
Other clients (readers) have to be able to deal with inconsistencies in the meantime.
Also, 2PC in postgres is incompatible with temporary tables, which rules out use with longrunning batch analysis jobs which might use temporary tables for intermediate work and then save results. Eg. "We want to send this marketing campaign to the top 10% of users" doesn't work with the naive approach.
Sorry, what am I missing here, this complaint is true for all architectures, because the readers are always going to be out of sync with the state in the database until they do another read.
The nanosecond that the system has the concept of readers and writers being different processes/people/whatever it has multiple copies, the one held by the database, and the copies held by the readers when they last read.
It does not matter if there is a single DB lock, or a multi shared distributed lock.
These are limitations in the current PostgreSQL implementation. It's quite possible to have consistent commits and snapshots across sharded databases. Hopefully some day in PostgreSQL too.
> The primary rationale is that sharding existing application workloads would be highly complex and time-consuming, requiring changes to hundreds of application endpoints and potentially taking months or even years