These sorts of heat balancing sharding schemes are very difficult to implement a...

riku_iki · on Feb 28, 2024

> As you see hot keys you need to split the hash space and rebalance within that space by reshuffling the shard data.

Correct, but I think it is solvable problem, if such approach adds value (e.g. one can build and start selling product like that).

Also, complexity may be not that hard. Say, you are building some tiered shards:

- every entry at the beginning is single shard

- once shard reaches 10m records, you split it to 10 shards, which is not that difficult computation, and can be done in XXXms in single transaction

fnordpiglet · on Feb 28, 2024

It’s actually quite hard. It starts with being able to detect a hot key at all. It’s also not the case that heat is symmetric with size, in fact in an inverted index single entries can be very hot. Then it’s not about simply shuffling data (which isn’t simple as you outline - you need to salt the keys and they shuffle randomly, otherwise you don’t get uniformity), then you need to create cumulatively eventually consistent write replicas to balance write load while answering queries online in a strongly consistent way. Add to this any dynamic change in the index like this requires consistent online behavior (I.e., ingestion and queries don’t stop because you need to rebalance), and the hot keys are necessarily “large” in volume so back pressure can be enormous and queue draining itself can be expensive. Add to it you need stateful elastic infrastructure.

There are definitely products that offer these characteristics. S3 and dynamodb both do, even if you can’t see it. But it took many years of very intensive engineering to get it to work, and they have total control over the infrastructure and runtime behind an opaque api. Elastic search and Splunk are general purpose software packages that are installed by customers, and their data models are much more complex than objects or tables.

riku_iki · on Feb 28, 2024

> It’s also not the case that heat is symmetric with size, in fact in an inverted index single entries can be very hot.

I think you mixed two orthogonal topics: you first talked about frequent tokens, and now switched to hot keys(tokens which are frequently queried).

As for frequent tokens, I think I well described algorithm, and it looks simple, and I don't see any issues there, if your metadata store (where you store info about shards and replicas) allows some kind of transactions (e.g. cocroachdb or similar).

For hot keys/shards, as you pointed out, solution is to increase replication factor, but I think if shard is relatively small(10m IDs as in my example), adding another replica online is also fast, can be done in single transaction, and may not require all these movement you described.

fnordpiglet · on Feb 28, 2024

The thread is about inverted indices. Frequent tokens are hot keys.