>Far be it from me to tell anyone how to write software, but why build a databas...

zetalyrae · on July 27, 2023

>Cost

It's a bad trade. Thousands of hours of a high human capital computer scientist vs. a few tens of dollars a month for RDS.

>Reliability

Empirically false: none of this would have happened if Tarsnap used Postgres instead of a home-spun database.

nonethewiser · on July 27, 2023

>> Cost

> It's a bad trade.

Maybe. But that's the reason. You never acknowledged that advantage in your question so it needed to be emphasized

zetalyrae · on July 27, 2023

It never occurred to me that anyone would need it explained to them that RDS is cheaper than the time of any software engineer.

The opportunity cost of building your own database is 10,000x the cost of running RDS for a year.

foldr · on July 27, 2023

That sort of logic doesn't really apply here because:

* RDS costs obviously scale linearly with ongoing time and probably scale linearly with the total amount of data being backed up. So depending on the revenue of the business, these extra costs could easily end up outweighing the (notional) cost of the time saved, which is mostly a one-off expense.

* The cost of a software engineer's time is notional in the context of a one-person business. The author of Tarsnap isn't going to be able to employ fewer than zero additional software engineers to maintain Tarsnap because of the time saved by using RDS.

zetalyrae · on July 27, 2023

[flagged]

gus_massa · on July 27, 2023

Perhaps I'm biased, but I read every part like:

> I realized that this was introduced by some code I wrote in 2014: Occasionally Tarsnap users need to move a machine between accounts,

with an implicit "and I fixed that code now" or "and I will fix that tomorrow when I get enough sleep". Let's hope he writes later a follow up explaining the details.

> [a] postmortem on the front page of HN

Is that bad? I upvoted this almost instantly, then went to the comment section to upvote cperciva if he was here, then I read the full post and verified my first upvote was correct.

foldr · on July 27, 2023

The outage was caused by a hardware failure and (I assume) the lack of any redundancy. Using RDS wouldn't have made a difference as far as I can see.

zetalyrae · on July 27, 2023

RDS can have replication.

But more than that: servers should be stateless! A server going down should never take down your business.

If you use Postgres, and stateless servers, then if a server goes down it's no problem, it gets rebooted and there may be other servers and a load balancer to pick up the load. If Postgres goes down, you have a replica, or it gets rebooted, and Postgres always recovers from crashes (in my experience), and if it doesn't you have PITR.

AWS has everything under the sun to prevent this kind of thing happening. This is a 1990's outage. This didn't have to happen.

foldr · on July 27, 2023

The hardware failure was on the server running the application code, so RDS replication wouldn’t have helped. You’re right of course that this failure points to a lack of redundancy – but that’s a separate issue from choosing S3 vs. RDS as the data layer.

By the way, S3 is insanely reliable and in fact more reliable than a replicated RDS setup. So switching from S3 to RDS would almost certainly reduce the basic reliability of the data layer, however many conveniences it might bring.

zetalyrae · on July 27, 2023

Once again, the problem is not S3, it is reinventing a database on top of S3, the logic of which runs on EC2.

catiopatio · on July 27, 2023

Once again, no, that’s not the problem.

PostgreSQL and RDS are quite a bit more than just a log-structured data store, and are not prima facie the correct solution for this problem domain, regardless of how much arrogant ignorance you bring to bear on the debate.

dang · on July 27, 2023

You broke the site guidelines badly in more than one place in this thread. I realize you're trying to defend someone's work against what you feel is unfair criticism, but breaking the site guidelines yourself, with swipes and name-calling and flamewar, is exactly the wrong way to do this.

If you'd please review https://news.ycombinator.com/newsguidelines.html and stick to the rules when posting here, we'd appreciate it.

catiopatio · on July 27, 2023

You’re not wrong, and thank you for holding me to account.

In retrospect, I’d delete and/or edit the comments if I could.

dang · on July 27, 2023

No need to delete - the only thing we care about is fixing things going forward. Thanks for the kind reply!

zetalyrae · on July 27, 2023

[flagged]

dang · on July 27, 2023

Would you please stop posting flamewar comments and breaking the site guidelines? You've been doing that repeatedly and badly in this thread. We end up having to ban such accounts, and I don't want to ban you.

Fortunately it doesn't look from your recent comments that you've been in the habit of posting this way, so it should be easy to fix.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.

catiopatio · on July 27, 2023

Just … stop. This is the first outage in 11 years.

You’re being unnecessarily arrogant and antagonistic up and down this thread

You’re not smarter than everyone else here, you don’t have better or more perfect knowledge, and you almost certainly wouldn’t have built a better or more reliable system.