A lot of comments here seem to be along the lines of "you can hire one more engi...

deepspace · on Nov 16, 2023

> Keeping a few racks of servers happily humming along isn't the massive undertaking that most people here seem to think it is

Keeping them humming along redundantly, with adequate power and cooling, and protection against cooling- and power failures is more of an undertaking, though. Now you are maintaining generators, UPSs and multiple HVAC systems in addition to your 'few racks of servers'.

You also need to maintain full network redundancy (including ingress/egress) and all the cost that entails.

All the above hardware needs maintenance and replacement when it becomes obsolete.

Now you are good in one DC, but not protected against tornadoes, fire and flood like you would be if you used AWS with multiple availability zones.

So, you have to build another DC far enough away, staff it, and buy tons of replication software, plus several FTEs to manage cross-site backups and deal with sync issues.

10000truths · on Nov 16, 2023

You don't need to build your own datacenter unless your workload requires a datacenter's worth of hardware. Colocation is a feasible and popular option for handling all of the hands-on stuff you mention. Ship the racks to a colo center, they'll install them for you. Ship them replacement peripherals, and the operators will hot-swap them for you. If you need redundancy, that's just a matter of sending your hardware to multiple places instead of one. Slightly more involved, but it's hardly rocket science.

Sohcahtoa82 · on Nov 16, 2023

That all takes time. With cloud, you can have a system up and running in literal seconds, which is very nice when you find out that you severely underestimated how much traffic you web app will get.

But yeah, in the long run, colocation becomes significantly cheaper than cloud. You use AWS, and you'll find yourself paying $200/month for hardware you could buy once for $2,000.

Sometimes I think people forgot colocation is an option that exists.

pleoxy · on Nov 17, 2023

Purely in hardware costs. More like $200/mo payment can be replaced by $100-130 in hardware.

For $600 I can get hardware that outperforms a $1200/mo ec2. Easy.

scarface_74 · on Nov 17, 2023

Do you realize how few Pennie’s $200 a month is compared to $2000 for any decent size company?

johnklos · on Nov 17, 2023

No, I'm afraid we can't do basic math. Nobody will ever know how long it takes for $200 a month to end up costing more than a single outlay of $2000.

scarface_74 · on Nov 17, 2023

I’m saying that little savings is a rounding error to any business of any size

ricardobeat · on Nov 17, 2023

He is giving an example for a single server. You can * 100 both numbers if you want.

mike_d · on Nov 16, 2023

> Now you are maintaining generators, UPSs and multiple HVAC systems in addition to your 'few racks of servers'.

I don't mean to call you out specifically, but I believe your comment is a perfect example of how the majority of developers have no clue what being on bare metal actually involves. You literally need to do none of these things. Cloud vendors make it sound overly complex and the myth just kinda self perpetuates because nobody knows better.

If anyone in the Bay Area is considering the move out of the cloud and wants to see in person what is really involved, I might consider putting together a group tour of one of my rack locations.

dekhn · on Nov 17, 2023

What do your racks do if they lose site power?

What happens when your own HVAC dies and your DC has about 4 hours until it overheats?

(I'm a software engineer who previously built and maintained racks of bare metal. Never again).

mike_d · on Nov 17, 2023

Unless you are a double digit billion dollar company, you don't build your own datacenters. You lease space from a colocation provider like Equinix, CoreSite, or DRT. Even AWS and GCP themselves are partially hosted in buildings operated by these companies.

These datacenters are fed by multiple power substations, have onsite battery and generators, and contracts for delivery of fuel in the event of a disaster. But none of these things are any more your problem than if a power plant explodes knocking out an AWS region.

dekhn · on Nov 17, 2023

I was replying with my interpretation of what you said above. Did I misunderstand you when you said you don't need generators, UPS, and multiple HVAC systems? If I read it correctlyu you're saying that developers have no clue what being on bare metal means, and that you don't need generators, UPS, or multiple HVAC. But then you said the colo facilities have those (along with other disaster plans).

> > Now you are maintaining generators, UPSs and multiple HVAC systems in addition to your 'few racks of servers'.

> I don't mean to call you out specifically, but I believe your comment is a perfect example of how the majority of developers have no clue what being on bare metal actually involves. You literally need to do none of these things.

By the way, I worked for a double digit billion dollar company that built its own datacenters as well as placing resources in colos. They started out purely in colos, and put several colos out of business over HVAC and power costs (back when rack space was billed by area, not cooling). Even after that, they stayed in colos, and when I worked there, we constantly had to deal with the unreliability of colos- not just that they were smaller, with less cooling, and inadequate power, but also because they often didn't actually fulfill their contractual requirements. ATL was a great example.

If colos work for you, that's great. I just don't thinnk they are prepared to handle disasters nearly as well as the megascale cloud providers.

mike_d · on Nov 17, 2023

> Did I misunderstand you when you said you don't need generators, UPS, and multiple HVAC systems?

No, you seem to misunderstand how leased space works. If you rent a floor of an office building, the toilets flush without you having to own a water plant or redundant water pipes.

> when I worked there, we constantly had to deal with the unreliability of colos- not just that they were smaller,

It sounds like whomever was in charge of picking datacenters was shit at their job. The colocation market isn't what it used to be and it isn't just a dude with some warehouse space and swamp coolers. Colos are publicly traded companies or REITs and have good SLAs.

> I just don't thinnk they are prepared to handle disasters nearly as well as the megascale cloud providers.

I worked for a megascale cloud provider. I'm intimately familiar with the nuts and bolts of a few others. Some of it is the big owned and operated campuses you see in the glossy brochure photos, but a substantial part is also in the same colos you can lease yourself. They don't bring in any additional cooling or power over and above what the datacenter provides.

lijok · on Nov 17, 2023

You let the colo handle it.

lol768 · on Nov 16, 2023

Most of those requirements cease to exist if you decide to colo. It's not cloud or "run your own DC".

shrubble · on Nov 17, 2023

This is why you check out decent datacenters. A good DC is already BUILT above the 500 year or 1000 year flood plain, has N+1 generators, and tornadoes are not present in all locations.

johnklos · on Nov 17, 2023

You just come across as someone who is scared because you don't understand.

Seriously, a facility with multiple Internet connections, adequate power and cooling, passive cooling in case of outage, protection against power failures, adequate UPSes, backup generator power, and so on could be my Mom's house. There's nothing special about any of those things that makes them somehow frightening.

"buy tons of replication software"? What industry do you work in? Seriously, nobody who isn't in some clueless "enterprise" would pay good money for things that're widely available in open source.

I'm dismissive because these things aren't difficult if you've actually done them, so I can only assume you've never done them.

unglaublich · on Nov 16, 2023

> I think lots of "cloud native" engineers are just intimidated by having to learn lower levels to keep things running.

Rightly so, because they're cloud native engineers, not system administrators. They're intimidated by the things they don't know. It'll be a very individual calculation whether it's worth it for your enterprise to organize and maintain hardware yourself, or isn't.

moduspol · on Nov 17, 2023

There's certainly no shortage of sysadmins comfortable with their on-prem skillsets that have their heads in the sand about the cloud.

And there are plenty of us who've spent time managing hardware and physical networks, transitioned to cloud, and are very happy to not be looking back.

Jnr · on Nov 16, 2023

And in other places around the world those would be closer to 3 or 4 good engineers for the same money. And while each engineer costs some money, they probably bring in close to double of what they are being paid.

bee_rider · on Nov 16, 2023

Especially given the low unemployment rate, laying somebody off seems quite risky, if it doesn’t work out you’ll have trouble hiring some replacement I guess.

cj · on Nov 16, 2023

The current hiring market in tech is the easiest (for employers) than it has been in a really long time. It used to take 3-4 months to fill a role. In the current market it's more like 2-4 weeks.

sgarland · on Nov 17, 2023

Not for SRE and DBRE lol. 4-6 months easy.

I made the correct career choice. Downturn? What downturn?

bastawhiz · on Nov 16, 2023

The cost of one salary can probably be achieved by profiling and improving some software. Moving to graviton. Find and delete a few expensive queries. Add a database index. Switching from the cloud to on prem is a dramatic and expensive migration that changes a lot of constraints for your infra that you might not need to change in the first place.

nwmcsween · on Nov 16, 2023

eh, to a degree, having to deal with failed hardware and worse buggy hardware is just a pain and really time consuming.

I_Am_Nous · on Nov 17, 2023

>Keeping a few racks of servers happily humming along isn't the massive undertaking that most people here seem to think it is.

It isn't until hardware failures happen, and that requires different skills to deal with effectively. Like the time a core networking switch presented with a dead PSU and no backups, so you Frankenstein it back to temporarily working with another switch to pull the config off of it.

With bare metal you may have to have more generalized, Jack of all trades staff to account for the unaccountable.