Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Doesn't radiation mostly cause random one-off errors rather than permanent defects? If so, then if the rad-harden stuff is 100x slower (which I think is approximately right?), it is almost certainly better to use error correction on non-rad-hardened hardware.


In addition to SEUs (single event upsets) which are bit flips, there are also the following Single Event Effects (See) that are destructive:

- Single Event Burnout, SEB

- Single Event Gate Rupture, SEGR

- Single Event Latch-up, SEL these can be recoverable

In addition there are also Total Ionizing Dose (TID) Effects https://radhome.gsfc.nasa.gov/radhome/tid.htm


Sure, no doubt they occur. But at what rate?


Many types of bit error are not recoverable without a full system reset. It isn't a matter of a simple "this bit in ram got corrupted", but more "this floating point unit has got into a state where it will not produce a result, and will therefore hang the entire processor".

Therefore boot time becomes critical - if you end up rebooting due to bit errors multiple times per second, you can't afford to wait for Linux to start up each time...


Run 9 systems in parallel and reset the ones that give less common results or no results at all.

You still have 10% the surface area, power usage and weight and 10 times the speed of the radiation hardened ones.


And that’s why it’s wise to have multiple systems running at the same time, if one errors you still hopefully have another online. There’s a reason airplanes and now cars are designed this way. I’m sure they’re working towards this too.


Well I suppose they do not have to load all the kernels and drivers that Linux provides today.

I wonder how one could use micro kernels to further improve startup time and have a mini distributed OS/kernel for each component.


This is a problem with floating point operations happening at a lower level than the error correction you're imagining. In principle, that's not at all necessary. Are you arguing that it's infeasibly expensive to design a chip with operations that are error correctable?


It's possible - but you'll end up reinventing nearly every step of the IC design process, which will cost a lot.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: