Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Unless one in every 33 asteroids that have 3% impact probability at some point in time actually impacts earth, there is clearly some unwarranted assumption in the error bar/distribution calculation.

"The measurement data has noise" does not explain why the noise has a bias towards "the asteroid will hit earth" whereas reality so far has been biased towards "the asteroid will not hit earth".

(This assumes that significantly more than 33 asteroids have had >= 3% impact probability predicted at some point. The opposite would not be less concerning.)



To simplfy, let's assume you have perfect knowledge of everything else & that the only variable that matters is asteroids current position. By triangulating observations you have a point estimate. Due to calibrating your instruments in the past you know that they tend to have uniform additive noise that is the same in each dimension. Let's say it shifts measurements by up to 1km randomly.

So the best guess you have is that the true asteroid is 99% likely to be somewhere within a 2km box centered at the observation point.

For each possible location in this box you use it as a hypothetical starting point and run a simulation forward creating a trajectory. In 3% of these trajectories the asteroid hits the earth.

The 3% is only a probability over the measurement uncertainty. It represents our knowledge about the system in a bayesian sense. The true asteroid was always ever going to hit the earth or not. There is no uncertainty inherent in the system.

That many asteroids have non negligible probability only means the physics is sensitive to initial conditions or that the measurements are loose. (Both are true)


Given everything you said is true, under those assumptions 3% of those asteroids that we identify as being in said 2km box will hit earth, unless the forward simulation is wrong (implausible) or the measurement error distribution is substantially wrong (also seems unlikely).

What your analysis is not touching on is the prior probability that an asteroid will hit earth (you collapse this to "any asteroid will either hit or not", but that is not helpful for "model calibration" or whatever you want to call this) - or, equivalently, the prior probability of making (a series of) observations with a certain uncertainty/error distribution. If that prior were actually as uniform as each measurement error suggests, I don't see any Bayesian wiggle room left for why we don't have those 3% of impact actually happen.

(I'm no expert, but presumably you need multiple measurements to predict a trajectory, and while their measurement error distributions may be independent, it seems plausible to me that the prior probability of making two specific noise-affected observations, i.e. of the asteroid being on a certain trajectory, is most likely not so uniform. That's the part that I'd like to learn more about though.)


I think some confusion here seems to come from the following interpretations:

-Then what does 3% mean? Surely it means "given the data we have, one in every 33 will hit" -Given everything you said is true, under those assumptions 3% of those asteroids that we identify as being in said 2km box will hit earth.

Both of these statements are false. The probability density is over our knowledge of the state variables/state space for this asteroid, not over asteroids. The hypothetical sample of asteroids is not drawn from the distribution I'm talking about.

Going back to the simplified example: With the uniform prior on the box, our probability means that 3% of the volume of this box would lead to an impact if an asteroid was centered at a point in that volume at this time of measurement.

It doesn't say anything about hypothetical realizations of this asteroid (it is not clear what this would be sampled from or what it means in a precise sense to repeat a 1 time event) and says even less about the sample of (nearly) independent asteroids observed in the past. The probability measure only describes the measurement uncertainty on properties of this particular asteroid. It is not conditioned on or related to statistics on impacts of "general asteroids".

But "presumably you need multiple measurements to predict a trajectory" and your notes about independence and uniformness being bad assumptions are absolutely correct tho. I agree 100%

My comment above is mostly an attempt to make a simple example to clarify what the probability measure being measured here is. It's not a physically realistic example :) and definitely doesn't make good assumptions about what information is needed and what error distributions that information would have! I don't do space and didn't want to make guesses

Calibration here would have to be over multiple measurements of the same asteroid (which my example doesn't touch on). Likely by predicting trajectories at different intervals and matching the likelihood of later observations.

Verifying multiple observations leading up to a 1 time event is a very different than, say, verification of simulations of an internal combustion engine design where measurements of a real world prototype can be conducted repeatedly and independently to learn/calibrate some fundamental properties or initial conditions like chemical kinetic coefficients and such.

For general interest/lectures/fun, the general field that studies how to push uncertainties forwards/backwards/calibrate a mathematical model and simulation is called "Uncertainty Quantification". Also not an expert lol, I was just surrounded by a bunch in my cohort


> Unless one in every 33 asteroids that have 3% impact probability at some point in time actually impacts earth

There would be a ~63.4% chance that at least one would hit us if there were 33 such asteroids. To compute this, take 1-(0.97^33). I agree with your broader point though.


That's because Earth has gravity, and an asteroid that comes close enough can get deflected onto the planet even if right now it seems to be on a trajectory to miss it entirely. The closer they get and the lower the relative speeds the larger the chance that they will collide and that's not a linear relationship. Beyond a certain boundary impact is certain, then the question is what the time of the impact is and how precise the observations up to that point are in order to figure out where and when exactly it will come down. That won't happen very long before the impact itself happens even if you could say some time in advance roughly in which hemisphere and roughly when. But not precise enough to be very useful.


I wouldn't expect earth gravity to affect it sufficiently enough to cause it to crash unless it was moving very slowly, but I'm not sure asteroids ever move that slowly?


We're not talking about the asteroid stopping with a screech of tires and then taking a hard left turn to crash into earth.

It's just that anything traveling through the earth/moon gravitational sphere of influence will have it's trajectory tweaked just a bit. How close to the center of gravity the pass is determines exactly how much of a tweak. There is a small section of space, we'll call it the keyhole, where if the asteroid happens to pass exactly through that area the tweak will result in a collision next time the asteroid comes around. That could be decades hence.

There could even be a case where an unlucky keyhole pass this time lines up another unlucky keyhole pass the next time to an eventual collision in the distance future.

The technology to nudge the asteroid just far enough to miss a critical keyhole pass is within the realm of possibility with today's knowhow. We just need to have these missions ready to go on short (order of a few months to a year) notice.


We see big ones with a few days to hours of notice, sometimes we see them when they hit.

Most likely: this will never come up.

Less likely: if it does we're fucked.

Even less likely: if it does and surprisingly we see it in time we will act for the good of all and not bicker about who pays and we'll make things better rather than worse. If not, see above.


Like the moon is moving slowly? About 1 km/second for an object that weighs ~10^18 tons at a distance of 300,000 km?

Now think of what that kind of force would do on a much lighter object that moves faster.


I'm not sure I understand your point... The object mass does not impact its trajectory (unless it either touches our atmosphere or is so massive as to measurably change earth's orbit). The gravitational force earth exerts on the moon and some asteroid is also very different, because the force is proportional to both object masses.


Think 'gravitational slingshot' but without missing the planet. The object will change direction and accelerate into us. It could end up grazing the atmosphere or it could go from grazing the atmosphere or even non-impact to impact.


Imagine you see a car 1 mile away as you're preparing to cross the street. 1 sec later, it's a bit closer. You wonder "will this car hit me?". It's hard to say since the car is so far away and your measurements of its speed are so poor.

You wait 5 sec and it's still only imperceptibly closer. You realize there is no way it could possibly hit you. You cross the street unconcerned.


That makes perfect sense. Where it breaks down is if you put percentages on it. If you say the car is a 3% chance of hitting you, it doesn't and you repeat the process a thousand times, and it never hits you something is wrong with your math


I wonder if it's the difference between "this asteroid" and "all asteroids". As we learn more about it, we can start to treat it like a process that has repeated, but initially we can't be sure if it's like other asteroids.

Consider a 6-sided dice roll. What is the chance it will roll a 1?

A person might think, "1 in 6". But what if this is a loaded die? In that case, we need more information before we can classify it as "a die like other dice". We can observe two rolls, and try to ascertain whether or not it is like other fair 6-sided dice; however, two rolls is not enough to be sure.

So as we're gathering data, we start to classify this instance of a thing (a die, an asteroid) as part of a series of things we already know about. The more rolls we observe, the more sure we can be that this is a fair die or a loaded die, for instance.

If I'm understanding how asteroids' trajectories are calculated, we can simulate THIS asteroid's trajectory (3% chance of hitting you, based on a little data), or we can just decide to classify it (perhaps prematurely?) in the series "an asteroid like every other asteroid that we've observed" and arrive at a 0.000001% chance of hitting you (I'm making up a number here).


I think you're right. The 3% number must be ignoring repeat sampling bias. This is basically the same issue as P hacking or false positives and medical testing.

You have one confidence margin for a single single measurement and a different confidence margin if you make 1 million measurements.

Let's say you can measure marble diameters and your tool has a calibrated standard deviation of 1 mm.

If you pull one marble and measure it to be 10 mm larger than expected, you can calculate the chance you are wrong using only the standard deviation of your measurement tool.

However, if you pull 1 million marbles and measure one to be 10 mm larger than expected, you need to take into account the number of marbles you have measured.


The uncertainty is epistemic not aleatoric. The percentage represents our knowledge about the system at the time of measurement propagated through the forward model and is not an inherit random process in the system/model itself.


If your model is consistently wrong in a statistically predictable way, either your measurement or model is inaccurate.

A 3% chance that never occurs is an inaccurate prediction.


Right! Yes absolutely!

It's wrong because the measurements are suggestive of possibility, rather than certain of it.

If we observe an asteroid that with two poor measurements is determined to be headed away from Earth, that's the end. Look no further.

If we observe an asteroid with two poor measurements that has some significant chance of hitting, more and better measurements are made. Then very often those better measurements show it was never actually going to hit anyhow.

But we never would have known without the better measurements, and we never would have devoted more time to making better measurements without a reason to do so.

A 3% chance that never occurs is because that 3% is based on data that's at the limit of what the telescopes can provide, not based upon bad math.


Then what does 3% mean? Surely it means "given the data we have, one in every 33 will hit". Since that empirically doesn't happen, it must be that "the data we have" has a very low prior probability of being real. In other words, the measurement noise seems distributed in a way that over-represents unlikely trajectories.

Hence it seems that it would lead to more accurate predictions if the measurements and their uncertainties were fitted to a model that corrects for the prior probability of observing an asteroid on a given trajectory/making a certain observation.

This discrepancy between distribution of measurement error vs distribution of actual trajectories is what people are wondering about, because it seems interesting to know more about (e.g. "why are certain trajectories less likely?").


Despite all the people coming out of the Woodworks with weird theories, my best one is that the 3% number doesn't take into account their entire measurement process and sampling.

It's is similar to P hacking.


I don't think you understand how this works at all. You might read up on this here if you want to learn more. https://astronomy.stackexchange.com/questions/8450/how-is-th...

If you just want to argue with people, feel free. But based on how this conversation has been going it doesn't seem like you want to learn.


Setting your condescension aside, I browsed the thread.

I understand that calculating trajectories is difficult.

If someone claims something like a 3% impact probability, and they are wrong 99.999% of the time, that speaks to a methodological error in how the numbers are conveyed and or defined.

I work in medical devices and testing. I perform tests like X percentage of patients will die based on the statistical calculations. You may undergo treatment with a medical device that I have worked on.


Calculating trajectories is easy. Getting good data points is hard. Two pictures using a telescope on back to back nights is probably the smallest reasonable sample one could get. Take another picture the 3rd night and you've just doubled the size of the arc.

Wait a week and get another sample and your arc is now approx 5x as long. Wait a month and get another and now your arc is 30x as long as the original. More observations shrink your error bars.

There are systemic errors here for sure. Two kinds, really:

1. Limits of resolution of telescopes 2. Short sample lengths

You absolutely can't do anything about error type 1. You can fix 2 by getting more data. But there's no point in getting data on asteroids that have absolutely no possibility of hitting. So only asteroids that have some probability with limited measurements get enough better measurements that are high quality in order to find out where they're really headed.

All of these measurements of trajectories are completely uncorrelated, so you can't use the priors to adjust probabilities. I mean you can do whatever you want, but we haven't been hit by a big asteroid yet since we've had telescopes and tracking databases.

If we made adjustments based on priors we'd have to discount all collisions down to 0 irrespective of the trajectories. Seems absurd, so there must be something else going on here.


This is a statistics problem, not a measurement problem. The problem is that there are different well understood formulas that must be applied depending on if a measurement is taken of a single sample in isolation, or if it is one measurement of many.

Illustrate the point, imagine a pass/fail AIDs test with 99% accuracy and 1% false positive. If you test one patient only and they are positive, You can conclude that is 99% likely to be correct. However, if you test a hundred different people and one of them comes up positive, you can no longer claim the 99% certainty for that patient. You know that you administered a hundred different tests to different people and would have to reduce your confidence accordingly because you expect one false positive. This second statistical approach is what is not happening with the asteroids, and why asteroids with a 3% chance of hitting Earth suspiciously get revised down to zero more than 97% of the time.

>If we made adjustments based on priors we'd have to discount all collisions down to 0 irrespective of the trajectories. Seems absurd, so there must be something else going on here

Not quite true. If you measure a million asteroids in the data from one says it has a trajectory towards Earth, you need to Discount that observation by the fact that you made 1 million different measurements. The outlier might still be close to zero statistically, but it did have a outlier data. This would be a reason to remeasure the asteroid multiple times. It is only through that process that the number will climb from zero, or stay at zero.

It's not that you're applying the prior that we have never observed Earth colliding asteroid. You're simply accounting for the fact that with the error bars on your measurement system, you expect one false positive in 1 million measurements.

My inference is that the 3% number we are talking about for this specific asteroid what's not calculated using the proper statistical treatment, and that's why it wasn't published in the first place.

This is also why it is similar to P hacking. If you run 20 experiments and analyze them as if they were the only experiment you did, you will get one of them that says a wrong result with 95% confidence, which is the common threshold for publishing outside of physics.


That conclusion may be too early to reach with confidence, based on the limited data!


You're standing in a four-lane road and see a car approaching. You're looking at an angle and the lanes are poorly marked, so you can't tell which one it's in. Your observation lets you estimate the chance you need to move at 25%.

When it gets a little closer, you can tell at least which half it's on, the left or the right. Now your estimate is either 0% or 50%.

Closer still and you tell which lane it's in, so now you're sure.


again, that makes perfect sense.

What wouldn't make sense is if you repeat this 1000 times and a car is never in your lane.

That means that something is wrong about how you are modeling the road and cars.

The claim that people are confused by is (asteroids with a 3% chance of hitting get the change revised to 0% more than 97% of the time).


3% seems much higher though. If I crossed the street at 3%, I probably would be dead by now. Cars may not be a great analogy, because they swerve, but it is quite high. Space is pretty damn big too so the odds are really low of being hit by space things. But unlike cars, space stuff tend to swerve towards the larger bodies.


> But unlike cars, space stuff tend to swerve towards the larger bodies.

That's exactly it. And at the speeds these objects are going and the uncertainty of the observations you would have to be observing an object for a really long time to get the kind of accuracy required to pick a mitigation method that would work. And even then, assuming you could nail the point of impact of something going 2000 km / second of unknown mass in a strong gravity field: given the COVID response I have a hard time believing that the response to 'Houston, Texas is going to be obliterated on Jun 1st 2024' would be met with anything but skepticism and laughter. Right up to the moment of impact.


One explanation would be the Anthropic Principle. In 3% of universes you were killed today, you're just not living in one of those.


In 97% of the universes dinosaur descendants rule planet earth. But on this one they got unlucky.


This only works if there is nothing between "no impact" and "you die the same day as the impact". But we know that's not the case.


Why "same day" and not same week or month? If it's not the same instant, then you're hypothesizing some kind of back-propagation (where alternate futures in which you die influence the likelihood of current events)[0]. Under that hypothesis, it would only matter whether some event would cause you to die sooner than you otherwise would.

[0] https://arxiv.org/abs/0707.1919

[edit] FWIW, I actually corresponded with one of the authors of this paper back in 2007, and from what I could tell, this wasn't an attempt at parody, although now it might be dismissed as one. Personally I'm not willing to declare my (non)commitment to the theory either way.


Most likely the estimation is conservative.

In many situations, erring on one side results in worse outcomes than erring on the other side. In our case, a false positive has pretty much zero consequences, while a false negative could wipe out the dinosaurs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: