Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Setting your condescension aside, I browsed the thread.

I understand that calculating trajectories is difficult.

If someone claims something like a 3% impact probability, and they are wrong 99.999% of the time, that speaks to a methodological error in how the numbers are conveyed and or defined.

I work in medical devices and testing. I perform tests like X percentage of patients will die based on the statistical calculations. You may undergo treatment with a medical device that I have worked on.



Calculating trajectories is easy. Getting good data points is hard. Two pictures using a telescope on back to back nights is probably the smallest reasonable sample one could get. Take another picture the 3rd night and you've just doubled the size of the arc.

Wait a week and get another sample and your arc is now approx 5x as long. Wait a month and get another and now your arc is 30x as long as the original. More observations shrink your error bars.

There are systemic errors here for sure. Two kinds, really:

1. Limits of resolution of telescopes 2. Short sample lengths

You absolutely can't do anything about error type 1. You can fix 2 by getting more data. But there's no point in getting data on asteroids that have absolutely no possibility of hitting. So only asteroids that have some probability with limited measurements get enough better measurements that are high quality in order to find out where they're really headed.

All of these measurements of trajectories are completely uncorrelated, so you can't use the priors to adjust probabilities. I mean you can do whatever you want, but we haven't been hit by a big asteroid yet since we've had telescopes and tracking databases.

If we made adjustments based on priors we'd have to discount all collisions down to 0 irrespective of the trajectories. Seems absurd, so there must be something else going on here.


This is a statistics problem, not a measurement problem. The problem is that there are different well understood formulas that must be applied depending on if a measurement is taken of a single sample in isolation, or if it is one measurement of many.

Illustrate the point, imagine a pass/fail AIDs test with 99% accuracy and 1% false positive. If you test one patient only and they are positive, You can conclude that is 99% likely to be correct. However, if you test a hundred different people and one of them comes up positive, you can no longer claim the 99% certainty for that patient. You know that you administered a hundred different tests to different people and would have to reduce your confidence accordingly because you expect one false positive. This second statistical approach is what is not happening with the asteroids, and why asteroids with a 3% chance of hitting Earth suspiciously get revised down to zero more than 97% of the time.

>If we made adjustments based on priors we'd have to discount all collisions down to 0 irrespective of the trajectories. Seems absurd, so there must be something else going on here

Not quite true. If you measure a million asteroids in the data from one says it has a trajectory towards Earth, you need to Discount that observation by the fact that you made 1 million different measurements. The outlier might still be close to zero statistically, but it did have a outlier data. This would be a reason to remeasure the asteroid multiple times. It is only through that process that the number will climb from zero, or stay at zero.

It's not that you're applying the prior that we have never observed Earth colliding asteroid. You're simply accounting for the fact that with the error bars on your measurement system, you expect one false positive in 1 million measurements.

My inference is that the 3% number we are talking about for this specific asteroid what's not calculated using the proper statistical treatment, and that's why it wasn't published in the first place.

This is also why it is similar to P hacking. If you run 20 experiments and analyze them as if they were the only experiment you did, you will get one of them that says a wrong result with 95% confidence, which is the common threshold for publishing outside of physics.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: