Error in scientific testing, and how it applies to NASA's recent publication on their seemingly-impossible, physics-breaking 'EM-Drive'

There's basically two types of error: statistical and systematic.

There's basically two types of error: statistical and systematic.

Statistical errors are caused by random fluctuations. You can never make them go away, but you can reduce them by taking more and more measurements. And you can try to manage them to optimizing your experiment. For example, you can use the concept of the Fisher information and the Cramer-Rao bound to figure out the best case scenario for your statistical uncertainty before you even run your experiment. You can figure out how to do your measurements to get the smallest statistical uncertainty on the parameter(s) you're trying to measure. Eagleworks at least attempted to handle their statistical errors. I think they made a little table with the uncertainties from the specs of all their instruments, and added the relative errors in quadrature (standard error propagation).

But then there's the entire other kind of uncertainty which is not accounted for at all. They wrote a few little paragraphs about possible sources of systematic errors, but they didn't quantify any of them. And there are ways they could have tried to control for them, but they didn't do so. Anyway, so statistical errors are random fluctuations about the mean of your estimator (which is hopefully the true value of the estimator is unbiased and you have negligible systematic uncertainty). But systematic errors are an offset, or bias away from the true value. Think of it this way, if you could completely eliminate all statistical error (clearly impossible), the systematic error of your measurement is how far off your estiator is from the true value.

If the thrust of the drive is exactly zero, but some kind of systematic effect makes you measure a constant 1 uN (consider that to be the mean value of your estimator after infinitely many trials, so we can ignore statistical uncertainty), then your data has a bias of 1 uN.

If you know exactly what your bias is, you can subtract it off. Or if you know exactly what's causing it, you can try to fix it and run the experiment again. Both both of these things are generally very hard to do.

The ideal situation for an experimental physicist is to estimate some theoretical parameter with minimum statistical uncertainty, in a way that the mean squared error is dominated by statistical uncertainty. That is, design your experiment such that systematic errors are negligible and statistical errors are manageable.

Unfortunately that's not the kind of experiment Eagleworks is doing. In fact their error could very well be dominated by systematics. If they more carefully considered their systematic errors, it could be that their error bars extend to below zero, meaning that the whole measurement is consistent with zero thrust: a null result.

So not handling your errors properly can literally mean the difference between "I see thrust" and "I don't see thrust", the entire purpose of the experiment. I hope that drives home how absolutely necessary it is, and I hope that clarifies why I said above that their number is meaningless without proper error bars.

Eagleworks is attempting to measure a very small quantity, very close to a physical boundary (since the magnitude of the thrust force can't be negative).

The standard statistical approach here, assuming they can't make a measurement where their full error bars are inconsistent with zero, would be to try to set and upper limit for the thrust the drive produces. This is where you'd use confidence intervals, and you'd say that "With 95% confidence, the thrust is below 2 uN", or something like that.

That way you're not guaranteeing that it's nonzero. Rather you're saying that if it's nonzero, it's likely less than 2 uN.

It seems to me like EW is coming at this with a different mindset than physicists go at their experiments. It seems to me like Harold White's thought process is something along the lines of "I have a pet project, and I'm going to prove it works." Whereas a physicist would be thinking "I have an idea, I need to try to prove it wrong in any way I possibly can, and if it survives, it's worthy of being reported to the physics community."