Thursday, 29 May 2008

A lesson in how to get a false positive result

Back to Rustum Roy and his ongoing attempt to show that water can have a 'healing' structure (whatever that means) imposed on it by the power of human intention. As Le Canard Noir said, in the comments to a previous post, this is car crash science. Although I know I should look away, I can't quite manage to do it.

Previously, Roy and his team tried to show that the Raman spectrum of a beaker of water could be changed merely by people all over the world thinking about it changing. Despite the horrible design of the experiment, the results were inconclusive. What Roy needs is a way of increasing the chances of getting a positive result. Another experiment is now planned, details of which are beginning to emerge at the Intention Experiment blog:

As you remember, our experiment isn’t conclusive – largely because we’re only looking at one parameter to see if it has changes.

This is a little like looking at an elephant from one side. If you look from the front, you will mainly see a trunk. Look from bottom, and you only see a giant mass hovering over you like a dark grey cloud.

Rainen’s new equipment consists of three separate devices that examine, respectively, the light scattering, the thermal expansion and any infrared changes in a sample of water. Once these measurements are taken, they are sent into a computer, and from this handful of data points, the computer can determine some 1000 parameters of the sample.

“This equipment represents a revolution in characterizing water,” says Roy.

From this, it sounds as though Roy and team are going to be comparing 1,000 variables. This raises the issue of multiple comparisons. If you compare 1,000 variables between two populations, using a hypothesis test at the 5% level, you would expect to get 'positive' results for 50 of the variables, even if there was no real difference between the two populations. Statisticians apply corrections to account for this effect. Since Roy's team have previously published a paper that purported to show differences between graphs without applying any statistical analysis at all, it's not certain that this will be done. Another issue is that these 1,000 parameters are derived from 'a handful' of measurements, so they presumably cannot be independent parameters. It seems that false positive results are much more likely from the new experiment. Result!


HolfordWatch said...

I, for one, look forward to the details of how they shall calibrate and benchmark "Rainen’s new equipment [which] consists of three separate devices" so that there is a fair evaluation of both the samples.

Any word on whether they have solved the time-lag problem for intention or are they advising the faithful to consult a world-time converter website?

Derrik said...

Nice post

Your right of course that there are ways to correct for the multiple testing problem but they don’t need to do anything so sophisticated. Given how easy the experiment is to repeat it might be better to do a validation instead. They can measure their thousand parameters, identify the most significant parameters (5 say), then repeat the experiment recording only those, if they remain significant it would be fair to say it wasn’t by chance alone.

Of course they won’t do that; this will be a one experiment one inference once kind of a thing.

Anonymous said...

Re "multiple comparisons ... 1,000 variables ... hypothesis test at the 5% level" -
I got quite excited by this mention of multiple comparisons. AP Gaylard referred to the Bonferroni correction in his post on 'The Rudolph Effect' last week. I assume Rustum Roy is aware of methods of alpha adjustment for multiple comparisons and hope he will use one of these methods. I doubt it though. [/cynical]

Paul Wilson said...

Ah yes, the excitement of statistics. I'm certainly not any kind of expert on statistics, but I do know that multiple comparisons are an issue.

I assume that Bonferroni would be too conservative, given that the 1,000 parameters we're talking about are probably not independent? What are the chances of there being any discussion of this once the results are in?

So far, the intention experiment blog hasn't been particularly strong on showing actual results. So I'm not sure if there will be any way of knowing what Roy has actually done with the data.

mugsandmoney said...

I've worked out what's wrong with the experiment - they shouldn't be using water, a nice cup of tea would be much better.

Anonymous said...

An experiment with enough in-built obsfucation to allow the hypothesis to live for another face-saving day.