Wednesday 19 November 2008

Science by press release (epic fail)

While I was messing about at the University of Google, doing some research for the thing that I've just sent to homeopathy, I came across this [pdf]. Yes, the homeopaths (in the form of the International Homeopathic Medical League and the European Committee for Homeopathy) have put together a press release, based on the study in Homeopathy by Rutten and Stolper, and the study by Ludtke and Rutten in the Journal of Clinical Epidemiology. Both studies criticise (mostly wrongly) a perfectly good meta-analysis of homeopathy from the Lancet. The headline is "New evidence for homeopathy". No question mark, no caveats. You might guess that the International Homeopathic Medical League and the European Committee for Homeopathy would not be entirely unbiased sources of information. In fact, these studies are very far from being evidence for homeopathy, as I've tried to point out (and not just me). I suppose two studies sounds better than one, but the truth is that these are essentially the same study, with extra nonsense in the part that got published in Homeopathy. The studies apply a post-hoc analysis to the data from the Lancet article, and claim to be able to produce a positive result for homeopathy if you squint at the data and stand on one leg. Then they accuse the Lancet paper of data dredging. What a laugh riot.

Unfortunately, lots of scientific studies get reported based on their press releases. This means that the people who read the reports get no sense of the flaws in the study and the caveats that should be applied to its conclusions; these come from independent scientific scrutiny of the study once it has been peer-reviewed and published. Pushing press releases on your research may be a good way to get brownie points from your university, but it's not usually a good way of fostering an improved understanding of the scientific process. To be fair, it's by no means only homeopaths and their ilk who do this, as Ben Goldacre has pointed out.

Still, the good thing about this is that a search on Google News for this release shows that, at the time of writing, it has only been picked up by a few woo-ish magazines: mainstream western news seems to have ignored this more or less completely. Maybe the press has got fed up of this particular manufactured controversy, at least for now.

I know I said life was too short...

I wrote a little about a paper by Rutten and Stolper recently published in the amusing pseudo-journal Homeopathy. The paper performed the usual homeopath party trick of throwing incorrect allegations of research misconduct at the Shang et al. meta-analysis of homeopathy that was published in the Lancet, while also engaging in dubious statistical analysis. I've now had a little time to put together something a bit more meaty, with proper references and everything, and send it off as a letter to the editor of Homeopathy. I reproduce the text below.

Rutten and Stolper [1] have conducted a re-analysis of the data used in the landmark Lancet meta-analysis (Shang et al.) [2] of trials of homeopathy and conventional medicine. However, their approach to this work seems to have been influenced by a belief that the Shang analysis was deliberately skewed against homeopathy, and in favour of conventional medicine. I argue here that the evidence does not support that contention, and that the re-analysis by Rutten and Stolper does not show that the Shang et al. study was invalid.

Rationale for the re-analysis

In the abstract of their paper, Rutten and Stolper state “There is a discrepancy between the outcome of a meta-analysis published in 1997 of 89 trials of homeopathy by Linde et al and an analysis of 110 trials by Shang et al published in 2005, these reached opposite conclusions”, and on page 170 they write “The contradiction between Linde's conclusion based on 89 trials, and Shang et al's conclusion, based on 110 trials seems odd”. But there is nothing particularly surprising about this discrepancy. The Linde paper referred to was published in the Lancet in 1997 [3]. The same team re-analysed the data in a paper published in 1999 [4]. They concluded that because trials of higher methodological quality had smaller effect sizes, and that because a number of newly published high-quality trials showed negative results for homeopathy, their meta-analysis had over-estimated the effectiveness of homeopathy. Hence there is no reason to see to the discrepancy between Shang et al. and Linde et al. (1997) as being particularly “odd”.

Trial quality

Rutten and Stolper make statements about the “pre-specified hypotheses” of the Shang et al. study, but these are not consistent through the paper. In the introduction, they state:

The hypotheses predefined mentioned in the introduction of Shang et al's paper were: ‘Bias in conduct and reporting of trials is a possible explanation for positive findings of placebo-controlled trials of both homeopathy and allopathy (conventional medicine)’; and: ‘These biases are more likely to affect small than large studies; the smaller a study, the larger the treatment effect necessary for the results to be statistically significant, whereas large studies are more likely to be of high methodological quality and published even if their results are negative’.”

Yet, in Rutter and Stolper’s section on “Pre-specified hypotheses” they include “quality in homeopathy is worse than in conventional medicine” as a hypothesis of Shang et al., and say that this hypothesis was falsified in the Shang et al. study. This is a straw man: it is not a hypothesis that was discussed in the Shang et al. paper, and Rutten and Stolper have missed the point of including a matched set of trials of conventional medicine. As Rutten and Stolper state (p. 170) “Pooling of results is…questionable if homeopathy works for some conditions and not for others”. This is a reasonable point. However, it is clear that some experimental conventional treatments work and some do not. The results of the analysis of conventional medicine were not consistent with the placebo hypothesis, showing that it is possible to obtain a positive result using the methods of Shang et al., even there is considerable heterogeneity in the results [5].

Post-hoc analysis?

Rutten and Stolper make the claim that the sub-sets of larger, higher quality studies were chosen post-hoc, presumably to make homeopathy appear less effective than it really is. In their paper, Rutten and Stolper state [p. 172-173]:

Cut-off values for sample size were not mentioned or explained in Shang el al's [sic] analysis. Why were eight homeopathy trials compared with six conventional trials? Was this choice predefined or post-hoc? Post-publication data showed that cut-off values for larger higher quality studies differed between the two groups. In the homeopathy group the cut-off value was n = 98, including eight trials (38% of the higher quality trials). The cut-off value for larger conventional studies in this analysis was n = 146, including six trials (66% of the higher quality trials). These cut-off values were considerably above the median sample size of 65. There were 31 homeopathy trials larger than the homeopathy cut-off value and 24 conventional trials larger than the conventional cut-off value. We can think of no criterion that could be common to the two cut-off values. This suggests that this choice was post-hoc.”

The first thing to note is that it is not true that cut-off values for sample size were not mentioned or explained in the Shang et al. analysis. In the original Shang paper, on page 728, it is stated that “Trials with SE [standard error] in the lowest quartile were defined as larger trials”. In other words, the cut-off was not defined in terms of numbers of subjects, but in terms of standard error. It might be argued that this is a strange way of defining “larger” trials (and perhaps it should have been phrased as “lower standard error”). But it makes sense when criteria must be stated a priori. If a number of subjects were stated as a cut-off value, there would be no way of knowing how many studies would meet that criterion before looking at the data. You might find that a very large or very small number of studies met the criterion, making further analysis difficult. So, there is no mystery as to why the “cut-off values” were different between trials of homeopathy and trials of conventional medicine: it is because the distribution of standard errors is different between the two populations. This could be discovered simply by reading the original paper, and the conclusion that the groups were chosen post-hoc cannot be sustained.

A further point here is that the group of “larger” homeopathy trials contains smaller trials that would not have made the cut for “larger” trials in the conventional medicine group. Those smaller trials are more likely to show spurious positive results. It follows that had the authors engineered the groups to get the result they wanted, they had engineered them in favour of homeopathy.

Another paragraph in Rutten and Stolper states “We did not further investigate possible selection bias by excluding trials, but we were surprised by the exclusion of Wiesenauer's trial on chronic polyarthritis. This was a larger trial (n = 176), of good quality according to Linde, with positive results. This trial would have contributed positively to the outcome of the larger higher quality trials. Shang excluded this trial because no matching trial could be found” (page 171). Since the trial was excluded on the basis of the clearly stated, pre-specified exclusion criteria, what is surprising about it having been excluded? Including it would have made a nonsense of the design of the study and violated the pre-specified exclusion criteria, and would have been a gross error.

Another possible outcome?

Rutten and Stolper conduct a sensitivity analysis, but, as they note, the decisions they make in this analysis are highly subjective. They decide to exclude all trials of homeopathy for muscle soreness [6-9], on the grounds that “treatment of healthy individuals is very rare in homeopathic practice [and] this outcome has low external validity to judge the effect of homeopathy as a method” (page 173). Yet, the trials were conducted with the participation of prominent homeopaths, and some were published in homeopathic or alternative medicine journals [8, 9], so at least some homeopaths seem to be of the opinion that there is enough external validity for it to be worth conducting a trial. So how can the external validity of the trials be judged in a transparent way? In a meta-analysis based on clear, pre-specified criteria, there could be no justification for omitting the studies.

It is also notable that one of the authors was a co-author of another re-analysis published in the Journal of Clinical Epidemiology [10]. That analysis showed that if random-effects meta-analysis is used, it is possible to add smaller trials to Shang’s set of “larger, higher quality” trials of homeopathy, and get a statistically significant (although clinically unimpressive) benefit for homeopathy. All this really shows is that a finding in favour of homeopathy is not robust, and as Shang et al. showed, including smaller trials also decreases the reliability of the findings. The re-analysis also showed that the benefit for homeopathy was statistically insignificant when a meta-regression analysis was used: this negative finding was strangely not mentioned in the Homeopathy paper. Because the results differed between meta-regression and random-effects analyses, and because Shang et al. showed highly significant evidence of funnel-plot asymmetry in their complete dataset of 110 trials of homeopathy, it is arguable that meta-regression analysis is a more appropriate choice.

Overall, it is clear that “another outcome” (i.e. one favourable to homeopathy) is possible, as long as negative studies are excluded without good reason, smaller and less reliable studies are included, and a particular method of statistical analysis is used. In a paper that (wrongly) criticises a study for analysing data based on criteria established post-hoc, this seems like an odd point to make.


The analysis by Rutten and Stolper contains misconceptions of Shang et al., contains some important errors, and does not show that the Shang et al. study was an invalid analysis. In particular, there is no evidence that the Shang et al. study involved post-hoc choice of subgroups. The results of meta-analyses can be debated, but scientists should not be accused of research misconduct on the basis of no evidence, or on the basis of having failed to read their work properly.


1. Rutten ALB and Stolper CF. The 2005 meta-analysis of homeopathy: the importance of post-publication data. Homeopathy 2008; 97: 169-177.

2. Shang A, Huwiler-Müntener K, Nartey L et al. Are the clinical effects of homeopathy placebo effects? Comparative study of placebo-controlled trials of homeopathy and allopathy, Lancet 2005; 366: 726–732.

3. Linde K, Clausius N, Ramirez G et al. Are the clinical effects of homeopathy placebo effects? A meta-analysis of placebo-controlled trials, Lancet (1997); 350: 834–843.

4. K. Linde K, Scholz M, Ramirez G, Clausius N, Melchart D, Jonas WB. Impact of study quality on outcome in placebo-controlled trials of homeopathy, J Clin Epidemiol 1999; 52: 631–36.

5. Shang A, Jüni P, Sterne JAC, Huwiler-Müntener K, Egger M. Are the clinical effects of homeopathy placebo effects? A meta-analysis of placebo-controlled trials: Author’s reply, Lancet 2005; 366: 2083-2084

6. Vickers AJ, Fisher P, Wyllie SE, Rees R. Homeopathic Arnica 30X is ineffective for muscle soreness after long-distance running – A randomized, double-blind, placebo-controlled trial. Clin J Pain 1998; 14: 227–231.

7. Vickers AJ, Fisher P, Smith C, Wyllie SE, Lewith GT. Homoeopathy for delayed onset muscle soreness - A randomised double blind placebo controlled trial. Brit J Sports Med 1997; 31: 304–307.

8. Jawara N, Lewith GT, Vickers AJ, Mullee MA, Smith C. Homoeopathic Arnica and Rhus toxicodendron for delayed onset muscle soreness - A pilot for a randomized, double-blind, placebo-controlled trial. Brit Hom J 1997; 86: 10–15.

9. Tveiten D, Bruset S, Borchgrevink CF, Norseth J. Effects of the homeopathic remedy Arnica D30 on marathon runners: A randomized, double-blind study during the 1995 Oslo Marathon. Complement Ther Med 1998; 6(2): 71–74.

10. Lüdtke R, Rutten ALB. The conclusions on the effectiveness of homeopathy highly depend on the set of analyzed trials. J Clin Epidemiol 2008; 61: 1197-1204

Thursday 6 November 2008

Journal of Structural Geology paper

If anyone is interested in the paper I recently got accepted in the Journal of Structural Geology, there is a short summary of it here...

Hazel Blears talks rubbish about the blogosphere

I was struck by this article in yesterday's Grauniad. Hazel Blears, the MP for Salford and communities minister (who I incidentally saw having a swift half in Manchester's City Arms the other day), weighs in on the culture of "corrosive cynicism" which is supposedly damaging political discourse in the UK. This, of course, is all the fault of the media, and in particular the blogosphere. The following quote is from an address that Blears is giving today at a Hansard Society conference on growing political disengagement in Britain:

"Perhaps because of the nature of the technology, there is a tendency for political blogs to have a 'Samizdat' style. The most popular blogs are rightwing, ranging from the considered Tory views of Iain Dale, to the vicious nihilism of Guido Fawkes. Perhaps this is simply anti-establishment. Blogs have only existed under a Labour government. Perhaps if there was a Tory government, all the leading blogs would be left-of-centre? But mostly, political blogs are written by people with disdain for the political system and politicians, who see their function as unearthing scandals, conspiracies and perceived hypocrisy. Until political blogging 'adds value' to our political culture, by allowing new voices, ideas and legitimate protest and challenge, and until the mainstream media reports politics in a calmer, more responsible manner, it will continue to fuel a culture of cynicism and despair."
Now, I don't want to claim that all of what Blears says here is total nonsense. Clearly the media in general does have a lot to answer for. But perhaps there could be other reasons for political disengagement in the UK? I can think of a handful off the top of my head.

1. The growing reliance of the Labour party on rich donors, which has led to a number of scandals, including the "cash for honours" affair. The corollary of this is a decrease in party membership, an erosion of internal party democracy, a lack of connection between the party grassroots and the government, and a perception amongst the electorate that the government is corrupt.

2. The Iraq war, opposed by a large proportion of the population, and launched on the basis of statements that were not true (whether or not they were strictly "lies"), has now led to the deaths of more than 175 British soldiers and serious injuries to many more. This is aside from the civilian death toll in Iraq, the best that can be said of which is that we don't know what it is, but it is a hell of a lot.

3. A failure to apply appropriate regulation to financial markets allowed the inflation of a credit and asset bubble which has now burst, and will lead to perhaps hundreds of thousands of people losing their jobs and their homes.

4. A massive proliferation in anti-terror laws, which despite re-assurances when they were brought in, are now being used to suppress legitimate protest and freeze the assets of Icelandic banks.

No doubt you can think of a few more. So perhaps this corrosive cynicism has as much to do with the cynicism and incompetence of the government as that of the media and the blogosphere?

Quite apart from that, it is interesting to look at what Blears says about the blogosphere. She says that political bloggers "see their function as unearthing scandals, conspiracies and perceived hypocrisy", and then adds that "until political blogging 'adds value' to our political will continue to fuel a culture of cynicism and despair." To me, unearthing scandals does add value to our political culture. Or should government wrong-doing just be hidden? An important aspect of democratic government is that those who govern us can be held to account. That can't happen if no-one knows what they're up to.

It's obviously true that a large number of blogs are dreadful and useless. But it's equally true that many are valuable. By indulging in a rant against the media and the blogosphere, without addressing the contribution of government corruption and incompetence, Blears is not doing much to further the debate on political disengagement.