« "Small" Financial Market Interventions and Asset Prices | Main | Jonathan Chait on Hillary Clinton: You have sat too long for any good you have been doing lately. Depart, I say; and let us have done with you. In the name of God, go! »

March 08, 2008

Robert Waldmann Has a Big Problem with the Anti-Prozac Meta-Analysis Study

Robert Waldmann has a big problem with and talks back to the anti-Prozac meta-analysis study of Hirsch et al.*.

Mark Liberman summarizes the meta-analysis data in a nice picture:

Language Log: Listening to Prozac, hearing effect sizes

The x's are studies. The vertical axis shows the improvement in mood for people being given the placebo--the sugar pill. The horizontal axis shows the improvement in mood for people being given the antidepressant, both according to the Hamilton Scale of Depression.

People being given the placebo improved their mood a lot--by 7.8 points, which is a relatively big deal on the Hamilton Scale: feeling that you are taking control over your Depression by getting involved in a cutting-edge medical study, the fact that a group of research scientists are paying attention to you, and the passage of time together do a lot of good. But the people being given the actual anti-depressants improved their mood by even more. Let's turn the mike over to Hirsch et al.:

[W]eighted mean improvement was 9.60 points on the HRSD in the drug groups and 7.80 in the placebo groups, yielding a mean drug-placebo difference of 1.80 on [Hamilton] improvement scores.... [which] easily attained statistical significance [at the 0.001 level, in fact--much better than the 0.05 level]...

Subjects given Prozac improved their mood by an extra 1.8 points on the Hamilton scale. This difference is not due to chance sampling error--it is, statistically, very significant. The pills are really cheap to make. There is an upside. Better Living Through Chemistry.

So what's the problem with Prozac? The problem, according to Hirsch et al., is that the difference of 1.8 points on the Hamilton Scale:

does not meet the three-point drug–placebo criterion for clinical significance used by NICE [Britain's National Institute for Health and Clinical Excellence]...

Where does this requirement that no therapy for Depression is worthwhile unless it improves the Hamilton Scale score by three points come from? The weblog "Pyjamas in Bananas" finds a quote:

Pyjamas in Bananas: No research evidence or consensus is available about what constitutes a clinically meaningful difference in Hamilton scores, but it seems unlikely that a difference of less than 2 points could be considered meaningful. NICE required a difference of at least 3 points as the criterion for clinical importance but gave no justification for this figure...

Who wrote this? Irving Kirsch, lead author on the anti-Prozac study.

And it is at this point that the economist in me wants to reach for his revolver. A declaration that a real-world solid statistically-significant improvement in people's quality of life is not "clinically significant" is inadmissable unless it is motivated by a proper analysis of opportunity costs: a conclusion that the resources devoted to this therapy would have a higher value and a better alternative use in some other therapy. It cannot rest on an arbitary number that some organization pulls out of its a--.

Even worse, Robert Waldmann points out, is that the Guardian's health editor Sarah Boseley doesn't understand the article she is reporting on:

Prozac, used by 40m people, does not work say scientists: Analysis of unseen trials and other data concludes it is no better than placebo: Prozac, the bestselling antidepressant taken by 40 million people worldwide, does not work and nor do similar drugs in the same class, according to a major review released today.... When all the data was pulled together, it appeared that patients had improved - but those on placebo improved just as much as those on the drugs...

Waldmann comments that Boseley is:

totally dishonest, totally innumerate or both. 1.8 > 0. Patients on Placebo did not improve just as much as patients on SSRI's... this isn't even a case of treating a statistically insignificant difference... as... proof that the true value is zero.... "Irving Kirsch, Brett J. Deacon, Tania B. Huedo-Medina, Alan Scoboria, Thomas J. Moore & Blair T. Johnson" find a significant additional benefit of taking a SSRI rejecting the null of no benefit with a p value of "<0.001"... overwhelmingly strong evidence that SSRI's cause improvement in depression.... Oddly big Pharma, which spends huge amounts of money on advertising, doesn't seem to have managed to hire anyone intelligent enough to point out that 1.8 > 0...

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Is the 1.8 improvement counterbalance the side affects of Prozac? This is above and beyond the qn of being better than a placebo, which likely have no side affects

Thanks for linking. You post is excellent. How did you find pyjamas in bananas?

[Google, of course]

How did he she (or they I guess) find that great quote? [I have no clue. It's not on the net...]

Glancing at the Hamilton scale (warning a pdf and not for self diagnosis)
http://healthnet.umassmed.edu/mhealth/HAMD.pdf
I notice that one way to get a 3 point improvement is to give the same answers on the other questions and on question 3 (suicide) move from

"suicidal ideas or gesture" (worth 3 points)
past
"Wishes he were dead or any thoughts of possible death to self" (2 point)
past
"Feels life not worth living."
to
absent(worth zero points).

According to CHILD, for an improvement to be beyond borderline clinically significant, if there are on average no other improvements, the patient would have to move from a at least one "serious" attempt to nothing like suicidal thoughts. A mere suicidal gesture would not be enough.

Merely moving from thinking life is not worth living to thinking life is worth living wouldn't come close to clinical significance, if that were the only change (or if other changes were balanced by the patient reporting a depressed mood without questioning when, in the first interview, the patient said nothing until questioned).

Clearly the level of 3 for clinical significance is mixing concern about the magnitude of the change with a concern about meaningless fluctuation do to the measurement method (an interview). CHILD has no idea that the second problem can be addressed by using a large sample.


I'd guess that a large part of the positive response in the patients who received the Placebo is just natural remission of depression. That is it would have happened even if they had not participated in the study. It is a comparison of the score after the Placebo treatment to the score at the beginning of the trial.

To get the true placebo effect, one would have to compare this to the change in score over the same period of people who weren't part of a study of a new drug. I'm sure data on the normal course of depression are available somewhere. I'm sure the meta analysis would be tricky (I would guess the period of treatment before the second measurement of depression varied across studies).

Rob

Which side effects of Prozac worry you ? The needing less sleep or the losing weight ?

There were effective anti-depressants before Prozac. Prozac became a mega selling drug, because people like the side effects (the older anti depressants made patients eat and sleep a lot).

Side effects are not necessarily bad. Taking an occasional aspirin for a headache reduces the risk of heart attack. Certainly the presumption should be that powerful chemicals are dangerous and to be avoided until they are tested. Prozac has now been beta tested by tens of millions (many of whom probably weren't depressed to begin with it and just wanted to lose some weight).

> A declaration that a real-world solid statistically-significant
> improvement in people's quality of life is not "clinically
> significant" is inadmissable unless it is motivated by a
> proper analysis of opportunity costs: a conclusion that
> the resources devoted to this therapy would have a higher
> value and a better alternative use in some other therapy.

Rob already got to it above, but anti-depressant drugs have significant side effects that have to be taken into account in the analysis. Of course these side effects are poorly understood and usually ignored.

Cranky

The Last Psychiatrist says that the reason for the publicity these new studies are receiving is that Pharma wants people to switch to anti-psychotics (are these higher profit?)

http://thelastpsychiatrist.com/2008/02/yet_another_study_on_antidepre.html

"There were effective anti-depressants before Prozac."

My word. You so don't know what you are talking about. There are two main families: tricyclics (now hetrocyclics) and MAOIs. Both made people so miserable that they were only used by desperate people, and the MAOIs had the exceptionally troublesome side effect of making patients vulnerable to stroke after eating some common foods (among others chocolate and aged cheeses), which is an enormous risk for a potentially suicidal patient.

Sheesh.

How much of the placebo improvement was caused by regression to the mean, operating over time? After all, both groups (placebo and Prozac) started out depressed. This would, I think, be the medical equivalent of Horace Secrist's _The Triumph of Mediocrity in Business_; ably discussed by Stephen Stigler in his article "The History of Statistics in 1933".

"Is the 1.8 improvement counterbalance the side affects of Prozac? "

Like many things in life, it depends. 1.8 points on the right part of the scale can take you from being able to function at a job (or not caring enough to bother) to being able to do so. That's a huge win right there for many people, who rely on their income to keep a roof over their head and food on the table.

I'll take a flatlined libido over not being able to make a decision without flipping a coin any day (not a hypothetical).

Robert Waldmann hit what I think is the relevant point here. Kirsch suggests that the variance in Hamilton scores is something around 2 points. What that side seems to be missing, however, is that taking a large number of samples, under standard conditions, will beat down the variance and provide a much better estimate, so that the 9.6 and 7.8 can have a difference which is statistically significant, even if on one test it's hard to tell if an individual is at 7.8 or 9.6.

Yeah, you gotta remember that 1.8 is the average improvement attributable to the Prozac. The effect on some people will be much larger.

The poor reporting of this study is not just sensationalism on the journalists' part - as pointed out at http://tugboatpotemkin.blogspot.com/2008/02/prozac-vs-placebo-or-teh-spin-vs.html - the report's summary by the authors was confused at best, and mendacious at worst.

It seems to me that it is Waldman who is not understanding neurology rather than NICE that is not understanding math. Go talk to any neurologist and they'll tell you: 3 points is too little to matter. Most neurologists only consider a treatment for depression successful when a person's score has been reduced by 50%. 3 points is a very liberal measure of minimal improvement. And arbitrariness is inescapable when you have to convert a continuum into binary.

There is a second point that Waldman seems to have missed entirely, and it is a math point: this difference in group mean covers a huge amount of variation. By disaggregating the scores with analysis, the article shows that for most depressives, from mild to severe (>22 on the HRSD) the improvement is even smaller than 1.8. The mean scores are pulled up by the larger effect that it has on people who are at the upper end of severe depression (>27). Those patients did have improvements that were greater than the minimal NICE measure of 3 points. It is for that reason that the Kirsh study suggested that treatment recommending continue treatment for them.

But this brings us to the third issue, with is kind of mind blowing, and which no one seems to be addressing: the only reason the score differential went for the severely depressed was because their reaction to the placebo went down, and not because the drug was more effective.

Now at the first sight, this seems to make perfect sense: if placebos depend on expectations and hope, you should expect to see less of a placebo effect in people who feel hopeless and affectless, right? Except for one thing: the relation between placebo effect and drug effect has always been conceived of additively. That is, if the placebo effect improves the average score 7.8 points, and the drug group gets a score of 9.6, we assume that 7.8 of that score is the placebo effect, and the extra 1.8 is the drug effect. That makes perfect sense.

But what doesn't make sense is a scenario in which severely depressed people get the same 9.6 score for the drugs but a 6.2 for the placebo group. And yet using illustratively crude numbers, that's exactly what happened. How to explain it *if the placebo effect is assumed to be the same in both groups?* Are we saying the drug works more if the placebo effect can be lowered? That can't be true if its additive. So what is their relation? There's probably a concept that would answer this, but I don't know it. I can't come up with any physiological or psychological framework that makes sense.

And this brings us to the next point: the placebo effect is the cornerstone of modern drug science. But we have few studies that address is directly and almost no theory. And worst of all, our standard operating procedure is explicitly designed to minimize it. For example in the studies that were meta-studied here, all the subjects went through a 1-2 week "washout" period where everyone was given a placebo. Subjects who exhibited a 20% improvement were then removed from the study. That makes sense if you're not interested in the placebo effect and regard it as noise. But if you really want to see it's effect and how it interacts with the drug, then those people have to be included. It might not change the group means (since if they are randomly distributed and under current theory they are getting the same placebo effect whether they take the placebo or not). But then again it might, if the placebo effect indeed turns out not to be additive. And of course people like this aren't weeded out of the actual patients treated. All those large placebo responders in the normal population have improvements that are credited to the drug.

And this bring us to the last point that is missing if we really want to the understand placebo effect: you need a second control group of people who aren't in the study, and who report their scores at the beginning and end of the same period of time. Because much more important than the narrow difference between the placebo and the drug is the fact that basically everybody in these studies got well -- all the people in the drug group and all the people in the placebo group. And we have no way of comparing that to what happens on average to a similar group of people over the 4-8 week period that these studies take. If most of them showed zero improvement, then that placebo effect is some amazing thing: it cures depression. It's not a fake thing to be dismissed. It's a real thing that's effective. Conversely, and if most of them showed substantial improvement period, then much of what we're attributing to the placebo effect shouldn't be.

In short, the theory of this is still in its infancy. And I rather suspect it won't ever get any farther. Because who would we be helping by unmasking a placebo effect that cures people? Removing their cure? Because one thing we can probably all agree on is that for a placebo effect to work, people have to believe in it. And for people with our upbringing to believe in a medical treatment, it's got to be based on science, wrapped in technology and cost an appreciable amount of money. In other words: exactly what we already have. Whatever it's mechanism, it cures depression in lots of people. Why would we want to take that cure away from them if we had nothing as effective to offer in its place?

And that's without even counting the opposition of the powers that be.

So depressives who react angrily to any suggestion that the drugs that cured them are placebos are effectively defending their interests. If it's true, they don't want to know, and I don't blame them a bit.

"Because who would we be helping by unmasking a placebo effect that cures people?"

Some of the side effects of these placebos are terrible, and are long lived, and seem to be evidence of real brain changing physical effects. Some of these "placebos" form PHYSICAL addictions.

There is the relatively well known suicidal tendencies. There is the flat-lined libido. In between there are "brain shivers". http://en.wikipedia.org/wiki/Brain_zaps . Vertigo. There is also constant tinnitus, and lots of other crap too.

The brain shivers are no fun whatsoever. And many of the other symptoms are much worse for me, than the disease.

How did I find that Kirsch quote? From the 2005 article "Efficacy of antidepressants in adults" by Moncrieff and Kirsch in the BMJ (it should be free to access):

http://www.bmj.com/cgi/content/full/331/7509/155

If anyone wants to read my views on why this study is a bit more complicated than is being made out, in particular why Kirsch et al have underestimated the effect of anti-depressants because of the statistical methods used, why their recommendations about milder forms of depression are based on minimal evidence, and how looking at the raw HAM-D scores shows that decreasing placebo response does not actually underlie the increasing response to anti-depressants in severe depression, then have a look at this:

http://www.plos.org/cms/node/335#comment-490

***[which] easily attained statistical significance [at the 0.001 level, in fact--much better than the 0.05 level]...*** Hmmmmm. Are these statistical operations on data where the data values are probably non-linear, and quite possibly somewhat arbitrary, valid? My impression is that some analytic procedures are fairly robust, but are numerical values of statistical significance one of them?

Rob wrote: "Is the 1.8 improvement counterbalance the side affects of Prozac?"

http://www.cnn.com/2005/HEALTH/01/03/prozac.documents/index.html

How does a person ending their life change their score on the Hamilton scale?

Does anyone know if this suicide link has been shown to be accurate?

I read that one big problem with the studies is that drug companies screen the subjects to get a subpopulation they know/guess that will respond stronger to the medicine than the population at large.

Now, a chemical imbalance in the brain can equally well be the result or cause of depression. A healthy person changes his or her mood by producing chemicals.

Prozac is not all that cheap, even a generic version cost more than 500 dollars per year. It could be a good policy to try to replace Prozac with "more natural" pills that have no side-effects, or use decreased dosage, and return to the initial dose if the symptoms do not get worse. The problem with the placebo effect is that it takes some effort to create it, e.g. follow-up visits etc., but this effort is probably more important then the medication.

For example, it would make sense to have some kind of psychiatric registered nurse who could see patients 4-6 times a year, and decide what strength of medicine is most appropriate. Even if the most appropriate strength is zero, if it is chosen professionally and is subject of a follow-up evaluation, it could have full placebo strength.

"Prozac is not all that cheap, even a generic version cost more than 500 dollars per year."

Other recent studies show the placebo effect is enhanced by the cost of the drug. http://abcnews.go.com/Health/MindMoodNews/story?id=4386984&page=1

Andrew Weil speaking about placebos:

"Q: What is your view of the placebo effect?

A: I think the placebo effect is really the meat of medicine. Placebo responses are pure healing responses from within that are elicited by belief ... If we could get that by using methods that are less expensive, less invasive, less technological, it seems to me that's the goal we should be working toward."

"How do you respond to critics who say that the anecdotal results you see from some alternative therapies are simply the "placebo effect?"

Well, I think when they say it's placebo, what they're saying is that it has no intrinsic effect, so it's a way of dismissing. I think they're contemptuous of placebo effects, also, but it's a way of saying the therapy is worthless, just people believe in it. And it's odd that at the same time that they say that, they're maintaining the mind has no great influence on the body.

… My vision of this is that you've got an intrinsic effect and then you've got a halo of placebo effect and that the placebo effect is very important. In fact to me the best medicine is getting the maximum placebo response with the minimum intervention. Placebo responses are pure healing responses from within that are elicited by belief. The best medicine is getting the maximum healing response with the least intervention. So we should be finding ways of giving people gentler and gentler treatments or at least the gentlest treatments demanded by the situation and getting the maximum placebo response."

Nothing has no side effects. Even a sugar pill will "cause" some "side" effects in some patients. If you compare the measured benefit of a treatment with that of a placebo, you also need to compare the side effect rate and severity of the treatment with that of a placebo. You can't just assume that the placebo that shows a positive effect would have no side effects.

Presumably a bit of matrix math would spit out the number of lives saved by that 1.8 shift in the depressed population, allowing a bit of estimation of how many people an article like that can kill.

Meanwhile re. piotr's estimate of the cost/year of $500 look quite high, as shown in this chart http://web.wxyz.com/extras/040205-drugchart.html, thought it says something more interesting about the efficiency of markets :)

You cannot extrapolate this study to all SSRIs. Talk to any psychiatrist, or any patient who has tried multiple SSRIs, and you will find out that different SSRIs can vary dramatically with respect to efficacy as well as to side effects.

Did all subjects receive the same time and length of psychotherapy? Or any psychotherapy? Because there are other studies that show that combined psychopharmacological treatment with psychotherapy was most effective.

You can get generic fluoextine at Wal-Mart for $4 for a 30 day supply.

I assume the arbitrary number of 3 is inserted becuase this is where the cost of the side effects fo the meds is outweighed by the benefits. Shouldn't we let the patient decide that?

The comments to this entry are closed.

Follow Me

Get updates on my activity. Follow me on my Profile.

Search Brad DeLong's Website

  •  

Economics Must-Reads

Categories

Support

This Weblog...

Tip Jar

A Rising Sun

  • "I now know it is a rising, not a setting, sun" --Benjamin Franklin, 1787

From Brad DeLong

Graphs

  • Global Warming
    Matthew Yglesias » Yes, The World is Really Getting Warmer
  • The U.S. Federal Budget Deficit
  • Modern Economic Growth Is a Historically Recent Phenomenon
    20090604 issuu Slouching.VI.doc
  • Escape from Malthusland
    20090604 issuu Slouching.VI.doc
  • The TED Spread Normalizes
  • Recovery in the 1930s
    Path Finder
  • Stock Market: The Graham Ratio
    Path Finder
  • Employment-to-Population
    Path Finder
  • GDP Growth
    Path Finder

Egregious Moderation

Shrillblog