Posted by: thealienist | August 1, 2011

Critique of the Double-Blind, Randomized, Controlled Trial

The crown jewel of biological psychiatry is the randomized, controlled trial.  If it is well-constructed, it can be simple, elegant, and powerful.  But in assessing medications, it can have some problems.

How can we assess the value of a tool independently of the one who uses it?  If I give golf clubs to 100 people, the better golfers will use them better and have better scores.  If I give paintbrushes to 100 people, the better artists will make better paintings.  If I give hammers to 100 people, the better craftsmen will make better structures.  What will happen if I give better medications to 100 doctors?The randomized, controlled trial is designed to determine how efficacious medications are.  By randomizing subjects into the various groups (control, drug 1, drug 2, etc.) all the groups should be identical (or at least similar enough) so that they would respond to a given drug the same way.  This allows us to compare the responses between groups.  The use of a placebo (a treatment with no known efficacy) allows us to see what the effects of the treatment setting would be.  By making the medication the only thing that varies between groups, we can isolate the effect of the medication.  If the study participants and researchers are blind with respect to which patients get which medication, then there is reduced bias in the study, since patients and researchers will not be tempted to exaggerate the effect of the drug being studied.  All of this makes logical sense, so what could be a problem?

Well, there has been a lot written lately about the lack of effectiveness of antidepressants.  This has led some to criticize their continued widespread use.  But the problem with the antidepressant studies is not that people did not get better; it was that too many people were getting better.  The people who were receiving placebos were also improving, so that it was difficult to tell whether the groups receiving medication were showing additional improvement.

The problem of having too large of a placebo response is not uncommon in some types of research but is especially problematic in studies of mental illness.  In an effort to reduce the placebo response, study designers try to standardize the interactions between researchers and study subjects.  They try to make sure that no non-medicine therapeutic interactions are taking place that might obscure the effect of the medication.

This effort to remove the tool user from the evaluation of the tool is akin to giving a golf pro a new set of clubs and telling him “be sure not to use your skill when you try these clubs.”  How much difference would we expect to see between the old (control) clubs and the new (experimental) clubs?  Would this be a valuable test of the quality of the new clubs?

Now, study designers might protest that if we had a groups of golfers of similar skill and they were given the old and new clubs, then we would be able to see the effects of the clubs.  O.K., but who is assessing the skill of the users?  Who indeed ARE the users?  The physicians are one set of users.  So how do we rate a physician’s skill?  The study subjects are also users.  How do we rate a study subject’s skill?  The physician and the study subjects are expected to work together.  How do we measure their joint skill?  Study designers might try to eliminate the skill of the physician by comparing the improvement when a particular physician prescribes a placebo with when the same physician prescribes the study medication.  However, this still leaves the patient skill and physician-patient interaction unmeasured.

The double-blind, randomized, controlled trial also frequently assumes that the clinical change from medication effects and the clinical change from non-medication effects are independent.  This means that change due to one factor does not affect the change attributable to the other.  The problem is that we do not know if this is true.  If the change due to these two factors overlaps each other, then it is possible that a large placebo response might limit the amount of change that is available to be demonstrated by the medication.  If this proves true, it provides further pressure for study designers to limit the placebo response (and thus remove the tool user from the evaluation of the tool).

It seems to me that double-blind, randomized, controlled trials are best designed to detect harmful effects.  If I give 100 people hammers, their skill will determine the quality of their buildings.  However, I can enumerate the amount of destruction the hammers are capable of by simply noting the destructive events.  With this, I can estimate the likely problems encountered when people of similar background use these hammers.

So, if double-blind, randomized, controlled trials have these problems, why are they being relied on for drug approval?  Simply, because we do not have a better tool.  People will continue to want to know whether medications are worth their time, money, and trouble.  This is currently the best (if still flawed) way of giving an answer.



  1. You got me nervous for a while there, John. At the end, though, you reach the right conclusion: RCTs are the best we can do. In fact, we can do a little better if we use so-called “active placebos”. For example, if in SSRI trials we were able to use a placebo that gave dry mouth or somnolence (or even decreased libido!) we might narrow the magnitude of the therapeutic effect. In fact, Irving Kirsch documents such effects in The Emperor’s New Drugs. Say! Any chance you could review Kirsch?


    • Rob,

      Thanks for your post. I would like to take a good look at Kirsch and see more closely what he has to say. We’ll see if I can get the time to do it. The active placebo experiments are a problem for a reason I think you could appreciate well. If we don’t know exactly what is causing the problem we are treating, it is very dicey trying to pick an “active placebo” that we are sure has no therapeutic effect.


      • Excellent point. This too occurred to me.


  2. There’s a heck of a lot more sources of bias in today’s randomized controlled studies than what you mention. Problems with subject selection, use of symptom checklists instead of clinical interviews, outcome measures, researchers who are paid off etc. When brand-named drugs are compared head to head in a PhARMA-funded study, in 90% of cases, the sponsor’s drug comes out ahead. Try explaining how that is good evidence.

    Rob – I thought you thought that if we don’t kow how a drug works, then an RCT is useless.


    • David,

      I agree. The biases I mentioned were those I see in a WELL-designed RTC. The compromised ones can have many more. (Also, you may even be refering to biases in well-designed studies that I did not see or mention).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s


%d bloggers like this: