Large medical studies are not a good idea.

on Thursday, March 1, 2012

In response to NoahSD's excellent blog-post, where Noah first makes clear that the human body is wondrously complex, and we understand relatively little about its intricate, convoluted inner-workings, and explains that large, randomized, double-blind studies are our best hope in separating effective medical treatments from placebo-inducing quackery. Notably, he compares self-experimentation and anecdotal evidence to a poker player that wins after a short session (statistically meaningless sample) and concludes that his playing style is profitable. I have some objections to these lines of reasoning:

I have a few criticisms. Large, randomized, double-blind studies would be a good idea only if all human beings were sufficiently identical. Large studies are best suited to identical machines, but they are poorly suited to test machines (or animals) with a large variability. (This is one reason that in animal studies, rats with identical genes are tested.)

In examining exactly why large trials are not a good fit for human research, it is important to keep the stated goal of the AMA in the forefront of our awareness. They control what is taught at medical schools, and their goal is to increase the wages of medical doctors. Objectively speaking, what is the best way to increase the wages of doctors? As I see it, there are a few ways, but the subject is so tainted with propaganda and morality that it is almost impossible to consider them in any even remotely objective manner. To make this more palatable, let us consider an analogy; imagine that patients are in fact automobiles, and doctors are auto-mechanics:

1. Is it a good idea to accept only the treatments that work for all cars, in general? Or is it best to tailor repairs (treatments) to each individual type of car? A study that finds 5W-30 motor-oil as most effective for preventing engine wear would not apply to engines designed to accept a different grade, even if large studies show that the oil is effective for 75% of engines. We make a fundamental error in assuming that all human bodies are the same machine. They could easily be as different (and similar) as the myriad makes and models of modern cars. Because we are all “human” does not, per se, make us the same, any more than all “cars” require the same replacement parts.

2. If mechanics were somehow able to garner a monopoly on the education of auto-mechanics, with their only goal being to (regardless of morality) increase their wages, do you imagine that they would train mechanics to teach their customers to avoid riding their brakes? Would they train them to change their oil regularly? Would they train mechanics to instruct their customers to fuel their cars with high-quality gasoline of the correct octane level? I think not–I think, were their goal to make money, they would focus only on repairs necessary, and ignore any and all preventative measures.

New mechanics would come to see their jobs as repair specialists, only, with no obligation to inquire as to why some customers’ brake-pads wore out prematurely. Instead they would blame it on genetics, or manufacturing defects (make and model).

Additionally, if consumers began experimenting on their own, and anecdotally discovered that using 10W-40 as the engine became older helped to prevent engine-death, would mechanics listen, or would they tell customers that such evidence is not valid unless millions of old cars are tested? And would there be any encouragement or support from professional mechanics to test this theory? Why would there be? Keeping engines alive longer with a simple solution lowers the income of auto mechanics–they won’t be repairing and replacing as many engines. The error here is three-fold:

1. There is an assumption that all people are (mostly) identical. The truth may well be that we are widely divergent (just like cars) and huge studies do indeed identify workable cures, but do not find the best cure. They only find cures that work for all makes and models, ignoring repairs specific to individuals.

2. There is an assumption that the AMA’s stated goal (increasing wages for doctors) is not a reality; assuming that morality comes first in setting the course for future doctors.

3. There is an assumption that any attention will be paid to cheap solutions, or preventative measures.

Additionally, the analogy to poker is not appropriate, because the edge enjoyed by a drug in changing the outcome of the experiment is many magnitudes greater than a professional poker player.

A good poker player may enjoy a 3% edge in affecting the outcome of a single hand; a good drug (or treatment) may enjoy a 90% chance in successfully affecting the health or mood of a random human body.
In this way, we can see that even a single person, experimenting with a huge edge can arrive at statistically significant results. And the confounding variable involved in self-experimentation can be mitigated to a very large degree. Generally, when one thinks about natural remedies and self-medication with herbs or supplements, we imagine that the experimenter expects the remedies to work. That is not necessarily the case.

Most often, I think, we don’t really expect herbs and supplements to work, and are surprised when they do. (This has been my experience; most don’t work). In this way we can see that the placebo effect should not be expected to confound results at all; if anything we should see a nocebo effect, where the efficacy of treatments is negatively impacted, because natural cures are generally expected to be ineffective.

"Large sample size, double blind, placebo controlled, randomized. If something is proven by a study that has all of those characteristics, it’s probably true."

If you took a random sample of one million cars that all suffered from the same problem; perhaps they all have wheels that have fallen off unexpectedly, and tried to cure the problem with a new drug, a certain type of improved lug nut, what do you imagine we would be able to determine from this study?

Even if the mechanics installing the new lug nuts don’t know which are the new type and which are standard, control lug nuts, and if the cars are sufficiently randomized into two statistically identical groups, we should expect to find that the treatment is effective for some percentage of cars–they no longer experience the original problem (wheels falling off).

But for the remainder of cars, they will be no better off than before. Perhaps their wheels will fall off even more often, because the new lugs fit normal cars but not Ferraris and BMWs.

We could end up finding a treatment that improves the problem for say, 60% of cars (patients), but will have failed to understand the problem. And this is what we see, today, with many treatments used in modern medicine. If one antibiotic doesn’t work, try another, and another, and another, until the infection goes away. If it doesn’t go away, we can’t help you. Same for antidepressant drugs, which are foisted upon patients in a very similar manner.

This is a bit like throwing lug nuts at automobiles with loose wheels. It may eventually work, but it is much less effective than examining the wheels, determining which size they require, and installing the proper hardware.

Large scale studies are doing us a disservice in this way, by moving away from personalized solutions, moving away from truly understanding the problem, and moving away from innovative solutions that are custom tailored to individuals. The situation is further complicated if the best lug nuts for are standard issue and available ubiquitously, and the improved lugs are titanium or rare earth metals that are patented and cost much more. In the later case, the financially rewarding path for drug companies (lug manufacturers) would be to promote the expensive, patented solution and to ignore the cheap, ubiquitous solution. I think this is what happens all too often today; there are simple solutions to many medical problems that are largely ignored by the medical community, most often because there exist no large, randomized trials that support them. Large trials are not necessary in generating innovative solutions to medical problems, and in my opinion, can easy be counter-productive.

If doctors are prone to throwing random lug-nuts at broken cars visiting their shops, why do we consider it ignorant for common people to begin throwing random treatments at their own bodies, especially when they are harmless, most don't work, solutions are often discovered, and statistically valid results can often be obtained, with limited confounding variables?