Virtual Mentor. March 2012, Volume 14, Number 3: 227-231.
State of the Art and Science
The sham effect is alive and well in surgery today. More surgical procedures should be compared to sham controls, or the closest thing we can devise, to help us identify procedures of little or no intrinsic merit.
Richard J. Rohrer, MD
Those of us who work in transplantation are not prone to existential angst. Heart and liver transplants save lives where no other viable option exists. Kidney transplantation has proven itself time and again to add years and quality to the lives of renal failure patients, and in a highly cost-effective way. Aside from occasional concerns over comorbidities and patient selection, we rarely think twice about offering our services. And yet, humility is a virtue, even for us.
A few years back I was attracted to a review of the book Charlatan, by Pope Brock . It recounts the story of a quack surgeon from the 1920s, one John R. Brinkley. His signature operation was testicular xenografting (goat donors, human recipients), making him, I suppose, a pioneer of transplant surgery. The book tells the tale of how he was finally taken down by the legendary Morris Fishbein, editor of the Journal of the American Medical Association.
A saga like this leads a thoughtful person to many different considerations. One such consideration for me was how we ever really know what a surgery accomplishes. Surely the claims of individual surgeons and the testimonials of selected patients are inadequate, and they may be positively misleading. Even that staple of the surgery literature, the case series—whether single-center or multi-institutional—commonly suffers from selection bias in design and groupthink in analysis.
The sources of bias in studies of surgery are legion. In addition to the straightforward patient selection bias inherent in most of them, there is a variety of more subtle forms of bias to consider. The Hawthorne effect describes changes in general behavior (and in the care of a control group, if any) that are related to participation in the study rather than to the intervention itself . The Pygmalion effect describes how investigators are predisposed to see the outcome they seek, even if it is objectively absent . The Will Rogers effect (“When the Okies left Oklahoma for California, they raised the average intelligence of both states.”) describes a unique but not uncommon form of allocation bias (i.e., how patients are assigned to diagnostic groups or stages). And publication bias is everywhere; trials with positive outcomes are 3-8 times more likely to be published than trials with negative outcomes .
Moving beyond the literature to everyday practice, even more elements of bias come into play. Action bias—“Don’t just stand there, do something”—can be particularly hard to resist . It may take various forms: “Dr. Jones sent me this patient, so she must want me to operate”; “The procedure may be a bit hard to justify in this patient, but everybody’s doing it”; or even “If I don’t operate on this patient, someone else will.” Provider bias—“If all you have is a hammer, the whole world looks like nails”—naturally influences the specifics of any recommended procedure, particularly in this age of rapidly evolving technology (and individual surgical skill sets that may not have kept pace). And we are disingenuous if we don’t acknowledge the timeless effect of straightforward economic bias in surgical practice. As Rene Descartes said in the seventeenth century, “A man is incapable of understanding any argument that interferes with his revenue.”
Randomized, controlled, double-blind studies go a long way toward answering the question of how we really know what a surgery accomplishes. But an observer of the literature immediately notices a few problems. First, and most conspicuous, there are very few of them, and blinding is difficult. In addition, they are often statistically underpowered, and what’s more, they are rarely repeated by another group for confirmation. But perhaps even more daunting is the fact that the control arm of these studies is usually some other mode of surgery, which is itself untested in the first instance. That is to say, sham-controlled surgical trials are rare.
Placebo-controlled trials are well-known in pharmaceutical studies (though even there, they are not the rule). It is at least easy to conceptualize how a “sugar pill” can be used to create a control arm for a study of, say, a new antihypertensive medication. Sham “placebo” surgery controls (as opposed to sham “bogus” surgery, like goat-testicle transplants) are another matter. The sham control patient would at least need anesthesia and an incision somewhere, and that would seem to be simple enough in principle. But it is highly dependent upon the specific surgery and may not be logically possible. For example, if I want to study arteriovenous fistulae for patients heading onto dialysis, including a sham control group in my study would make no sense, since there is no way for high-volume vessels to spontaneously appear on the arm of a patient who had just a skin incision and nothing more.
Even if a sham control is logically possible, it may not be practical. Though you might plausibly design a sham control for a study of amputation for rest pain (due to ischemia of the lower leg)—one group gets a below-knee amputation, the other just anesthesia and a circumferential incision—it would be impractical to conceal the outcome from the patient, or anyone else for that matter. Finally, even if a sham control group were both logical and practical, it may not be ethical. There will never be a sham control group to evaluate surgery for colon obstruction, since it could never be ethical to leave patients with colons that remain obstructed, not to mention putting them through the risks attendant on anesthesia and the incision. And sham surgery in organ transplant would seem to be equally difficult to justify on ethical grounds.
Furthermore, we surgeons (and proceduralists in general) have a fundamental problem with the null hypothesis that is implicit in a sham-controlled study. We believe in our operations. This is quite natural: we live and breathe in a world where we actually do things to people. When we make errors, they are usually errors of commission, which contrast qualitatively with the errors of omission that are seen among our medical (i.e., nonproceduralist) brethren. When we stick a knife or a needle into a patient, we are carried along by confidence that the risk-benefit calculation for this patient favors action, and the same mindset infuses both pre-op preparation and post-op management. If it were otherwise, we’d risk a kind of psychological inertia approaching paralysis. We are only human, and though we live in the twenty-first century, we have Stone Age brains, which benefit from overriding confidence (captured in the aphorism “often wrong, but never in doubt”).
So it should come as no surprise that sham-controlled studies of surgery are rare . In fact, there are only a dozen or so, and most of them involve what might better be described as “minimally invasive procedures” than “traditional surgeries.” The classic is a study of internal mammary artery ligation for angina pectoris, by Cobb and colleagues from 1959 . At the time it was thought that ligation of the distal internal mammary arteries might increase collateral blood flow to the ischemic heart. All patients underwent dissection and encircling of the internal mammary arteries, but then subjects were randomized into trial and control groups, and only half had their arteries ligated. The postoperative angina and performance metrics of both groups improved equally. This, of course, led to the conclusion that bilateral internal mammary artery ligation was no better than a sham procedure. But more interesting, in many ways, was the question generated: what was going on with the sham group that they were able to improve at all?
In all, there appear to have been about 15 sham-controlled studies of surgical interventions in the recent literature. Much depends upon how one defines “surgical intervention,” of course. I have chosen to include vertebroplasty , for example, but to exclude an excellent and illustrative study of acupuncture . There have been sham-controlled studies of arthroscopy for osteoarthritis , implantation of dopaminergic neural tissue for Parkinson’s disease , and transmyocardial laser revascularization for refractory angina . One of particular interest for the general surgery community involved implantation of a gastric stimulator for treatment of obesity. In the SHAPE trial , 190 patients underwent laparoscopic placement of a device designed to alter normal gastric function; in half the group, the stimulator was turned on, and in the other half it was left off. Patients and evaluators were blinded. At 12 months the control group had lost 11.7 percent of excess weight, while the treatment group had lost 11.8 percent: no difference between the two groups. Similarly, a sham-controlled study of laparoscopic lysis of adhesions in treatment of pelvic pain showed that both groups improved equally .
These studies might be dismissed as just a collection of oddball case types, except for one thing: in all reported sham-controlled studies to date, evidence for benefit of surgery over sham has been lacking. The score is sham 15, intervention 0. Of course this is due in large part to some of the barriers to study described above. But as surgery moves from its historical role—open, ablative procedures for the saving of lives—to its contemporary role, which includes a remarkable percentage of minimally invasive techniques, with reworking of the native anatomy for the reduction of pain or improvement in quality of life—surely an expanded role for sham-controlled trials is indicated.
And when true sham-controlled studies of surgery can’t be performed, we must learn to be creative in seeking the next best thing. For example, Waki and colleagues studied the putative survival benefit of pancreas transplantation in an ingenious but straightforward way . They queried a large transplant database for deceased organ donors who had donated one of their kidneys to a diabetic recipient as part of a simultaneous pancreas-kidney transplant (SPK), and the other kidney to a diabetic recipient as a kidney graft alone—a “sham” (absent pancreas) SPK. The result: patient survival through 10 years was equivalent, indicating that in these patients there is no survival benefit to a pancreas transplant over standard insulin injections (and thereby relegating the potential benefit of pancreas transplantation to quality of life).
Or consider the study comparing open colectomy to laparoscopic colectomy performed by Basse et al . They randomly assigned 60 patients to one of these modes of surgery and, at the conclusion of the case, went to the considerable effort of covering the entire abdomen with a single large bandage. Patients and evaluators were blinded as to the kind of surgery performed. Time until discharge from hospital—the primary endpoint—was the same for both groups.
In short, the sham effect is anything but a wifty notion: it is real, and it is alive and well in surgery today. More surgical procedures should be compared to sham controls, or the closest thing we can devise. The insights gained will help us understand exactly what it is that we accomplish with our procedures, and what it is that the patient actually experiences with surgery . And they will allow us to expeditiously identify procedures of little or no intrinsic merit.
Richard J. Rohrer, MD, is a professor and vice chairman of Tufts University School of Medicine’s Department of Surgery and the chief of the Division of Transplant Surgery at Tufts Medical Center. He has performed liver and kidney transplants in Boston for more than 25 years and has had numerous roles with the New England Organ Bank and the United Network for Organ Sharing.
Related in VM
The viewpoints expressed on this site are those of the authors and do not necessarily reflect the views and policies of the AMA.
© 2012 American Medical Association. All Rights Reserved.