The Wicked Problem of Surgical Failure
- Lee Zhao
- 2 days ago
- 11 min read
Or: Why the Best Surgeons Cannot Escape Variance, and What That Teaches Us About Learning in Complex Systems

1.   The Ether Dome and the Statistician
On October 16, 1846, William Morton administered ether to a patient in the amphitheater of Massachusetts General Hospital, and surgeon John Collins Warren removed a tumor from the patient's neck while the man slept. Warren turned to the assembled physicians and said, "Gentlemen, this is no humbug." The Ether Dome, as it came to be called, became a monument to surgical progress.
What gets less attention is that Morton was a dentist with no medical degree, that he had stolen the idea from a colleague, that he spent the rest of his life in patent disputes, and that he died destitute. Also: the patient, Edward Gilbert Abbott, had a recurrence of his tumor within months. The operation that inaugurated modern anesthesia was, by contemporary oncologic standards, a failure.
I bring this up not to be contrarian about a genuine milestone, but because it illustrates something uncomfortable about how we narrate surgical history. We remember the operations that seemed to work at the time. We build amphitheaters around them. We do not erect monuments to the statistically inevitable complications that followed the celebrated cases, nor to the operations that went perfectly but whose patients died anyway.
Surgery has a storytelling problem. The stories we tell ourselves about success and failure do not match the underlying statistics of what actually happens when you cut into human beings.
This essay is about what happens when a complication occurs after a technically sound operation, on an appropriately selected patient, performed with experience and care—and still ends badly.
About the question that arrives immediately and stays far longer than the complication itself: What did I do wrong?
And about the possibility that this question, as natural and even necessary as it feels, is often the wrong question entirely.
2.    Surgery as Prediction Under Uncertainty (Or: The Map Is Not the Territory)
There is a genre of writing about surgery that treats it as a craft—as if the surgeon were a watchmaker or a carpenter, working with inert materials according to stable principles. The Hands, we call them. Atul Gawande writes beautifully about the "learning curve" as if it were a smooth function asymptoting toward perfection.
This framing is not wrong, exactly. It is incomplete.
Surgery is better understood as a series of nested predictions, each made under uncertainty:
Preoperatively, we predict that an intervention will improve a patient's long-term trajectory—that the expected value of operating exceeds the expected value of not operating. This requires estimating probabilities of success, failure, and complication across a distribution of possible futures, then weighting those by quality-of-life outcomes we cannot directly measure.
Intraoperatively, we predict that this tissue plane is the correct one, that this structure is the ureter and not the gonadal vessel, that blood supply to the anastomosis is adequate, that the closure will hold against physiologic stress.
Postoperatively, we predict that the patient's physiology will recover along a familiar arc—that fever on day one is inflammatory and not infectious, that ileus will resolve, that the drain output means healing rather than leak.
None of these predictions is certain. They are probabilistic judgments made in a complex system with incomplete information.
This is where the concept of map versus territory becomes useful. The preoperative CT scan is a map. The intraoperative anatomy is the territory. The map is useful—sometimes essential—but it is always a simplification. Tumors do not respect radiologic margins. Tissue planes that look pristine on imaging are obliterated by inflammation. The patient's physiology on the day of surgery is not the physiology captured in labs drawn a week earlier.
Surgeons develop intuitions that function as predictive models. Good surgical judgment is, in part, well-calibrated uncertainty—knowing when your map is reliable and when you are operating in unmapped territory. The problem is that calibration is extraordinarily difficult to develop, and nearly impossible to verify.
A thought experiment: Imagine a surgeon who operates on 100 patients with a "true" complication rate of 5%. By chance alone, in that sample of 100 patients, that surgeon might have anywhere from 1 to 10 complications. If they have 2, they feel like a genius. If they have 9, they feel like a failure. Neither feeling is justified. Both surgeons are drawing from the same distribution.
Now imagine those surgeons trying to learn from their outcomes. The surgeon with 2 complications will anchor on whatever they did and try to replicate it. The surgeon with 9 will search their memory for errors, find some (because errors are always findable in retrospect), and "correct" them—possibly introducing new errors in the process.
This is outcome bias dressed in surgical scrubs. And it is nearly universal.
3.   The Wicked Learning Environment (Or: Why Chess Players Improve Faster Than Surgeons)
The psychologist Robin Hogarth distinguished between "kind" and "wicked" learning environments. The distinction is crucial and underappreciated.
In a kind learning environment:
Feedback is immediate and accurate
The rules are stable and known
The relationship between action and outcome is clear
Repetition reliably produces improvement
Chess is the canonical example. You make a move; you see the result; the rules never change; the best players are unambiguously identifiable. Most gates we cross to become surgeons are kind learning environments. The SAT, MCAT, USMLE, medical board exams—the questions have right answers, and you find out your score.
In a wicked learning environment:
Feedback is delayed, noisy, and often biased
The rules change as technology, patients, and expectations evolve
Outcomes correlate imperfectly with the quality of decisions
The same action can produce different outcomes depending on unobservable variables
Experience can actually reduce accuracy if it teaches the wrong lessons
Surgery is a wicked environment. Perhaps the wickedest in medicine.
Consider the feedback loops. A surgeon performs a urethral reconstruction. The patient goes home. The "outcome" that matters—long-term patency, freedom from stricture recurrence—will not be known for months or years. By the time the feedback arrives, the surgeon has performed dozens more cases, and their memory of the index operation has been overwritten by subsequent experience. The feedback is not only delayed; it is degraded.
Worse, the feedback is biased. Patients who do well often disappear from follow-up. Patients who do poorly may seek care elsewhere. The surgeon's personal database is a censored sample. The complications they remember are not a random draw from their actual complication rate; they are the complications that happened to be captured by an imperfect surveillance system.
And the rules change. New technologies emerge. Patient expectations shift. The definition of "complication" evolves. A 10% leak rate that was acceptable in 1996 is malpractice in 2026. A surgeon who learned the "right" lesson from their training may be learning the wrong lesson for the current environment.
In wicked environments, confidence is easy to fake and insight is hard-earned. The surgeon who claims to have no complications is not announcing mastery. They are announcing a failure of measurement, a failure of follow-up, or a failure of honesty.
The chess grandmaster improves because the game teaches true lessons. The surgeon may or may not improve, because the game teaches lessons that are sometimes true, sometimes false, and rarely labeled.
4.   Signaling Competence vs. Achieving Competence
There is a related pathology worth naming: the substitution of signaling for substance.
Signaling theory, imported from economics and evolutionary biology, observes that organisms often take costly actions not because the actions are intrinsically useful, but because they communicate something to observers. The peacock's tail does not help it fly. It signals genetic fitness to peahens.
In surgery, signaling is rampant.
Some examples:
Case volume as proxy for skill. "I've done a thousand of these" is a signal. It communicates experience. But volume without reflection is just repetition. A surgeon who has done 1,000 cases and never examined their outcomes has not had 1,000 learning opportunities; they have had one learning opportunity, repeated 1,000 times.
Technology acquisition as proxy for quality. Hospitals buy robotic systems not because the evidence for superiority is overwhelming, but because the robot signals cutting-edge capability. This is the Red Queen Effect in action: every hospital must run faster just to stay in place. (I acknowledge that I have been a full beneficiary of this effect)
Confidence as proxy for competence. The surgeon who expresses uncertainty is often perceived as less skilled than the surgeon who projects certainty—even if the uncertain surgeon is better calibrated to reality. Patients want reassurance. Institutions want volume that meets staffing ratios.
Minimizing complications as proxy for having none. "Only a 2% leak rate" sounds better than "I have leaks, and here is what I learned from each one." The former is a signal; the latter is substance.
The problem with signaling is that it provokes counter-signaling. The surgeon who has genuinely mastered a procedure can afford to be humble; the surgeon who is insecure must project confidence. But observers cannot always distinguish true humility from incompetence, or true confidence from bluster. So the safest strategy is to signal confidence regardless of actual state.
And again, we are in a coordination failure. Everyone signals. No one learns.
5.    What Actually Helps (Or: Building Your Own Feedback Loops)
If the institutional environment does not provide reliable feedback, and the social environment punishes honest disclosure, what is a surgeon to do?
The answer, I think, is to build your own learning loops. Not because the institution has failed you, but because the complexity of surgery demands it.
Some concrete practices:
1. Pre-mortems. Before the operation, ask: "How might this fail?" List the specific failure modes. This accomplishes two things. First, it surfaces risks that might otherwise be ignored. Second, it preregisters your concerns, so that if a complication occurs, you can assess whether it was foreseeable or genuinely surprising. (A complication you predicted is still a complication, but it is not evidence of blind spots.)
2. Video review. Memory is a bad historian. It smooths over details, reconstructs intention as execution, and edits for narrative coherence. Video does not. Watching your own operations—especially the ones that went well—reveals gaps between what you thought you did and what you actually did. It is uncomfortable. It is also irreplaceable.
3. Personal outcome tracking. Keep a database. Track your patients. Record complications, including the ones that "don't count" because they resolved without reoperation, or because the patient was lost to follow-up before they declared themselves. The institution's quality metrics are lagging indicators; your personal database is a leading one.
4. Structured debriefs. After every operation—not just the complicated ones—take 60 seconds to ask: What went well? What would I do differently? This tiny habit compounds over thousands of cases. It converts repetition into deliberate practice.
5. Seek disconfirming feedback. Find a colleague who will tell you the truth, not the comfortable version. Ask them to watch your video, review your outcomes, challenge your decisions. This is hard, because it requires vulnerability. It is also the only way to escape the echo chamber of self-assessment.
None of these practices is revolutionary. All of them are rare.
Why? Because they cost time, and time is the scarcest resource in surgery. Because they require
humility, and humility is not rewarded. Because they produce uncomfortable realizations, and discomfort is unpleasant.
In other words: the practices that help are precisely the ones that are painful.
6.    The Guilt Trap and the Pain Signal
I want to return to the question that opens this essay: What did I do wrong?
This question has a seductive quality. It feels morally serious. It feels like accountability. It is what a conscientious surgeon should ask.
And sometimes it is the right question. Some complications do follow from error—errors of selection, planning, execution, or postoperative management. A surgeon who never asks "What did I do wrong?" is not humble; they are unteachable.
But the question becomes a trap when it forecloses a second possibility: that you did nothing wrong, and the complication happened anyway.
Variance exists. Complex systems produce unpredictable outcomes. A 5% complication rate means that, by definition, 5% of patients will have complications—including patients who received perfect care.
The failure to accept this is a form of high-modernist thinking: the belief that with sufficient skill
and planning, all uncertainty can be eliminated. This belief is comforting, because it implies that complications are controllable. It is also false, and the falseness has consequences.
When surgeons cannot distinguish moral failure from stochastic failure, they experience guilt when
they should experience only pain.
Pain is a signal. It says: pay attention to this; something has gone wrong; update your models.
Guilt is a conclusion. It says: you are responsible for this; you failed; you are less than you should be.
Pain motivates learning. Guilt often shuts it down. Shame narrows attention, triggers defensiveness, and impairs the exact kind of open-minded analysis that complications require.
The surgeon who feels appropriate pain after a complication will ask: What can I learn from this? The surgeon who feels inappropriate guilt will ask: How can I protect myself from feeling this again? The first question leads to process improvement. The second leads to psychological defense mechanisms—denial, rationalization, blame-shifting, or, at the extreme, attrition from the field.
The hidden curriculum of unkindness trains surgeons to feel guilt when they should feel only pain. And then it provides no tools for processing either.
7.    Responsibility and Mastery
Here is the synthesis I am reaching toward.
Performing surgery is an extraordinary act. We render patients unconscious, cut into them, rearrange anatomy irreversibly, and do so under the conviction that we are improving their lives. That conviction must be strong enough to act upon—or else we could never operate. But it must also be humble enough to be revised—or else we could never learn.
Complications are not evidence that surgery is broken. They are evidence that surgery is hard. They are evidence that we are operating at the edge of what is knowable, with maps that do not perfectly match the territory, in a learning environment that often teaches the wrong lessons, within a social system that punishes honest disclosure.
The task is not to eliminate pain, uncertainty, or doubt. It is to use them correctly.
This means:
Treating complications as data, not indictments
Distinguishing process errors from outcome variance
Building personal feedback systems because institutional ones are insufficient
Resisting the social pressure to signal competence at the cost of actual learning
Feeling pain without collapsing into guilt
Owning what you can control while accepting what you cannot
Mastery in surgery is not the absence of complications. (Any surgeon who claims zero complications is either lying, operating on no one, or not following their patients.) Mastery is the ability to learn the right lessons from complications—quietly, rigorously, and over a lifetime.
8. Â Coda: The Wicked Problem Generalized
I have written about surgery because that is what I know. But the wicked learning environment is not unique to the operating room.
Consider any domain where:
Feedback is delayed and noisy
Outcomes depend on unobservable variables
The rules change faster than expertise accumulates
Social incentives favor signaling over substance
Guilt is confused with appropriate pain
This describes venture capital. It describes public policy. It describes parenting. It describes, increasingly, any form of knowledge work in complex systems.
The question of how to learn in wicked environments is not a niche concern for surgeons. It is one of the defining challenges of our era.
We are building AI systems to assist in surgery—systems trained on outcome data, optimizing for metrics, promising to reduce variance. I am cautiously optimistic about this technology. But I notice that we are training these systems in the same wicked environment that makes human learning difficult. If the labels are noisy, the AI learns noise. If the feedback is biased, the AI learns bias.
The map is not the territory. The simulation is not the patient. Â The CPT code is not what you actually did.Â
These are not problems that technology automatically solves. They are problems that technology inherits, and sometimes amplifies.
So: yes, build the robots. Train the models. But also: build the feedback loops. Distinguish pain from guilt. Escape the coordination failures. Create the conditions for honest disclosure.
Surgery teaches, to those who are paying attention, that confidence and competence are not the same thing. That outcomes do not reliably indicate quality. That the hardest learning happens not when things go right, but when they go wrong—and we have the humility to ask what we can learn rather than who we can blame.
This is the wicked problem. We are all still learning how to solve it.