What does a doctor do in the age of AI?
The role of the physician is not to diagnose or prescribe but to determine what information to collect next and how to best proceed at this particular moment.
AI models can now solve clinicopathologic conferences (CPCs). Basically, these are some of the most challenging cases to diagnose. The CPCs are presented as curated vignettes.
If AI can diagnose the hardest cases, then what is the role of the doctor? Specifically, what is the role of the mostly non-procedural physician (e.g., medical oncologist, internist)?
Please read Dhruv’s article in the New Yorker, this post builds off the ideas he makes in that article.
Bayesian update step
What question do you ask next? What labs or imaging do you want to collect next? A patient sitting in front of you is not a CPC of neatly organized data points, they are relatively undifferentiated and you have to decide what data to collect.
Each decision has an opportunity cost. It takes time to send them to get an MRI and their disease process may change in the interim, for example. There is a cost to the health system, insurer and patient too. And finally, there are other patients that also may need your time or access to a CT scanner so if you sit with a specific patient for three hours and order every test, other patients may not get timely access to what they need.
Synthesis
In a CPC, you do not need to sort through hundreds of medical notes, tens of imaging studies, and thousands of lab values and trends. In real life, you do. The clinician’s role is to create the clinical summary of the patient just as the staff of NEJM do for CPCs. The art of figuring out what to include in the summary and how to describe it is critical. “Real patients do not present as carefully curated case studies”.
The natural counterargument to this is, “just throw most if not all the information at AI and have it summarize and sort”. Perhaps one day, but not today. In my experience, how you prompt the model makes a big difference. Here are two examples from Dhruv:
Dhruv gave the same model that solved the CPC the broad strokes of what a real patient had experienced. The model made up lab values, exam findings, incorrectly interpreted imaging, and fabricated a CT scan read. And, it ended up with the wrong diagnosis. But, when Dhruv gave a formal summary of his visit in medical-ese, CaBot latched on to the pertinent data and did not make anything up. It reached an almost correct diagnosis. Importantly, the next steps CaBot suggested was identical to what should have been done under the exact right diagnosis. The underlying pathology and fix would have been done properly.
Another example:
These examples are representative of my experience with AI.
Evaluating the output of medical AI
A clinician has to know medicine to evaluate the outputs of the AI. If the clinician does not grok medicine, or at least the speciality they practice, then they will be forced to use AI / books / search to evaluate each statement the AI makes which is pointless.
AI offers the clinician leverage
With good AI, a clinician can scale their efforts. They can do what they could not have before. They can see more patients (and certainly PE firms will use this to squeeze productivity out of physicians), but they can also dedicate more effort to their patients. They can also think more deeply about medicine to contribute to meaningful clinical and research questions. AI allows the clinician to dedicate more time to working on hard problems.
Considering the patient’s preferences, values, and goals
Sometimes the ninth line of therapy for a 83 year old patient with metastatic colorectal cancer is not the right decision. It may be medically indicated but what does the patient want? What are their goals and priorities? What is their quality of life?
There is often no right answer
Reasoning under uncertainty is the name of the game especially when evidence is scant. A good example of this is the ICU setting. We don’t have RCTs delineating treatment decisions for each example. Clinicians have to reason under uncertainty.
A related point is that it is hard to train LLMs in medicine because of the uncertainty. We can train really good math models because most math has a right answer that we can score model outputs against to reach an optimal set of weights. How do you develop a scoring function in medicine? Nobody knows the right answer.
Underwriting risk
This is a logistical question that will get figured out with time, and not a particularly strong reason. New insurance models and ways of underwriting risk will be introduced as AI gets better. I don’t find this to be a compelling argument.
There is no role for the non-procedural physician if we de-skill ourselves by relying on AI too much especially while training.
Please comment or reach out with thoughts or disagreements!


