
An Expanded Framework for Governing Autonomous Clinical Systems
Artificial intelligence in medicine has advanced from narrow diagnostic classifiers to agentic systems capable of longitudinal clinical reasoning, autonomous planning, and sustained patient interaction. As these systems approach and, on certain benchmarks, exceed physician-level performance, the dominant evaluation paradigm remains rooted in examination-based credentialing: structured tests, simulated encounters, and quantitative performance metrics. This paper argues that examination-centric evaluation, however sophisticated, is fundamentally insufficient for governing the deployment of autonomous AI in clinical settings. Drawing on the philosophy of moral agency, the ethics of accountability, and the institutional architecture of medical trust, we propose a comprehensive framework of ethical redlines, domains of clinical decision-making that must remain under human authority regardless of demonstrated AI capability. We further develop a theory of bounded autonomy that integrates technical evaluation with moral, legal, and institutional accountability structures. The framework addresses not only what AI systems can do, but what they should be permitted to do, and who bears responsibility when harm occurs. We contend that the integrity of medicine as a moral profession depends on establishing these boundaries before, not after, the widespread deployment of increasingly autonomous clinical AI.
The integration of artificial intelligence into clinical medicine represents one of the most consequential shifts in the history of healthcare delivery. What began as a set of narrowly scoped tools, image classifiers for dermatological lesions, risk calculators for cardiovascular events, natural language processors for clinical documentation, has evolved into a class of systems that approximate, and in some measurable respects surpass, the cognitive performance of trained physicians. Large language models now demonstrate the capacity to synthesize longitudinal patient histories, generate differential diagnoses across organ systems, propose evidence-based treatment plans, and engage in naturalistic conversation with patients and clinicians alike. Some emerging architectures exhibit what researchers term agentic characteristics: multi-step reasoning, goal-directed behavior, adaptive learning from environmental feedback, and the capacity to initiate and sequence clinical actions without continuous human direction.
These developments have provoked considerable optimism. Proponents argue that AI-augmented medicine can reduce diagnostic error, mitigate physician burnout, extend specialist-level care to underserved populations, and accelerate the translation of evidence into practice. There is merit in each of these claims, and the empirical trajectory suggests that AI will indeed transform clinical workflows in profound and, in many cases, beneficial ways.
Yet the very velocity of this transformation has exposed a fundamental governance gap. The dominant approach to evaluating clinical AI readiness remains anchored in examination-based paradigms borrowed, with varying degrees of adaptation, from the credentialing architecture of human medical education. Systems are tested against structured benchmarks, subjected to simulated patient encounters, interrogated in oral-style formats, and monitored through continuous performance dashboards. Each of these methods offers valuable information about technical capability. None, individually or in combination, addresses the deeper question that must precede clinical deployment: not whether an AI system can perform like a physician, but whether it can be held accountable like one.
This paper builds upon and substantially extends the foundational arguments presented in the initial redlines framework. Our purpose is threefold. First, we provide a rigorous theoretical account of why examination-based evaluation, however sophisticated, is categorically insufficient for governing autonomous clinical AI. Second, we develop a comprehensive taxonomy of ethical redlines, domains of clinical decision-making in which human moral agency is not merely preferable but constitutive of the act itself. Third, we propose an operational framework for bounded autonomy that integrates technical evaluation, institutional governance, legal accountability, and moral philosophy into a unified approach to clinical AI deployment. Throughout, we write for an audience of health system leaders, clinicians, policymakers, and technologists who must make consequential decisions about these systems in the near term.
To understand why examination-based evaluation cannot, by itself, justify the deployment of autonomous clinical AI, it is necessary to examine the epistemological foundations of medical trust. Trust in medicine is not a single construct. It is a layered phenomenon that emerges from the intersection of demonstrated competence, institutional accountability, relational continuity, and moral commitment. Each of these layers contributes something that the others cannot supply, and the absence of anyone creates a form of trust that is, at best, incomplete and, at worst, dangerous.
Medical examinations from the United States Medical Licensing Examination to specialty board certifications were designed to assess a specific dimension of readiness: cognitive competence under controlled conditions. They test whether a candidate possesses the knowledge base, pattern recognition skills, and procedural understanding necessary for clinical practice. This is valuable and, for human physicians, serves as a necessary gatekeeping function. No one should practice medicine without demonstrating a threshold level of medical knowledge.
However, the history of medical education itself demonstrates that competence assessment is only the beginning of trustworthiness. A medical student who achieves a perfect score on every licensing examination is not, on that basis alone, permitted to practice independently. The profession requires years of supervised residency training precisely because knowledge and judgment are not the same thing. Judgment develops through repeated exposure to clinical uncertainty, through the experience of making decisions under conditions of incomplete information, through witnessing the consequences of one’s choices on real human beings, and through the gradual internalization of professional norms that govern not only what one does but how one reflects on what one has done.
This distinction is not a relic of tradition. It reflects a deep epistemological insight: that clinical wisdom is partly constituted by the process through which it is acquired. The physician who has sat with a dying patient, who has delivered catastrophic news, who has second-guessed a decision at three in the morning and then lived with the consequences that physician possesses something that cannot be captured by any test, however comprehensive. This is not mysticism. It is the recognition that moral knowledge, like other forms of practical knowledge, is partly experiential and partly embodied.
The second layer of medical trust is institutional. Physicians do not practice in isolation. They operate within a dense web of institutional structures, hospitals, licensing boards, malpractice systems, peer review processes, professional societies that collectively enforce standards of conduct and create mechanisms of recourse when those standards are violated. A patient who is harmed by a physician has access to a structured system of accountability: malpractice litigation, professional disciplinary proceedings, institutional credentialing reviews, and, in extreme cases, criminal prosecution.
These structures serve multiple functions simultaneously. They provide ex post remediation for patients who have been harmed. They create ex ante incentives for physicians to practice carefully. They generate an evidentiary record that informs future regulation and education. And they perform an expressive function, communicating to both the profession and the public that medicine takes its obligations seriously and that failures will be addressed. The entire architecture of medical trust presupposes the existence of identifiable, accountable agents who can be subjected to these mechanisms. When we trust a physician, we trust not only their competence but the system that holds them accountable.
The third layer is relational. The therapeutic relationship between physician and patient is not merely instrumental, a means to the end of accurate diagnosis and effective treatment. It is, in many clinical contexts, partly constitutive of the care itself. When a patient discloses a history of trauma in the course of a clinical encounter, the quality of that disclosure and therefore its clinical utility, is shaped by the patient’s perception that they are speaking to someone who can be trusted, who has a stake in the outcome, and who will bear some measure of responsibility for what happens next. The therapeutic alliance is not a luxury that can be replaced by superior information processing. It is a functional component of many forms of clinical care, particularly in mental health, palliative care, chronic disease management, and surgical decision-making.
When applied to AI systems, examination-based evaluation addresses only the first of these layers and does so incompletely. An AI system that achieves high marks on medical benchmarks has demonstrated a form of competence, but it has not undergone the experiential development that transforms competence into judgment. It does not operate within institutional accountability structures designed for identifiable moral agents. It does not form therapeutic relationships grounded in mutual vulnerability and shared moral commitment. Strong performance metrics may therefore create what we term credential washing: the appearance of clinical readiness that masks fundamental gaps in accountability, moral agency, and institutional integration. This is not a critique of the systems themselves, which may indeed be technically impressive. It is a critique of the evaluative framework that conflates performance with trustworthiness.
The argument for ethical redlines rests on a philosophical claim about the nature of certain clinical acts. Specifically, we argue that some medical decisions are not purely technical. They are constitutively moral, meaning that the moral agency of the decision-maker is not incidental to the act but partly constitutive of it. Removing the moral agent does not merely change who performs the act; it changes the nature of the act itself.
Moral agency, as we use the term, requires at minimum three capacities. The first is the capacity for normative reasoning: the ability to recognize that one’s actions have moral significance, to weigh competing values, and to choose in light of those values. The second is the capacity for moral phenomenology: the ability to experience states such as empathy, remorse, guilt, and moral distress, affective states that both inform moral judgment and serve as feedback mechanisms that shape future behavior. The third is the capacity for moral accountability: the ability to be held responsible for one’s actions, to accept the consequences of error, and to undergo moral transformation because of one’s experiences.
Current AI systems, including the most advanced large language models, possess none of these capacities in the relevant sense. They can simulate normative reasoning by generating outputs that resemble moral deliberation, but they do not engage in deliberation. They can produce language that mimics empathy, but they do not experience empathic states. They cannot feel remorse for errors because they do not experience anything at all. And they cannot be held accountable in the robust sense that accountability requires, because accountability presupposes a subject who can understand, internalize, and be changed by the consequences of their actions.
Consider the decision to withdraw life-sustaining treatment. This is not merely a technical determination that a patient’s condition is unlikely to improve. It is a moral act in which a physician, in consultation with the patient or their surrogate, accepts responsibility for the cessation of life-prolonging measures. The gravity of this decision is not merely a function of its consequences, though the consequences are indeed grave but of the moral relationship between the decision-maker and the patient. When a physician authorizes the withdrawal of a ventilator, that physician assumes a burden that persists beyond the moment of the decision. They carry the weight of that choice, revisit it, and are shaped by it. This is not inefficiency. It is the moral substance of medicine.
An AI system that generates a recommendation to withdraw life-sustaining treatment based on prognostic data has performed a technical function. But the act of authorizing withdrawal is not reducible to prognosis. It requires someone who can stand behind the decision, who can look the family in the eye, who can live with the consequences, and who can be held responsible if the decision was wrong. To delegate this authority to a system that lacks these capacities is not merely to risk error, it is to transform a moral act into a mechanical one, and in doing so, to diminish the moral architecture of the profession.
Similar analyses apply across a range of clinical domains. The disclosure of a terminal diagnosis is not merely the transmission of information; it is a moral encounter in which the physician accepts responsibility for the manner of telling, for the support that follows, and for the ongoing relationship with a patient whose life has been fundamentally altered. The decision to amputate a limb in a case of uncertain benefit is not merely a probabilistic calculation; it is an act that permanently alters a human body and that requires a decision-maker who will bear the moral weight of that alteration. These are not edge cases. They represent the core of what it means to practice medicine as a moral profession.
Building on the theoretical foundations established above, we propose a taxonomy of clinical domains in which human moral agency must be preserved regardless of AI capability. These redlines are not arbitrary restrictions. They are derived from the analysis of constitutively moral clinical acts and reflect the minimum conditions for maintaining the ethical integrity of medical practice.
The first and most fundamental redline concerns decisions that directly determine whether a patient lives or dies. This category includes the withdrawal or withholding of life-sustaining treatment, the initiation of terminal or palliative sedation, the determination of surgical candidacy in cases where the risk of death is substantial and the benefit uncertain, and any decision to pursue or forego aggressive resuscitation. In each of these contexts, the decision is not merely consequential, it is existential. It involves the exercise of authority over the continuation of a human life, and this authority must be vested in a person who can be morally and legally accountable for its exercise.
AI systems may appropriately contribute prognostic data, risk assessments, and evidence summaries to support these decisions. They should not make them. The distinction between informational support and decisional authority is critical and must be maintained in both system design and institutional policy.
The communication of devastating clinical information, terminal diagnoses, unexpected surgical deaths, severe fetal anomalies, confirmed malignancies is a moral act that cannot be reduced to information transfer. Research on the experience of receiving catastrophic medical news demonstrates that the manner of disclosure profoundly shapes patient outcomes, including psychological adjustment, treatment adherence, trust in the medical system, and even physiological stress responses. Patients consistently report that the presence of a compassionate, accountable human being during disclosure is not merely preferable but essential to their capacity to process and respond to the information.
This finding is not merely empirical; it has normative force. A patient who receives a terminal diagnosis from an AI system, even one that has been optimized for empathic communication has been denied something that is owed to them: the moral presence of a fellow human being who accepts responsibility for the moment and its aftermath. This is not a claim about the relative quality of AI-generated empathy. It is a claim about what patients are entitled to, as a matter of medical ethics, in moments of existential vulnerability.
Decisions involving permanent and irreversible alterations to the human body constitute a third domain requiring human authority. Amputation, sterilization, organ removal in cases of uncertain benefit, and enrollment in high-risk experimental protocols all share a common feature: the consequences are borne entirely and permanently by the patient, and the decision to proceed requires a form of moral responsibility that cannot be distributed to an algorithm. The informed consent process for these interventions is not merely a legal formality. It is a moral compact between physician and patient in which the physician affirms that they have exercised their best judgment, that they accept responsibility for the recommendation, and that they will be available to the patient in the aftermath of the decision, whatever the outcome.
The fourth redline concerns the transparency and authenticity of the therapeutic relationship itself. Patients must always know when they are interacting with an AI system rather than a human being, and they must retain meaningful access to human care at critical decision points. This is not merely a matter of informed consent, though it is that. It is a condition of maintaining the moral integrity of the clinical encounter. A patient who believes they are speaking with a physician when they are in fact interacting with a language model has been deceived, and this deception undermines the foundation of trust upon which the entire clinical relationship depends.
Moreover, the right to human care at critical junctures is not merely a preference that patients may waive. It is a structural requirement of a medical system that takes moral agency seriously. Just as patients cannot consent to malpractice, they should not be placed in a position where the only available pathway for high-stakes medical decisions routes exclusively through automated systems.
When an AI system contributes to patient harm, the question of responsibility becomes structurally complex in ways that existing legal and professional frameworks are ill-equipped to handle. In conventional medical practice, liability follows a relatively clear chain: the treating physician bears primary clinical responsibility, the institution bears vicarious and institutional liability, and the regulatory system provides oversight and enforcement. The introduction of autonomous AI systems fragments this chain and distributes responsibility across a much larger and more diffuse set of actors.
Consider a scenario in which an AI system generates a treatment recommendation that a clinician follows, and the patient suffers harm. Who is responsible? The developer who designed the algorithm? The data scientists who trained it? The institution that deployed it? The clinician who relied on it? The regulator who approved it? Under current legal frameworks, the answer is uncertain, and this uncertainty has real consequences. Diffuse responsibility tends to degrade into no responsibility, as each actor in the chain points to the others. The patient, meanwhile, is left without a clear path to redress.
This problem is not merely logistical. It reflects a deeper philosophical challenge. Accountability, in the robust moral sense, requires not only that someone can be identified as causally responsible but that they can be held normatively responsible, that is, that they are the sort of agent who can understand, accept, and be changed by the ascription of responsibility. AI systems cannot fulfill this role. They are causal contributors to outcomes, but they are not moral agents who can bear responsibility. This means that every deployment of an autonomous clinical AI system requires a prior determination of how responsibility will be allocated among the human actors in the chain, with clear, enforceable, and transparent lines of accountability.
The preceding analysis suggests that the governance of clinical AI requires a framework that goes substantially beyond technical evaluation. We propose a model of bounded autonomy consisting of four interlocking components: defined autonomy limits, transparent accountability structures, robust contestability mechanisms, and continuous post-deployment monitoring. Each component addresses a distinct dimension of the governance challenge, and all four are necessary for safe and ethically defensible deployment.
Every clinical AI system should operate within clearly specified autonomy boundaries that define the scope of independent action permitted without human authorization. These boundaries should be calibrated to clinical risk and should reflect the ethical redlines articulated above. In practice, this means that systems operating in low-risk, well-characterized clinical domains, routine medication dosing adjustments, scheduling optimization, preliminary imaging triage may be permitted a wider scope of autonomous operation than systems involved in high-stakes, morally complex decisions. The key principle is that the degree of autonomy must be inversely proportional to the severity and irreversibility of potential consequences, and that the redline domains identified in Section 4 represent absolute constraints that no level of demonstrated capability should override.
For each deployed system, institutions must establish explicit accountability maps that identify who bears responsibility for system performance, clinical outcomes, adverse events, and patient recourse. These maps should be publicly available and should clearly delineate the roles of developers, institutional administrators, clinical users, and regulatory bodies. Where multiple actors share responsibility, the nature and limits of each party’s accountability must be specified in advance. The goal is to ensure that at no point in the clinical workflow does responsibility become so diffuse as to be effectively nonexistent.
Both patients and clinicians must retain the capacity to challenge, override, or refuse AI-generated recommendations without penalty or undue friction. Contestability is not merely a procedural safeguard; it is a moral requirement. A system that cannot be effectively challenged by the people affected by its outputs is a system that has been granted a form of authority that is incompatible with the values of medical practice. Operationally, this requires that AI recommendations be presented as recommendations rather than directives, that override mechanisms be integrated into clinical workflows, and that clinicians who exercise their judgment to deviate from AI recommendations are protected from institutional or legal reprisal.
Pre-deployment evaluation, however rigorous, cannot anticipate the full range of conditions that a system will encounter in real-world clinical practice. Continuous monitoring of real-world performance is therefore essential, including mandatory reporting of adverse events, systematic tracking of outcome disparities across patient populations, and periodic re-evaluation of system performance against evolving clinical standards. Monitoring systems should be independent of the developers whose products they evaluate, and monitoring data should be accessible to regulators, institutional review boards, and the public.
Health systems occupy a unique position in the governance of clinical AI. They are the locus of deployment, the context in which AI outputs are translated into clinical actions, and the primary point of contact between automated systems and patients. This position carries significant governance obligations that cannot be delegated to developers, regulators, or individual clinicians.
Before deploying advanced AI tools, hospitals and health systems should undertake a systematic risk assessment that identifies the clinical domains in which the system will operate, classifies those domains according to their ethical risk profile, and establishes corresponding governance protocols. Institutional AI governance committees should include not only clinical and technical expertise but also bioethics representation, patient advocacy, and legal counsel. These committees should have the authority to restrict or withdraw AI systems that fail to meet performance or safety standards, and their deliberations should be documented and subject to external review.
Disclosure requirements represent another critical institutional obligation. Patients must be informed when AI systems have contributed to their care, including the nature and extent of that contribution. This disclosure should occur prospectively, at the point of care, rather than retrospectively or only upon request. The rationale for prospective disclosure is both ethical and practical: patients who understand the role of AI in their care are better positioned to exercise their rights, including the right to request human review and the right to contest AI-generated recommendations.
Finally, institutions must invest in the ethical education of clinicians who work alongside AI systems. The clinical skills required for effective human-AI collaboration are not the same as those required for unaided practice. Clinicians must develop the capacity to critically evaluate AI-generated recommendations, to recognize situations in which AI outputs may be unreliable or inappropriate, and to maintain their own clinical judgment in the face of algorithmically generated confidence. This is not a one-time training requirement, but an ongoing professional development need that should be integrated into continuing medical education.
The regulatory frameworks currently governing medical devices were not designed for systems that learn, adapt, and exercise a degree of autonomous judgment. Extending these frameworks to cover clinical AI will require significant conceptual and structural innovation. We suggest several principles that should guide this effort.
First, regulatory classification should reflect the degree of system autonomy and the clinical risk of the domain in which the system operates. A system that provides decision support to a human clinician poses different governance challenges than a system that autonomously initiates clinical actions, and the regulatory framework should reflect this distinction. Second, high-risk and high-autonomy systems should be subject to explicit requirements for human oversight, including mandatory human authorization for actions within redline domains. Third, liability frameworks must be adapted to address the distributed nature of responsibility in AI-augmented clinical workflows. This may require new legal instruments, such as strict liability for certain classes of AI-related harm or mandatory insurance requirements for institutions deploying high-autonomy systems. Fourth, regulators should require transparency not only in system performance metrics but also in system limitations, known failure modes, and the circumstances under which the system is expected to underperform.
Perhaps most importantly, regulatory approval should not be conceived as a single event but as an ongoing relationship between the regulator and the regulated entity. The adaptive nature of many AI systems means that the system approved at time of deployment may differ materially from the system operating six months later. Continuous regulatory engagement, including real-world evidence requirements and mandatory re-certification processes, is essential to maintaining meaningful oversight.
The arguments presented in this paper ultimately rest on a view of medicine as fundamentally a moral profession, a profession in which the exercise of technical skill is inseparable from the exercise of moral responsibility. This view is not universally shared. Some commentators argue that medicine is, or should be, understood primarily as an applied science in which the quality of outcomes is the paramount concern, and the identity of the decision-maker is secondary. On this view, if an AI system consistently produces better clinical outcomes than a human physician, there is a moral obligation to prefer the AI, and ethical redlines of the sort we propose are obstacles to patient welfare.
We take this objection seriously but ultimately reject it. The reduction of medicine to outcome optimization fails to account for the relational, fiduciary, and existential dimensions of clinical care. Patients are not merely the passive recipients of clinical interventions; they are moral agents, embedded in networks of relationships, values, and commitments that shape the meaning of their clinical experiences. A healthcare system that optimizes outcomes while eroding the moral fabric of the clinical encounter has gained efficiency at the cost of something that many patients and many physicians regard as essential.
Moreover, the outcome-optimization argument assumes a degree of predictive certainty that clinical medicine does not provide. In conditions of genuine uncertainty, where outcomes are unpredictable, where values are contested, and where the costs of error are borne disproportionately by the patient the question of who makes the decision is not a mere procedural detail. It is a matter of justice. Patients are entitled to know that the person making life-altering decisions on their behalf is someone who can be held accountable, who has a stake in the outcome, and who will bear the moral consequences of error. This entitlement is not contingent on whether a machine could, in the aggregate, produce marginally better outcomes.
This paper has argued for a set of ethical boundaries and accountability structures that we believe are necessary for the responsible integration of AI into clinical medicine. Several important questions remain open and warrant further research and deliberation.
The first concerns the status of AI moral agency itself. Our analysis assumes that current AI systems lack genuine moral agency, and we believe this assumption is well-founded given the present state of the technology. However, the question of machine moral agency is not settled, and if future systems were to develop capacities that plausibly constitute moral agency, genuine understanding, affective experience, normative reasoning, the framework we propose would need to be revisited. We do not regard this prospect as imminent, but neither do we regard it as inconceivable, and the governance frameworks we build today should be designed to accommodate conceptual revision.
The second open question concerns the interaction between cultural context and ethical redlines. Medicine is practiced across diverse cultural settings with varying norms around autonomy, disclosure, and the role of technology. While we believe the core argument about moral agency is cross-culturally applicable, the specific operational implications of our framework may require contextual adaptation. Comparative research on patient and clinician attitudes toward AI authority across cultural settings would substantially inform the implementation of bounded autonomy frameworks.
Third, the economic dynamics of clinical AI deployment present challenges that this paper has not fully addressed. The financial incentives driving AI adoption in healthcare are substantial, and there is a real risk that economic considerations will outpace ethical governance. Health systems, regulators, and payers will need to develop mechanisms that ensure the pace of deployment is governed by ethical readiness rather than market pressure.
The integration of artificial intelligence into clinical medicine will reshape the practice of healthcare in ways that are both profound and, in many respects, beneficial. AI systems will enhance diagnostic accuracy, extend the reach of specialized care, reduce administrative burden, and support clinical decision-making with a breadth and speed of information synthesis that no individual clinician can match. These are genuine goods, and the medical profession should embrace them.
But trust in medicine has never been a function of capability alone. It is the product of accountability, moral agency, and the willingness of human beings to bear responsibility for the consequences of their actions on other human beings. No examination score, no benchmark performance, no simulated encounter can substitute for this. The question before us is not whether AI should be integrated into medicine it will be but on what terms. The answer, we have argued, requires explicit ethical redlines that preserve human moral authority over the decisions that matter most, clear accountability structures that ensure responsibility cannot dissolve into algorithmic opacity, and governance frameworks that treat the moral architecture of medicine as a constraint on deployment rather than an impediment to progress.
The history of medicine is, in many ways, a history of negotiating the relationship between technological power and moral responsibility. Each new capability, anesthesia, organ transplantation, genetic testing, life support, has required the profession to ask not only what it can do but what it should do, and who should bear the consequences when things go wrong. Artificial intelligence is the latest, and perhaps the most consequential, instance of this recurring challenge. Meeting it will require the same combination of intellectual rigor, moral seriousness, and institutional courage that has defined the best of the medical profession throughout its history. The time to begin is now.

