ResearchWednesday, April 22, 2026·8 min read

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

AI Agents Daily

Curated by AI Agents Daily team · Source: ArXiv CS.LG

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

Why This Matters

Researchers from Brown University published a new framework on April 20, 2026, that treats clinical diagnosis as a sequence modeling problem, allowing AI systems to make accurate predictions even when patient data is incomplete. This matters because most hospital AI tools fail in...

Andrew Wang, Ellie Pavlick, and Ritambhara Singh, writing for ArXiv's Machine Learning section, submitted a paper on April 20, 2026, arguing that the healthcare AI field has been solving the wrong problem. Instead of trying to patch over missing data with imputation tricks or discarding incomplete records, their framework reframes the entire diagnostic process as an autoregressive sequence modeling task, borrowing the same causal decoder architecture that powers modern large language models. The result is a system that handles incomplete patient data not as an exception, but as an expected and manageable condition.

Why This Matters

Missing data is not an edge case in hospitals. It is the default state. A patient arrives in the emergency department, some labs are back, imaging is still pending, vitals are recorded but clinical notes are incomplete, and clinicians need answers right now. Every AI diagnostic tool that requires a complete multimodal input is, in practice, useless in that moment. The MIMIC-IV and eICU benchmarks that Wang and colleagues used represent tens of thousands of real ICU patient stays, making their performance results meaningful rather than theoretical. The broader healthcare AI market is projected to exceed $45 billion by 2026 according to multiple industry reports, and the single biggest deployment blocker is exactly this problem.

Stay ahead in AI agents

Daily briefing from 50+ sources. Free, 5-minute read.

The Full Story

The core insight from Wang, Pavlick, and Singh is deceptively simple: stop treating clinical records as a static table of features and start treating them as a story told over time, with some chapters missing. Autoregressive models, the same class of models behind GPT-style systems, predict the next token based on everything that came before. Applied to patient data, this means the model learns to make predictions based on whatever information happens to be available at any given moment, without requiring the full picture.

To make this work across different types of clinical data, such as lab results, vital signs, imaging, and notes, the team developed what they call a missingness-aware contrastive pre-training objective. Contrastive learning, in general, trains a model to pull similar things together in a shared mathematical space while pushing dissimilar things apart. Their twist is building that shared space with missingness explicitly in mind, so the model learns robust representations even when entire modalities drop out. This pre-training phase happens before the model ever sees the specific prediction tasks on MIMIC-IV and eICU.

The architecture choice is significant. Transformer-based causal decoders are not the obvious pick for structured clinical tabular data. Most prior work in clinical ML reached for encoders, attention mechanisms applied to full sequences, or hybrid architectures. Wang and colleagues chose the decoder-only approach deliberately, because it naturally handles variable-length sequences and temporal ordering, exactly the properties that patient trajectories require. Their benchmarks on MIMIC-IV and eICU show this outperforms baseline approaches, though the specific numerical margins are detailed in the full paper available at arXiv:2604.18753.

What makes this paper stand out beyond the performance numbers is the interpretability work. The team used interpretability techniques to examine what actually happens when modalities go missing, and they found that without their contrastive pre-training, removing a modality causes wildly divergent model behavior across different patients. Two patients with similar clinical profiles but slightly different available data could receive dramatically different predictions. The contrastive pre-training stabilizes this behavior, which is exactly what regulators and clinicians need to see before trusting any AI system with diagnostic support.

The framing around "safe, transparent clinical AI" is not marketing language here. It directly addresses the explainability requirements that the FDA and European health authorities have been tightening around AI-based clinical decision support. A model that behaves erratically when data is missing cannot pass regulatory scrutiny, no matter how impressive its average-case accuracy.

Key Details

Paper submitted to ArXiv on April 20, 2026, by Andrew Wang, Ellie Pavlick, and Ritambhara Singh.
The framework was evaluated on 2 established clinical benchmarks: MIMIC-IV and eICU.
Pre-training uses a missingness-aware contrastive objective designed specifically for sparse multimodal datasets.
The architecture borrows causal decoder designs from large language models, not the encoder-based models typical in clinical ML.
The paper is available under a Creative Commons Zero 1.0 license, meaning fully open access.
The submission file is 24,230 KB, indicating substantial supplementary material and experiments.

What's Next

The natural next step for this research is prospective clinical validation, where the model runs alongside real clinician decisions in a live hospital environment rather than a historical benchmark. Given the regulatory requirements for clinical AI tools in the United States and Europe, researchers will need to demonstrate performance stability across patient demographics that extend beyond MIMIC-IV's largely American ICU population. Watch for follow-up work from this group on uncertainty quantification, since the autoregressive framing opens a direct path toward expressing prediction confidence when specific modalities are absent.

How This Compares

The closest published comparison is work from Lena Stempfle at Chalmers University and colleagues at INRIA-Inserm, which surveyed 55 clinicians across 29 French trauma centers in early 2025 to understand how missing features affect interpretable clinical ML in practice. That work documented the problem from the clinician side. Wang and colleagues are solving it from the model architecture side. Together they form a compelling case that the field is converging on this problem from multiple directions simultaneously, which usually signals that a real solution is close.

ClinicalGAN, detailed in a Nature Scientific Reports article in 2024, took a generative adversarial approach to the same general problem of modeling patient trajectories with incomplete data. GANs are notoriously difficult to train and even harder to interpret, which limits their clinical adoption. The autoregressive approach from Wang et al. sidesteps those instability issues while also providing better native interpretability through attention visualization and token-level attribution.

The broader LLM-for-healthcare wave has mostly focused on text, specifically clinical notes, discharge summaries, and medical question answering. What this paper does is extend the sequence modeling intuition to non-text modalities, which is a harder and more clinically relevant problem. Projects like Google's Med-PaLM 2, announced in 2023, addressed text and medical imagery but did not tackle the specific challenge of temporally sparse multimodal ICU data with systematic missingness. This paper fills a gap that the large-lab efforts have mostly ignored.

FAQ

Q: What does autoregressive modeling mean in plain language? A: Autoregressive modeling means predicting what comes next based on everything that came before, one step at a time. Think of it like autocomplete on your phone, but instead of predicting words, this system predicts a patient's health status based on whatever clinical measurements have been recorded so far, even if some measurements are missing.

Q: Why does missing data cause such big problems for clinical AI? A: Most clinical AI models were trained and tested on datasets where all relevant inputs are present. When deployed in real hospitals, where some labs are pending or imaging was never ordered, the model receives inputs it was never prepared for. This causes unpredictable behavior, which is dangerous in a medical setting and why many promising research models never reach actual clinical use.

Q: What are MIMIC-IV and eICU, and why do they matter? A: MIMIC-IV and eICU are two large, publicly available databases of real ICU patient records collected from American hospitals. They are the standard benchmarks for clinical machine learning research because they contain realistic, messy, incomplete data from real patients. Strong performance on both databases gives the research community confidence that a method generalizes beyond a single institution or patient population.

This paper represents the kind of foundational work that rarely gets mainstream coverage but has an outsized effect on what clinical AI actually looks like in five years. The combination of LLM-style architecture, explicit missingness handling, and interpretability analysis is a template other research groups will build on. For more coverage of AI tools tackling real-world deployment challenges, and guides on understanding clinical AI systems, keep following the latest AI news as this field moves fast. Subscribe to the AI Agents Daily weekly newsletter for daily updates on AI agents, tools, and automation.

Our Take

This story matters because it signals a shift in how AI agents are being adopted across the industry. The research findings here could reshape how developers build agentic systems in the coming months.

Post Share

Get stories like this daily

Free briefing. Curated from 50+ sources. 5-minute read every morning.

Handling and Interpreting Missing Modalities in Patient Clinical Trajectories via Autoregressive Sequence Modeling

Why This Matters

The Full Story

Key Details

What's Next

How This Compares

FAQ

Get stories like this daily

More in Research

OpenAI Open-Sources Euphony: A Browser-Based Visualization Tool for Harmony Chat Data and Codex Session Logs

On Solving the Multiple Variable Gapped Longest Common Subsequence Problem

Compile to Compress: Boosting Formal Theorem Provers by Compiler Outputs

Learn more — Guides