AdaptFM AdaptFM

Resource-Adaptive Foundation Model Inference (AdaptFM)


Foundation models (FMs) have achieved remarkable capabilities across language, vision, and multimodal tasks. However, their inference typically follows a rigid, one-size-fits-all paradigm where every input, regardless of complexity, passes through the same fixed architecture with identical computational cost. This inflexibility creates a fundamental mismatch between the diverse resource budgets encountered in real-world deployments and the static nature of model inference.

Adaptation can take many forms: compressing models to meet deployment budgets, designing flexible architectures that support multiple configurations from a single trained model, or making dynamic runtime decisions based on input complexity or resource availability. The central question we explore is: How can foundation model inference flexibly adapt to any resource budget, whether constrained by memory, compute, latency, energy, or cost, while maximizing output quality?

This challenge spans across algorithms, architectures, and systems. We aim to bring together researchers from ML, systems, and hardware communities to advance techniques that move beyond rigid inference toward flexible, resource-aware foundation models. We welcome contributions in the following areas:

We hope that AdaptFM will serve as a forum for researchers across different disciplines to bring forward and discuss challenging topics, share new ideas, and exchange experience in building flexible, resource-aware foundation models — both from a theoretical and experimental perspective.

The workshop is co-located with ICML’26, held in Seoul, South Korea.

Submission portal: openreview.net

Submission deadline: May 1, 2026, AoE

Call for Papers Poster