We present MixANT, a novel architecture for stochastic long-term dense anticipation of human activities. While recent State Space Models (SSMs) like Mamba have shown promise through input-dependent selectivity on three key parameters, the critical forget-gate ( matrix) controlling temporal memory remains static. We address this limitation by introducing a mixture of experts approach that dynamically selects contextually relevant matrices based on input features, enhancing representational capacity without sacrificing computational efficiency. Extensive experiments on the 50Salads, Breakfast, and Assembly101 datasets demonstrate that MixANT consistently outperforms state-of-the-art methods across all evaluation settings. Our results highlight the importance of input-dependent forget-gate mechanisms for reliable prediction of human behavior in diverse real-world scenarios.
Zitation:
Wasim, S. T., Suleman, H., Zatsarynna, O., Naseer, M., & Gall, J. (2025). MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation. arXiv [Cs.CV]. Retrieved from http://arxiv.org/abs/2509.11394