To operate safely, autonomous vehicles (AVs) need to detect and handle unexpected objects or anomalies on the road. While significant research exists for anomaly detection and segmentation in 2D, research progress in 3D is underexplored. Existing datasets lack...
Autonomous driving has the potential to significantly enhance productivity and provide numerous societal benefits. Ensuring robustness in these safety-critical systems is essential, particularly when vehicles must navigate adverse weather conditions and sensor...
Sa2VA is a recent model for language-guided dense grounding in images and video that achieves state-of-the-art results on multiple segmentation benchmarks and that has become widely popular. However, we found that Sa2VA does not perform according to its full potential...
The detection and classification of exfoliated two-dimensional (2D) material flakes from optical microscope images can be automated using computer vision algorithms. This has the potential to increase the accuracy and objectivity of classification and the efficiency...
We present MixANT, a novel architecture for stochastic long-term dense anticipation of human activities. While recent State Space Models (SSMs) like Mamba have shown promise through input-dependent selectivity on three key parameters, the critical forget-gate ( ...
Real-time understanding of dynamic human presence is crucial for immersive Augmented Reality (AR), yet challenging on resource-constrained Head-Mounted Displays (HMDs). This paper introduces Rhino-AR, a pipeline for ondevice 3D human pose estimation and dynamic scene...