Publikationen - WestAI

Composing and Validating Large-Scale Datasets for Training Open Foundation Models for Audio

Obtaining strong reproducible foundation language-audio models require open datasets of sufficient scale and quality. To pre-train contrastive language-audio model we compose large-scale sound effects dataset with detailed text descriptions for each sample. Generating...

Towards FAIR Data in Distributed Machine Learning Systems

In the era of big data and artificial intelligence, distributed machine learning has emerged as a promising solution to address privacy and security concerns while fostering collaboration between multiple parties. However, with the data increased in terms of...

Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis

We present a method that simultaneously addresses the tasks of dynamic scene novel-view synthesis and six degree-of-freedom (6-DOF) tracking of all dense scene elements. We follow an analysis-by-synthesis framework, inspired by recent work that models scenes as a...

Mask4D: Mask Transformer for 4D Panoptic Segmentation

Accurately perceiving and tracking instances over time is essential for the decision-making processes of autonomous agents interacting safely in dynamic environments. With this intention, we propose Mask4Former for the challenging task of 4D panoptic segmentation of...

Graph Pooling Provably Improves Expressivity

In the domain of graph neural networks (GNNs), pooling operators are fundamental to reduce the size of the graph by simplifying graph structures and vertex features. Recent advances have shown that well-designed pooling operators, coupled with message-passing...

Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks

Models with similar performances exhibit significant disagreement in the predictions of individual samples, referred to as prediction churn. Our work explores this phenomenon in graph neural networks by investigating differences between models differing only in their...