Many materials, such as sand, rice, wheat, or other kinds of seeds, consist of numerous individual grains that determine the visual appearance of these materials. When generating images of these mixtures, the primary challenge is to simulate the interaction of light...
Word Sense Disambiguation (WSD) is one of the hardest tasks in natural language understanding and knowledge engineering. The glass ceiling of 80% F1 score is recently achieved through supervised deep-learning, enriched by a variety of knowledge graphs. Here, we...
iNeural networks promise automated prostate segmentation for the development of precise and quantifiable image-based biomarkers in modern personalized oncology. Before clinical translation, however, theirstability must be ensured. In this study, we train...
Fourier PlenOctrees have shown to be an efficient representation for real-time rendering of dynamic Neural Radiance Fields (NeRF). Despite its many advantages, this method suffers from artifacts introduced by the involved compression when combining it with recent...
Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the machine...
Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed...