Obtaining strong reproducible foundation language-audio models require open datasets of sufficient scale and quality. To pre-train contrastive language-audio model we compose large-scale sound effects dataset with detailed text descriptions for each sample. Generating...
We present sustain.AI, an intelligent, context-aware recommender system that assists auditors and financial investors as well as the general public to efficiently analyze companies’ sustainability reports. The tool leverages an end-to-end trainable architecture that...
We introduce a linguistically enhanced combination of pre-training methods for transformers. The pre-training objectives include POS-tagging, synset prediction based on semantic knowledge graphs, and parent prediction based on dependency parse trees. Our approach...
This paper addresses the classification of Arabic text data in the field of Natural Language Processing (NLP), with a particular focus on Natural Language Inference (NLI) and Contradiction Detection (CD). Arabic is considered a resource-poor language, meaning that...
We introduce a novel data generation method for contradiction detection, which leverages the generative power of large language models as well as linguistic rules. Our vision is to provide a condensed corpus of prototypical contradictions, allowing for in-depth...
Deep Learning can learn complex properties from image datasets, which are difficult to model with traditional machine vision algorithms, inherently in the form of disentangled latent spaces. With latent spaces of Generative AI models, a feature extraction method to...