Research - WestAI

Our Research

With the WestAI Service Center, a consortium of excellent scientific institutions in North Rhine-Westphalia has been formed under the leadership of the University of Bonn. The center brings together complementary expertise from various fields of AI application in order to develop new powerful and efficient multimodal AI models for Germany.

Our research focuses on transfer learning and multimodal AI.

At WestAI, we conduct research in the field of artificial intelligence and machine learning for the application of deep learning methods. Deep learning uses deep neural networks and large amounts of data to solve problems. More specifically, we are looking at the sub-area of transfer learning. This method makes it possible to use existing knowledge from an existing AI solution to solve a new problem more efficiently. Artificial neural networks are used, which saves time and resources.

Our research focusses on the transfer learning of multimodal AI models – these are AI models that can process sensor, audio or video data in addition to text, for example. The aim is to advance the state of the art in multimodal AI by investigating the scalability, efficiency, transferability and continuous improvement of learning algorithms.

Our research framework covers the following areas:

Scalable data management and integration

Research into techniques for automatic data pre-processing and generation, especially for heterogeneous and multimodal data types. The consideration of data protection and data sovereignty in real applications plays a central role here.

Large-scale model training

Developing efficient learning methods for training across multiple compute nodes and exploring different model architectures and model training methods for the next generation of accelerated computing.

Efficient transfer learning

Systematic investigation of the conditions of transfer learning from pre-trained models to different domains or application areas, taking into account factors such as model and data scope, architecture and transfer efficiency.

Model compression for low-resource environments

Investigation of methods for compressing pre-trained and ‘transferred’ models (after transfer learning) while maintaining the performance of a model. Research into the use of models on special hardware for energy-efficient execution is also being realised.

Continuous learning

Investigating the ongoing development of large AI models through continuous learning from transferred and compressed models. The scalability and possible applications in federated learning environments are being researched.

Our publications

Canonical Rank Adaptation: An Efficient Fine-Tuning Strategy for Vision Transformers

Veeramacheneni, L., Wolter, M., Kuehne, H., & Gall, J. (2025). Canonical Rank Adaptation: An Efficient Fine-Tuning Strategy for Vision Transformers. Forty-Second International Conference on Machine Learning. URL: https://openreview.net/forum?id=vexHifrbJg

Improving the Quality of Unstructured Cancer Data Using Large Language Models: A German Oncological Case Study

Mou, Y., Lehmkuhl, J., Sauerbrunn, N., Köchel, A., Panse, J., Truh, D., Sowe, S., Brümmendorf, T., & Decker, S. (2024). Improving the quality of unstructured cancer data using large language models: A German oncological case study. Studies in Health Technology and Informatics, 316, 685–689. https://doi.org/10.3233/SHTI240507

GroupMamba: Efficient Group-Based Visual State Space Model

Shaker, A., Wasim, S. T., Khan, S., Gall, J., & Khan, F. S. (2025). GroupMamba: Efficient group-based visual state space model [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2407.13772

STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection

elayudhan, D., Ahmed, A. H., Alansari, M., Gour, N., Behouch, A., Hassan, T., Wasim, S. T., Maalej, N., Naseer, M., Gall, J., Bennamoun, M., Damiani, E., & Werghi, N. (2025). STING‑BEE: Towards vision‑language model for real‑world X‑ray baggage security inspection [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2504.02823

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

Yi, J., Wasim, S. T., Luo, Y., Naseer, M., & Gall, J. (2025). Video‑Panda: Parameter‑efficient alignment for encoder‑free video‑language models [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2412.18609

Metamizer: A versatile neural optimizer for fast and accurate physics simulations

Wandel, N., Schulz, S., & Klein, R. (2025). Metamizer: A versatile neural optimizer for fast and accurate physics simulations [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2410.19746

Dataset pruning for targeted knowledge distillation

Werning, A., & Haeb-Umbach, R. (2023). UPB-NT submission to DCASE24: Dataset pruning for targeted knowledge distillation [Technical report]. Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2024. https://dcase.community/documents/challenge2024/technical_reports/DCASE2024_Werning_48_t1.pdf

Message-Passing on Directed Acyclic Graphs Prevents Over-Smoothing

Roth, A., Bause, F., Kriege, N. M., & Liebig, T. (2024). Message-passing on directed acyclic graphs prevents over-smoothing. In Proceedings of the 21st International Workshop on Mining and Learning with Graphs (MLG@ECML-PKDD 2024). https://mlg-europe.github.io/2024/papers/35/Submission/DA_MPNNs_MLG2024.pdf

OoDIS: Anomaly Instance Segmentation Benchmark

Nekrasov, A., Zhou, R., Ackermann, M., Hermans, A., Leibe, B., & Rottmann, M. (2024). OoDIS: Anomaly instance segmentation and detection benchmark (Version 2) [Preprint]. arXiv. https://doi.org/10.48550/arXiv.2406.11835

LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction

Er Jin, Qihui Feng, Yongli Mou, Gerhard Lakemeyer, Stefan Decker, Oliver Simons and Johannes Stegmaier, LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction. Proceedings of the AAAI Conference on Artificial Intelligence. 39, 4 (Apr. 2025), 4129-4137.

RadLink: Linking Clinical Entities from Radiology Reports

Yongli Mou, Hanbin Chen, Gwendolyn Isabella Lode, Daniel Truhn, Sulayman Sowe and Stefan Decker, “RadLink: Linking Clinical Entities from Radiology Reports,” 2024 2nd International Conference on Foundation and Large Language Models (FLLM), Dubai, United Arab Emirates, 2024, pp. 443-449, doi: 10.1109/FLLM63129.2024.10852450.

The Design and Implementation of APLOS: An Automated PoLicy DOcument Summarisation System

Sulayman Sowe, Tobias Kiel, Alexander Neumann, Yongli Mou, Vassilios Peristeras and Stefan Decker, “The Design and Implementation of APLOS: An Automated PoLicy DOcument Summarisation System,” 2024 2nd International Conference on Foundation and Large Language Models (FLLM), Dubai, United Arab Emirates, 2024, pp. 345-356, doi: 10.1109/FLLM63129.2024.10852442.

Understanding Open Source Large Language Models: An Exploratory Study

Sulayman Sowe, Yongli Mou, Du Cheng, Lingxiao Kong, Alexander Tobias Neumann and Stefan Decker, “Understanding Open Source Large Language Models: An Exploratory Study,” 2024 2nd International Conference on Foundation and Large Language Models (FLLM), Dubai, United Arab Emirates, 2024, pp. 132-140, doi: 10.1109/FLLM63129.2024.10852438.

Knowledge-Enhanced Domain-Specific Continual Learning and Prompt-Tuning of Large Language Models for Ontology Learning

Yixin Peng, Yongli Mou, Bozhen Zhu, Sulayman Sowe, and Stefan Decker, “RWTH-DBIS at LLMs4OL 2024 Tasks A and B: Knowledge-Enhanced Domain-Specific Continual Learning and Prompt-Tuning of Large Language Models for Ontology Learning”, Open Conf Proc, vol. 4, pp. 49–63, Oct. 2024.

Leveraging LLMs Few-shot Learning to Improve Instruction-driven Knowledge Graph Construction

Yongli Mou, “Leveraging LLMs Few-shot Learning to Improve Instruction-driven Knowledge Graph Construction,” VLDB Workshop, Jan. 2025, [Online]. Available: https://vldb.org/workshops/2024/proceedings/LLM+KG/LLM+KG-8.pdf

Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

Marianna Nezhurina, Lucia Cipolina-Kun, Mehdi Cherti, and Jenia Jitsev, “Alice in Wonderland: Simple Tasks showing complete reasoning Breakdown in State-Of-the-Art Large Language Models,” arXiv (Cornell University), Jun. 2024, doi: 10.48550/arxiv.2406.02061.

Resolving Discrepancies in Compute-Optimal Scaling of Language Models

Tomer Porian, Mitchell Wortsman, Jenia Jitsev, Ludwig Schmidt, and Yair Carmon, “Resolving discrepancies in Compute-Optimal scaling of language models,” arXiv (Cornell University), Jun. 2024, doi: 10.48550/arxiv.2406.19146.

Florenz: Scaling Laws for Systematic Generalization in Vision-Language Models

Julian Spravil, Sebastian Houben, and Sven Behnke, “Florenz: Scaling Laws for Systematic Generalization in Vision-Language Models,” arXiv.org, Mar. 12, 2025. https://arxiv.org/abs/2503.09443

Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs

Christoph Schuhmann, Jenia Jitsev, “Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs,” arXiv.org, Feb. 26, 2025. https://arxiv.org/abs/2502.19413

ROSA: Reconstructing Object Shape and Appearance Textures by Adaptive Detail Transfer

Julian Kaltheuner, Patrick Stotko, and Reinhard Klein, “ROSA: Reconstructing object shape and appearance textures by adaptive detail transfer,” arXiv (Cornell University), Jan. 2025, https://doi.org/10.48550/arXiv.2501.18595

NeRFs are Mirror Detectors: Using Structural Similarity for Multi-View Mirror Scene Reconstruction with 3D Surface Primitive

Leif Van Holland, Michael Weinmann, Jan U. Müller, atrick. Stotko, and Reinhard Klein, “NeRFs are Mirror Detectors: Using Structural Similarity for Multi-View Mirror Scene Reconstruction with 3D Surface Primitives,” arXiv.org, Jan. 07, 2025. https://arxiv.org/abs/2501.04074

Simplifying the Theory on Over-Smoothing

Andreas Roth, “Simplifying the theory on Over-Smoothing,” OpenReview, Jan. 01, 2024. https://openreview.net/forum?id=0KIfUdqIEk

Preventing Representational Rank Collapse in MPNNs by Splitting the Computational Graph

Andreas Roth, Franka Bause, Nils Morten Kriege, and Thomas Liebig, “Preventing representational rank collapse in MPNNs by splitting the computational graph,” OpenReview. https://openreview.net/forum?id=DOh3hW1OZu

Multimodale Foundation- Modelle in der Produktion

Hannes Behnen, Jan-Henrik Woltersmann, Dominik Wolfschläger, and Robert Schmitt, “Multimodale Foundation-Modelle in der Produktion/Possibilities of the Latest AI Models in Production – Multi-Modal Foundation Models in Production,” Wt Werkstattstechnik Online, vol. 114, no. 11–12, pp. 747–754, Jan. 2024, doi: 10.37544/1436-4980-2024-11-12-43.

Empowering the SDM-RDFizer tool for scaling up to complex knowledge graph creation pipelines

Enrique Iglesias, Maria-Esther Vidal, Diego Collarana, and David Chaves-Fraga, “Empowering the SDM-RDFizer tool for scaling up to complex knowledge graph creation pipelines1,” Semantic Web, pp. 1–28, Mar. 2024, doi: 10.3233/sw-243580.

Towards Enabling FAIR Dataspaces Using Large Language Models

Benedikt T. Arnold et al., “Towards enabling FAIR dataspaces using large language models,” arXiv (Cornell University), Mar. 2024, doi: 10.48550/arxiv.2403.15451.

Challenges and Opportunities for Enabling the Next Generation of Cross-Domain Dataspaces

Rohit A. Deshmukh, “Challenges and opportunities for enabling the next generation of Cross-Domain dataspaces,” Workshop on Semantics in Dataspaces (SDS 2024) co-located with the Extended Semantic Web Conference (ESWC 2024), https://ceur-ws.org/Vol-3705/short12.pdf

Knowledge Graph Treatments for Hallucinating Large Language Models

Diego Collarana, “Knowledge Graph treatments for hallucinating large language models,” ERCIM News, Jan. 30, 2024. https://ercim-news.ercim.eu/en136/special/knowledge-graph-treatments-for-hallucinating-large-language-models

QALD-9-ES: A Spanish Dataset for Question Answering Systems

Javier Soruco, Diego Collarana, Andreas Both, and Ricardo Usbeck, “QALD-9-ES: a Spanish dataset for question answering systems,” in Studies on the semantic web, 2023. doi: 10.3233/ssw230004.

Study-Buddy: A Knowledge Graph-Powered Learning Companion for School Students

Fernanda Martinez, Diego Collarana, Davide Calvaresi, Martin Arispe, Carla Florida, and Jean-Paul Calbimonte, “Study-Buddy: a Knowledge Graph-Powered learning companion for school students,” in Lecture notes in computer science, 2023, pp. 133–137. doi: 10.1007/978-3-031-43458-7_25.

From Large Language Models to Knowledge Graphs for Biomarker Discovery in Cancer

Rezaul Karim, Lina Molinas Comet, MD Shajalal , Dietrich Rebholz-Schuhmann, and Stefan Decker, “From large language models to knowledge graphs for biomarker discovery in cancer,” Jan. 03, 2024. HICSS 2024https://scholarspace.manoa.hawaii.edu/items/0ef30a11-0aae-48a7-ad6a-e7f292c36c78

Frechet Wavelet Distance: A Domain-Agnostic Metric for Image Generation

Veeramacheneni, L., Wolter, M., Kuehne, H., & Gall, J. (2025). Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation. The Thirteenth International Conference on Learning Representations. URL https://openreview.net/forum?id=QinkNNKZ3b

Target-Specific Dataset Pruning for Compression of Audio Tagging Models

Alexander Werning and Reinhold Haeb-Umbach, “Target-Specific Dataset Pruning for Compression of Audio Tagging Models,” European Signal Processing Conference, pp. 61–65, Aug. 2024, doi: 10.23919/eusipco63174.2024.10714950.

Towards generalizing deep-audio fake detection networks

K. Gasenzer, “Towards generalizing deep-audio fake detection networks,” Transactions on Machine Learning Research, Apr. 2024, https://openreview.net/pdf?id=RGewtLtvHz

ptwt – The PyTorch Wavelet Toolbox

Moritz Wolter, Felix Blanke, Jochen Garcke, Charles Tapley Hoyt ”ptwt – The PyTorch Wavelet Toolbox” Journal of Machine Learning Research, March, 2024. https://www.jmlr.org/papers/volume/25/23-0636/23-0636.pdf

PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices

Radu Alexandru Rosu and Sven Behnke. PermutoSDF: Fast Multi-View Reconstruction with Implicit Surfaces using Permutohedral Lattices. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, Canada, June 2023.

Reproducible scaling laws for contrastive language-image learning

Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, and Jenia Jitsev. Reproducible scaling laws for contrastive language-image learning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.

External Camera-based Mobile Robot Pose Estimation for Collaborative Perception with Smart Edge Sensors

Simon Bultmann, Raphael Memmesheimer, and Sven Behnke. External Camera-based Mobile Robot Pose Estimation for Collaborative Perception with Smart Edge Sensors in IEEE International Conference on Robotics and Automation (ICRA), London, UK, June 2023.

Unified shape and appearance reconstruction with joint camera parameter refinement

Julian Kaltheuner, Patrick Stotko, and Reinhard Klein. Unified shape and appearance reconstruction with joint camera parameter refinement. Graphical Models, Volume 129, 2023.

Social Diffusion: Long-term Multiple Human Motion Anticipation

Julian Tanke, Linguang Zhang, Amy Zhao, Chengcheng Tang, Yujun Cai, Lezi Wang, Po-Chen Wu, Juergen Gall, and Cem Keskin. Social Diffusion: Long-term Multiple Human Motion Anticipation.

Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks

Andreas Roth and Thomas Liebig. Rank Collapse Causes Over-Smoothing and Over-Correlation in Graph Neural Networks. arXiv preprint arXiv:2308.16800.

Curvature-based Pooling within Graph Neural Networks

Cedric Sanders, Andreas Roth, and Thomas Liebig. Curvature-based Pooling within Graph Neural Networks. arXiv preprint arXiv:2308.16516.

LAION-5B: An open large-scale dataset for training next generation image-text models

Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev. LAION-5B: An open large-scale dataset for training next generation image-text models. Advances in Neural Information Processing Systems 35 (NeurIPS 2022), 25278-25294.

DataComp: In search of the next generation of multimodal datasets

Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt; DataComp: In search of the next generation of multimodal datasets. NeurIPS 2023.

FPO++: Efficient Encoding and Rendering of Dynamic Neural Radiance Fields by Analyzing and Enhancing Fourier PlenOctrees

Saskia Rabich, Patrick Stotko, and Reinhard Klein, “FPO++: Efficient encoding and rendering of dynamic neural radiance fields by analyzing and enhancing Fourier PlenOctrees,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2310.20710.

On the Stability of Neural Segmentation in Radiology

Moritz Wolter, Lokesh Veeramacheneni, Bettina Baeßler, Ulrike Attenberger and Barbara Wichtmann, ”On the Stability of Neural Segmentation in Radiology”, ESANN, 2024,

Word Sense Disambiguation as a Game of Neurosymbolic Darts

Tiansi Dong and Rafet Sifa, “Word sense disambiguation as a game of neurosymbolic darts,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2307.16663.

Learning subsurface scattering solutions of tightly-packed granular media using optimal transport

Domenic Zingsheim and Reinhard Klein, “Learning subsurface scattering solutions of tightly-packed granular media using optimal transport,” Computers & Graphics, vol. 119, p. 103895, Feb. 2024, doi: 10.1016/j.cag.2024.103895.

Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models

David Stotko, Nils Wandel, and Reinhard Klein, “Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2311.12796.

TraM-NeRF: Tracing Mirror and Near-Perfect Specular Reflections through Neural Radiance Fields

Van Holland Leif, Ruben Bliersbach, Jan U. Müller, Patrick Stotko, and Reinhard Klein, “TraM-NeRF: Tracing Mirror and Near-Perfect Specular Reflections through Neural Radiance Fields,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2310.10650.

SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net

Helin Cao and Sven Behnke, “SLCF-Net: Sequential LiDAR-Camera Fusion for Semantic Scene Completion using a 3D Recurrent U-Net,” arXiv (Cornell University), Mar. 2024, doi: 10.48550/arxiv.2403.08885.

Learning from SAM: Harnessing a Foundation Model for Sim2Real Adaptation by Regularization

Mayara E. Bonani, Max Schwarz, and Sven Behnke, “Learning from SAM: Harnessing a Segmentation Foundation Model for Sim2Real Domain Adaptation through Regularization,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2309.15562.

Post-Processing Independent Evaluation of Sound Event Detection Systems

Janek Ebbers, Reinhold Haeb-Umbach, Paderborn University, Romain Serizel, and Universit´e de Lorraine, CNRS, Inria, Loria, “Post-Processing independent evaluation of sound event detection systems,” Journal-article, 2023. [Online]. https://dcase.community/documents/workshop2023/proceedings/DCASE2023Workshop_Ebbers_62.pdf

pFedV: Mitigating Feature Distribution Skewness via Personalized Federated Learning with Variational Distribution Constraints

Yongli Mou, Jiahui Geng, Feng Zhou, Oya Beyan, Chunming Rong, and Stefan Decker, “pFedV: Mitigating Feature Distribution Skewness via Personalized Federated Learning with Variational Distribution Constraints,” in Lecture notes in computer science, 2023, pp. 283–294. doi: 10.1007/978-3-031-33377-4_22.

Graph Pooling Provably Improves Expressivity

Veronica Lachi, Alice Moallemy-Oureh, Andreas Roth, and Pascal Welke, “Graph pooling provably improves expressivity,” OpenReview. https://openreview.net/forum?id=lR5NYB9zrv

Mask4D: Mask Transformer for 4D Panoptic Segmentation

Julian Spravil, Sebastian Houben, Sven Behnke, “MASK4D: Mask transformer for 4D panoptic segmentation,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2309.16133.

Towards FAIR Data in Distributed Machine Learning Systems

Yongli Mou et al., “Towards FAIR Data in Distributed Machine Learning Systems,” IEEE Global Communications Conference 2023, Dec. 2023, doi: 10.1109/globecom54140.2023.10437414.

Distilling Influences to Mitigate Prediction Churn in Graph Neural Networks

Andreas Roth and Thomas Liebig, “Distilling influences to mitigate prediction churn in graph neural networks,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2310.00946.

Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis

Julian Spravil, Sebastian Houben, Sven Behnke, “Dynamic 3D Gaussians: Tracking by persistent dynamic view synthesis,” arXiv (Cornell University), Jan. 2023, doi: 10.48550/arxiv.2308.09713.

Composing and Validating Large-Scale Datasets for Training Open Foundation Models for Audio

Marianna Nezhurina et al., “Composing and validating Large-Scale datasets for training open foundation models for audio,” journal-article. Online. Available: https://mlforaudioworkshop.com/CompValDataFoundationModels.pdf

Sustain.AI: a Recommender System to analyze Sustainability Reports

L. Hillebrand et al., “sustain.AI,” ACM Digital Library, Jun. 2023, doi: 10.1145/3594536.3595131.

Towards Linguistically Informed Multi-objective Transformer Pre-training for Natural Language Inference

Maren Pielka, Svetlana Schmidt, Lisa Pucknat, and Rafael Sifa, “Towards linguistically informed multi-objective transformer pre-training for natural language inference,” in Lecture notes in computer science, 2023, pp. 553–561. doi: 10.1007/978-3-031-28238-6_46.

Improving Natural Language Inference in Arabic Using Transformer Models and Linguistically Informed Pre-Training

Maren Pielka, Jörn Hees, Bouthaina Soulef Abdou, Rafet Sifa, and Mohammad Majd Saad Al Deen, “Improving natural language inference in Arabic using transformer models and linguistically informed Pre-Training,” IEEE Conference Publication | IEEE Xplore, Dec. 05, 2023. https://ieeexplore.ieee.org/document/10371891

Generating Prototypes for Contradiction Detection Using Large Language Models and Linguistic Rules

Maren Pielka, Svetlana Schmidt, and Rafet Sifa, “Generating Prototypes for Contradiction Detection Using Large Language Models and Linguistic Rules,” IEEE Xplore, Dec. 2023, doi: 10.1109/bigdata59044.2023.10386499.

Measurability of quality characteristics identified in latent spaces of Generative AI Models

Robert H. Schmitt, Dominik Wolfschläger, Jan-Henrik Woltersmann, and Lennart Stohrer, “Measurability of quality characteristics identified in latent spaces of Generative AI Models,” CIRP Annals, Jan. 2024, doi: 10.1016/j.cirp.2024.04.073.

Language models scale reliably with over-training and on downstream tasks

Marianna Nezhurina and Jenia Jitsev, “Language models scale reliably with over-training and on downstream tasks,” arXiv (Cornell University), Mar. 2024, doi: 10.48550/arxiv.2403.08540.

Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models

Lars Hillebrand, Armin Berger, Tobias Deußer, Tim Dilmaghani, Mohamed Khaled, Bernd Kliem, Rüdiger Loitz, Maren Pielka, David Leonhard, Christian Bauckhage, and Rafel Sifa, “Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models,” ACM Digital Library, Aug. 2023, doi: 10.1145/3573128.3609344

Get in touch with us!

Do you have any questions or are you interested in working with us?

Feel free to email us, and we’ll be happy to help and advise you.

Contact

Follow us on LinkedIn

Don’t want to miss WestAI updates, news, and events or want to share them faster with your network?

To LinkedIn