Julian Spravil
I’m a PhD student in the Autonomous Intelligent Systems Group at the University of Bonn and a research engineer at Fraunhofer Institute for Intelligent Analysis and Information Systems. My research interests include multimodal and multilingual machine learning as well as human-centered applications.
Research
My research in multimodal machine learning integrates vision, language, and audio to build assistive systems. Representative papers are highlighted.
-
Scaling Laws for Conditional Emergence of Multilingual Image Captioning via Generalization from TranslationJulian Spravil, Sebastian Houben, Sven Behnke40th AAAI Conference on Artificial Intelligence (AAAI) 2026, SingaporeScaling model size, training samples, and multilinguality enables image captioning in unseen languages via translation as auxiliary task, however, fine-tuning with full task-language coverage remains essential.
-
HyenaPixel: Global Image Context with ConvolutionsJulian Spravil, Sebastian Houben, Sven Behnke27th European Conference on Artificial Intelligence (ECAI) 2024, Santiago de Compostela, SpainHyenaPixel builds on the Hyena operator, extending it to bidirectional and 2D processing to capture global image context with large convolutions, enabling transformer-level accuracy without attention.