Liu, Y., Lerch, L., Palmieri, L., Rudenko, A., Koch, S., Ropinski, T., Aiello, M. (2025). Context-Aware Human Behavior Prediction Using Multimodal Large Language Models: Challenges and Insights. IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) 2025.
Project Page: CAP-MLLM - Code: boschresearch/cap-mllm - Paper: arXiv:2504.00839.
Generating realistic tree images with Stable Diffusion and ControlNet
- Description: Trained a custom ControlNet allowing to guide tree image generation with a hand-drawn tree skeleton condition
- Team: Solo
- Technologies: Generative AI, Diffusion Models, Python (PyTorch), HuggingFace
Feedback Networks for Robust Object Detection
- Description: Added feedback connections in Faster R-CNN's ResNet backbone to improve robustness against image corruptions
- Team: Solo
- Technologies: Computer Vision, Convolutional Neural Networks, Python (PyTorch)
Learning to Query Social Media via Interpretable ML
- Description: Combining natural language processing and optimized sparse decision trees to improve social media API queries
- Team: 4
- Technologies: Explainable AI, Decision Trees, Python (scikit-learn)
Monte Carlo Tree Search (MCTS) tictactoe
- Description: Monte Carlo Tree Search (MCTS) Python implementation with tictactoe and Gridworld Gymnasium environments
- Team: 2
- Technologies: Reinforcement Learning, Python
Real-time ML Weather Predictions with Apache Kafka on AWS
- Description: ML predictions on real-time weather data using Apache Kafka on AWS, investigating the impact of resource constraints on the performance of the streaming application pipeline.
- Team: 4
- Technologies: Data Science, MLOps, Kafka, AWS
- Description: Building a web platform on which businesses can transparently communicate the order status to their customers
- Team: 3
- Technologies: Web Development, Frontend (Vue.js), Backend (Laravel), Product Development
TUM.ai x Aleph Alpha: BenchPress Makeathon
- Description: Build a Llama 3.1 based coding agent to solve coding problems evaluated on the APPS dataset
- Team: 4
- Won 1st place in the hackathon
TUM.ai Makeathon 2025: OpenAI Open Track Challenge
- Description: ReplyGremlin - Email handling voice assistant that helps reply to and organise emails, all in a real-time conversational experience.
- Team: 4
EDTH MDS Hack: SE3 Labs Challenge
- Description: Visual drone-map localization pipeline using SuperPoint for keypoint extraction and LightGlue for matching.
- Team: 3
TUM.ai x BKW Engineering Hackathon
- Description: FachPlanner - AI-powered building engineering planning and cost estimation in the construction industry.
- Team: 4