Installation

OWL: GEOMETRY-AWARE SPATIAL REASONING FOR AUDIO LARGE LANGUAGE MODELS

Accepted at International Conference on Learning Representations (ICLR), 2026

Project Page: https://bashlab.github.io/owl_project/

🚀 Main Results

Comparison of OWL with closed- and open-source baselines on BiDepth across four task types: Type I (event detection), Type II (direction estimation), Type III (spatial reasoning), and Type IV (CoT reasoning). OWL consistently surpasses prior open-source models, with further gains from CoT supervision. Best results are in bold.

Zero-shot Performance of OWL on the SpatialSoundQA across perception and reasoning tasks. OWL consistently outperforms the baselines, with larger gains in spatial reasoning tasks, demonstrating the benefit of the SAGE and CoT instruction tuning. Best results are denoted in bold.

Installation

git clone https://github.com/BASHLab/OWL.git
cd OWL

python -m venv venv
source venv/bin/activate

git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout tags/v4.35.2
pip install -e .

cd ..
git clone https://github.com/huggingface/peft.git
cd peft
git checkout tags/v0.6.0
pip install -e .

cd seld_cot/owl
pip install -r requirements.txt
cd ../../

pip install  -e .

📁 BiDepth Dataset

BiDepth

🍀 Model Zoo

Model Name
SAGE
OWL-LLaMA2-7B
OWL-LLaMA3.2-3B
OWL-Qwen2.5-7B-Instruct
OWL-LLaMA2.5-3B

👍 Acknowledgement

The codebase of OWL is adapted from SLAM-LLM. We are also grateful for their contribution.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
assets		assets
seld_cot/owl		seld_cot/owl
src/slam_llm		src/slam_llm
README.md		README.md
dev_requirements.txt		dev_requirements.txt
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OWL: GEOMETRY-AWARE SPATIAL REASONING FOR AUDIO LARGE LANGUAGE MODELS

Accepted at International Conference on Learning Representations (ICLR), 2026

Project Page: https://bashlab.github.io/owl_project/

🚀 Main Results

Zero-shot Performance of OWL on the SpatialSoundQA across perception and reasoning tasks. OWL consistently outperforms the baselines, with larger gains in spatial reasoning tasks, demonstrating the benefit of the SAGE and CoT instruction tuning. Best results are denoted in bold.

Installation

📁 BiDepth Dataset

🍀 Model Zoo

👍 Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OWL: GEOMETRY-AWARE SPATIAL REASONING FOR AUDIO LARGE LANGUAGE MODELS

Accepted at International Conference on Learning Representations (ICLR), 2026

Project Page: https://bashlab.github.io/owl_project/

🚀 Main Results

Zero-shot Performance of OWL on the SpatialSoundQA across perception and reasoning tasks. OWL consistently outperforms the baselines, with larger gains in spatial reasoning tasks, demonstrating the benefit of the SAGE and CoT instruction tuning. Best results are denoted in bold.

Installation

📁 BiDepth Dataset

🍀 Model Zoo

👍 Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages