Cello: Causal evaluation of large vision-language models

M Chen, B Peng, Y Zhang, C Lu - … Methods in Natural Language …, 2024 - aclanthology.org
… defined causal graphs required for formal causal reasoning. To over… and unified definition
of causality involving interactions … of 14,094 causal questions across all four levels of causality: …

Causality for large language models

A Wu, K Kuang, M Zhu, Y Wang, Y Zheng… - arXiv preprint arXiv …, 2024 - arxiv.org
… This paper explores the causal reasoning capabilities of Large Language Models (LLMs) …
focusing on tasks like causal DAG generation, counterfactual reasoning, and token causality. It …

Causalbench: A comprehensive benchmark for evaluating causal reasoning capabilities of large language models

Z Wang - … of the 10th SIGHAN Workshop on Chinese Language …, 2024 - aclanthology.org
… ), it addresses the limitations of existing causal datasets and offers a … causal reasoning
abilities of language models. The findings in this paper suggest that models with stronger causal

Large language models and causal inference in collaboration: A comprehensive survey

X Liu, P Xu, J Wu, J Yuan, Y Yang, Y Zhou… - Findings of the …, 2025 - aclanthology.org
… 2 Causal Inference and Large Language Models … to both large language models (LLMs)
and causal inference, laying … natural language processing by enabling sophisticated language

Cladder: Assessing causal reasoning in language models

Z Jin, Y Chen, F Leeb, L Gresele… - Advances in …, 2023 - proceedings.neurips.cc
… perform causal reasoning is widely considered a core feature of intelligence. In this work,
we investigate whether large language models (LLMs) can coherently reason about causality. …

Causal reasoning and large language models: Opening a new frontier for causality

E Kiciman, R Ness, A Sharma, C Tan - Transactions on Machine …, 2023 - openreview.net
language models for causal DAG generation, counterfactual reasoning, and token causality
demonstrates that they bring significant new capabilities across a wide range of causal tasks…

Causal inference with large language model: A survey

J Ma - Findings of the Association for Computational …, 2025 - aclanthology.org
… 4 Evaluations of LLMs in Causal Tasks This section summarizes recent evaluation results
of LLMs in causal tasks. We mainly focus on causal discovery and causal effect estimation, …

Causaleval: Towards better causal reasoning in language models

L Yu, D Chen, S Xiong, Q Wu, D Li… - … : Human Language …, 2025 - aclanthology.org
… 3 Towards Causal Reasoning in Large Language Models We separate the roles of language
models in CR into two categories. First, LLMs can serve as causal reasoning engines, …

Cause and effect: Can large language models truly understand causality?

S Ashwani, K Hegde, NR Mannuru… - Proceedings of the …, 2024 - ojs.aaai.org
… We present the CARE-CA framework as a significant advancement in enhancing the causal
reasoning capabilities of large language models (LLMs). By integrating explicit knowledge …

[HTML][HTML] Evaluating causal reasoning capabilities of large language models: A systematic analysis across three scenarios

L Wang, Y Shen - Electronics, 2024 - mdpi.com
… ’ causal capabilities, we introduce a novel evaluation method for assessing their causal
This section presents the evaluation results of the selected LLMs on three causal reasoning …