Google Scholar

Sum-Product-Attention Networks: Leveraging Self-Attention in Probabilistic Circuits

Z Yu, DS Dhami, K Kersting - arXiv preprint arXiv:2109.06587, 2021 - arxiv.org

arXiv preprint arXiv:2109.06587, 2021•arxiv.org

Probabilistic circuits (PCs) have become the de-facto standard for learning and inference in
probabilistic modeling. We introduce Sum-Product-Attention Networks (SPAN), a new
generative model that integrates probabilistic circuits with Transformers. SPAN uses self-
attention to select the most relevant parts of a probabilistic circuit, here sum-product
networks, to improve the modeling capability of the underlying sum-product network. We
show that while modeling, SPAN focuses on a specific set of independent assumptions in …

Probabilistic circuits (PCs) have become the de-facto standard for learning and inference in probabilistic modeling. We introduce Sum-Product-Attention Networks (SPAN), a new generative model that integrates probabilistic circuits with Transformers. SPAN uses self-attention to select the most relevant parts of a probabilistic circuit, here sum-product networks, to improve the modeling capability of the underlying sum-product network. We show that while modeling, SPAN focuses on a specific set of independent assumptions in every product layer of the sum-product network. Our empirical evaluations show that SPAN outperforms state-of-the-art probabilistic generative models on various benchmark data sets as well is an efficient generative image model.

arxiv.org

Show moreShow less

Save Cite Cited by 2 Related articles All 2 versions View as HTML

Cite

Advanced search

Saved to My library

Sum-Product-Attention Networks: Leveraging Self-Attention in Probabilistic Circuits