{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T17:15:43Z","timestamp":1777655743968,"version":"3.51.4"},"reference-count":76,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2024,8,11]],"date-time":"2024-08-11T00:00:00Z","timestamp":1723334400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["62206248"],"award-info":[{"award-number":["62206248"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Entropy"],"abstract":"<jats:p>Particle-based Variational Inference (ParVI) methods have been widely adopted in deep Bayesian inference tasks such as Bayesian neural networks or Gaussian Processes, owing to their efficiency in generating high-quality samples given the score of the target distribution. Typically, ParVI methods evolve a weighted-particle system by approximating the first-order Wasserstein gradient flow to reduce the dissimilarity between the particle system\u2019s empirical distribution and the target distribution. Recent advancements in ParVI have explored sophisticated gradient flows to obtain refined particle systems with either accelerated position updates or dynamic weight adjustments. In this paper, we introduce the semi-Hamiltonian gradient flow on a novel Information\u2013Fisher\u2013Rao space, known as the SHIFR flow, and propose the first ParVI framework that possesses both accelerated position update and dynamical weight adjustment simultaneously, named the General Accelerated Dynamic-Weight Particle-based Variational Inference (GAD-PVI) framework. GAD-PVI is compatible with different dissimilarities between the empirical distribution and the target distribution, as well as different approximation approaches to gradient flow. Moreover, when the appropriate dissimilarity is selected, GAD-PVI is also suitable for obtaining high-quality samples even when analytical scores cannot be obtained. Experiments conducted under both the score-based tasks and sample-based tasks demonstrate the faster convergence and reduced approximation error of GAD-PVI methods over the state-of-the-art.<\/jats:p>","DOI":"10.3390\/e26080679","type":"journal-article","created":{"date-parts":[[2024,8,12]],"date-time":"2024-08-12T11:23:46Z","timestamp":1723461826000},"page":"679","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":1,"title":["GAD-PVI: A General Accelerated Dynamic-Weight Particle-Based Variational Inference Framework"],"prefix":"10.3390","volume":"26","author":[{"given":"Fangyikang","family":"Wang","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China"}]},{"given":"Huminhao","family":"Zhu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China"}]},{"given":"Chao","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China"}]},{"given":"Hanbin","family":"Zhao","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China"}]},{"given":"Hui","family":"Qian","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China"}]}],"member":"1968","published-online":{"date-parts":[[2024,8,11]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Akbayrak, S., Bocharov, I., and de Vries, B. (2021). Extended variational message passing for automated approximate Bayesian inference. Entropy, 23.","DOI":"10.3390\/e23070815"},{"key":"ref_2","unstructured":"Sharif-Razavian, N., and Zollmann, A. (2008). An overview of nonparametric bayesian models and applications to natural language processing. Science, 71\u201393. Available online: https:\/\/www.cs.cmu.edu\/~zollmann\/publications\/nonparametric.pdf."},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Siddhant, A., and Lipton, Z.C. (2018). Deep bayesian active learning for natural language processing: Results of a large-scale empirical study. arXiv.","DOI":"10.18653\/v1\/D18-1318"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"5330","DOI":"10.1109\/TNNLS.2018.2797539","article-title":"Nonparametric Bayesian Correlated Group Regression With Applications to Image Classification","volume":"29","author":"Luo","year":"2018","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"2310","DOI":"10.1109\/TNNLS.2018.2882456","article-title":"Reconstructing Perceived Images From Human Brain Activities With Bayesian Deep Multiview Learning","volume":"30","author":"Du","year":"2019","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Frank, P., Leike, R., and En\u00dflin, T.A. (2021). Geometric variational inference. Entropy, 23.","DOI":"10.3390\/e23070853"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"3989","DOI":"10.3390\/e17063989","article-title":"Entropy, information theory, information geometry and Bayesian inference in data, signal and image processing and inverse problems","volume":"17","year":"2015","journal-title":"Entropy"},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Jewson, J., Smith, J.Q., and Holmes, C. (2018). Principles of Bayesian inference using general divergence criteria. Entropy, 20.","DOI":"10.3390\/e20060442"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"2176","DOI":"10.1109\/TNNLS.2014.2362012","article-title":"Variational Bayesian Inference Algorithms for Infinite Relational Model of Network Data","volume":"26","author":"Konishi","year":"2015","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_10","first-page":"730","article-title":"Variational Inference Over Graph: Knowledge Representation for Deep Process Data Analytics","volume":"36","author":"Chen","year":"2023","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_11","unstructured":"Wang, H., Fan, J., Chen, Z., Li, H., Liu, W., Liu, T., Dai, Q., Wang, Y., Dong, Z., and Tang, R. (2023, January 10\u201316). Optimal Transport for Treatment Effect Estimation. Proceedings of the Thirty-Seventh Conference on Neural Information Processing Systems, New Orleans, LO, USA."},{"key":"ref_12","first-page":"473","article-title":"Practical markov chain monte carlo","volume":"7","author":"Geyer","year":"1992","journal-title":"Stat. Sci."},{"key":"ref_13","first-page":"3","article-title":"Markov chain monte carlo and gibbs sampling","volume":"581","author":"Carlo","year":"2004","journal-title":"Lect. Notes EEB"},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Neal, R.M. (2012). MCMC using Hamiltonian dynamics. arXiv.","DOI":"10.1201\/b10905-6"},{"key":"ref_15","unstructured":"Chen, T., Fox, E., and Guestrin, C. (2014, January 21\u201326). Stochastic gradient hamiltonian monte carlo. Proceedings of the International Conference on Machine Learning, Beijing, China."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Betancourt, M. (2017). A conceptual introduction to Hamiltonian Monte Carlo. arXiv.","DOI":"10.3150\/16-BEJ810"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Doucet, A., De Freitas, N., and Gordon, N.J. (2001). Sequential Monte Carlo Methods in Practice, Springer.","DOI":"10.1007\/978-1-4757-3437-9"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"411","DOI":"10.1111\/j.1467-9868.2006.00553.x","article-title":"Sequential monte carlo samplers","volume":"68","author":"Doucet","year":"2006","journal-title":"J. R. Stat. Soc. Ser. B Stat. Methodol."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"312","DOI":"10.1109\/JSTSP.2015.2497211","article-title":"Langevin and Hamiltonian based sequential MCMC for efficient Bayesian filtering in high-dimensional spaces","volume":"10","author":"Septier","year":"2015","journal-title":"IEEE J. Sel. Top. Signal Process."},{"key":"ref_20","unstructured":"Liu, Q., and Wang, D. (2016). Stein variational gradient descent: A general purpose bayesian inference algorithm. arXiv."},{"key":"ref_21","unstructured":"Zhu, M., Liu, C., and Zhu, J. (2020, January 13\u201318). Variance Reduction and Quasi-Newton for Particle-Based Variational Inference. Proceedings of the ICML, Virtual."},{"key":"ref_22","first-page":"17507","article-title":"De-randomizing MCMC dynamics with the diffusion Stein operator","volume":"Volume 34","author":"Ranzato","year":"2021","journal-title":"Proceedings of the Advances in Neural Information Processing Systems"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Zhang, C., Li, Z., Du, X., and Qian, H. (2022, January 23\u201329). DPVI: A Dynamic-Weight Particle-Based Variational Inference Framework. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Vienna, Austria.","DOI":"10.24963\/ijcai.2022\/679"},{"key":"ref_24","unstructured":"Li, L., Liu, Q., Korba, A., Yurochkin, M., and Solomon, J. (2023, January 7\u201311). Sampling with Mollified Interaction Energy Descent. Proceedings of the The Eleventh International Conference on Learning Representations, Vienna, Austria."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Galy-Fajou, T., Perrone, V., and Opper, M. (2021). Flexible and Efficient Inference with Particles for the Variational Gaussian Approximation. Entropy, 23.","DOI":"10.3390\/e23080990"},{"key":"ref_26","unstructured":"Chen, C., Zhang, R., Wang, W., Li, B., and Chen, L. (2018). A unified particle-optimization framework for scalable Bayesian sampling. arXiv."},{"key":"ref_27","unstructured":"Liu, C., Zhuo, J., Cheng, P., Zhang, R., and Zhu, J. (2019, January 9\u201315). Understanding and accelerating particle-based variational inference. Proceedings of the ICML, Long Beach, CA, USA."},{"key":"ref_28","unstructured":"Korba, A., Aubin-Frankowski, P.C., Majewski, S., and Ablin, P. (2021, January 18\u201324). Kernel Stein Discrepancy Descent. Proceedings of the 38th International Conference on Machine Learning, Virtual."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1681","DOI":"10.1090\/mcom3033","article-title":"A blob method for the aggregation equation","volume":"85","author":"Craig","year":"2016","journal-title":"Math. Comput."},{"key":"ref_30","unstructured":"Arbel, M., Korba, A., Salim, A., and Gretton, A. (2019). Maximum mean discrepancy gradient flow. arXiv."},{"key":"ref_31","unstructured":"Zhu, H., Wang, F., Zhang, C., Zhao, H., and Qian, H. (2024). Neural Sinkhorn Gradient Flow. arXiv."},{"key":"ref_32","unstructured":"Taghvaei, A., and Mehta, P. (2019, January 9\u201315). Accelerated flow for probability distributions. Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"11","DOI":"10.1007\/s10915-021-01709-3","article-title":"Accelerated Information Gradient Flow","volume":"90","author":"Wang","year":"2022","journal-title":"J. Sci. Comput."},{"key":"ref_34","unstructured":"Liu, Y., Shang, F., Cheng, J., Cheng, H., and Jiao, L. (2017). Accelerated first-order methods for geodesically convex optimization on Riemannian manifolds. Adv. Neural Inf. Process. Syst., 30, Available online: https:\/\/proceedings.neurips.cc\/paper\/2017\/hash\/6ef80bb237adf4b6f77d0700e1255907-Abstract.html."},{"key":"ref_35","unstructured":"Zhang, H., and Sra, S. (2018, January 6\u20139). An estimate sequence for geodesically convex optimization. Proceedings of the Conference on Learning Theory, Stockholm, Sweden."},{"key":"ref_36","doi-asserted-by":"crossref","first-page":"E7351","DOI":"10.1073\/pnas.1614734113","article-title":"A variational perspective on accelerated methods in optimization","volume":"113","author":"Wibisono","year":"2016","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1090\/S0002-9947-1988-0924776-9","article-title":"The Density Manifold and Configuration Space Quantization","volume":"305","author":"Lafferty","year":"1988","journal-title":"Trans. Am. Math. Soc."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Nesterov, Y. (2018). Lectures on Convex Optimization, Springer.","DOI":"10.1007\/978-3-319-91578-4"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"329","DOI":"10.1007\/s00220-018-3276-8","article-title":"Convergence to equilibrium in Wasserstein distance for damped Euler equations with interaction forces","volume":"365","author":"Carrillo","year":"2019","journal-title":"Commun. Math. Phys."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"1100","DOI":"10.1137\/16M106666X","article-title":"A JKO Splitting Scheme for Kantorovich\u2013Fisher\u2013Rao Gradient Flows","volume":"49","author":"Monsaingeon","year":"2017","journal-title":"SIAM J. Math. Anal."},{"key":"ref_41","unstructured":"Rotskoff, G., Jelassi, S., Bruna, J., and Vanden-Eijnden, E. (2019). Global convergence of neuron birth-death dynamics. arXiv."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1007\/s10208-016-9331-y","article-title":"An interpolating distance between optimal transport and Fisher\u2013Rao metrics","volume":"18","author":"Chizat","year":"2018","journal-title":"Found. Comput. Math."},{"key":"ref_43","first-page":"1117","article-title":"A new optimal transport distance on the space of finite Radon measures","volume":"21","author":"Kondratyev","year":"2016","journal-title":"Adv. Differ. Equ."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"859","DOI":"10.1080\/01621459.2017.1285773","article-title":"Variational inference: A review for statisticians","volume":"112","author":"Blei","year":"2017","journal-title":"J. Am. Stat. Assoc."},{"key":"ref_45","unstructured":"Liu, Q. (2017). Stein variational gradient descent as gradient flow. arXiv."},{"key":"ref_46","unstructured":"Wang, D., and Liu, Q. (2016). Learning to draw samples: With application to amortized mle for generative adversarial learning. arXiv."},{"key":"ref_47","unstructured":"Pu, Y., Gan, Z., Henao, R., Li, C., Han, S., and Carin, L. (2017, January 4\u20139). VAE Learning via Stein Variational Gradient Descent. Proceedings of the NIPS, Long Beach, CA, USA."},{"key":"ref_48","unstructured":"Liu, Y., Ramachandran, P., Liu, Q., and Peng, J. (2017). Stein variational policy gradient. arXiv."},{"key":"ref_49","unstructured":"Haarnoja, T., Tang, H., Abbeel, P., and Levine, S. (2017, January 6\u201311). Reinforcement learning with deep energy-based policies. Proceedings of the International Conference on Machine Learning, Sydney, NSW, Australia."},{"key":"ref_50","doi-asserted-by":"crossref","first-page":"11216","DOI":"10.1109\/TKDE.2022.3233789","article-title":"Contrastive Proxy Kernel Stein Path Alignment for Cross-Domain Cold-Start Recommendation","volume":"35","author":"Liu","year":"2023","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_51","unstructured":"Lu, Y., Lu, J., and Nolen, J. (2019). Accelerating langevin sampling with birth-death. arXiv."},{"key":"ref_52","first-page":"986","article-title":"Sinkhorn barycenter via functional gradient descent","volume":"33","author":"Shen","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_53","first-page":"139","article-title":"Generative adversarial nets","volume":"27","author":"Goodfellow","year":"2014","journal-title":"NIPS"},{"key":"ref_54","unstructured":"Kingma, D.P., and Welling, M. (2013). Auto-encoding variational bayes. arXiv."},{"key":"ref_55","unstructured":"Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., and Poole, B. (2020). Score-based generative modeling through stochastic differential equations. arXiv."},{"key":"ref_56","first-page":"6840","article-title":"Denoising diffusion probabilistic models","volume":"33","author":"Ho","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_57","unstructured":"Choi, J., Choi, J., and Kang, M. (2024). Scalable Wasserstein Gradient Flow for Generative Modeling through Unbalanced Optimal Transport. arXiv."},{"key":"ref_58","unstructured":"Ranganath, R., Gerrish, S., and Blei, D. (2014, January 22\u201325). Black box variational inference. Proceedings of the Artificial Intelligence and Statistics, Reykjavik, Iceland."},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Ambrosio, L., Gigli, N., and Savar\u00e9, G. (2008). Gradient Flows: In Metric Spaces and in the Space of Probability Measures, Springer Science & Business Media.","DOI":"10.1016\/S1874-5717(07)80004-1"},{"key":"ref_60","unstructured":"Peyr\u00e9, G., and Cuturi, M. (2017). Computational optimal transport. Cent. Res. Econ. Stat. Work. Pap., Available online: https:\/\/ideas.repec.org\/p\/crs\/wpaper\/2017-86.html."},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Platen, E., and Bruti-Liberati, N. (2010). Numerical Solution of Stochastic Differential Equations with Jumps in Finance, Springer Science & Business Media.","DOI":"10.1007\/978-3-642-13694-8"},{"key":"ref_62","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1090\/S0025-5718-1964-0159424-9","article-title":"Implicit runge-kutta processes","volume":"18","author":"Butcher","year":"1964","journal-title":"Math. Comput."},{"key":"ref_63","doi-asserted-by":"crossref","unstructured":"S\u00fcli, E., and Mayers, D.F. (2003). An Introduction to Numerical Analysis, Cambridge University Press.","DOI":"10.1017\/CBO9780511801181"},{"key":"ref_64","first-page":"4672","article-title":"A non-asymptotic analysis for Stein variational gradient descent","volume":"33","author":"Korba","year":"2020","journal-title":"NeurIPS"},{"key":"ref_65","unstructured":"Rasmussen, C.E. (2003, January 4\u201316). Gaussian processes in machine learning. Proceedings of the Summer School on Machine Learning, Tubingen, Germany."},{"key":"ref_66","unstructured":"Chen, W.Y., Mackey, L., Gorham, J., Briol, F.X., and Oates, C. (2018, January 10\u201315). Stein points. Proceedings of the ICML, Stockholm, Sweden."},{"key":"ref_67","doi-asserted-by":"crossref","unstructured":"Brooks, S., Gelman, A., Jones, G., and Meng, X.L. (2011). Handbook of Markov Chain Monte Carlo, CRC Press.","DOI":"10.1201\/b10905"},{"key":"ref_68","unstructured":"Cuturi, M., and Doucet, A. (2014, January 22\u201324). Fast computation of Wasserstein barycenters. Proceedings of the International Conference on Machine Learning, Beijing, China."},{"key":"ref_69","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/2766963","article-title":"Convolutional wasserstein distances: Efficient optimal transportation on geometric domains","volume":"34","author":"Solomon","year":"2015","journal-title":"ACM Trans. Graph. ToG"},{"key":"ref_70","unstructured":"Mroueh, Y., Sercu, T., and Raj, A. (2019, January 16\u201318). Sobolev descent. Proceedings of the Artificial Intelligence and Statistics, Okinawa, Japan."},{"key":"ref_71","doi-asserted-by":"crossref","first-page":"87","DOI":"10.1007\/s13373-017-0101-1","article-title":"{Euclidean, metric, and Wasserstein} gradient flows: An overview","volume":"7","author":"Santambrogio","year":"2017","journal-title":"Bull. Math. Sci."},{"key":"ref_72","unstructured":"Von Mises, R., Geiringer, H., and Ludford, G.S.S. (2004). Mathematical Theory of Compressible Fluid Flow, Courier Corporation."},{"key":"ref_73","first-page":"17034","article-title":"Unbalanced Sobolev Descent","volume":"33","author":"Mroueh","year":"2020","journal-title":"NeurIPS"},{"key":"ref_74","doi-asserted-by":"crossref","first-page":"412","DOI":"10.1137\/19M1251655","article-title":"Interacting Langevin diffusions: Gradient structure and ensemble Kalman sampler","volume":"19","author":"Hoffmann","year":"2020","journal-title":"SIAM J. Appl. Dyn. Syst."},{"key":"ref_75","unstructured":"N\u00fcsken, N., and Renger, D. (2021). Stein Variational Gradient Descent: Many-particle and long-time asymptotics. arXiv."},{"key":"ref_76","unstructured":"Zhang, J., Zhang, R., Carin, L., and Chen, C. (2020, January 26\u201328). Stochastic particle-optimization sampling and the non-asymptotic convergence theory. Proceedings of the Artificial Intelligence and Statistics, Online."}],"container-title":["Entropy"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1099-4300\/26\/8\/679\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T15:34:57Z","timestamp":1760110497000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1099-4300\/26\/8\/679"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,8,11]]},"references-count":76,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2024,8]]}},"alternative-id":["e26080679"],"URL":"https:\/\/doi.org\/10.3390\/e26080679","relation":{},"ISSN":["1099-4300"],"issn-type":[{"value":"1099-4300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,8,11]]}}}