He Du, Bowen Li, Chengxing Xie, Chang Gao, Kai Chen, Dacheng Tao: Confidence as a Reward: Transforming LLMs into Reward Models. CoRR abs/2510.13501 (2025)