Stars
PyTorch code for hierarchical k-means -- a data curation method for self-supervised learning
Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)
[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
AutoThink is a reinforcement learning framework designed to equip R1-style language models with adaptive reasoning capabilities. Instead of always thinking or never thinking, the model learns when …
Train your Agent model via our easy and efficient framework
[AAMAS'25] Code for "Online Preference-based Reinforcement Learning with Self-augmented Feedback from Large Language Model"
[AAAI'25] Code for "In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning"
Improving Math reasoning through Direct Preference Optimization with Verifiable Pairs
This repository implements AutoThink in our paper: Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
DeepLiterature: A fully open-source intelligent research assistant that integrates search, code execution, link resolution, and information expansion, with multiple tools working together to facili…
An open-source solution for full parameter fine-tuning of DeepSeek-V3/R1 671B, including complete code and scripts from training to inference, as well as some practical experiences and conclusions.…
Genome modeling and design across all domains of life