-
University of Maryland
- College Park
- lichang-chen.github.io
-
Lichang-Chen.github.io Public
The github personal webpage for Lichang Chen.
HTML UpdatedOct 26, 2025 -
ADRS Public
Forked from UCB-ADRS/ADRSAI-Driven Research For Systems (ADRS)
Jupyter Notebook UpdatedOct 16, 2025 -
CS234-Reinforcement-Learning Public
Forked from Rhyme0730/CS234-Reinforcement-LearningThis repo mainly contains CS234 assignment's coding problems
Python UpdatedFeb 4, 2025 -
RLHF-Reward-Modeling Public
Forked from RLHFlow/RLHF-Reward-ModelingRecipes to train reward model for RLHF.
Python Apache License 2.0 UpdatedDec 9, 2024 -
Reflection_Tuning Public
Forked from tianyi-lab/Reflection_Tuning[ACL'24] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
Python UpdatedSep 6, 2024 -
ODIN Public
ODIN: Disentangled Reward Mitigates Hacking in RLHF (ICML 2024)
-
-
AlpaGasus Public
A better Alpaca Model Trained with Less Data (only 9k instructions of the original set)
-
InstructZero Public
Official Implementation of InstructZero; the first framework to optimize bad prompts of ChatGPT(API LLMs) and finally obtain good prompts!
-
LLaVA Public
Forked from haotian-liu/LLaVA[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Python Apache License 2.0 UpdatedMar 24, 2024 -
HallusionBench Public
Forked from tianyi-lab/HallusionBench[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models
Python BSD 3-Clause "New" or "Revised" License UpdatedMar 17, 2024 -
LLaVA-RLHF Public
Forked from llava-rlhf/LLaVA-RLHFAligning LMMs with Factually Augmented RLHF
Python GNU General Public License v3.0 UpdatedNov 1, 2023 -
claude2-alpaca Public
First instruction-tuning dataset distilled from Claude2 (52k Alpaca prompts)!
-
stanford_alpaca Public
Forked from tatsu-lab/stanford_alpacaCode and documentation to train Stanford's Alpaca models, and generate the data.
Python Apache License 2.0 UpdatedJun 7, 2023 -
reward-trl Public
Forked from huggingface/trlTrain transformer language models with reinforcement learning.
-
Chain-of-ThoughtsPapers Public
Forked from Timothyxxx/Chain-of-ThoughtsPapersA trend starts from "Chain of Thought Prompting Elicits Reasoning in Large Language Models".
UpdatedNov 16, 2022 -
zero_shot_cot Public
Forked from kojima-takeshi188/zero_shot_cotProd Env
-
system-design-primer Public
Forked from donnemartin/system-design-primerLearn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Python Other UpdatedApr 28, 2022 -
minmax-opt-smooth-adversary Public
Forked from fiezt/minmax-opt-smooth-adversaryJupyter Notebook UpdatedJun 2, 2021 -
-
zjuthesis Public template
Forked from TheNetAdmin/zjuthesisZhejiang University Graduation Thesis/Design LaTeX template.
TeX MIT License UpdatedNov 4, 2019 -
DeepLearning-500-questions Public
Forked from CrownX/DeepLearning-500-questions深度学习500问,以问答形式对常用的概率知识、线性代数、机器学习、深度学习、计算机视觉等热点问题进行阐述,以帮助自己及有需要的读者。 全书分为18个章节,近30万字。由于水平有限,书中不妥之处恳请广大读者批评指正。 未完待续............ 如有意合作,联系scutjy2015@163.com 版权所有,违权必究 Tan 2018.06
TeX GNU General Public License v3.0 UpdatedMar 7, 2019