rlhf

Here are 3 public repositories matching this topic...

infiniV / ultra-ml-intern

ultra-instinct ML engineering intern for Claude Code. Reads papers, audits datasets, ships SFT/DPO/LoRA runs to Hugging Face.

Updated Jun 9, 2026
Shell

HyperWorX / NoCap

Star

Behavioural protocol package for Claude Code enforcing quality and transparency

cli skill protocol developer-tools transparency workflow-automation ai-agents claude agent-framework nocap llm prompt-engineering rlhf anthropic claude-code claude-skill anti-sycophancy behavioural-protocol

Updated Apr 30, 2026
Shell

haolpku / Awesome-LLM-Data-Preparation

Star

Data Preparation for Large Language Models — a curated companion to our JCST 2026 survey. Covers Pre-training, Continual Pre-training, and Post-training (SFT/RLHF/RLAIF) across collection, filtering, dedup, generation, evaluation.

nlp awesome deep-learning survey awesome-list data-preparation sft pretraining data-centric-ai large-language-models llm rlhf instruction-tuning rlaif continual-pretraining

Updated Apr 28, 2026
Shell

Improve this page

Add a description, image, and links to the rlhf topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rlhf topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly