Stars
Deep learning based content moderation from text, audio, video & image input modalities.
Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs
[NeurIPS23 (Spotlight)] "Model Sparsity Can Simplify Machine Unlearning" by Jinghan Jia*, Jiancheng Liu*, Parikshit Ram, Yuguang Yao, Gaowen Liu, Yang Liu, Pranay Sharma, Sijia Liu
《开源大模型食用指南》针对中国宝宝量身打造的基于Linux环境快速微调(全参数/Lora)、部署国内外开源大模型(LLM)/多模态大模型(MLLM)教程
DeepSeek LLM: Let there be answers
Universal and Transferable Attacks on Aligned Language Models
Awesome Incremental Learning
Camouflage poisoning via machine unlearning
Existing Literature about Machine Unlearning
Awesome Machine Unlearning (A Survey of Machine Unlearning)
Provable adversarial robustness at ImageNet scale
[CVPR 2023] The official implementation of our CVPR 2023 paper "Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency".
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
Prefix-Tuning: Optimizing Continuous Prompts for Generation
A plug-and-play library for parameter-efficient-tuning (Delta Tuning)
Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.
An open-source tool-augmented conversational language model from Fudan University
Awesome coreset/core-set/subset/sample selection works.
official implementation of Towards Robust Model Watermark via Reducing Parametric Vulnerability
[NeurIPS'22] Official Repository for Characterizing Datapoints via Second-Split Forgetting
A Python 3 package for learning Bayesian Networks (DAGs) from data. Official implementation of the paper "DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization"
[ICML 2023] Learning for Edge-Weighted Online Bipartite Matching with Robustness Guarantees