safety

Here are 238 public repositories matching this topic...

NVIDIA-NeMo / Guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

python nvidia safety agents guardrails llms generative-ai llm-security llm-safety

Updated Nov 12, 2025
Python

disclose / diodb

Star

Open-source vulnerability disclosure and bug bounty program database

legal data hackers bug-bounty safety simplicity responsible-disclosure safe-harbor-framework security-research vulnerability-disclosure disclosure-policy bug-bounty-hunters

Updated Jul 20, 2025
Python

utiasDSL / safe-control-gym

Star

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and RL

control reinforcement-learning quadcopter robotics symbolic gym cartpole safety quadrotor robustness pybullet casadi

Updated Nov 6, 2025
Python

PKU-Alignment / safe-rlhf

Star

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Updated Sep 8, 2025
Python

stanislaw / awesome-safety-critical

Star

List of resources about programming practices for writing safety-critical software.

safety awesome-list safety-critical awesome-lists safety-standards

Updated Mar 11, 2025
Python

befelix / safe_learning

Star

Safe reinforcement learning with stability guarantees

reinforcement-learning safety dynamic-programming gaussian-processes stability

Updated Feb 8, 2022
Python

trendmicro / ais

Star

Toolkit for research purposes in AIS. See the website for the paper.

security safety sdr rf ais aisafety

Updated Feb 15, 2021
Python

befelix / SafeOpt

Star

Safe Bayesian Optimization

reinforcement-learning robotics optimization safety gaussian-processes

Updated Nov 14, 2022
Python

pegasi-ai / agent-ci

Star

Deploy once. Continuously improve your AI agents in production.

security alignment safety accuracy rag hallucinations retrieval-augmented-generation rag-metrics

Updated Nov 7, 2025
Python

BlueFalconHD / apple_generative_model_safety_decrypted

Star

Decrypted Generative Model safety files for Apple Intelligence containing filters

apple ai safety decryption lldb-script llm llm-safety apple-intelligence

Updated Oct 22, 2025
Python

MIT-REALM / gcbfplus

Star

Jax Official Implementation of T-RO Paper: Songyuan Zhang*, Oswin So*, Kunal Garg, Chuchu Fan: "GCBF+: A Neural Graph Control Barrier Function Framework for Distributed Safe Multi-Agent Control".

machine-learning robotics safety multi-agent-systems neural-certificates

Updated Jun 3, 2025
Python

maraoz / gpt-scrolls

Star

A collaborative collection of open-source safe GPT-3 prompts that work well

generator transformer openai safety language-model gpt-3

Updated Mar 12, 2022
Python

befelix / safe-exploration

Star

Safe Exploration with MPC and Gaussian process models

reinforcement-learning exploration safety model-predictive-control

Updated Aug 17, 2020
Python

X-PLUG / CValues

Star

面向中文大模型价值观的评估与对齐研究

benchmark evaluation safety responsibility human-values multi-choice llms chinese-llms

Updated Jul 20, 2023
Python

tmlr-group / DeepInception

Star

[arXiv:2311.03191] "DeepInception: Hypnotize Large Language Model to Be Jailbreaker"

jailbreak deep safety inception gpt trustworthy gpt3 gpt4 large-language-models llm

Updated Feb 20, 2024
Python

boyiwei / alignment-attribution-code

Star

[ICML 2024] Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

alignment safety safety-critical low-rank-representaion llm

Updated Mar 30, 2025
Python

naver-ai / korean-safety-benchmarks

Star

Official datasets and pytorch implementation repository of SQuARe and KoSBi (ACL 2023)

nlp ai safety language-model social-bias

Updated Jun 29, 2023
Python

VeriDeep / DLV

Star

Safety Verification of Deep Neural Networks

deep-learning verification safety

Updated Feb 5, 2018
Python

nxvvvv / flask-proxy

Sponsor

Star

A flask proxy website viewer for complete privacy.

flask privacy proxy private safety incognito safe unblocker navaneeth

Updated Feb 16, 2023
Python

ReliaQualAssociates / ramstk

Star

Reliability, Availability, Maintainability, Safety (RAMS) analysis program.

ram analysis reliability safety analyses rams fmea

Updated Oct 29, 2025
Python

Improve this page

Add a description, image, and links to the safety topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the safety topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

safety

Here are 238 public repositories matching this topic...

NVIDIA-NeMo / Guardrails

disclose / diodb

utiasDSL / safe-control-gym

PKU-Alignment / safe-rlhf

stanislaw / awesome-safety-critical

befelix / safe_learning

trendmicro / ais

befelix / SafeOpt

pegasi-ai / agent-ci

BlueFalconHD / apple_generative_model_safety_decrypted

MIT-REALM / gcbfplus

maraoz / gpt-scrolls

befelix / safe-exploration

X-PLUG / CValues

tmlr-group / DeepInception

boyiwei / alignment-attribution-code

naver-ai / korean-safety-benchmarks

VeriDeep / DLV

nxvvvv / flask-proxy

ReliaQualAssociates / ramstk

Improve this page

Add this topic to your repo