This repo contains code for a technique that attempts to remove sleeper agent behavior. There's an accompanying blog post.
As far as dependencies and setup, there are a couple of options:
- There's a
pyproject.tomlthat contains Poetry declarations, but I never usepoetrydirectly so I can't guarantee its correctness. - I use Poetry indirectly via
poetry2nix. If you're a Nix user, invokingnix developshould get you a CPU-only setup.