Skip to content
View AlphaPav's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Organizations

@AI-secure

Block or report AlphaPav

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI enabled pair programmer for Claude, GPT, O Series, Grok, Deepseek, Gemini and 300+ models

Rust 7,409 1,448 Updated Jun 15, 2026

CyberGym is a large-scale, high-quality cybersecurity evaluation framework designed to rigorously assess the capabilities of AI agents on real-world vulnerability analysis tasks.

Python 413 59 Updated May 18, 2026

Lightweight coding agent that runs in your terminal

Rust 91,027 13,443 Updated Jun 15, 2026

🌎💪 BrowserGym, a Gym environment for web task automation

Python 1,250 177 Updated Mar 17, 2026

🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman

JavaScript 72,519 4,089 Updated Jun 12, 2026

Robust Speech Recognition via Large-Scale Weak Supervision

Python 102,727 12,535 Updated Apr 15, 2026

你想蒸馏的下一个员工,何必是同事。蒸馏任何人的思维方式——心智模型、决策启发式、表达DNA。Distill how anyone thinks.

Python 24,301 3,562 Updated Jun 14, 2026

OSS-Fuzz - continuous fuzzing for open source software.

Shell 12,345 2,788 Updated Jun 14, 2026

AI agents running research on single-GPU nanochat training automatically

Python 86,744 12,565 Updated Mar 26, 2026

OpenAI Frontier Evals

Python 1,220 162 Updated Apr 21, 2026

The open source coding agent.

TypeScript 174,416 21,089 Updated Jun 15, 2026

A platform for building reliable AI agents

Python 101 6 Updated Apr 3, 2026

image scaling attacks for multi-modal prompt injection

Python 1,060 93 Updated May 19, 2026

AndroidWorld is an environment and benchmark for autonomous agents

Python 794 155 Updated Jun 12, 2026

An Illusion of Progress? Assessing the Current State of Web Agents

Python 180 12 Updated May 28, 2026

Code for "WebVoyager: WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models"

Python 1,096 119 Updated Mar 4, 2024

[NeurIPS 2025 Spotlight] Reasoning Environments for Reinforcement Learning with Verifiable Rewards

Python 1,444 120 Updated Apr 17, 2026

Repo for the paper "Meta SecAlign: A Secure Foundation LLM Against Prompt Injection Attacks".

Python 68 18 Updated Jun 11, 2026
Python 144 26 Updated Jul 2, 2024

Official release of code for the paper RL is a hammer and LLMs are nails A simple RL approach to stronger prompt injection attacks

Python 52 6 Updated May 6, 2026
Python 22 2 Updated Jun 18, 2025

Open-source implementation of AlphaEvolve

Python 6,544 1,046 Updated Mar 18, 2026

Get your documents ready for gen AI

Python 61,560 4,304 Updated Jun 14, 2026

[NeurIPS 2025] Latent Zoning Networks

Python 61 3 Updated Jun 5, 2026

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 5,160 710 Updated Jun 13, 2026

An open-source AI agent that brings the power of Gemini directly into your terminal.

TypeScript 105,277 14,067 Updated Jun 15, 2026

🔮Reasoning for Safer Code Generation; 🥇Winner Solution of Amazon Nova AI Challenge 2025

Python 39 3 Updated Aug 24, 2025

An open-source AI coding agent that lives in your terminal.

TypeScript 25,213 2,505 Updated Jun 14, 2026

👩‍⚖️ Agent-as-a-Judge: The Magic for Open-Endedness

HTML 781 105 Updated Mar 28, 2026

MCPMark is a comprehensive, stress-testing MCP benchmark designed to evaluate model and agent capabilities in real-world MCP use.

Python 428 37 Updated Jun 12, 2026
Next