greynewell

🚀

Grey Newell greynewell

🚀

Building evaluation infrastructure for AI systems. Creator of the MIST stack. MS CS (ML) at Georgia Tech. Ex-AWS.

90 followers · 41 following

Achievements

x4 x3

Achievements

x4 x3

Highlights

cli Public
Forked from supermodeltools/cli

Go MIT License Updated Apr 2, 2026
awesome-hermes-agent Public
Forked from 0xNyk/awesome-hermes-agent

A curated list of awesome skills, tools, integrations, and resources for Hermes Agent by Nous Research

Other Updated Mar 23, 2026
awesome-mcp-serverss Public
Forked from fastmcp-me/awesome-mcp-serverss

Awesome MCP Servers - A curated list of Model Context Protocol servers

Updated Mar 23, 2026
bigiron Public
Forked from supermodeltools/bigiron

Big Iron — AI-Native SDLC. Hermes Agent + Supermodel code graph, graph-gated at every phase.

Shell Updated Mar 19, 2026
evals.biz Public

AI evaluation strategy reference library for technical leaders

Nunjucks Updated Mar 16, 2026
mist-go Public

Shared core for the MIST stack. Zero external deps.

testing golang middleware distributed-systems microservices metrics protocol

Go 3 MIT License 5 issues need help Updated Mar 16, 2026
greynewell Public

My personal README!

nodejs blog aws typescript aws-lambda reactjs developer

HTML 1 Updated Mar 15, 2026
swe-bench-fast Public

One-command SWE-bench eval harness in Go. Native ARM64 containers with 6.3x test runner speedup on Apple Silicon and AWS Graviton. Pre-built images on Docker Hub.

docker golang benchmark software-engineering arm64 aarch64 apple-silicon

Go 1 Updated Mar 6, 2026
docs Public
Forked from railwayapp/docs

Railway documentation

TypeScript MIT License Updated Mar 3, 2026
claude-software-factory Public template

Open an issue. Get a pull request. 6 GitHub Actions workflows that turn any repo into a self-running software factory powered by Claude Code.

template devops automation ai software-factory claude github-actions

3 MIT License Updated Mar 3, 2026
llm-router-env Public

Gymnasium RL environment for LLM inference routing optimization — cut costs 15-25% vs static strategies

Python MIT License Updated Mar 2, 2026
swe-bench-pro-action Public

GitHub Action for SWE-bench Pro evaluation powered by mcpbr

python benchmarking mcp ai-agents github-actions ml-evaluation llm-evaluation

Shell MIT License Updated Feb 26, 2026
evaldriven.org Public

Ship evals before you ship features.

testing devops benchmarking machine-learning automation best-practices evaluation

Nunjucks 21 5 Creative Commons Zero v1.0 Universal Updated Feb 25, 2026
schemaflux Public

Structured data compiler. Pass pipeline, pluggable backends.

markdown golang static-site-generator yaml sitemap schema compiler

Go 12 1 Updated Feb 17, 2026
mcpbr Public
Forked from supermodeltools/mcpbr

Model Context Protocol Benchmark Runner

Python MIT License Updated Feb 17, 2026
matchspec Public

Eval framework. Define correct, test against it, get results.

golang benchmark machine-learning evaluation quality-assurance automated-testing benchmark-suite

Go 22 8 MIT License 1 issue needs help Updated Feb 17, 2026
infermux Public

Route inference across LLM providers. Track cost per request.

golang api-gateway inference openai multi-model observability load-balancing

Go 90 7 MIT License Updated Feb 17, 2026
tokentrace Public

Where did your tokens go? Spans, latency percentiles, alerts.

golang performance real-time monitoring metrics latency alerting

Go 5 MIT License Updated Feb 17, 2026
SWE-bench Public
Forked from SWE-bench/SWE-bench

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Python MIT License Updated Feb 17, 2026
agentic-template Public archive

Starter template for AI-first development. Scaffolds AGENTS.md, CLAUDE.md, CHANGELOG, and README so coding agents like Claude Code, Cursor, and Copilot have the right context from day one.

mcp starter-template developer-tools cursor copilot ai-agents windsurf

HTML 1 Updated Feb 15, 2026
mcp-serialization-repro Public archive

Do MCP tools serialize in Claude Code? Empirical study: readOnlyHint controls parallelism, IPC overhead is ~5ms/call. Reproduces #14353.

mcp llm-agents model-context-protocol claude-code swe-bench tool-parallelism readonlyhint

Python 4 Updated Feb 15, 2026
arch-docs Public
Forked from supermodeltools/arch-docs

GitHub Action to generate architecture documentation for any repository using Supermodel

JavaScript Updated Feb 14, 2026
supermodeltools.github.io Public
Forked from GraphTechnologyDevelopers/graphtechnologydevelopers.github.io

GitHub Pages for supermodeltools

Go Updated Feb 14, 2026
mcp Public
Forked from supermodeltools/mcp

Supermodel Model Context Protocol server. Generate code graphs in Cursor, Codex or Claude Code!

TypeScript Updated Feb 13, 2026
openapi-spec Public
Forked from supermodeltools/openapi-spec

Spec for Supermodel public API in OpenAPI YAML. Use as a reference or generate your own clients.

Updated Feb 13, 2026
typescript-sdk Public
Forked from supermodeltools/sdk

Generate useful graphs of your codebase with our TypeScript SDK!

TypeScript Updated Feb 13, 2026
dead-code-hunter Public
Forked from supermodeltools/audit

GitHub Action to find unreachable functions using Supermodel call graphs

TypeScript MIT License Updated Feb 11, 2026

Grey Newell greynewell

Achievements

Achievements

Highlights

cli Public

Uh oh!

awesome-hermes-agent Public

Uh oh!

awesome-mcp-serverss Public

Uh oh!

bigiron Public

Uh oh!

evals.biz Public

Uh oh!

mist-go Public

Uh oh!

greynewell Public

Uh oh!

swe-bench-fast Public

Uh oh!

docs Public

Uh oh!

claude-software-factory Public template

Uh oh!

llm-router-env Public

Uh oh!

swe-bench-pro-action Public

Uh oh!

evaldriven.org Public

Uh oh!

schemaflux Public

Uh oh!

mcpbr Public

Uh oh!

matchspec Public

Uh oh!

infermux Public

Uh oh!

tokentrace Public

Uh oh!

SWE-bench Public

Uh oh!

agentic-template Public archive

Uh oh!

mcp-serialization-repro Public archive

Uh oh!

arch-docs Public

Uh oh!

supermodeltools.github.io Public

Uh oh!

mcp Public

Uh oh!

openapi-spec Public

Uh oh!

typescript-sdk Public

Uh oh!

dead-code-hunter Public

Uh oh!