Skills

A fork of huggingface/skills — a collection of AI skills, agents, and evaluations.

Overview

This repository contains:

Skills: Modular AI capabilities that can be composed into agents
Agents: Autonomous AI systems built from skills
Evals: Benchmarks and leaderboards for measuring skill performance
Marketplace: Discoverable plugins for Claude and Cursor

Structure

.
├── .claude-plugin/          # Claude AI plugin configuration
│   ├── plugin.json          # Plugin metadata and entry points
│   └── marketplace.json     # Marketplace listing
├── .cursor-plugin/          # Cursor IDE plugin configuration
│   ├── plugin.json          # Plugin metadata
│   └── marketplace.json     # Marketplace listing
├── .github/
│   └── workflows/
│       ├── generate-agents.yml          # CI: auto-generate agent configs
│       ├── push-evals-leaderboard.yml   # CI: update evals leaderboard
│       └── push-hackers-leaderboard.yml # CI: update hackers leaderboard
└── skills/                  # Core skill implementations

Getting Started

Prerequisites

Python 3.10+
pip or uv for package management

Installation

git clone https://github.com/your-org/skills.git
cd skills
pip install -e .

Running Evaluations

python -m skills.evals run --skill <skill-name>

Using the Claude Plugin

Install via the Claude marketplace or load the .claude-plugin/plugin.json directly in your Claude environment.

Using the Cursor Plugin

Install via the Cursor marketplace or load the .cursor-plugin/plugin.json directly in your Cursor IDE.

Contributing

Fork the repository
Create a feature branch (git checkout -b feat/my-skill)
Add your skill under skills/
Add corresponding evals under evals/
Open a pull request

See CONTRIBUTING.md for detailed guidelines.

Security

See .github/workflows/SECURITY.md for our security policy.

License

Apache 2.0 — see LICENSE for details.

Personal fork notes: I'm using this repo to experiment with building custom skills for my own workflows. Main areas of interest: text summarization and code review skills. Not intended for production use.

TODO:

Build a summarization skill that handles long documents (>10k tokens) by chunking

Experiment with a code review skill focused on Python style/type hints

Compare eval results against upstream once I have a baseline

Look into adding a --dry-run flag to the evals runner so I can test without writing results

Try running evals against gpt-4o-mini as a cheaper baseline before committing to full runs

Try chunking strategy: overlap by ~10% between chunks to avoid losing context at boundaries — tested this, works well; settled on 15% overlap since 10% occasionally dropped a sentence at boundaries

Set up local dev environment with uv instead of pip — noticeably faster for resolving deps

Name		Name	Last commit message	Last commit date
Latest commit History 269 Commits
.claude-plugin		.claude-plugin
.cursor-plugin		.cursor-plugin
.github/workflows		.github/workflows
agents		agents
apps		apps
assets		assets
hf-mcp/skills/hf-mcp		hf-mcp/skills/hf-mcp
scripts		scripts
skills		skills
.gitignore		.gitignore
.mcp.json		.mcp.json
LICENSE		LICENSE
README.md		README.md
gemini-extension.json		gemini-extension.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Skills

Overview

Structure

Getting Started

Prerequisites

Installation

Running Evaluations

Using the Claude Plugin

Using the Cursor Plugin

Contributing

Security

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Skills

Overview

Structure

Getting Started

Prerequisites

Installation

Running Evaluations

Using the Claude Plugin

Using the Cursor Plugin

Contributing

Security

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages