# Advanced Code Genera on Verifiers - Data Collec on Project
## Quick Start - Create a New Problem
The fastest way to get started is using our CLI tool. First, set up your environment:
```bash
uv sync
uv run python setup_precommit.py
```
Then create a new problem with the interac ve CLI:
```bash
uv run python -m scripts.create_problem
```
The CLI will:
- Prompt you to select a programming language
- Ask for your problem name
- Get your git username automa cally
- Create all required files with proper templates
- Generate the correct directory structure with today's date
- Set up branch naming as `<language>/<name>` (e.g., `python/graph-algorithms`)
**That's it!** Your problem structure is ready - just fill in the templates.
## Branch Naming
Branch names must follow the format: `<language>/<name>`
**Examples:**
- `python/graph-algorithms` - Topic-based grouping
- `javascript/add-sor ng-problems` - Feature descrip on
- `cpp/batch-1` - Batch iden fier
- `java/advanced-data-structures` - Subject area
This allows you to group mul ple related problems in a single PR.
## Project Background
This project collects challenging human-reviewed prompts and test implementa ons for
advanced coding subjects. The goal is to create graduate-level problems that challenge advanced
AI agents across 11 programming languages.
For detailed methodology, problem crea on guidelines, and difficulty assessment criteria, see
**[INSTRUCTIONS.md](INSTRUCTIONS.md)**.
## Required Directory Structure
Each submission must follow this exact format:
```
languages/{LANGUAGE}/{YYYY-MM-DD-username-problem-name}/
├── metadata.json # Problem metadata and configura on
├── problem.txt # Main problem statement
├── background.txt # Domain-specific context (if needed)
├── src/
│ └── main.py # Reference solu on (entrypoint)
├── test.py # Comprehensive test suite
└── README.md # Problem-specific documenta on
```
## Required Files Overview
| File | Purpose | Requirements |
|------|---------|-------------|
| `metadata.json` | Problem metadata | Difficulty, labels, test coverage, dependencies |
| `problem.txt` | Problem statement | Clear descrip on, examples, constraints |
| `background.txt` | Domain context | Defini ons needed (can be empty) |
| `src/main.*` | Reference solu on | Fully executable, documented, complete |
| `test.*` | Test suite | 10-20+ test cases, uses standard frameworks |
| `README.md` | Instruc ons | How to run, test commands, dependencies |
### metadata.json Example
```json
"name": "problem-name",
"difficulty": "Medium|Hard|Expert",
"subtask": "Code Genera on|Code Edi ng",
"subject_labels": ["Array", "Hash Table", "Algorithm"],
"test_coverage": {
"line_coverage": "X%",
"branch_coverage": "Y%"
},
"test_filename": "test.py", // (op onal), will default to test.[ext]
"entrypoint": "main.py", // (op onal), will default to main.[ext]
"setup_command": "pip install --upgrade pip", // (op onal) this can be any command that
needs to happen before execu on of the main code happens, note: we will install dependencies
"run_command": "python src/main.py", // (op onal), this is the priority of tes ng:
test_coverage_command -> test_command -> run_command
"test_coverage_command": "python -m pytest --cov=src.main --cov-branch --cov-report=term-
missing test.py",
"dependencies": {
"numpy": ">=1.26.0"
},
"test_dependencies": {
"pytest": ">=8.4.0"
},
"references": {
"problem_source_repo": "h ps://github.com/...",
"problem_source_file": "h ps://github.com/..."
```
### Problem-specific README.md Template
Each problem needs its own README with:
```markdown
# Problem Name
## How to Run
uv run python src/main.py
## How to Run Tests with Coverage
uv run python -m pytest --cov=src.main --cov-branch --cov-report=term-missing test.py
```
## Problem Requirements
### Must Include
- **Graduate-level difficulty** that challenges advanced AI agents
- **Clear specifica ons** with unambiguous input/output
- **Mul -step reasoning** requiring problem decomposi on
- **Comprehensive tes ng** with 10-20+ test cases covering edge cases
- **Complete working solu on** with proper documenta on
### Must Not Include
- Mul ple-choice ques ons
- Proof problems
- Surface-level problems easily solved by AI
- Ambiguous problem statements
- Problems from prohibited sources (see below)
## Difficulty Levels
Test your problem against AI agents:
- **Medium**: Required agent (Nova Premier) fails, addi onal agent (GPT/Claude) succeeds
- **Hard**: Both required and addi onal agents fail with mild issues
- **Expert**: Both agents fail with severe issues
## Valida on & Tes ng
### Automated Valida on
```bash
# Validate only your changes (recommended during development)
uv run python -m scripts.validate_submission --changed
# Validate all submissions
uv run python -m scripts.validate_submission
# Validate specific problem
uv run python -m scripts.validate_submission --problem two-sum
```
### Manual Tes ng
```bash
# Navigate to your problem directory
cd languages/{LANGUAGE}/{YYYY-MM-DD-username-problem-name}/
# Test your solu on
uv run python -m pytest --cov=src.main --cov-branch test.py
# Verify solu on runs
uv run python src/main.py
```
### Pre-commit Hooks
The setup script automa cally configures pre-commit hooks that will:
- Validate branch names (must be `<language>/<name>`)
- Check submission structure
- Format Python code
- Run valida on before each commit
## Automated Evalua on Pipeline
When you submit a pull request, your problems undergo comprehensive automated evalua on
across three valida on systems:
### **Requirements Valida on**
Verifies that your problem meets all quality standards:
- **Metadata Completeness**: All required fields, proper file structure, command specifica ons
- **Content Quality**: Clear I/O specifica ons, comprehensive tes ng, proper dependencies
- **Problem Characteris cs**: Graduate-level complexity, mul -step reasoning, unambiguous
language
- **Test Standards**: Framework compliance, adequate coverage, comprehensive cases
- **Format Compliance**: Natural language, proper entry points, no prohibited formats
### **Model Difficulty Valida on**
Tests your problem against mul ple AI models to ensure appropriate difficulty:
- **Challenge Level**: Validates that advanced AI agents struggle with your problem
- **Difficulty Correla on**: Confirms stated difficulty matches actual agent performance
- **Mul -Agent Tes ng**: Requires at least 2 models to fail for proper challenge level
- **Solu on Verifica on**: Tests that valid solu ons can be generated and executed
### **Execu on Valida on**
Ensures your code and tests work correctly in a clean environment:
- **Code Execu on**: Runs your solu on against provided test cases
- **Test Suite Valida on**: Verifies all tests pass with your reference solu on
- **Coverage Analysis**: Confirms test coverage meets minimum requirements
- **Environment Setup**: Tests dependency installa on and setup commands
- **Cross-Pla orm Compa bility**: Validates execu on across different environments
### Evalua on Results
A er evalua on completes, you'll receive detailed feedback on:
- **PASSED**: All valida ons successful - problem ready for review
- **WARNINGS**: Minor issues that don't block acceptance
- **FAILED**: Cri cal issues requiring fixes before approval
Results are automa cally posted as PR comments with specific guidance for any required
improvements.
## Execu on Environments
All submissions are tested in clean, isolated Docker containers with the following specifica ons:
### **System Resources**
- **Timeout**: 120 seconds per execu on
- **Memory Limit**: 512MB
- **CPU Limit**: 1.0 core
### **Supported Languages & Environments**
| Language | Docker Image | Package Manager | Test Framework | Coverage Tool |
|----------|--------------|-----------------|----------------|---------------|
| **Python** | `python:3.11-slim` | pip | pytest | pytest-cov |
| **Java** | `amazoncorre o:17-alpine-jdk` | Maven | JUnit 5 | JaCoCo |
| **JavaScript** | `node:18-alpine` | npm | Jest | Jest Coverage |
| **TypeScript** | `node:18-alpine` | npm | Jest | Jest Coverage |
| **Rust** | `rust:1.75` | Cargo | Built-in | Built-in |
| **Go** | `golang:1.21-alpine` | go mod | Built-in | Built-in |
| **C** | `gcc:latest` | apk | Built-in | gcov |
| **C++** | `gcc:latest` | apk | Built-in | gcov |
| **Ruby** | `ruby:3.2-alpine` | gem | RSpec/Minitest | SimpleCov |
| **PHP** | `php:8.2-cli-alpine` | Composer | PHPUnit | PHPUnit Coverage |
| **COBOL** | `esolang/cobol` | Built-in | Built-in | Manual |
### **Command Execu on Order**
1. **Setup Commands**: Dependency installa on, environment configura on
2. **Test Execu on**: Run your test suite with coverage (if available)
3. **Result Parsing**: Extract test results, coverage metrics, and execu on logs
### **Language-Specific Considera ons**
The executor copies all files from your problem directory to maintain complete context.
However, some languages have specific requirements:
**Java**:
- Use `entrypoint` and `test_filename` in metadata.json to specify correct class names
- Example: `"entrypoint": "StringU ls.java"`, `"test_filename": "TestStringU ls.java"`
**Go**:
- All `.go` files are automa cally placed in the same package (root directory) during execu on
- This allows test files to access func ons from main files regardless of original directory
structure
- Your tests and main code will be in the same package during execu on
**General**: All auxiliary files (like `test_with_coverage.rb`, configura on files, etc.) are
automa cally available during execu on.
### **Dependency Management**
Each language automa cally handles dependencies specified in `metadata.json`:
```json
"dependencies": {
"package-name": ">=1.0.0"
},
"test_dependencies": {
"tes ng-framework": ">=2.0.0"
```
**Auto-generated dependency files:**
- **Python**: `requirements.txt`
- **Java**: `pom.xml` with Maven configura on
- **JavaScript/TypeScript**: `package.json` with npm dependencies
- **Rust**: `Cargo.toml` with workspace setup
- **Go**: `go.mod` with module dependencies
- **Ruby**: `Gemfile` with bundler
- **PHP**: `composer.json` with Composer
### **Custom Commands**
You can override default behavior using metadata fields:
```json
{
"setup_command": "custom setup script",
"run_command": "custom execu on command",
"test_command": "custom test command",
"test_coverage_command": "custom coverage command"
```
### **Environment Variables**
Language-specific environment variables are automa cally set:
- **Python**: `PYTHONPATH`, `PYTHONUNBUFFERED`, `PYTHONDONTWRITEBYTECODE`
- **Java**: `JAVA_HOME`, `MAVEN_OPTS`, `MAVEN_CONFIG`
- **Node.js**: `NODE_ENV`, `NPM_CONFIG_CACHE`, `NPM_CONFIG_PREFIX`
- **Rust**: `CARGO_HOME`, `CARGO_TARGET_DIR`, `RUSTUP_HOME`
- **Go**: `GOPATH`, `GOPROXY`, `GOMOD`
### **Tes ng Guidelines**
- Use standard tes ng frameworks for your language
- Include 10-20+ comprehensive test cases
- Aim for >80% code coverage where possible
- Test edge cases and error condi ons
- Ensure tests pass with your reference solu on
## Prohibited Sources
Do not use problems from:
- ML Engineering: huggingface/transformers.js, huggingface/transformers, pytorch/extension-
cpp, jax-ml/jax
- Scien fic Coding: cer k/theore cal-physics, gszauer/GamePhysicsCookbook, sxs-
collabora on/spectre
## Submission Checklist
Before submi ng, verify:
- [ ] Used CLI tool or followed exact directory structure
- [ ] Branch name follows `<language>/<name>` format
- [ ] All required files present and complete
- [ ] Problem tested against AI agents for appropriate difficulty
- [ ] Solu on is executable and passes all tests
- [ ] Test suite has 10-20+ comprehensive test cases
- [ ] Problem-specific README includes run/test commands
- [ ] Metadata.json properly filled out
- [ ] No prohibited sources used
## Contribu ng
For detailed contribu on guidelines, see [CONTRIBUTING.md](CONTRIBUTING.md).
For comprehensive project methodology and problem crea on guidelines, see
[INSTRUCTIONS.md](INSTRUCTIONS.md).
## Support
Ques ons? Check the sample problem at `languages/Python/2025-06-09-stytarenko-two-sum/`
for the defini ve format example.
`JUST AN EXAMPLE OF A RIGHT STRUCTURE, NOT A REFERENCE IN TERMS OF COMPLEXITY!`
# FAQ
- How do I check what's wrong with my PR?
- [Loom](h ps://www.loom.com/share/0c35b0374cbd41c29aa875aa263cd835?sid=debfcf75-
b45e-44f3-9144-7dc5626376a8)
# Contribu ng to Rainforest Coding Verifiers
## Quick Start - Recommended Approach
The fastest and most reliable way to contribute is using our CLI tool:
```bash
# Set up your environment
uv sync
uv run python setup_precommit.py
# Create a new problem interac vely
uv run python -m scripts.create_problem
```
The CLI will handle all the setup for you, including proper file structure and branch naming.
## Branch Naming Requirements
Branch names **must** follow the format: `<language>/<name>`
**Examples:**
- `python/graph-algorithms` - Topic-based grouping
- `javascript/add-sor ng-problems` - Feature descrip on
- `cpp/batch-1` - Batch iden fier
- `java/advanced-data-structures` - Subject area
This allows you to group mul ple related problems in a single PR.
## Required Directory Structure
Each submission must follow this **exact** format:
```
languages/{LANGUAGE}/{YYYY-MM-DD-username-problem-name}/
├── metadata.json # Problem metadata and configura on
├── problem.txt # Main problem statement
├── background.txt # Domain-specific context (if needed)
├── src/
│ └── main.py # Reference solu on (entrypoint)
├── test.py # Comprehensive test suite
└── README.md # Problem-specific documenta on
```
### Directory Naming Format
```
YYYY-MM-DD-username-problem-name
```
**Examples:**
- `2024-01-15-john-two-sum`
- `2024-01-15-sarah-binary-search`
- `2024-01-16-mike-reverse-linked-list`
- `2024-01-15-john-fibonacci-recursive` (same day, different problem)
**Components:**
- **Date**: Use the date when you start working on the problem (YYYY-MM-DD format)
- **Username**: Your GitHub username (lowercase, replace spaces/special chars with hyphens)
- **Name**: Brief, descrip ve name using hyphens instead of spaces
## Required Files
| File | Purpose | Requirements |
|------|---------|-------------|
| `metadata.json` | Problem metadata | Difficulty, labels, test coverage, dependencies |
| `problem.txt` | Problem statement | Clear descrip on, examples, constraints |
| `background.txt` | Domain context | Defini ons needed (can be empty) |
| `src/main.*` | Reference solu on | Fully executable, documented, complete |
| `test.*` | Test suite | 10-20+ test cases, uses standard frameworks |
| `README.md` | Instruc ons | How to run, test commands, dependencies |
## Problem Requirements
### Must Include
- **Graduate-level difficulty** that challenges advanced AI agents
- **Clear specifica ons** with unambiguous input/output
- **Mul -step reasoning** requiring problem decomposi on
- **Comprehensive tes ng** with 10-20+ test cases covering edge cases
- **Complete working solu on** with proper documenta on
### Must Not Include
- Mul ple-choice ques ons
- Proof problems
- Surface-level problems easily solved by AI
- Ambiguous problem statements
- Problems from prohibited sources
## Valida on & Tes ng
### Before Submi ng
Run valida on on your changes:
```bash
# Validate only your changes (recommended during development)
uv run python -m scripts.validate_submission --changed
# Test your solu on
cd languages/{LANGUAGE}/{YYYY-MM-DD-username-problem-name}/
uv run python -m pytest --cov=src.main --cov-branch test.py
uv run python src/main.py
```
### Pre-commit Hooks
The setup script automa cally configures hooks that will:
- Validate branch names (must be `<language>/<name>`)
- Check submission structure
- Format code
- Run valida on before each commit
## Submission Workflow
### 1. Setup (One- me)
```bash
uv sync
uv run python setup_precommit.py
```
### 2. Create Problem (Recommended)
```bash
uv run python -m scripts.create_problem
```
### 3. Alterna ve: Manual Crea on
If not using the CLI tool, ensure you:
1. Create branch with `<language>/<name>` format
2. Follow the exact directory structure above
3. Include all required files
4. Run valida on before commi ng
### 4. Submit Pull Request
- Title: `Add [Language]: [Problem Name]`
- Include brief descrip on of the problem and approach
- Ensure all valida on passes
## Final Checklist
Before submi ng, verify:
- [ ] Used CLI tool or followed exact directory structure
- [ ] Branch name follows `<language>/<name>` format
- [ ] All required files present and complete
- [ ] Problem tested against AI agents for appropriate difficulty
- [ ] Solu on is executable and passes all tests
- [ ] Test suite has 10-20+ comprehensive test cases
- [ ] Problem-specific README includes run/test commands
- [ ] Metadata.json properly filled out
- [ ] No prohibited sources used
- [ ] Valida on passes: `uv run python -m scripts.validate_submission --changed`
## Ques ons?
Check the sample problem at `languages/Python/2025-06-09-stytarenko-two-sum/` for the
defini ve format example.
# AGENTS.md - AI Agent Guide for Problem Submission
## Quick Start for AI Agents
This guide provides step-by-step instruc ons for AI agents to successfully submit coding
problems to this repository.
## Repository Structure Overview
```
rainforest-coding-verifiers/
├── README.md # Main documenta on
├── INSTRUCTIONS.md # Detailed problem crea on guidelines
├── AGENTS.md # This file - agent-specific instruc ons
├── scripts/
│ ├── create_problem.py # CLI tool for problem crea on
│ └── validate_submission.py # Valida on tool
├── languages/ # All problems organized by language
│ ├── Python/
│ ├── JavaScript/
│ ├── Java/
│ └── [other languages]/
└── evalua on/ # Valida on systems (read-only)
```
## Step-by-Step Submission Workflow
### 1. Environment Setup
```bash
# Clone and setup
git clone <repository-url>
cd rainforest-coding-verifiers
uv sync
uv run python setup_precommit.py
```
### 2. Branch Crea on
Create a branch following the **exact format**: `<language>/<name>`
**Valid Examples:**
- `python/graph-algorithms`
- `javascript/sor ng-op miza on`
- `java/concurrent-data-structures`
- `cpp/memory-management`
**Commands:**
```bash
git checkout -b python/your-problem-name
```
### 3. Problem Crea on Using CLI (Recommended)
```bash
uv run python -m scripts.create_problem
```
The CLI will:
- Prompt for language selec on
- Ask for problem name
- Auto-generate directory with date prefix
- Create all required template files
- Set up proper structure
### 4. Manual Problem Crea on (Alterna ve)
If CLI is unavailable, create this **exact structure**:
```
languages/{LANGUAGE}/{YYYY-MM-DD-username-problem-name}/
├── metadata.json
├── problem.txt
├── background.txt
├── src/
│ └── main.{ext}
├── test.{ext}
└── README.md
```
## Required Files and Templates
### metadata.json Template
```json
"name": "problem-name",
"difficulty": "Medium|Hard|Expert",
"subtask": "Code Genera on|Code Edi ng",
"subject_labels": ["Array", "Hash Table", "Algorithm"],
"test_coverage": {
"line_coverage": "85.5%",
"branch_coverage": "90.2%"
},
"setup_command": "pip install --upgrade pip",
"run_command": "python src/main.py",
"test_command": "python -m pytest test.py",
"test_coverage_command": "python -m pytest --cov=src.main --cov-branch --cov-report=term-
missing test.py",
"dependencies": {
"numpy": ">=1.26.0"
},
"test_dependencies": {
"pytest": ">=8.4.0",
"pytest-cov": ">=4.0.0"
},
"references": {
"problem_source_repo": "h ps://github.com/...",
"problem_source_file": "h ps://github.com/..."
```
### problem.txt Template
```
[Clear problem statement with:]
- Problem descrip on
- Input format specifica on
- Output format specifica on
- Constraints
- Examples with expected outputs
- Edge cases to consider
```
### background.txt Template
```
[Domain-specific context, defini ons, or mathema cal background]
[Can be empty if no special context needed]
```
### src/main.py Template (Python example)
```python
#!/usr/bin/env python3
"""
Problem: [Problem Name]
Author: [Your name]
Date: [Date]
[Brief descrip on of solu on approach]
"""
def main_func on(input_param):
"""
Main solu on func on.
Args:
input_param: [Descrip on]
Returns:
[Descrip on of return value]
"""
# Implementa on here
pass
if __name__ == "__main__":
# Example usage or input handling
pass
```
### test.py Template (Python example)
```python
import pytest
from src.main import main_func on
class TestMainFunc on:
"""Comprehensive test suite with 10-20+ test cases."""
def test_basic_cases(self):
"""Test basic func onality."""
assert main_func on(input1) == expected1
assert main_func on(input2) == expected2
def test_edge_cases(self):
"""Test edge cases."""
# Empty inputs, boundary values, etc.
pass
def test_error_condi ons(self):
"""Test error handling."""
# Invalid inputs, excep ons, etc.
pass
```
### Problem-specific README.md Template
```markdown
# [Problem Name]
## How to Run
uv run python src/main.py
## How to Run Tests
uv run python -m pytest test.py
## How to Run Tests with Coverage
uv run python -m pytest --cov=src.main --cov-branch --cov-report=term-missing test.py
## Dependencies
[List any special dependencies or setup requirements]
```
## Quality Requirements Checklist
Your problem MUST meet these requirements:
### Problem Characteris cs
- [ ] **Graduate-level difficulty** that challenges advanced AI agents
- [ ] **Mul -step reasoning** requiring problem decomposi on
- [ ] **Clear specifica ons** with unambiguous input/output
- [ ] **NOT mul ple-choice** format
- [ ] **NOT a proof problem**
- [ ] **NOT surface-level** or trivial
### Technical Requirements
- [ ] **Complete working solu on** in src/ directory
- [ ] **10-20+ comprehensive test cases** covering edge cases
- [ ] **Test coverage ≥70% line, ≥80% branch** (reported in metadata.json)
- [ ] **All tests pass** with your reference solu on
- [ ] **Proper dependencies** specified in metadata.json
- [ ] **Working run/test commands** specified
### File Requirements
- [ ] All required files present and non-empty
- [ ] metadata.json properly forma ed with all fields
- [ ] Problem statement clear and complete
- [ ] Solu on code documented and executable
## Valida on Commands
Before submi ng, run these valida on checks:
```bash
# Validate your changes (run from repo root)
uv run python -m scripts.validate_submission --changed
# Test your specific problem
cd languages/{LANGUAGE}/{YYYY-MM-DD-username-problem-name}/
uv run python -m pytest --cov=src.main --cov-branch test.py
uv run python src/main.py
# Check coverage meets requirements
uv run python -m pytest --cov=src.main --cov-branch --cov-report=term-missing test.py
```
## PR Submission Process
### 1. Commit Your Changes
```bash
git add .
git commit -m "Add [language]: [problem-name] - [brief descrip on]"
git push origin <language>/<name>
```
### 2. Create Pull Request
- **Title**: `Add [Language]: [Problem Name] - [Brief Descrip on]`
- **Descrip on**: Include problem difficulty, subject areas, and brief overview
- **Base branch**: `main`
### 3. PR Descrip on Template
```markdown
## Problem Summary
- **Language**: [Programming Language]
- **Difficulty**: [Medium/Hard/Expert]
- **Subject Areas**: [List key topics]
- **Problem Type**: [Code Genera on/Code Edi ng]
## Problem Descrip on
[Brief 2-3 sentence descrip on of what the problem asks]
## Key Challenges
- [Challenge 1]
- [Challenge 2]
- [Challenge 3]
## Valida on Checklist
- [ ] All required files present
- [ ] Solu on passes all tests
- [ ] Test coverage ≥70% line, ≥80% branch
- [ ] Problem tested against AI agents for appropriate difficulty
- [ ] Metadata.json complete and valid
```
## Common Pi alls to Avoid
### Directory Structure
- Missing date prefix in directory name
- Incorrect file extensions for language
- Missing required files
- `languages/Python/2025-01-15-agent-problem-name/`
### Problem Quality
- Too easy (solvable in 1-2 steps)
- Mul ple choice format
- Vague or ambiguous requirements
- Requires mul -step reasoning and careful implementa on
### Tes ng
- Only 2-3 basic test cases
- No edge case coverage
- Tests don't actually validate the solu on
- 10-20+ comprehensive tests with edge cases
## Automated Evalua on
A er PR submission, your problem will be automa cally evaluated across three systems:
1. **Requirements Valida on**: Checks problem quality, structure, and completeness
2. **Model Difficulty Valida on**: Tests against AI models to verify appropriate difficulty
3. **Execu on Valida on**: Ensures code runs correctly in clean environment
Results will be posted as PR comments with specific feedback for any required improvements.
## Troubleshoo ng
### Common Issues and Solu ons
**Issue**: Branch name valida on fails
- **Solu on**: Ensure branch follows exact format `<language>/<name>`
**Issue**: Coverage valida on fails
- **Solu on**: Add more comprehensive tests, especially edge cases
**Issue**: Problem too easy (models pass)
- **Solu on**: Increase complexity, add more constraints, require deeper reasoning
**Issue**: Tests fail in clean environment
- **Solu on**: Check dependencies in metadata.json, verify setup commands
**Issue**: Metadata valida on fails
- **Solu on**: Ensure all required fields present, proper JSON forma ng
## Reference Example
See the reference implementa on at:
`languages/Python/2025-06-09-stytarenko-two-sum/`
This shows the exact structure and format expected (though complexity should be higher for
your submissions).
## Success Criteria
Your PR will be approved when:
- All automated evalua ons pass
- Problem demonstrates appropriate difficulty for AI agents
- Code quality and documenta on meet standards
- Test coverage and comprehensiveness adequate
- Problem fits project goals and requirements
Follow this guide systema cally, and your submission should pass all valida on checks
successfully.