15 releases (5 breaking)

0.9.5 Feb 1, 2026
0.9.4 Feb 1, 2026
0.6.0 Oct 25, 2025
0.5.0 Oct 21, 2025
0.2.4 Oct 16, 2025

#258 in Machine learning

MIT license

130KB
2K SLoC

Skim

Smart code reader - streaming code transformation for AI agents.

Crates.io License: MIT

Overview

Skim transforms source code by intelligently removing implementation details while preserving structure, signatures, and types - perfect for optimizing code for LLM context windows.

Think of it like cat, but smart about what code to show.

Installation

Try it (no install required)

npx rskim file.ts
# Via npm
npm install -g rskim

# Via Cargo
cargo install rskim

Note: Use npx for trying it out. For regular use, install globally to avoid npx overhead (~100-500ms per invocation).

Quick Start

# Try it with npx (no install)
npx rskim file.ts

# Or install globally for better performance
npm install -g rskim

# Read TypeScript with structure mode
skim file.ts

# Process multiple files with glob patterns
skim 'src/**/*.ts'

# Show token reduction statistics
skim file.ts --show-stats

# Extract Python function signatures
skim file.py --mode signatures

# Parallel processing with custom job count
skim '*.{js,ts}' --jobs 4

# Pipe to syntax highlighter
skim file.rs | bat -l rust

# Read from stdin
cat code.ts | skim - --language=typescript

# Clear cache
skim --clear-cache

Features

  • 6 Languages: TypeScript, JavaScript, Python, Rust, Go, Java
  • 4 Transformation Modes: Structure, Signatures, Types, Full
  • Fast: 14.6ms for 3000-line files (3x faster than target)
  • Cached: 40-50x speedup on repeated processing (enabled by default)
  • Multi-file: Glob patterns with parallel processing (skim 'src/**/*.ts')
  • Token Stats: Show reduction statistics with --show-stats
  • Streaming: Outputs to stdout for pipe workflows
  • Safe: Built-in DoS protections

Usage

Basic Usage

skim <FILE>

Options

Options:
  -m, --mode <MODE>         Transformation mode [default: structure]
                            [possible values: structure, signatures, types, full]
  -l, --language <LANGUAGE> Override language detection
                            [possible values: typescript, javascript, python, rust, go, java]
  -j, --jobs <JOBS>         Number of parallel jobs [default: number of CPUs]
      --no-header           Don't print file path headers for multi-file output
      --no-cache            Disable caching (caching is enabled by default)
      --clear-cache         Clear all cached files and exit
      --show-stats          Show token reduction statistics
  -h, --help                Print help
  -V, --version             Print version

Transformation Modes

Structure Mode (Default)

Removes function bodies while preserving signatures (70-80% reduction).

skim file.ts

Input:

function add(a: number, b: number): number {
    const result = a + b;
    console.log(`Adding ${a} + ${b} = ${result}`);
    return result;
}

Output:

function add(a: number, b: number): number { /* ... */ }

Signatures Mode

Extracts only function and method signatures (85-92% reduction).

skim file.py --mode signatures

Input:

def calculate_total(items: list[Item], tax_rate: float) -> Decimal:
    subtotal = sum(item.price for item in items)
    tax = subtotal * tax_rate
    return subtotal + tax

Output:

def calculate_total(items: list[Item], tax_rate: float) -> Decimal:

Types Mode

Extracts only type definitions (90-95% reduction).

skim file.ts --mode types

Input:

interface User {
    id: number;
    name: string;
}

function getUser(id: number): User {
    return db.users.find(id);
}

Output:

interface User {
    id: number;
    name: string;
}

Full Mode

Returns original code unchanged (0% reduction).

skim file.rs --mode full

Examples

Explore a codebase

# Get overview of all TypeScript files (NEW: glob support)
skim 'src/**/*.ts' --no-header

# Extract all Python function signatures with stats
skim 'lib/**/*.py' --mode signatures --show-stats > api.txt

# Review Rust types
skim lib.rs --mode types | less

# Parallel processing for faster multi-file operations
skim 'src/**/*.ts' --jobs 8

Prepare code for LLMs

# Reduce token count before sending to GPT
skim large_file.ts | wc -w
# Output: 150 (was 600)

# Get just the API surface
skim server.py --mode signatures | pbcopy

Pipe workflows

# Skim and highlight
skim file.rs | bat -l rust

# Skim and search
skim file.ts | grep "interface"

# Skim multiple files
cat *.py | skim - --language=python

Supported Languages

Language Extensions Auto-detected
TypeScript .ts, .tsx
JavaScript .js, .jsx, .mjs
Python .py
Rust .rs
Go .go
Java .java

Performance

  • Parse + Transform: 14.6ms for 3000-line files (verified)
  • Cached: 5ms on repeated processing (40-50x speedup)
  • Token Reduction: 60-95% depending on mode
  • Streaming: Zero intermediate files
  • Parallel: Scales with CPU cores for multi-file processing

Security

Built-in protections against:

  • Stack overflow attacks (max depth: 500)
  • Memory exhaustion (max input: 50MB)
  • UTF-8 boundary violations
  • Path traversal attacks

Library

For programmatic usage, see the rskim-core library crate.

License

MIT

Dependencies

~61MB
~1.5M SLoC