Skip to content

Conversation

@derduher
Copy link
Collaborator

Summary

This PR addresses security vulnerabilities and code quality issues identified during a comprehensive code review of lib/types.ts, lib/utils.ts, lib/validation.ts, lib/xmllint.ts, and lib/errors.ts.

Security Fixes (Critical)

🔒 Command Injection Prevention

  • File: lib/xmllint.ts
  • Issue: Function accepted file paths that could be exploited for command injection
  • Fix: Always pipe XML content via stdin; never accept file paths as arguments
  • Impact: Prevents arbitrary command execution via malicious file path arguments

🔒 Enhanced Path Traversal Protection

  • File: lib/validation.ts
  • Issue: Incomplete detection of .. path traversal sequences
  • Fix: Detect .. in all positions (beginning, middle, end, standalone)
  • Details:
    • Check before and after path normalization
    • Validate individual path components
    • Handle both Unix (/) and Windows (\) separators
  • Impact: Prevents accessing files outside intended directories

🔒 XSL URL Security Hardening

  • File: lib/validation.ts
  • Issue: Case-sensitive validation missed obfuscated attacks
  • Fix: Case-insensitive validation for malicious patterns
  • Blocks:
    • Script tags: <script, <ScRiPt, <SCRIPT>, etc.
    • Dangerous protocols: javascript:, data:, vbscript:, file:, about:
    • URL-encoded attacks: %3cscript, javascript%3a, etc.
  • Impact: Comprehensive XSS and protocol injection protection

Quality Improvements

✅ Number Validation

  • File: lib/utils.ts
  • Issue: parseFloat() and parseInt() could produce NaN that propagated to XML
  • Fix: Validate parsed values and throw descriptive errors
  • Validates:
    • Video ratings are valid numbers (not NaN)
    • View counts are valid non-negative integers

✅ Date Validation

  • File: lib/utils.ts
  • Issue: Invalid date strings created Invalid Date objects
  • Fix: Check Number.isNaN(date.getTime()) to detect invalid dates
  • Impact: Prevents Invalid Date objects in XML output

✅ Language Regex Fix

  • File: lib/types.ts
  • Issue: Regex /^zh-cn|zh-tw|([a-z]{2,3})$/ had incorrect grouping
  • Fix: Changed to /^(zh-cn|zh-tw|[a-z]{2,3})$/ for proper validation
  • Impact: Correctly validates ISO 639 language codes

✅ Path Resolution Robustness

  • File: lib/xmllint.ts
  • Issue: Fragile relative path resolution depended on execution context
  • Fix: Use process.cwd() based search with fallback paths
  • Impact: Works reliably in source, ESM build, CJS build, and test environments

📝 Type Documentation

  • File: lib/types.ts
  • Added: JSDoc comments for PriceType and Resolution types
  • Documents: Google Video Sitemap spec allows case variations

Documentation

📖 Security Documentation

  • File: api.md
  • Added: Comprehensive security section to simpleSitemapAndIndex
  • Documents:
    • URL validation rules and limits
    • Path traversal protection mechanisms
    • XSL stylesheet security measures
    • Resource limits and data validation
  • Added: Security note to xmlLint documentation about stdin piping
  • Updated: Examples to show secure usage patterns

📖 JSDoc Security Comments

  • Added security-focused comments to all validation functions
  • Explain what attacks each validation prevents
  • Document security rationale and best practices

Testing

✅ Test Updates

  • Updated tests/xmllint.test.ts to pass XML content instead of file paths
  • Added 7 new XSL URL security tests covering:
    • Mixed-case script tag detection
    • URL-encoded malicious content
    • Multiple dangerous protocol checks
  • All 291 tests passing with coverage exceeding requirements:
    • Statements: 90.23% (required: 90%)
    • Branches: 84.03% (required: 80%)
    • Functions: 94.08% (required: 90%)
    • Lines: 90.34% (required: 90%)

Breaking Changes

⚠️ Note: This PR includes breaking changes for improved security:

  1. Stricter path traversal detection - May reject paths containing .. that were previously accepted
  2. Enhanced XSL URL validation - Blocks more dangerous patterns (case-insensitive checks)
  3. Number/date parsing - Throws errors on invalid input instead of silently creating NaN/Invalid Date
  4. Language regex - May reject invalid language codes that previously passed
  5. xmlLint API change - No longer accepts file paths, only string or stream content

All breaking changes are documented in the updated api.md.

Checklist

  • Security vulnerabilities fixed
  • Code quality improved
  • Tests updated and passing (291/291)
  • Documentation updated with security notes
  • Coverage exceeds requirements (>90%)
  • Pre-commit hooks passed (eslint, prettier)
  • Breaking changes documented

🤖 Generated with Claude Code

## Security Fixes (Critical)

### Command Injection Prevention (lib/xmllint.ts)
- Fix command injection vulnerability in xmlLint function
- Always pipe XML content via stdin instead of accepting file paths
- Prevents arbitrary command execution via malicious file path arguments
- Update tests to pass XML content as strings or streams

### Path Traversal Protection (lib/validation.ts)
- Enhance path traversal detection in validatePath() and validatePublicBasePath()
- Detect '..' in all positions (beginning, middle, end, standalone)
- Validate before and after path normalization
- Check individual path components to catch edge cases
- Handle both Unix (/) and Windows (\) style path separators

### XSL URL Security (lib/validation.ts)
- Implement case-insensitive validation for XSL URLs
- Block script tags in all case variations: <script, <ScRiPt, <SCRIPT>
- Add checks for dangerous protocols: data:, vbscript:, file:, about:
- Detect URL-encoded attacks: %3cscript, javascript%3a, etc.
- Comprehensive protection against XSS and protocol injection

## Quality Improvements

### Number Validation (lib/utils.ts)
- Validate parseFloat(video.rating) returns valid number, not NaN
- Validate parseInt(video.view_count) returns valid non-negative integer
- Throw descriptive errors instead of silently propagating NaN values

### Date Validation (lib/utils.ts)
- Validate Date objects for lastmod, lastmodISO, lastmodfile
- Check for Invalid Date objects using Number.isNaN(date.getTime())
- Throw descriptive errors for unparseable date strings

### Language Regex Fix (lib/types.ts)
- Fix language validator regex from /^zh-cn|zh-tw|([a-z]{2,3})$/ to /^(zh-cn|zh-tw|[a-z]{2,3})$/
- Proper grouping ensures correct ISO 639 language code validation

### Path Resolution (lib/xmllint.ts)
- Improve schema directory resolution using process.cwd()
- Works reliably in source, ESM build, CJS build, and test environments
- Graceful fallback when import.meta is unavailable (Jest)

### Type Documentation (lib/types.ts)
- Add JSDoc comments for PriceType and Resolution types
- Document Google Video Sitemap spec allows case variations

## Documentation

### Security Documentation
- Add comprehensive security section to api.md
- Document URL validation, path traversal protection, XSL security
- Add security notes to xmlLint documentation
- Update examples to show secure usage patterns

### JSDoc Comments
- Add security-focused JSDoc to all validation functions
- Explain what attacks each validation prevents
- Document security best practices

## Testing

### Updated Tests
- Update xmllint tests to pass content instead of file paths
- Add 7 new XSL URL security tests (case-insensitive, encoded attacks)
- All 291 tests passing with >90% coverage

## Breaking Changes

- Stricter path traversal detection may reject paths with '..'
- XSL URL validation blocks more patterns (case-insensitive)
- Number/date parsing throws errors on invalid input
- Language regex fix may reject invalid codes
- xmlLint() no longer accepts file paths, only string/stream content

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@derduher derduher merged commit 4f487a0 into master Oct 13, 2025
6 checks passed
@derduher derduher deleted the security/code-review-fixes branch October 13, 2025 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants