Skip to content

Xebec19/seo-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SEO Crawler

A powerful CLI-based SEO analysis tool built in Go that analyzes webpages and generates comprehensive HTML reports with technical SEO insights.

Features

Meta Tags & On-Page SEO

  • Title tag analysis (presence, length optimization)
  • Meta description analysis (presence, length optimization)
  • Heading structure analysis (H1-H6 hierarchy)
  • Open Graph tags detection
  • Twitter Card tags detection
  • Charset and viewport meta tags
  • Language declaration
  • Canonical URL detection
  • Robots meta tag analysis

Images & Media

  • Alt text validation
  • Broken image detection
  • Large image identification (>500KB)
  • Modern format usage (WebP, AVIF)
  • Responsive image detection (srcset)
  • Total image count and optimization suggestions

Links Analysis

  • Internal vs external link counts
  • Broken link detection (sampled)
  • Anchor text validation
  • NoFollow link detection
  • Security check for target="_blank" links
  • Relative vs absolute URL analysis

Performance & Speed

  • Page load time measurement
  • Page size analysis
  • HTTPS usage verification
  • Compression detection (gzip/brotli)
  • Caching headers analysis
  • Resource count
  • Minification checks for CSS/JS

Installation

Prerequisites

  • Go 1.21 or higher

Build from Source

# Clone the repository
git clone <repository-url>
cd seo-crawler

# Install dependencies
go mod download

# Build the binary
go build -o seo-crawler cmd/crawler/main.go

Usage

Basic Usage

# Run with HTTPS URL
./seo-crawler https://example.com

# Run with domain (HTTPS assumed)
./seo-crawler example.com

# Using go run
go run cmd/crawler/main.go https://example.com

Example Output

Analyzing https://example.com...

⏳ Fetching page...
✓ Page fetched (0.45s, 12.3 KB)

⏳ Analyzing meta tags...
✓ Meta analysis complete (Score: 22/25)

⏳ Analyzing images...
✓ Image analysis complete (Score: 18/25)

⏳ Analyzing links...
✓ Link analysis complete (Score: 25/25)

⏳ Analyzing performance...
✓ Performance analysis complete (Score: 20/25)

⏳ Generating HTML report...
✓ Report generated: seo-report-2026-05-12-180500.html

═══════════════════════════════════════
  Overall Score: 85/100
═══════════════════════════════════════
  Meta & On-Page: 22/25
  Images & Media: 18/25
  Links:          25/25
  Performance:    20/25
═══════════════════════════════════════

  Issues: 2 errors, 4 warnings, 3 info

Open seo-report-2026-05-12-180500.html in your browser to view the full report.

Report Features

The generated HTML report includes:

  • Responsive Design: Works on desktop, tablet, and mobile
  • Overall Score: Out of 100 points across all categories
  • Category Breakdown: Each category scored out of 25 points
  • Color-Coded Issues:
    • Red = Errors (critical issues)
    • Yellow = Warnings (important improvements)
    • Blue = Info (suggestions)
  • Detailed Statistics: Key metrics for each category
  • Actionable Suggestions: Specific recommendations for each issue
  • Print-Friendly: Optimized for printing and PDF export

Scoring System

  • Total Score: 100 points (25 per category)
  • Error: -3 points each
  • Warning: -1 point each
  • Info: No deduction (informational only)
  • Minimum: 0 points per category

Score Interpretation

  • 90-100: Excellent
  • 70-89: Good
  • 50-69: Fair
  • 0-49: Poor

Project Structure

seo-crawler/
├── cmd/
│   └── crawler/
│       └── main.go              # CLI entry point
├── internal/
│   ├── crawler/
│   │   └── crawler.go           # Page fetching logic
│   ├── analyzer/
│   │   ├── meta.go              # Meta tags analysis
│   │   ├── images.go            # Images analysis
│   │   ├── links.go             # Links analysis
│   │   └── performance.go       # Performance analysis
│   ├── models/
│   │   └── report.go            # Data structures
│   └── reporter/
│       ├── reporter.go          # Report generation
│       └── templates/
│           └── report.html      # HTML template
├── go.mod
├── go.sum
└── README.md

Dependencies

  • colly - Web scraping framework
  • goquery - HTML parsing and manipulation

Limitations

  • Single page analysis only (does not crawl entire site)
  • Image and link checks are sampled (first 10) for performance
  • Requires internet connection to analyze remote pages
  • Some checks may timeout on slow servers

Future Enhancements

  • Full site crawl capability
  • JSON export option
  • Comparison between multiple URLs
  • Historical tracking and trend analysis
  • API mode for integration
  • Configurable timeout and sample sizes
  • Parallel page analysis

License

MIT License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues, questions, or suggestions, please open an issue on the repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors