Skip to content

dmytrogajewski/hercules

 
 

Repository files navigation

Hercules

Disclaimer: This project originates from src-d/hercules, but follows a completely different concept and architecture. It is not a drop-in replacement and is maintained independently.

CI Docker Image Go Reference

Hercules is a scalable, cloud-native Git repository analysis platform. It provides advanced analytics, REST/gRPC APIs, and supports high-scale deployments with S3-compatible caching, Kubernetes, and modern DevOps workflows.


🚀 Features

  • S3 & Multi-Backend Cache: Pluggable cache (S3, MinIO, local, memory) for horizontal scaling and cost efficiency
  • REST & gRPC APIs: Run analyses via HTTP or gRPC
  • Modern CI/CD: GitHub Actions, multi-arch Docker, security scanning, release automation
  • Cloud-Native: Distroless Docker, Kubernetes, Helm, autoscaling, health checks
  • Extensible: Add new analyses, plug in custom cache backends
  • Production Ready: IAM/secret support, resource limits, observability, best practices

🏗️ Quick Start

Docker (Distroless)

docker build -t hercules:latest .
docker run -p 8080:8080 hercules:latest

Docker Compose (with MinIO S3 cache)

docker-compose up -d
# Hercules: http://localhost:8080
# MinIO Console: http://localhost:9001 (minioadmin/minioadmin)

Kubernetes

kubectl apply -f k8s/deployment.yaml
kubectl get pods -l app=hercules

Helm

helm repo add hercules https://dmytrogajewski.github.io/hercules
helm install hercules hercules/hercules

🖥️ Command-Line Usage

Hercules can be used as a powerful CLI tool for direct repository analysis, automation, and scripting.

Basic Analysis

# Analyze a repository for burndown statistics
hercules --burndown https://github.com/dmytrogajewski/hercules.git

# Multiple analyses in one run
hercules --burndown --couples --devs https://github.com/dmytrogajewski/hercules.git

Custom Options

# Custom tick size, granularity, and sampling
hercules --burndown --tick-size 12 --granularity 15 --sampling 10 https://github.com/dmytrogajewski/hercules.git

Output Formats

# Output as JSON
hercules --burndown --json https://github.com/dmytrogajewski/hercules.git > result.json

# Output as YAML
hercules --burndown --yaml https://github.com/dmytrogajewski/hercules.git > result.yaml

# Output as Protobuf
hercules --burndown --pb https://github.com/dmytrogajewski/hercules.git > result.pb

Caching (Local/S3)

# Use local cache directory
hercules --burndown --cache /tmp/hercules-cache https://github.com/dmytrogajewski/hercules.git

# Use S3 cache (set via config or env)
HERCULES_CACHE_BACKEND=s3 HERCULES_CACHE_S3_BUCKET=my-bucket hercules --burndown https://github.com/dmytrogajewski/hercules.git

UAST Development Server

The UAST binary includes a development server for interactive UAST mapping development:

# Start the development server with web UI
uast server --static web --port 8080

# Start server without static files (API only)
uast server --port 8080

# Start with custom port
uast server --port 9000 --static /path/to/web/files

The server provides:

  • API endpoints: /api/parse and /api/query for UAST operations
  • Web UI: Interactive development environment (when using --static)
  • Real-time parsing: Parse code in multiple languages
  • Query interface: Execute UAST queries with DSL

Automation Example

# Analyze all repos in a directory
for repo in ~/code/*/.git; do
  hercules --burndown --devs "$(dirname "$repo")" > "$(basename "$(dirname "$repo")").json"
done

Using Config Files, Env Vars, and Flags

  • Config file: hercules --config config.yaml --burndown ...
  • Environment: HERCULES_ANALYSIS_TIMEOUT=60m hercules --burndown ...
  • Flags: hercules --burndown --tick-size 12 ...

Supported Analyses & Options

Flag Description Options (flags/env/config)
--burndown Line burndown statistics --tick-size, --granularity, --sampling
--couples File/developer coupling --tick-size
--devs Developer activity --tick-size
--commits-stat Commit statistics
--file-history File history analysis
--imports-per-dev Import usage per developer
--shotness Structural hotness

CLI Help

hercules --help

⚡️ Example API Usage

curl -X POST http://localhost:8080/api/v1/analyze \
  -H "Content-Type: application/json" \
  -d '{
    "repository": "https://github.com/dmytrogajewski/hercules",
    "analyses": ["burndown"],
    "options": {}
  }'

🔒 S3/MinIO/Cloud Cache

  • Any S3-compatible backend: AWS S3, MinIO, DigitalOcean Spaces, Wasabi, Backblaze, etc.
  • Configurable via YAML, env, or CLI
  • IAM role or secret support

Example config:

cache:
  enabled: true
  backend: "s3"
  s3_bucket: "hercules-cache"
  s3_region: "us-east-1"
  s3_endpoint: "http://minio:9000" # for MinIO
  s3_prefix: "hercules"
  ttl: "168h"
  aws_access_key_id: "minioadmin"
  aws_secret_access_key: "minioadmin"

🛠️ Architecture

  • Stateless server: All state in cache (S3/local/memory)
  • Async job queue: Scalable analysis jobs
  • Pluggable pipeline: Add new analyses easily
  • Modern Go: Go 1.24+, AWS SDK v2, Gorilla Mux, Viper
  • Distroless container: Minimal, secure, non-root

📁 Project Structure

This project follows the Standard Go Project Layout:

hercules/
├── api/                    # Protocol definitions and schemas
│   └── proto/             # Protocol buffer files
├── build/                 # Build and CI/CD artifacts
│   ├── bin/              # Compiled binaries
│   ├── scripts/          # Build and maintenance scripts
│   └── tools/            # Build tools
├── cmd/                   # Main applications
│   ├── herr/             # Hercules analyzer binary
│   └── uast/             # UAST parser binary
├── configs/               # Configuration templates
├── deployments/           # Deployment configurations
│   ├── docker/           # Docker configurations
│   ├── k8s/              # Kubernetes manifests
│   └── helm/             # Helm charts
├── docs/                  # Documentation
├── examples/              # Examples and plugins
├── internal/              # Private application code
│   ├── app/              # Application-specific code
│   ├── pkg/              # Private libraries
│   └── server/           # Server implementations
├── pkg/                   # Public libraries
│   ├── analyzers/        # Code analysis tools
│   └── uast/             # UAST parsing and manipulation
├── test/                  # Test data and benchmarks
│   ├── data/             # Test data
│   ├── fixtures/         # Test fixtures
│   └── benchmarks/       # Benchmark results
└── third_party/           # Third-party dependencies
    ├── grammars/         # Language grammars
    └── go-sitter-forest/ # Tree-sitter grammars

Key packages:

  • pkg/analyzers/: Public code analysis APIs
  • pkg/uast/: Universal Abstract Syntax Tree parsing and manipulation
  • internal/app/core/: Core application logic and pipeline
  • internal/pkg/: Private utility libraries
  • cmd/herr/: Main Hercules analyzer binary
  • cmd/uast/: UAST parser binary

📦 Deployment


🧪 CI/CD

  • GitHub Actions:
    • Lint, test, coverage, security scan
    • Integration test with MinIO
    • Multi-arch Docker build & push
    • Release binaries for all major OS/arch on tag

📚 Documentation


🏛️ Project History

This project was originally forked from src-d/hercules, but has since been completely re-architected for modern, cloud-native, and high-scale use cases. It is not API or feature compatible with the original src-d/hercules.


📝 License

Apache 2.0

Embedded UAST Provider

Hercules supports an Embedded UAST Provider as a drop-in alternative to Babelfish for UAST-based analyses. This allows you to run structural code analyses offline, in CI, or in restricted environments without a running Babelfish server.

Key points:

  • The embedded provider uses built-in parsers (currently Go's standard library) to generate UASTs for supported languages.
  • Enable it with the CLI flag:
    ./hercules --shotness --uast-provider=embedded <repo>
  • If a file's language is unsupported, Hercules will skip it or warn, but will not fail the analysis.
  • The default provider is still Babelfish. You can switch back at any time with --uast-provider=babelfish.

Supported languages:

  • Go (via GoEmbeddedProvider)

Planned (via Tree-sitter):

  • Java, Kotlin, Swift, JavaScript/TypeScript/React/Angular, Rust, PHP, Python

Roadmap:

  • See docs/UAST_PROVIDER_ROADMAP.md for progress and planned language support.

Example usage:

./hercules --shotness --uast-provider=embedded <repo>

If you want to contribute support for more languages, see the roadmap and open a PR!

About

Gaining advanced insights from Git repository history.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 89.0%
  • JavaScript 5.4%
  • Python 4.9%
  • Makefile 0.5%
  • Dockerfile 0.1%
  • Shell 0.1%