Automated content curation, LLM-powered analysis, and blog generation for modern growth teams
Workflows β’ Features β’ Quick Start β’ Deployment β’ Development
Status: β Active | Purpose: Sync GitHub issues to local storage
# Manual execution
uv run python scripts/sync_github_issues.pyFeatures:
- π GitHub CLI wrapper (
gh issue list) - β° Timestamp-based upsert logic
- π Issue state tracking (open/closed)
- π Atomic file operations
Output: data/github/issues.jsonl
Status: β Active | Purpose: Ingest, curate, and generate content
# Manual execution
uv run python -m growth_agent.main run workflow-bThree-Stage Pipeline:
-
π₯ Ingestion Stage
- Fetch from X/Twitter creators (20 tweets per creator)
- Fetch from RSS feeds (20 articles per feed)
- Store in
data/inbox/items.jsonl - Index in LanceDB for semantic search
-
π― Curation Stage
- LLM evaluates each item (score 0-100)
- Filter by minimum score (default: 60)
- Select top-K items (default: 10)
- Store in
data/curated/{date}_ranked.jsonl
-
βοΈ Generation Stage
- LLM generates blog post from curated items
- YAML frontmatter with metadata
- Save as
data/blogs/{ID}_{slug}.md
Output:
- π₯
data/inbox/items.jsonl - π―
data/curated/{YYYY-MM-DD}_ranked.jsonl - βοΈ
data/blogs/*.md
Status: β Active | Purpose: Track engagement metrics across multiple platforms
# Manual execution - X/Twitter metrics
uv run python scripts/sync_metrics.py --source x
# Google Search Console metrics
uv run python scripts/sync_metrics.py --source gsc --days 7
# PostHog product analytics
uv run python scripts/sync_metrics.py --source posthog --days 1
# Sync all data sources
uv run python scripts/sync_metrics.py --source allFeatures:
- π¦ X/Twitter: Fetch latest tweets and engagement metrics (likes, retweets, replies)
- π Google Search Console: Search analytics, CTR, ranking positions, Core Web Vitals
- π PostHog: User behavior events, insights, funnels, feature flags
- πΎ Separate JSONL files per platform (
stats.jsonl,gsc_stats.jsonl,posthog_stats.jsonl) - π Overwrite mode (keeps latest data only)
Output:
data/metrics/stats.jsonl- X/Twitter metricsdata/metrics/gsc_stats.jsonl- Google Search Console datadata/metrics/posthog_stats.jsonl- PostHog analytics data
Status: β Integrated | Purpose: Discover daily social opportunities and blog ideas, optionally render images, and post to Discord
# Initialize the default social listener configs
python -m growth_agent.main init
# Run the social listener manually
python -m growth_agent.main run workflow-d
# Handle x1 / b1 style image regeneration commands
python -m growth_agent.main social-reply x1
python -m growth_agent.main social-reply b1 --forceWhat it does:
- Fetches RSS / X-RSS sources from
data/social_listener/config/sources.json - Fetches blog-material sources from
data/social_listener/config/blog_sources.json - Scores social post opportunities and SEO blog ideas with PuppyOne-specific prompts
- Saves JSON / Markdown / text reports to
data/social_listener/reports/ - Optionally renders top images via
qwen-image-2.0 - Optionally sends a daily digest and top items to Discord via webhook
-
π₯ Multi-Source Ingestion
- π X/Twitter creators via RapidAPI
- π° RSS feed subscriptions
- π LanceDB vector indexing for semantic search
-
π― AI-Powered Curation
- π€ LLM-based content evaluation and scoring
- π Quality filtering (configurable thresholds)
- π Top-K selection for high-value content
-
βοΈ Automated Blog Generation
- π YAML frontmatter with metadata
- π¨ GitHub-flavored markdown output
- π Daily scheduled execution (8 AM Beijing)
- π GitHub CLI integration
- π Automatic issue synchronization
- β±οΈ Timestamp-based upsert logic
- π Local caching with JSONL storage
- π¦ X/Twitter: Engagement metrics (likes, retweets, replies, impressions)
- π Google Search Console: SEO performance, search analytics, Core Web Vitals
- π PostHog: Product analytics, user events, funnels, insights
- π Separate storage per platform for efficient querying
- π― OAuth 2.0 and API Key authentication support
- βοΈ Configuration: Pydantic-settings with environment variables
- πΎ Storage: File-system database with JSONL format (separate per platform)
- π Scheduling: Linux cron jobs for production deployments
- π Logging: Structured logging to files and console
- π Security: Atomic file operations, OAuth 2.0, API Key authentication
- π Multi-Platform: X/Twitter, GitHub, Google Search Console, PostHog integration
- Python 3.10 or higher
- uv (recommended) or pip
- API Keys:
- X/Twitter RapidAPI Key
- OpenRouter API Key
- GitHub Token (optional, for Workflow A)
# Clone the repository
git clone https://github.com/HYPERVAPOR/growth-agent.git
cd growth-agent
# Install dependencies with uv (recommended)
uv sync
# Or with pip
pip install -e .# Copy environment template
cp .env.example .env
# Edit configuration
vim .envRequired environment variables:
# API Keys
X_RAPIDAPI_KEY=your_x_api_key_here
OPENROUTER_API_KEY=your_openrouter_key_here
# Optional - Workflow A (GitHub)
GITHUB_TOKEN=your_github_token_here
REPO_PATH=puppyone-ai/puppyone
# Optional - Workflow C (Google Search Console)
GSC_ENABLED=true
GSC_SITE_URL=https://example.com
# Option 1: Use service account file
GSC_SERVICE_ACCOUNT_PATH=path/to/service-account.json
# Option 2: Use environment variables (recommended for deployments)
GSC_CLIENT_EMAIL=your-service-account@project-id.iam.gserviceaccount.com
GSC_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
# Optional - Workflow C (PostHog)
POSTHOG_ENABLED=true
POSTHOG_API_KEY=phx_your_project_api_key_here # Use Project API Key, not Personal
POSTHOG_HOST=app.posthog.com
POSTHOG_PROJECT_ID=your_project_id
# Optional - Workflow D (PuppyOne Social Listener)
SOCIAL_LISTENER_ENABLED=true
SOCIAL_LISTENER_DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...
SOCIAL_LISTENER_RENDER_IMAGES=false
SOCIAL_LISTENER_IMAGE_COUNT=1
# Optional - qwen-image-2.0 rendering
DASHSCOPE_API_KEY=your_dashscope_api_key_here
DASHSCOPE_BASE_URL=https://dashscope.aliyuncs.com/api/v1
# LLM Configuration
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_TEMPERATURE=0.3
LLM_MAX_TOKENS=2000X/Twitter RapidAPI:
- Visit RapidAPI
- Subscribe to Twitter API v2
- Copy your API key to
.env
OpenRouter:
- Visit OpenRouter
- Create an account and get API key
- Add to
.env
Google Search Console:
- Create Google Cloud Project
- Enable Search Console API
- Create service account with JSON key
- Add service account email to GSC property permissions
- Configure in
.env(see above)
PostHog:
- Login to PostHog
- Navigate to Settings β Project β API Keys
- Copy Project API Key (not Personal API Key)
- Add to
.env
# Initialize data directory
uv run python -m growth_agent.main init
# Add subscriptions
vim data/subscriptions/x_creators.jsonl
vim data/subscriptions/rss_feeds.jsonl
# Run Workflow B immediately
uv run python -m growth_agent.main run workflow-b
# Start scheduler daemon (Ctrl+C to stop)
uv run python -m growth_agent.main schedulegrowth-agent/
βββ π src/growth_agent/
β βββ π core/ # Core infrastructure
β β βββ schema.py # Pydantic data models
β β βββ storage.py # File-system database
β β βββ llm.py # LLM client (OpenRouter)
β β βββ vector_store.py # LanceDB integration
β β βββ logging.py # Logging configuration
β β βββ scheduler.py # APScheduler setup
β βββ π workflows/ # Workflow orchestration
β β βββ base.py # Abstract workflow base
β β βββ workflow_a.py # GitHub sync
β β βββ workflow_b.py # Content intelligence
β β βββ workflow_c.py # Metrics tracking
β βββ π ingestors/ # Data ingestion
β β βββ x_twitter.py # X/Twitter API client
β β βββ rss_feed.py # RSS feed parser
β β βββ github.py # GitHub CLI wrapper
β β βββ metrics.py # Metrics collector (X/Twitter)
β β βββ gsc_search_console.py # Google Search Console API
β β βββ posthog.py # PostHog analytics API
β βββ π processors/ # Data processing
β β βββ curator.py # LLM content evaluator
β β βββ ranker.py # Content ranking
β β βββ blog_generator.py # Blog post generator
β βββ config.py # Configuration management
β βββ main.py # CLI entry point
βββ π data/ # File-system database
β βββ subscriptions/ # X/RSS subscriptions
β βββ inbox/ # Raw ingested items
β βββ curated/ # LLM-evaluated content
β βββ blogs/ # Generated blog posts
β βββ github/ # GitHub issues cache
β βββ metrics/ # Social media metrics
β βββ logs/ # Execution logs
β βββ index/ # LanceDB vector store
βββ π scripts/ # Utility scripts
β βββ sync_github_issues.py # Manual Workflow A trigger
β βββ sync_metrics.py # Manual Workflow C trigger
β βββ test_posthog.py # PostHog API validation
βββ π tests/ # Test suite
βββ pyproject.toml # Project configuration
βββ .env.example # Environment template
1. Clone & Install
# Clone repository
git clone https://github.com/HYPERVAPOR/growth-agent.git
cd growth-agent
# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync
# Initialize data directory
uv run python -m growth_agent.main init2. Configure Environment
# Copy environment template
cp .env.example .env
# Edit configuration (add API keys)
vim .envRequired environment variables:
# API Keys
X_RAPIDAPI_KEY=your_x_api_key_here
OPENROUTER_API_KEY=your_openrouter_key_here
# Optional - Workflow A (GitHub)
GITHUB_TOKEN=your_github_token_here
REPO_PATH=puppyone-ai/puppyone
# Optional - Workflow C (GSC & PostHog)
GSC_ENABLED=true
GSC_SITE_URL=https://example.com
GSC_CLIENT_EMAIL=your-service-account@project-id.iam.gserviceaccount.com
GSC_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
POSTHOG_ENABLED=true
POSTHOG_API_KEY=phx_your_project_api_key_here
POSTHOG_HOST=app.posthog.com
POSTHOG_PROJECT_ID=your_project_id
# LLM Configuration
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_TEMPERATURE=0.3
LLM_MAX_TOKENS=20003. Setup Cron Jobs
# Edit crontab
crontab -eAdd the following cron jobs:
# Workflow A: GitHub Issues Sync (every 2 hours)
0 */2 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_github_issues.py >> data/logs/cron_workflow_a.log 2>&1
# Workflow B: Content Intelligence & Blog Generation (daily at 8 AM)
0 8 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python -m growth_agent.main run workflow-b >> data/logs/cron_workflow_b.log 2>&1
# Workflow C: X/Twitter Metrics (every 6 hours)
0 */6 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_metrics.py --source x >> data/logs/cron_workflow_c.log 2>&1
# Workflow C: Google Search Console (daily at 9 AM)
0 9 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_metrics.py --source gsc --days 7 >> data/logs/cron_workflow_c.log 2>&1
# Workflow C: PostHog Analytics (every 6 hours)
0 */6 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_metrics.py --source posthog --days 1 >> data/logs/cron_workflow_c.log 2>&1Important:
- Replace
/path/to/growth-agentwith your actual project path - Replace
/usr/local/bin/uvwith your uv executable path (find withwhich uv) - Adjust schedule times based on your timezone and needs
- Logs are written to
data/logs/cron_workflow_*.log
4. Verify Cron Jobs
# List current cron jobs
crontab -l
# Check cron service status
sudo systemctl status cron
# View cron logs (Ubuntu/Debian)
sudo grep CRON /var/log/syslog
# View application logs
tail -f data/logs/cron_workflow_b.log5. Monitor Execution
# View workflow logs
tail -f data/logs/$(date +%Y-%m-%d).log
# View specific cron job logs
tail -f data/logs/cron_workflow_a.log # GitHub sync
tail -f data/logs/cron_workflow_b.log # Content intelligence
tail -f data/logs/cron_workflow_c.log # Metrics tracking
# Check last execution time
ls -lh data/blogs/ # Workflow B output
ls -lh data/metrics/ # Workflow C output
ls -lh data/github/ # Workflow A output# Pull latest code
git pull origin main
# Reinstall dependencies (if needed)
uv sync
# Test workflows manually
uv run python -m growth_agent.main run workflow-b
uv run python scripts/sync_metrics.py --source allIf you prefer Docker over cron jobs:
# Build image
docker build -t growth-agent .
# Run with environment file
docker run -d \
--env-file .env \
-v $(pwd)/data:/app/data \
--name growth-agent \
growth-agent# Install development dependencies
uv sync --all-extras
# Run tests
pytest
# Run with coverage
pytest --cov=src/growth_agent --cov-report=html
# View coverage report
open htmlcov/index.html# Format code
black src/ tests/
# Check linting
ruff check src/ tests/
# Type checking
mypy src/# Enable verbose logging
export LOG_LEVEL=DEBUG
# Run with verbose output
uv run python -m growth_agent.main run workflow-b --verboseBase schema for all ingested content.
Fields:
id: Unique identifiersource: "x" or "rss"content_type: "post" or "article"url: Original URLcontent: Text contentauthor_name: Author display nametitle: Content titlepublished_at: ISO 8601 timestamp
LLM-evaluated content with quality scores.
Fields:
- All InboxItem fields
score: Quality rating (0-100)summary: AI-generated summarycomment: AI evaluation commentrank: Position in ranked list
Generated blog post with YAML frontmatter.
Fields:
id: Unique blog ID (UUID first 8 chars)slug: URL-friendly slugtitle: Blog titledate: Publication datesummary: Brief summary (50-300 chars)tags: List of tagsauthor: Author namecontent: Markdown content
See data/schemas/ for detailed documentation.
JSONL (JSON Lines) provides:
- β Simple version control with git
- β Human-readable format
- β Easy debugging and manual inspection
- β No database dependencies
- β Atomic writes prevent corruption
- β AI-friendly structure for LLM analysis
Edit your crontab:
crontab -eModify the cron schedule format: minute hour day month weekday
Examples:
0 8 * * *- Daily at 8 AM0 */6 * * *- Every 6 hours0 9 * * 1- 9 AM every Monday
- Create Google Cloud Project
- Enable Search Console API
- Create service account & download JSON key
- Add service account email to GSC property permissions
- Configure in
.env(use environment variables for security)
Helper script available:
# Create GSC credentials JSON from environment variables
uv run python scripts/create_gsc_creds.pyYou're likely using a Personal API Key instead of a Project API Key.
Fix:
- Login to PostHog
- Go to Settings β Project β API Keys
- Copy the Project API Key (starts with
phx_) - Update
.env:POSTHOG_API_KEY=phx_...
Verify:
uv run python scripts/test_posthog.py- Workflow A (GitHub): Issue number as unique key, upsert based on
updated_at - Workflow B (Content): No deduplication (daily snapshots with timestamps)
- Workflow C (Metrics): Overwrite mode per platform (always latest data)
Yes! Add them to data/subscriptions/x_creators.jsonl:
{"id": "123456", "username": "elonmusk", "followers_count": 1000000, "subscribed_at": "2026-02-05T10:00:00Z", "last_fetched_at": null}
{"id": "789012", "username": "puppyone_ai", "followers_count": 1000, "subscribed_at": "2026-02-05T10:00:00Z", "last_fetched_at": null}Yes! Manual execution:
# Workflow A
uv run python scripts/sync_github_issues.py
# Workflow B
uv run python -m growth_agent.main run workflow-b
# Workflow C (all platforms)
uv run python scripts/sync_metrics.py --source all
# Workflow C (specific platform)
uv run python scripts/sync_metrics.py --source gsc --days 7MIT License - see LICENSE for details.
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- π§ Email: support@hypervapor.com
- π Issues: GitHub Issues
- π Documentation: data/schemas/
Built with β€οΈ by HYPERVAPOR