Skip to content

puppyone-ai/growth-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Growth Agent: AI-Powered Content Intelligence & Automated Blog Generation

Growth Agent

Python Version Code style: black License: MIT Powered by Claude AI Agent OpenRouter LanceDB

Automated content curation, LLM-powered analysis, and blog generation for modern growth teams

Workflows β€’ Features β€’ Quick Start β€’ Deployment β€’ Development


πŸ”„ Workflows

Workflow Explained

πŸ“¦ Workflow A: GitHub Quality Management

Status: βœ… Active | Purpose: Sync GitHub issues to local storage

# Manual execution
uv run python scripts/sync_github_issues.py

Features:

  • πŸ™ GitHub CLI wrapper (gh issue list)
  • ⏰ Timestamp-based upsert logic
  • πŸ“Š Issue state tracking (open/closed)
  • πŸ”’ Atomic file operations

Output: data/github/issues.jsonl


🧠 Workflow B: Content Intelligence & Blog Creation

Status: βœ… Active | Purpose: Ingest, curate, and generate content

# Manual execution
uv run python -m growth_agent.main run workflow-b

Three-Stage Pipeline:

  1. πŸ“₯ Ingestion Stage

    • Fetch from X/Twitter creators (20 tweets per creator)
    • Fetch from RSS feeds (20 articles per feed)
    • Store in data/inbox/items.jsonl
    • Index in LanceDB for semantic search
  2. 🎯 Curation Stage

    • LLM evaluates each item (score 0-100)
    • Filter by minimum score (default: 60)
    • Select top-K items (default: 10)
    • Store in data/curated/{date}_ranked.jsonl
  3. ✍️ Generation Stage

    • LLM generates blog post from curated items
    • YAML frontmatter with metadata
    • Save as data/blogs/{ID}_{slug}.md

Output:

  • πŸ“₯ data/inbox/items.jsonl
  • 🎯 data/curated/{YYYY-MM-DD}_ranked.jsonl
  • ✍️ data/blogs/*.md

πŸ“Š Workflow C: Social Media & Product Analytics Tracking

Status: βœ… Active | Purpose: Track engagement metrics across multiple platforms

# Manual execution - X/Twitter metrics
uv run python scripts/sync_metrics.py --source x

# Google Search Console metrics
uv run python scripts/sync_metrics.py --source gsc --days 7

# PostHog product analytics
uv run python scripts/sync_metrics.py --source posthog --days 1

# Sync all data sources
uv run python scripts/sync_metrics.py --source all

Features:

  • 🐦 X/Twitter: Fetch latest tweets and engagement metrics (likes, retweets, replies)
  • πŸ” Google Search Console: Search analytics, CTR, ranking positions, Core Web Vitals
  • πŸ“Š PostHog: User behavior events, insights, funnels, feature flags
  • πŸ’Ύ Separate JSONL files per platform (stats.jsonl, gsc_stats.jsonl, posthog_stats.jsonl)
  • πŸ”„ Overwrite mode (keeps latest data only)

Output:

  • data/metrics/stats.jsonl - X/Twitter metrics
  • data/metrics/gsc_stats.jsonl - Google Search Console data
  • data/metrics/posthog_stats.jsonl - PostHog analytics data

πŸ“£ Workflow D: PuppyOne Social Listener

Status: βœ… Integrated | Purpose: Discover daily social opportunities and blog ideas, optionally render images, and post to Discord

# Initialize the default social listener configs
python -m growth_agent.main init

# Run the social listener manually
python -m growth_agent.main run workflow-d

# Handle x1 / b1 style image regeneration commands
python -m growth_agent.main social-reply x1
python -m growth_agent.main social-reply b1 --force

What it does:

  • Fetches RSS / X-RSS sources from data/social_listener/config/sources.json
  • Fetches blog-material sources from data/social_listener/config/blog_sources.json
  • Scores social post opportunities and SEO blog ideas with PuppyOne-specific prompts
  • Saves JSON / Markdown / text reports to data/social_listener/reports/
  • Optionally renders top images via qwen-image-2.0
  • Optionally sends a daily digest and top items to Discord via webhook

✨ Features

🧠 Workflow B - Content Intelligence & Blog Creation

  • πŸ“₯ Multi-Source Ingestion

    • πŸ”— X/Twitter creators via RapidAPI
    • πŸ“° RSS feed subscriptions
    • πŸ“Š LanceDB vector indexing for semantic search
  • 🎯 AI-Powered Curation

    • πŸ€– LLM-based content evaluation and scoring
    • πŸ“ˆ Quality filtering (configurable thresholds)
    • πŸ† Top-K selection for high-value content
  • ✍️ Automated Blog Generation

    • πŸ“ YAML frontmatter with metadata
    • 🎨 GitHub-flavored markdown output
    • πŸ“… Daily scheduled execution (8 AM Beijing)

πŸ”§ Workflow A - GitHub Quality Management

  • πŸ™ GitHub CLI integration
  • πŸ”„ Automatic issue synchronization
  • ⏱️ Timestamp-based upsert logic
  • πŸ“‚ Local caching with JSONL storage

πŸ“Š Workflow C - Multi-Platform Analytics Tracking

  • 🐦 X/Twitter: Engagement metrics (likes, retweets, replies, impressions)
  • πŸ” Google Search Console: SEO performance, search analytics, Core Web Vitals
  • πŸ“Š PostHog: Product analytics, user events, funnels, insights
  • πŸ”„ Separate storage per platform for efficient querying
  • 🎯 OAuth 2.0 and API Key authentication support

πŸ—οΈ Infrastructure

  • βš™οΈ Configuration: Pydantic-settings with environment variables
  • πŸ’Ύ Storage: File-system database with JSONL format (separate per platform)
  • πŸ“… Scheduling: Linux cron jobs for production deployments
  • πŸ“ Logging: Structured logging to files and console
  • πŸ”’ Security: Atomic file operations, OAuth 2.0, API Key authentication
  • 🌐 Multi-Platform: X/Twitter, GitHub, Google Search Console, PostHog integration

πŸš€ Quick Start

πŸ“‹ Prerequisites

πŸ”§ Installation

# Clone the repository
git clone https://github.com/HYPERVAPOR/growth-agent.git
cd growth-agent

# Install dependencies with uv (recommended)
uv sync

# Or with pip
pip install -e .

βš™οΈ Configuration

# Copy environment template
cp .env.example .env

# Edit configuration
vim .env

Required environment variables:

# API Keys
X_RAPIDAPI_KEY=your_x_api_key_here
OPENROUTER_API_KEY=your_openrouter_key_here

# Optional - Workflow A (GitHub)
GITHUB_TOKEN=your_github_token_here
REPO_PATH=puppyone-ai/puppyone

# Optional - Workflow C (Google Search Console)
GSC_ENABLED=true
GSC_SITE_URL=https://example.com
# Option 1: Use service account file
GSC_SERVICE_ACCOUNT_PATH=path/to/service-account.json
# Option 2: Use environment variables (recommended for deployments)
GSC_CLIENT_EMAIL=your-service-account@project-id.iam.gserviceaccount.com
GSC_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"

# Optional - Workflow C (PostHog)
POSTHOG_ENABLED=true
POSTHOG_API_KEY=phx_your_project_api_key_here  # Use Project API Key, not Personal
POSTHOG_HOST=app.posthog.com
POSTHOG_PROJECT_ID=your_project_id

# Optional - Workflow D (PuppyOne Social Listener)
SOCIAL_LISTENER_ENABLED=true
SOCIAL_LISTENER_DISCORD_WEBHOOK_URL=https://discord.com/api/webhooks/...
SOCIAL_LISTENER_RENDER_IMAGES=false
SOCIAL_LISTENER_IMAGE_COUNT=1

# Optional - qwen-image-2.0 rendering
DASHSCOPE_API_KEY=your_dashscope_api_key_here
DASHSCOPE_BASE_URL=https://dashscope.aliyuncs.com/api/v1

# LLM Configuration
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_TEMPERATURE=0.3
LLM_MAX_TOKENS=2000

πŸ”‘ Setting up API Keys

X/Twitter RapidAPI:

  1. Visit RapidAPI
  2. Subscribe to Twitter API v2
  3. Copy your API key to .env

OpenRouter:

  1. Visit OpenRouter
  2. Create an account and get API key
  3. Add to .env

Google Search Console:

  1. Create Google Cloud Project
  2. Enable Search Console API
  3. Create service account with JSON key
  4. Add service account email to GSC property permissions
  5. Configure in .env (see above)

PostHog:

  1. Login to PostHog
  2. Navigate to Settings β†’ Project β†’ API Keys
  3. Copy Project API Key (not Personal API Key)
  4. Add to .env

🎯 Usage

# Initialize data directory
uv run python -m growth_agent.main init

# Add subscriptions
vim data/subscriptions/x_creators.jsonl
vim data/subscriptions/rss_feeds.jsonl

# Run Workflow B immediately
uv run python -m growth_agent.main run workflow-b

# Start scheduler daemon (Ctrl+C to stop)
uv run python -m growth_agent.main schedule

πŸ“¦ Project Structure

growth-agent/
β”œβ”€β”€ πŸ“‚ src/growth_agent/
β”‚   β”œβ”€β”€ πŸ“‚ core/                  # Core infrastructure
β”‚   β”‚   β”œβ”€β”€ schema.py            # Pydantic data models
β”‚   β”‚   β”œβ”€β”€ storage.py           # File-system database
β”‚   β”‚   β”œβ”€β”€ llm.py               # LLM client (OpenRouter)
β”‚   β”‚   β”œβ”€β”€ vector_store.py      # LanceDB integration
β”‚   β”‚   β”œβ”€β”€ logging.py           # Logging configuration
β”‚   β”‚   └── scheduler.py         # APScheduler setup
β”‚   β”œβ”€β”€ πŸ“‚ workflows/             # Workflow orchestration
β”‚   β”‚   β”œβ”€β”€ base.py              # Abstract workflow base
β”‚   β”‚   β”œβ”€β”€ workflow_a.py        # GitHub sync
β”‚   β”‚   β”œβ”€β”€ workflow_b.py        # Content intelligence
β”‚   β”‚   └── workflow_c.py        # Metrics tracking
β”‚   β”œβ”€β”€ πŸ“‚ ingestors/             # Data ingestion
β”‚   β”‚   β”œβ”€β”€ x_twitter.py         # X/Twitter API client
β”‚   β”‚   β”œβ”€β”€ rss_feed.py          # RSS feed parser
β”‚   β”‚   β”œβ”€β”€ github.py            # GitHub CLI wrapper
β”‚   β”‚   β”œβ”€β”€ metrics.py           # Metrics collector (X/Twitter)
β”‚   β”‚   β”œβ”€β”€ gsc_search_console.py # Google Search Console API
β”‚   β”‚   └── posthog.py           # PostHog analytics API
β”‚   β”œβ”€β”€ πŸ“‚ processors/            # Data processing
β”‚   β”‚   β”œβ”€β”€ curator.py           # LLM content evaluator
β”‚   β”‚   β”œβ”€β”€ ranker.py            # Content ranking
β”‚   β”‚   └── blog_generator.py    # Blog post generator
β”‚   β”œβ”€β”€ config.py                # Configuration management
β”‚   └── main.py                  # CLI entry point
β”œβ”€β”€ πŸ“‚ data/                      # File-system database
β”‚   β”œβ”€β”€ subscriptions/           # X/RSS subscriptions
β”‚   β”œβ”€β”€ inbox/                   # Raw ingested items
β”‚   β”œβ”€β”€ curated/                 # LLM-evaluated content
β”‚   β”œβ”€β”€ blogs/                   # Generated blog posts
β”‚   β”œβ”€β”€ github/                  # GitHub issues cache
β”‚   β”œβ”€β”€ metrics/                 # Social media metrics
β”‚   β”œβ”€β”€ logs/                    # Execution logs
β”‚   └── index/                   # LanceDB vector store
β”œβ”€β”€ πŸ“‚ scripts/                   # Utility scripts
β”‚   β”œβ”€β”€ sync_github_issues.py   # Manual Workflow A trigger
β”‚   β”œβ”€β”€ sync_metrics.py         # Manual Workflow C trigger
β”‚   └── test_posthog.py         # PostHog API validation
β”œβ”€β”€ πŸ“‚ tests/                     # Test suite
β”œβ”€β”€ pyproject.toml              # Project configuration
└── .env.example                # Environment template

🚒 Deployment

πŸ–₯️ Server Deployment with Cron Jobs

1. Clone & Install

# Clone repository
git clone https://github.com/HYPERVAPOR/growth-agent.git
cd growth-agent

# Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync

# Initialize data directory
uv run python -m growth_agent.main init

2. Configure Environment

# Copy environment template
cp .env.example .env

# Edit configuration (add API keys)
vim .env

Required environment variables:

# API Keys
X_RAPIDAPI_KEY=your_x_api_key_here
OPENROUTER_API_KEY=your_openrouter_key_here

# Optional - Workflow A (GitHub)
GITHUB_TOKEN=your_github_token_here
REPO_PATH=puppyone-ai/puppyone

# Optional - Workflow C (GSC & PostHog)
GSC_ENABLED=true
GSC_SITE_URL=https://example.com
GSC_CLIENT_EMAIL=your-service-account@project-id.iam.gserviceaccount.com
GSC_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"

POSTHOG_ENABLED=true
POSTHOG_API_KEY=phx_your_project_api_key_here
POSTHOG_HOST=app.posthog.com
POSTHOG_PROJECT_ID=your_project_id

# LLM Configuration
LLM_MODEL=anthropic/claude-3.5-sonnet
LLM_TEMPERATURE=0.3
LLM_MAX_TOKENS=2000

3. Setup Cron Jobs

# Edit crontab
crontab -e

Add the following cron jobs:

# Workflow A: GitHub Issues Sync (every 2 hours)
0 */2 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_github_issues.py >> data/logs/cron_workflow_a.log 2>&1

# Workflow B: Content Intelligence & Blog Generation (daily at 8 AM)
0 8 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python -m growth_agent.main run workflow-b >> data/logs/cron_workflow_b.log 2>&1

# Workflow C: X/Twitter Metrics (every 6 hours)
0 */6 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_metrics.py --source x >> data/logs/cron_workflow_c.log 2>&1

# Workflow C: Google Search Console (daily at 9 AM)
0 9 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_metrics.py --source gsc --days 7 >> data/logs/cron_workflow_c.log 2>&1

# Workflow C: PostHog Analytics (every 6 hours)
0 */6 * * * cd /path/to/growth-agent && /usr/local/bin/uv run python scripts/sync_metrics.py --source posthog --days 1 >> data/logs/cron_workflow_c.log 2>&1

Important:

  • Replace /path/to/growth-agent with your actual project path
  • Replace /usr/local/bin/uv with your uv executable path (find with which uv)
  • Adjust schedule times based on your timezone and needs
  • Logs are written to data/logs/cron_workflow_*.log

4. Verify Cron Jobs

# List current cron jobs
crontab -l

# Check cron service status
sudo systemctl status cron

# View cron logs (Ubuntu/Debian)
sudo grep CRON /var/log/syslog

# View application logs
tail -f data/logs/cron_workflow_b.log

5. Monitor Execution

# View workflow logs
tail -f data/logs/$(date +%Y-%m-%d).log

# View specific cron job logs
tail -f data/logs/cron_workflow_a.log  # GitHub sync
tail -f data/logs/cron_workflow_b.log  # Content intelligence
tail -f data/logs/cron_workflow_c.log  # Metrics tracking

# Check last execution time
ls -lh data/blogs/  # Workflow B output
ls -lh data/metrics/  # Workflow C output
ls -lh data/github/  # Workflow A output

πŸ”„ Updates

# Pull latest code
git pull origin main

# Reinstall dependencies (if needed)
uv sync

# Test workflows manually
uv run python -m growth_agent.main run workflow-b
uv run python scripts/sync_metrics.py --source all

🐳 Docker Deployment (Optional)

If you prefer Docker over cron jobs:

# Build image
docker build -t growth-agent .

# Run with environment file
docker run -d \
  --env-file .env \
  -v $(pwd)/data:/app/data \
  --name growth-agent \
  growth-agent

πŸ§ͺ Development

πŸƒ Running Tests

# Install development dependencies
uv sync --all-extras

# Run tests
pytest

# Run with coverage
pytest --cov=src/growth_agent --cov-report=html

# View coverage report
open htmlcov/index.html

πŸ“ Code Style

# Format code
black src/ tests/

# Check linting
ruff check src/ tests/

# Type checking
mypy src/

πŸ” Debugging

# Enable verbose logging
export LOG_LEVEL=DEBUG

# Run with verbose output
uv run python -m growth_agent.main run workflow-b --verbose

πŸ“Š Data Schemas

πŸ“₯ InboxItem

Base schema for all ingested content.

Fields:

  • id: Unique identifier
  • source: "x" or "rss"
  • content_type: "post" or "article"
  • url: Original URL
  • content: Text content
  • author_name: Author display name
  • title: Content title
  • published_at: ISO 8601 timestamp

🎯 CuratedItem

LLM-evaluated content with quality scores.

Fields:

  • All InboxItem fields
  • score: Quality rating (0-100)
  • summary: AI-generated summary
  • comment: AI evaluation comment
  • rank: Position in ranked list

✍️ BlogPost

Generated blog post with YAML frontmatter.

Fields:

  • id: Unique blog ID (UUID first 8 chars)
  • slug: URL-friendly slug
  • title: Blog title
  • date: Publication date
  • summary: Brief summary (50-300 chars)
  • tags: List of tags
  • author: Author name
  • content: Markdown content

See data/schemas/ for detailed documentation.


❓ FAQ

πŸ€” Why JSONL instead of a database?

JSONL (JSON Lines) provides:

  • βœ… Simple version control with git
  • βœ… Human-readable format
  • βœ… Easy debugging and manual inspection
  • βœ… No database dependencies
  • βœ… Atomic writes prevent corruption
  • βœ… AI-friendly structure for LLM analysis

⏰ How do I change cron job schedules?

Edit your crontab:

crontab -e

Modify the cron schedule format: minute hour day month weekday

Examples:

  • 0 8 * * * - Daily at 8 AM
  • 0 */6 * * * - Every 6 hours
  • 0 9 * * 1 - 9 AM every Monday

πŸ” How do I get Google Search Console credentials?

  1. Create Google Cloud Project
  2. Enable Search Console API
  3. Create service account & download JSON key
  4. Add service account email to GSC property permissions
  5. Configure in .env (use environment variables for security)

Helper script available:

# Create GSC credentials JSON from environment variables
uv run python scripts/create_gsc_creds.py

πŸ“Š Why does PostHog return 401 Unauthorized?

You're likely using a Personal API Key instead of a Project API Key.

Fix:

  1. Login to PostHog
  2. Go to Settings β†’ Project β†’ API Keys
  3. Copy the Project API Key (starts with phx_)
  4. Update .env: POSTHOG_API_KEY=phx_...

Verify:

uv run python scripts/test_posthog.py

πŸ”„ How does deduplication work?

  • Workflow A (GitHub): Issue number as unique key, upsert based on updated_at
  • Workflow B (Content): No deduplication (daily snapshots with timestamps)
  • Workflow C (Metrics): Overwrite mode per platform (always latest data)

πŸ“ˆ Can I track multiple X accounts?

Yes! Add them to data/subscriptions/x_creators.jsonl:

{"id": "123456", "username": "elonmusk", "followers_count": 1000000, "subscribed_at": "2026-02-05T10:00:00Z", "last_fetched_at": null}
{"id": "789012", "username": "puppyone_ai", "followers_count": 1000, "subscribed_at": "2026-02-05T10:00:00Z", "last_fetched_at": null}

πŸš€ Can I run workflows without cron jobs?

Yes! Manual execution:

# Workflow A
uv run python scripts/sync_github_issues.py

# Workflow B
uv run python -m growth_agent.main run workflow-b

# Workflow C (all platforms)
uv run python scripts/sync_metrics.py --source all

# Workflow C (specific platform)
uv run python scripts/sync_metrics.py --source gsc --days 7

πŸ“œ License

MIT License - see LICENSE for details.


🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ž Support


Built with ❀️ by HYPERVAPOR

About

AI Agent driven growth and operations system for content intelligence and blog creation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors