Skip to content

Aawegg/Review-Portal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

TB Review Portal - LLM Review with Google Sheets Support

A Streamlit-based portal for reviewing Terminal Bench tasks with integrated validator and LLM review comparison.

πŸš€ Quick Start

./start_portal.sh

✨ Features

1. Task Upload & Validation

  • Upload task ZIP files or provide Google Drive links
  • Run validator script on tasks
  • View validation results with detailed feedback

2. Multi-Step Review Workflow

  • Step 1: Upload Task
  • Step 2: Run Validator
  • Step 3: Review task.yaml (with LLM comparison)
  • Step 4: Review solution.sh (with LLM comparison)
  • Step 5: Review Dockerfile (with LLM comparison)
  • Step 6: Compare Model Test Results
  • Step 7: Export Final Review Report

3. LLM Review Integration

βœ… Google Sheets Support (NEW!)

You can now directly paste Google Sheets URLs to load LLM reviews!

Example:

https://docs.google.com/spreadsheets/d/1VW0CrLLgjPRGs7fWlYwW78OEZ5S5MMuURwh7jCQC_SQ/edit?usp=sharing

How to Format Your Sheet:

  • Column A: Section name (e.g., task.yaml, solution.sh, Dockerfile)
  • Column B: LLM review content

Or use any format - the parser will try to intelligently extract sections!

Other Supported Formats:

  • Upload Files: .txt, .md, .json, .docx, .csv
  • Google Drive Links: Direct file links
  • Direct URLs: Any accessible document URL

4. Side-by-Side Comparison

  • View task files on the left
  • View LLM reviews on the right
  • Edit LLM reviews inline
  • Add manual review notes

5. Export Reports

  • Combined review with your notes and LLM feedback
  • JSON format for easy processing
  • Includes validation results

πŸ“‹ Usage

Upload a Task

  1. Go to "Upload" tab
  2. Upload ZIP file or paste Google Drive link
  3. Wait for extraction

Add LLM Review

  1. In sidebar, find "πŸ€– LLM Review Document"
  2. Option A: Paste Google Sheets URL
  3. Option B: Upload file (.txt, .md, .json, .docx, .csv)
  4. Option C: Paste Google Drive file link
  5. Click "πŸ“₯ Load from URL" or file will auto-upload

Review Files

  1. Navigate through steps using buttons or sidebar
  2. View file content with syntax highlighting
  3. View LLM review side-by-side
  4. Add your review notes
  5. Mark steps as complete

Export Report

  1. Go to "Final Review" step
  2. Click "πŸ“₯ Export Review Report"
  3. Download JSON report with all reviews

πŸ”§ Setup

Install Dependencies

pip install -r requirements.txt

Dependencies

  • streamlit>=1.28.0 - Web framework
  • gdown>=4.7.1 - Google Drive downloads
  • PyYAML>=6.0 - YAML parsing
  • python-docx>=0.8.11 - DOCX support
  • API clients (optional, for future LLM features)

Configure Validator

  • Default: Uses tb_validator_v6.py in the same directory
  • Upload different validator in sidebar if needed

πŸ“ Google Sheets Format Examples

Format 1: Terminal Bench Review Format

The parser automatically detects Terminal Bench review documents with sections like:

A) PROMPT
- Analysis of task.yaml...
- PASS/FAIL indicators...

B) TESTS
- Test coverage analysis...

E) DOCKERFILE
- Dockerfile review...

G) SOLUTION.SH
- Solution implementation review...

These sections are automatically mapped to:

  • A) PROMPT β†’ task.yaml review
  • B) TESTS β†’ tests review
  • E) DOCKERFILE β†’ dockerfile review
  • G) SOLUTION β†’ solution.sh review
  • H) DOCKER-COMPOSE β†’ docker_compose review

Format 2: Simple Headers

task.yaml Review
----------------
Good structure, clear description...

solution.sh Review
------------------
Correct implementation...

Format 3: CSV Export from Sheets

Just export your Sheet as CSV and upload, or paste the Sheets URL directly!

🎯 Tips

  1. For Google Sheets: Make sure the sheet is shared (at least "Anyone with link can view")
  2. Section Names: Use filenames like task.yaml, solution.sh, dockerfile for automatic matching
  3. Rich Formatting: Markdown is supported in LLM reviews (headers, lists, bold, etc.)
  4. Multiple Reviews: You can load multiple review documents - they will merge

πŸ› Troubleshooting

Google Sheets not loading?

  • Check the sheet is shared publicly or with link
  • Verify the URL format is correct
  • Try exporting as CSV and uploading instead

LLM sections not appearing?

  • Check the success message shows parsed sections
  • Verify your section names match file names
  • Try a simpler format (two columns: Section, Review)

Validator errors?

  • Ensure task ZIP structure is correct
  • Check validator script is present
  • Look at detailed error message in output

πŸ“‚ Project Structure

Review Portal/
β”œβ”€β”€ review_portal_v2_llm.py      # Main portal application
β”œβ”€β”€ llm_review_parser.py         # Review document parser
β”œβ”€β”€ tb_validator_v6.py           # Task validator script
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ start_portal.sh              # Launch script
└── README.md                    # This file

πŸ”„ Version

Current: v2.1 with Google Sheets Support

What's New:

  • βœ… Direct Google Sheets URL support
  • βœ… CSV parsing for exported Sheets
  • βœ… Improved file upload (added CSV)
  • βœ… Better error messages
  • βœ… Enhanced download handling

πŸ’‘ Future Features

  • Direct LLM API integration (GPT-4, Claude, Gemini)
  • Batch review processing
  • Review templates
  • Automated scoring

Need Help? Check the error messages in the portal - they're designed to be helpful! πŸš€

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors