This repository contains a complete MLOps pipeline for training and serving a PDF-to-LaTeX model on Google Cloud Platform (GCP) using Vertex AI.
conda create -n pdf2latex python=3.11 -y
conda activate pdf2latex
pip install notebook tqdm wandb
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu129
pip install transformers datasets accelerate peft flash-attn --no-build-isolationuv syncNote: flash-attn is not available for MPS.
ssh -L 8001:gpunode24:8000 cs.edu
- GCP Project: A Google Cloud Project with billing enabled.
- Tools: Install Terraform, Google Cloud SDK, and uv.
- Authentication:
gcloud auth login gcloud auth application-default login
Provision all necessary resources (GCS Bucket, Artifact Registry, APIs) automatically.
cd terraform
# Update terraform.tfvars with your project_id
terraform init
terraform applyNote down the Bucket Name output by Terraform.
Generate the dataset and stage the model artifacts to GCS.
Generate Dataset:
uv run python pdf2latex/data_process.py
# Upload to GCS
gcloud storage cp datasets/latex80m_en_1m.parquet gs://YOUR_BUCKET/datasets/Stage Model (Hugging Face -> GCS): Download the model and upload it to your bucket for controlled serving.
uv run python scripts/stage_model.py \
--repo_id scottcfy/Qwen2-VL-2B-Instruct-pdf2latex \
--gcs_uri gs://YOUR_BUCKET/models/pdf2latex-v1 \
--project_id YOUR_PROJECT_IDBuild the training and serving containers and push them to Artifact Registry.
# Usage: ./scripts/gcp_build_and_push.sh <PROJECT_ID> <REGION> <REPO_NAME>
./scripts/gcp_build_and_push.sh YOUR_PROJECT_ID us-central1 pdf2latex-repoSubmit a custom training job to Vertex AI.
uv run python scripts/gcp_submit_train.py \
--project_id YOUR_PROJECT_ID \
--location us-central1 \
--staging_bucket gs://YOUR_BUCKET \
--display_name pdf2latex-train \
--container_uri us-central1-docker.pkg.dev/YOUR_PROJECT_ID/pdf2latex-repo/pdf2latex-train:latest \
--dataset_path gs://YOUR_BUCKET/datasets/latex80m_en_1m.parquet \
--output_dir gs://YOUR_BUCKET/outputs/run1 \
--use_spot # Use Spot instances for cost savingsDeploy the model to a Vertex AI Endpoint. The serving container supports loading from GCS or Hugging Face.
Deploy from GCS (Recommended):
uv run python scripts/gcp_deploy_serve.py \
--project_id YOUR_PROJECT_ID \
--location us-central1 \
--display_name pdf2latex-serve \
--serving_container_uri us-central1-docker.pkg.dev/YOUR_PROJECT_ID/pdf2latex-repo/pdf2latex-serve:latest \
--model_artifact_uri gs://YOUR_BUCKET/models/pdf2latex-v1Deploy from Hugging Face directly:
uv run python scripts/gcp_deploy_serve.py \
...
--hf_model_id scottcfy/Qwen2-VL-2B-Instruct-pdf2latexVerify the deployed endpoint by sending a sample image.
uv run python scripts/test_endpoint.py \
--endpoint_id YOUR_ENDPOINT_ID \
--image_path test_image.pngTo call the deployed model from another service (e.g., a backend API or microservice), use the Google Cloud Vertex AI SDK or standard REST API.
Ensure your service has a Service Account with the Vertex AI User role.
- Local Dev:
gcloud auth application-default login - Production: Attach the Service Account to your VM/Pod.
import base64
import json
from google.cloud import aiplatform
def predict_latex(project_id, location, endpoint_id, image_path):
# Initialize Vertex AI SDK
aiplatform.init(project=project_id, location=location)
endpoint = aiplatform.Endpoint(endpoint_id)
# Encode Image
with open(image_path, "rb") as f:
encoded_image = base64.b64encode(f.read()).decode("utf-8")
# Construct Payload (OpenAI Chat Format)
payload = {
"model": "/model-artifacts",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Convert this to LaTeX."},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{encoded_image}"}}
]
}
],
"max_tokens": 512,
"temperature": 0.2
}
# Send Request
response = endpoint.raw_predict(
body=json.dumps(payload).encode("utf-8"),
headers={"Content-Type": "application/json"}
)
return response.content.decode("utf-8")
ssh -L 8001:gpunode3:8000 cs.eduvllm serve Qwen/Qwen2-VL-2B-Instruct \
--port 8000 \
--gpu-memory-utilization 0.9