This repository contains the official code for our ICML 2025 paper:
FedOne: Query-Efficient Federated Learning for Black-box Discrete Prompt Learning
This work proposes a novel federated framework designed to minimize query costs to cloud-based LLM in black-box discrete prompt learning scenarios.
The implementation builds upon and extends the codebase from Black-Box-Prompt-Learning, adapting it to the federated learning setting with additional components for client coordination, efficient prompt optimization.
-
RoBERTa-based Experiments:
preprocess.py: Performs data loading and preprocessing for RoBERTa tasks.run_glue_LLM_FL.py: Implements the federated learning framework for RoBERTa-based prompt tuning.PromptTuningClient/*.py: Contains client-side implementations of various white-box prompt tuning methods, including BBT, BDPL, Gumbel-BDPL, Prefix-Tuning, and Prompt-Tuning.
-
OpenAI API-based Experiments (GPT models):
preprocess_GPT.py: Handles preprocessing tailored to GPT-based experiments using the OpenAI API.run_glue_LLM_FL_GPT.py: Implements the federated learning workflow for black-box prompt tuning with GPT models.PromptTuningClient_GPT/*.py: Includes client-side implementations for black-box prompt learning methods such as BDPL, Gumbel-BDPL, and NoPrompt.
To set up the environment, follow these steps:
-
Create a virtual environment for example using anaconda:
conda create -n bdpl python=3.9.19 -y conda activate bdpl
We used python version (3.9.19).
-
Install required packages:
The installed important packages are: Pytorch (2.7.0 stable). transformers (4.40.2), datasets (2.19.0). accelerate (0.29.3), importlib-metadata (8.7.0). peft (0.10.0). scipy (1.13.0). scikit-learn==1.4.2. numpy==1.26.4. tqdm==4.66.2. cmaes (0.10.0). wandb (0.16.6). openai
pip3 install torch torchvision torchaudio pip install transformers==4.40.2 pip install datasets==2.19.0 pip install accelerate==0.29.3 pip install importlib-metadata==8.7.0 pip install peft==0.10.0 pip install scipy==1.13.0 pip install scikit-learn==1.4.2 pip install numpy==1.26.4 pip install tqdm==4.66.2 pip install cmaes==0.10.0 pip install wandb==0.16.6 pip install openai==1.82.1
You can also use the
requirement.txtas reference.pip install -r requirements.txt
-
For simple test, run
run_Debug.shbash run_Debug.sh
-
For RoBERTa-large experiments, run the scripts via
bash run_Experiment.sh
-
Run GPT-3.5-turbo experiments:
To run GPT-3.5-turbo experiments, execute the following script:
bash run_GPT.sh
Make sure to obtain your OpenAI API Project Key and add it to a
.envfile in your project directory. The content of.envfile should look like this:OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxxxx_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--task_name: Specifies the name of the GLUE task. Options include:[mnli, qqp, sst2, mrpc, cola, qnli, rte].--file_name: Indicates the name of the domain-specific dataset. Options include:[CI, SE, RCT, HP].--low_resource: Enables low-resource training mode.--ce_loss: Specifies whether to use cross-entropy loss. If set toFalse, hinge loss will be used. Default isTrue.--sample_size: Defines the number of samples per batch. This parameter is critical for controlling resource usage. Default is20.--prompt_length: Sets the length of the prompt tokens. Default is6.--prompt_learning_rate: Learning rate used for prompt tuning. Default is5e-5.--prompt_search_space: The size of the search space for prompt optimization. Default is20.--num_train_epochs: Total number of training epochs to perform. Default is30.--ckpt_path: Path for saving model checkpoints. Default is./ckpts.--margin: Margin used in the loss function. Default is1.0.--trial: If enabled, denotes a trial run for debugging or exploratory experiments.--use_wandb: Specifies whether to use Weights & Biases for experiment tracking. Default isFalse.--cuda: The ID of the CUDA device to use. Default is0.--max_length: The maximum length of input sequences after tokenization. Longer sequences are truncated. Default is450.--pad_to_max_length: If enabled, all sequences are padded tomax_length. Otherwise, dynamic padding is used.--per_device_train_batch_size: Batch size per device during training. Default is128.--per_device_eval_batch_size: Batch size per device during evaluation. Default is32.--model_name_or_path: Path to a pretrained model or its identifier from Hugging Face. Default is'roberta-large'.--use_slow_tokenizer: If enabled, uses the slower tokenizer implementation not backed by the Hugging Face Tokenizers library.--weight_decay: Weight decay coefficient for regularization. Default is0.1.--max_train_steps: If specified, overrides the number of training epochs.--gradient_accumulation_steps: Number of steps to accumulate gradients before performing a backward pass. Default is1.--lr_scheduler_type: Specifies the learning rate scheduler type. Options include:linear,cosine,cosine_with_restarts,polynomial,constant, andconstant_with_warmup. Default islinear.--num_warmup_steps: Number of warm-up steps for the learning rate scheduler. Default is100.--output_dir: Directory for saving the final trained model.--seed: Random seed for reproducibility. Default is42.--k_shot: Number of examples per class for few-shot learning. A value of-1denotes full supervision. Default is-1.--use_ngram: Indicates whether to use n-gram features. Default isTrue.--api_limit: Maximum number of API requests allowed. Default is8000.
--FL_framework: Specifies the Federated Learning framework. Currently supported:FedAvg.--num_clients: Total number of clients in the Federated Learning setup. Default is10.--num_activated_clients: Number of clients activated in each training round. Default is10.--num_client_local_step: Number of local update steps performed by each client. Default is1000.--max_client_train_steps: Maximum number of training steps a client can perform during one activation. Default is8000.--dirichlet_alpha: Dirichlet concentration parameter for non-IID data partitioning. A value of-1indicates IID partitioning, other value all using Dirichlet partition. Default is-1.0.
--prompt_tuning_method: Specifies the prompt tuning strategy. Supported options include:BBT,BDPL,GumbelBDPL,prefix-tuning, andprompt-tuning. Default isBDPL.
--bbt_d: Dimensionality parameter for BBT. Default is500.--bbt_sigma: Standard deviation parameter for the CMA-ES optimizer in BBT. Default is1.0.--bbt_population_size: Population size used by the CMA-ES optimizer. Default is200.
--tau: Temperature parameter for Gumbel-Softmax. Default is0.1.
--early_stop: Training will stop once the validation metric meets or exceeds this value. If set to a value less than 0, early stopping is disabled and training will proceed for the full number of epochs. Default is-1.0.
--log_file_name: Specifies the file path for saving training logs. The default value isTempResult. When the path starts withTempResult, the log file can be overwritten in subsequent runs. For all other values, the system will prevent overwriting an existing log file to avoid accidental loss of results. Upon completion of training, the final row of the log file records the test result. When conducting experiments, it is recommended to create a dedicated folder for each run to organize logs.
GLUE benchmark: MNLI, QQP, SST-2, MRPC, CoLA, QNLI, RTE