-
Notifications
You must be signed in to change notification settings - Fork 2.1k
fix gpu tests #2471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix gpu tests #2471
Conversation
|
Hey, I don't think these changes are correct. As the comment suggests, the purpose of this test is:
By setting |
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
|
That's weird. I got totally different result on A100, will check it again. |
|
Hi @BenjaminBossan . Could you please run the following codes and paste your outputs? import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, set_seed
from peft import LoraConfig, get_peft_model
set_seed(0)
model = AutoModelForCausalLM.from_pretrained(
"facebook/opt-125m",
quantization_config=BitsAndBytesConfig(load_in_8bit=True),
torch_dtype=torch.float32,
).eval()
torch.manual_seed(0)
config_lora = LoraConfig(r=8, init_lora_weights=False, use_dora=False)
model = get_peft_model(model, config_lora).eval()
random_input = torch.LongTensor([[1, 0, 1, 0, 1, 0]]).to(model.device)
logits_lora = model(random_input).logits
dora_model = AutoModelForCausalLM.from_pretrained(
"facebook/opt-125m",
quantization_config=BitsAndBytesConfig(load_in_8bit=True),
torch_dtype=torch.float32,
)
torch.manual_seed(0)
config_dora = LoraConfig(r=8, init_lora_weights=False, use_dora=True)
dora_model = get_peft_model(dora_model, config_dora).eval()
logits_dora = dora_model(random_input).logits
# import pdb; pdb.set_trace()
print(logits_lora)
print(logits_dora)with on A100 single card. |
|
I get: On a 4090, both with latest main of PEFT and v0.15.1. I tried transformers main and v4.50.3, bnb 0.45.3, torch 2.6.0. |
|
That's weird. I got same results on 4090 but different results on A100. And I don't know why these 2 results can be the same because the computation is different here |
|
I see, it's because of the numerical loss here: There is some numerical loss in A100 so the results are slightly different. I don't know why no numerical loss in 4090.... |
|
Hi @BenjaminBossan . Do you have any idea to deal with this case? |
The DoRA part itself is a no-op without any updates to the DoRA params. The LoRA part is not a no-op since
I'm also not sure, did you check that the dtype is the same between the A100 and 4090? Also, are the torch versions identical? |
|
Both 4090 and A100 is float32 dtype. The torch version is 2.8.0.dev20250401+cu128 for both. Exactly the same script as I proposed here |
|
Could you try out how much you need to increase tolerance for the script to pass on A100? |
atol=0.2, rtol=0.1 could pass the assert. I also selected the max and mean value of abs(logits_lora - logits_dora) |
|
Hmm, this is quite high, I don't think it makes sense to set these tolerances, as that would make the test almost meaningless. Instead, could we check the hardware and skip on A100? Maybe it's the architecture (Ampere vs Ada)? |
|
Hi @BenjaminBossan . As I only test A100 and 4090, don't know what kind of GPU can pass the tests. But I know XPU cannot pass so I disabled XPU only. It is enough for me. Please review the new changes. Thanks! |
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this.
Avoid issue with numerical instability.
Avoid issue with numerical instability.
Hi @BenjaminBossan . This PR fixed gpu tests. To reproduce it:
pytest tests/test_common_gpu.py::PeftGPUCommonTests::test_8bit_dora_inferenceThe lora 8bit model q_proj:

The Dora 8bit model q_proj:

You can see that Dora model have an extra linear:
lora.dora.DoraLinearLayer.The output cannot be the same unless the weights of
lora.dora.DoraLinearLayerare all zero.