Hi,
I used the scripts under ./wanda/lora_ft/* to perform LoRA finetuning.
Specifically, I applied LoRA finetuning to a LLaMA2-7B model that had already been pruned to 40% sparsity.
However, after finetuning and merging the LoRA parameters back into the base model, the final sparsity ratio dropped to 32.7%.
According to the description in Section 5 of the paper:
For LLaMA2-7B, LoRA introduces only around 0.06% additional parameters, leaving the total sparsity level still around 50%.
Could you help explain why the sparsity ratio drops significantly after LoRA finetuning with your script?
Additionally, could you share how you manage to maintain almost the same sparsity after LoRA finetuning as described?
Please let me know if you need any additional details.
Thanks in advance!