Format of dataset when fine-tuning with LoRA #239
Unanswered
coffeeBeansz
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Trying to use the lora.py script for fine-tuning. I have prepared a .json file with the following format:
{ "images": [<image_path_1>, ...] "messages": [ [ { "role": "system", "content": [{"text": <system prompt>, "type": "text"}] }, { "role": "user", "content": [ {"image": <image_path_1>, "type": "image"}, {"text": <question>, "type": "text"} ] }, { "role": "assistant", "content": [{"text": <desired response>, "type": "text"}] } ], ... ] }And then I have imported
Datasetfromdatasetand usedDataset.from_dict()to convert it to a huggingface dataset.My question is if this is the correct format of the data? Should I pass the image in the "messages" as well as in the "images"? Or do I only need to pass it in the "images" field? Also should I pass the local path or a PIL image?
Beta Was this translation helpful? Give feedback.
All reactions