Skip to content

Conversation

@farzadab
Copy link
Contributor

@farzadab farzadab commented Jul 9, 2024

This PR separates validation data_sets from training data_sets and adds separate evaluations for the following datasets both in the audio and text-only modes: ["heysquad_human", "anyinstruct", "soda", "peoplespeech"]

Example perplexity/loss curves:
image

@farzadab farzadab marked this pull request as ready for review July 17, 2024 21:54
@farzadab farzadab requested a review from juberti July 17, 2024 21:57
@farzadab
Copy link
Contributor Author

farzadab commented Jul 17, 2024

For some reason I had assumed that I had merged this PR but it was just sitting there for a week!
I will add ST evals separately later on.

@farzadab farzadab requested a review from zqhuang211 July 17, 2024 21:58
@farzadab
Copy link
Contributor Author

PTAL @juberti. All comments were applied.

@farzadab farzadab merged commit c3c8dd1 into main Jul 23, 2024
@farzadab farzadab deleted the farzad-more-vals branch July 23, 2024 17:07
akshat0311 pushed a commit to jiviai/audio-llm that referenced this pull request Jan 30, 2025
* add heysquad and slue-sqa5 datasets

* multi-ds evaluations

* add spanish and chinese evals

* remove chinese and spanish val sets due to hang

* "Transcribe <|audio|>" to "Transcribe\n<|audio|>"

* _get_messages helper function

* moved contenxt len check to _get_query_prompt
zqhuang211 added a commit that referenced this pull request Feb 12, 2025
- revert max_audio_duration_secs to the default 30s in eval_config_2k.yaml
- update poetry.lock as the previous version is outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants