Skip to content

Unify API-based inference for weave.py#7

Open
ksadov wants to merge 4 commits intoJD-P:mainfrom
ksadov:unify_api
Open

Unify API-based inference for weave.py#7
ksadov wants to merge 4 commits intoJD-P:mainfrom
ksadov:unify_api

Conversation

@ksadov
Copy link

@ksadov ksadov commented Jun 22, 2024

Currently, weave.py contains separate functions to infer using OpenAI or VLLM. But both of these APIs, as well as a number of others, are based on the OpenAI v1 completions API. Supporting arbitrary APIs is just a matter of changing base url and API key.

I made some changes to weave.py inference functions and args to support this. Examples to try:

  • Business as usual: python weave.py --gen-model-name mistralai/Mistral-7B-v0.1 --eval-model-name jdpressman/minihf_evaluator_mistral_7b_v0.1
  • Generating with a local model, evaluating with an OpenAI model: python weave.py --gen-model-name openai-community/gpt2 --eval-model-name davinci-002 --eval-api-base https://api.openai.com/v1/completions --eval-api-key $OPENAI_API_KEY
    • you'll need to comment out repetition_penalty and top_k in the evaluate_outputs_api payload to get it to work
  • Generating with a model hosted on Together AI, evaluating with a model running on a local llama.cpp server: python weave.py --gen-model-name mistralai/Mistral-7B-v0.1 --gen-api-base https://api.together.xyz/v1/completions --gen-api-key $TOGETHERAI_API_KEY --eval-model-name Meta-Llama-3-8B-Q4_5_M --eval-api-base http://localhost:5000/v1/completions
    • you'll need to comment out seed in the generate_outputs_api payload to get it to work

It's not ideal that getting certain APIs to work requires so much commenting, but I'd prefer to handle that via config files rather than tacking on more command line args, and that seems like a job for a separate PR.

@ksadov ksadov mentioned this pull request Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments