We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
十个字使用GPU(T4卡),一个并发大概要1秒多,两个并发平均响应时间就要2秒到3秒,十个并发平均就要9秒多,GPU利用率只到了40%,有没有办法提高并发
The text was updated successfully, but these errors were encountered:
同问
Sorry, something went wrong.
使用vLLM(实验性)或开多个进程。
No branches or pull requests
十个字使用GPU(T4卡),一个并发大概要1秒多,两个并发平均响应时间就要2秒到3秒,十个并发平均就要9秒多,GPU利用率只到了40%,有没有办法提高并发
The text was updated successfully, but these errors were encountered: