-
Notifications
You must be signed in to change notification settings - Fork 475
Issues: sgl-project/sglang
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Feature]Support Qwen2_5...etc tools calling by OpenAI API
#1912
opened Nov 4, 2024 by
CedricHwong
1 of 2 tasks
Expose max_total_num_tokens for Token Limit Calculation in Request Handling
#1900
opened Nov 3, 2024 by
hahmad2008
1 of 5 tasks
[Bug] Offline engine performance is not better than local server when running batch
#1872
opened Nov 1, 2024 by
jischein
5 tasks done
Question: Does sglang support prefix cache for multimodal models?
#1870
opened Nov 1, 2024 by
htrekker
TP8 scheduling overhead is very high for small model, Llama 3 8B
#1857
opened Oct 31, 2024 by
hliuca
5 tasks done
Questions Regarding sglang vs vllm and Memory Management
#1828
opened Oct 28, 2024 by
hahmad2008
1 of 5 tasks
[Feature] Support QLoRA weights
enhancement
New feature or request
#1826
opened Oct 28, 2024 by
zzh-www
[Bug] Got error with awq_marlin quantization args.
#1792
opened Oct 25, 2024 by
liangzelang
5 tasks done
[Feature] Request to 8-bit Quantization of Attention with SageAttention
good first issue
Good for newcomers
#1763
opened Oct 23, 2024 by
Snowdar
2 tasks done
[Bug][minimal reproducible demo] High variability across batch inference runs
bug
Something isn't working
#1729
opened Oct 20, 2024 by
FredericOdermatt
5 tasks done
[Feature] Cascade attention kernels
good first issue
Good for newcomers
#1715
opened Oct 19, 2024 by
merrymercy
[Bug] crash about
c10d::ProcessGroupNCCL::WorkNCCL::checkTimeout
#1693
opened Oct 17, 2024 by
zeng-zc
5 tasks
[Feature] When will a version of S-Lora be available?
#1668
opened Oct 14, 2024 by
kunkunzhang123
2 tasks done
[Feature] Using frontend APIs but passing a list of prompts in
run
rather than run_batch
#1624
opened Oct 10, 2024 by
pengye91
2 tasks done
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.