sgl-project/sglang issues and pull requests

#1470 - Release v0.3.1.post2

Pull Request - State: closed - Opened by merrymercy about 14 hours ago

#1469 - Fix padding in the cuda graph

Pull Request - State: closed - Opened by merrymercy about 15 hours ago

#1467 - [Bug] illegal memory access encountered

Issue - State: open - Opened by wonderisland about 15 hours ago

#1466 - [Bug] enable-mixed-chunk may cause the regex request get wrong result and output_token_logprobs

Issue - State: open - Opened by liuteng about 15 hours ago

#1465 - Debug

Pull Request - State: open - Opened by hnyls2002 about 15 hours ago

#1464 - fix: creat new dict everytime for putting new frame

Pull Request - State: closed - Opened by Luodian about 17 hours ago

#1463 - [Bug] oom,torch.OutOfMemoryError: seems to only use one gpu on A800-80G,available 40g on each card

Issue - State: open - Opened by chuangzhidan about 18 hours ago - 3 comments

#1461 - Metrics

Pull Request - State: open - Opened by binarycrayon about 20 hours ago - 1 comment

#1460 - [Question]Why is the default value of max_prefill_tokens 16384?

Issue - State: closed - Opened by wjj19950828 about 20 hours ago

#1459 - [WIP] Support double sparsity

Pull Request - State: open - Opened by andy-yang-1 1 day ago - 1 comment

#1458 - [Event] Add public meeting invite to README

Pull Request - State: closed - Opened by Ying1123 1 day ago

#1457 - Fuse top_k and top_k in the sampler

Pull Request - State: closed - Opened by merrymercy 1 day ago

#1456 - Pr fix max workers

Pull Request - State: open - Opened by wellhowtosay 1 day ago

#1455 - [Bug] OOM when runing `bench_serving` with DeepSeekCoder-V2-Lite.

Issue - State: closed - Opened by zh-zheng 1 day ago - 2 comments

#1454 - Fix oom issues with fp8 for llama

Pull Request - State: closed - Opened by merrymercy 1 day ago

#1453 - [Bugfix] Enable SGLang on AMD GPUs via PyTorch for ROCm (#1419)

Pull Request - State: closed - Opened by HaiShaw 1 day ago

#1452 - Add bench_server_latency.py

Pull Request - State: closed - Opened by merrymercy 1 day ago

#1451 - Fix schedule bug

Pull Request - State: closed - Opened by hnyls2002 1 day ago

#1450 - fix schedule bug

Pull Request - State: closed - Opened by hnyls2002 2 days ago

#1449 - Fixed n>1 causing list index out of range with VLM

Pull Request - State: closed - Opened by jasonyux 2 days ago - 2 comments

#1448 - Fix attention backend

Pull Request - State: closed - Opened by ispobock 2 days ago

#1447 - Enable MLA by default

Pull Request - State: closed - Opened by ispobock 2 days ago

#1446 - [Bug] Performance issue on MoE with torch.compile

Issue - State: open - Opened by ispobock 3 days ago
Labels: performance

#1445 - Release 0.3.1.post1

Pull Request - State: closed - Opened by merrymercy 3 days ago

#1444 - Add OLMoE model

Pull Request - State: closed - Opened by janimo 3 days ago

#1443 - [Bug] The latest Sglang docker image cannot start online services

Issue - State: closed - Opened by CedricHwong 3 days ago - 2 comments

#1442 - Fix torch compile for deepseek-v2

Pull Request - State: closed - Opened by ispobock 3 days ago - 6 comments

#1441 - Simplify sampler and its error handling

Pull Request - State: closed - Opened by merrymercy 3 days ago

#1440 - Clean up model loader

Pull Request - State: closed - Opened by merrymercy 3 days ago

#1439 - [Bug] Llama 405B FP8 causes OOM on 16xA40

Issue - State: open - Opened by sumukshashidhar 3 days ago

#1438 - Add constrained_json_whitespace_pattern to ServerArgs

Pull Request - State: closed - Opened by zifeitong 3 days ago

#1436 - [Feature] Add initial support for sequence parallelism

Pull Request - State: open - Opened by Ying1123 4 days ago

#1435 - [Feature] Expert parallelism support

Issue - State: open - Opened by chongli-uw 4 days ago
Labels: enhancement

#1434 - [Bug] Nonsense and slow output under high concurrency

Issue - State: open - Opened by tongyx361 4 days ago - 1 comment

#1433 - [Feature] Support LoRA path renaming and add LoRA serving benchmarks

Pull Request - State: closed - Opened by Ying1123 4 days ago

#1432 - Revert "[Minor] Raise exception for wrong import (#1409)"

Pull Request - State: closed - Opened by Ying1123 4 days ago

#1431 - Remove deprecated configs

Pull Request - State: closed - Opened by merrymercy 4 days ago

#1430 - Release v0.3.1

Pull Request - State: closed - Opened by merrymercy 4 days ago

#1429 - Update backend.md

Pull Request - State: closed - Opened by merrymercy 5 days ago

#1428 - [Fix] Fix logprob and normalized_logprob

Pull Request - State: closed - Opened by merrymercy 5 days ago

#1427 - Add libibverbs-dev to Dockerfile

Pull Request - State: closed - Opened by Aphoh 5 days ago

#1426 - fix: resolve nightly eval

Pull Request - State: closed - Opened by zhyncs 5 days ago

#1425 - Add pytorch sampling backend ut

Pull Request - State: closed - Opened by ispobock 5 days ago

#1424 - [Bug] missing max_workers param when initiate ProcessPoolExecutor

Issue - State: open - Opened by wellhowtosay 6 days ago - 2 comments

#1423 - [Bug] MLA models can't use enable-torch-compile. Can be fix by suppressing errors.

Issue - State: closed - Opened by Achazwl 6 days ago - 1 comment

#1422 - Enable torch.compile for triton backend

Pull Request - State: closed - Opened by merrymercy 6 days ago - 1 comment

#1421 - [Bug] deepseek-v2 fp8 cuda graph errror

Issue - State: closed - Opened by fengyang95 6 days ago - 5 comments

#1420 - [Feature, Hardware] Enable SGLang on AMD GPUs via PyTorch for ROCm

Pull Request - State: closed - Opened by HaiShaw 6 days ago

#1419 - [Feature] Support AMD GPU via PyTorch for ROCm

Issue - State: open - Opened by HaiShaw 6 days ago
Labels: enhancement

#1418 - Add torchao quant for mixtral and qwen_moe

Pull Request - State: closed - Opened by jerryzh168 6 days ago
Labels: quant

#1417 - fallback to round robin scheduler

Pull Request - State: open - Opened by qeternity 6 days ago - 5 comments

#1416 - [Bug] AttributeError: 'MiniCPM3ForCausalLM' object has no attribute 'get_module_name'

Issue - State: open - Opened by lixiangtiandashen 6 days ago - 2 comments

#1415 - [Bug] Issue with batch API

Issue - State: open - Opened by dmakhervaks 6 days ago - 3 comments

#1414 - ci: fix finish

Pull Request - State: closed - Opened by zhyncs 6 days ago - 1 comment

#1413 - [Bug] triton attention-backend bug

Issue - State: open - Opened by 81549361 6 days ago

#1412 - Update pr-test.yml

Pull Request - State: closed - Opened by merrymercy 7 days ago

#1411 - Balance test in CI

Pull Request - State: closed - Opened by merrymercy 7 days ago

#1410 - Update pr-test.yml

Pull Request - State: closed - Opened by merrymercy 7 days ago

#1409 - [Minor] Raise exception for wrong import

Pull Request - State: closed - Opened by Ying1123 7 days ago - 2 comments

#1408 - [CI] Include triton backend and online serving benchmark into CI

Pull Request - State: closed - Opened by merrymercy 7 days ago

#1407 - Make stop reason a dict instead of str

Pull Request - State: closed - Opened by merrymercy 7 days ago - 1 comment

#1406 - [Minor, CI] remove lora test from minimal suite

Pull Request - State: closed - Opened by Ying1123 7 days ago

#1405 - [Bug] RuntimeError: Failed to allocate memory for batch_prefill_tmp_v with size 458752000 and alignment 16 in AlignedAllocator

Issue - State: open - Opened by josephydu 7 days ago - 3 comments

#1405 - [Bug] RuntimeError: Failed to allocate memory for batch_prefill_tmp_v with size 458752000 and alignment 16 in AlignedAllocator

Issue - State: open - Opened by josephydu 7 days ago - 3 comments

#1404 - [Bug] ImportError : cannot import name 'gemma_fused_add_rmsnorm' from 'flashinfer.norm'

Issue - State: closed - Opened by luo647 8 days ago - 2 comments

#1404 - [Bug] ImportError : cannot import name 'gemma_fused_add_rmsnorm' from 'flashinfer.norm'

Issue - State: closed - Opened by luo647 8 days ago - 1 comment

#1403 - kernel: use tensor cores for flashinfer gqa kernels

Pull Request - State: closed - Opened by yzh119 8 days ago - 1 comment

#1403 - kernel: use tensor cores for flashinfer gqa kernels

Pull Request - State: closed - Opened by yzh119 8 days ago - 1 comment

#1402 - [Minor Fix] Fix llava modalities issue for single-image

Pull Request - State: closed - Opened by kcz358 8 days ago - 1 comment

#1402 - [Minor Fix] Fix llava modalities issue for single-image

Pull Request - State: closed - Opened by kcz358 8 days ago - 1 comment

#1401 - Support cuda graph in the triton attention backend

Pull Request - State: closed - Opened by merrymercy 8 days ago - 4 comments

#1401 - Support cuda graph in the triton attention backend

Pull Request - State: closed - Opened by merrymercy 8 days ago - 5 comments

#1400 - [Bug] LLaVA performance inconsistent with the result

Issue - State: closed - Opened by kcz358 8 days ago - 1 comment

#1400 - [Bug] LLaVA performance inconsistent with the result

Issue - State: closed - Opened by kcz358 8 days ago - 1 comment

#1399 - Fix README format

Pull Request - State: closed - Opened by Achazwl 8 days ago

#1398 - [Bug] This modeling file requires the following packages that were not found in your environment: datamodel_code_generator. Run `pip install datamodel_code_generator`

Issue - State: open - Opened by cicicji 8 days ago

#1398 - [Bug] This modeling file requires the following packages that were not found in your environment: datamodel_code_generator. Run `pip install datamodel_code_generator`

Issue - State: open - Opened by cicicji 8 days ago

#1397 - Add Support for XVERSE Models (Dense and MoE) to sglang

Pull Request - State: closed - Opened by hxer7963 8 days ago

#1396 - [Feature] support awq of deepseek-v2 or deepseek-v2.5

Issue - State: open - Opened by tutu329 8 days ago

#1396 - [Feature] support awq of deepseek-v2 or deepseek-v2.5

Issue - State: open - Opened by tutu329 8 days ago

#1395 - [Feature] need DeepSeek-v2 or deepseek-v2.5 awq support

Issue - State: closed - Opened by tutu329 8 days ago

#1395 - [Feature] need DeepSeek-v2 or deepseek-v2.5 awq support

Issue - State: closed - Opened by tutu329 8 days ago

#1394 - Remove synchronization in cuda graph replay

Pull Request - State: closed - Opened by hnyls2002 8 days ago

#1393 - Add no commit to main rule

Pull Request - State: closed - Opened by hnyls2002 8 days ago - 1 comment

#1392 - Optimize conflicts between CUDA graph and vocab mask tensors

Pull Request - State: closed - Opened by hnyls2002 8 days ago

#1391 - [Bug] 'LlamaTokenizerFast' object has no attribute 'tokenizer'

Issue - State: open - Opened by zwc163 8 days ago

#1391 - [Bug] 'LlamaTokenizerFast' object has no attribute 'tokenizer'

Issue - State: closed - Opened by zwc163 8 days ago

#1390 - Improve error reporting during server launch

Pull Request - State: closed - Opened by merrymercy 8 days ago

#1389 - [Fix] Fix --disable-flashinfer

Pull Request - State: closed - Opened by merrymercy 8 days ago - 1 comment

#1389 - [Fix] Fix --disable-flashinfer

Pull Request - State: closed - Opened by merrymercy 8 days ago - 1 comment

#1388 - [Feature] Support torch profiler

Issue - State: open - Opened by danielhua23 9 days ago - 1 comment
Labels: good first issue

#1388 - [Feature] Support torch profiler

Issue - State: open - Opened by danielhua23 9 days ago - 1 comment
Labels: good first issue

#1387 - [Feature] Can centos7 use this project?

Issue - State: open - Opened by luo647 9 days ago - 1 comment

#1387 - [Feature] Can centos7 use this project?

Issue - State: closed - Opened by luo647 9 days ago - 3 comments

#1386 - [Bug] requests.exceptions.JSONDecodeError:

Issue - State: closed - Opened by eyuansu62 9 days ago - 6 comments

#1386 - [Bug] requests.exceptions.JSONDecodeError:

Issue - State: closed - Opened by eyuansu62 9 days ago - 6 comments

#1385 - remove assertion in triton attention and add an unit test

Pull Request - State: closed - Opened by ByronHsu 9 days ago

#1385 - remove assertion in triton attention and add an unit test

Pull Request - State: closed - Opened by ByronHsu 9 days ago

#1384 - [Feature] Support RM API

Issue - State: open - Opened by UbeCc 9 days ago - 1 comment

#1383 - Rewrite mixed chunked prefill

Pull Request - State: open - Opened by hnyls2002 9 days ago

GitHub / sgl-project/sglang issues and pull requests