GitHub / vllm-project/llm-compressor issues and pull requests
#1648 - [Transform] QuIP Modifier
Pull Request -
State: open - Opened by kylesayrs 15 days ago
#1646 - [Bugfix] Untie word embeddings
Pull Request -
State: open - Opened by kylesayrs 15 days ago
#1645 - Alternate moe calib
Pull Request -
State: closed - Opened by dsikka 15 days ago
#1637 - [Transform] Norm fusing utilities
Pull Request -
State: closed - Opened by kylesayrs 21 days ago
- 1 comment
Labels: ready
#1456 - [deepseek-v2-lite-int8] RuntimeError: Unsupported FusedMoe scheme: num_bits=8 type='int'
Issue -
State: open - Opened by mZhenz 2 months ago
- 2 comments
Labels: bug
#1455 - Fix: Improve `SmoothQuant` Support for Mixture of Experts (MoE) Models
Pull Request -
State: open - Opened by rahul-tuli 2 months ago
- 2 comments
#1454 - Disable kernels during calibration (and tracing)
Pull Request -
State: open - Opened by kylesayrs 2 months ago
- 1 comment
Labels: ready
#1453 - [GPTQ] Fix actorder resolution, add sentinel
Pull Request -
State: open - Opened by kylesayrs 2 months ago
- 1 comment
Labels: ready
#1452 - [Tracing] Fix Traceable Imports
Pull Request -
State: closed - Opened by kylesayrs 2 months ago
- 1 comment
Labels: ready
#1451 - AWQ Apply Scales Bugfix when smooth layer output length doesn't match balance layer input length
Pull Request -
State: open - Opened by brian-dellabetta 2 months ago
- 1 comment
Labels: ready
#1450 - [Observer] Optimize mse observer
Pull Request -
State: open - Opened by shanjiaz 2 months ago
- 2 comments
Labels: ready
#1449 - [Tests] Use proper offloading utils in `test_compress_tensor_utils`
Pull Request -
State: closed - Opened by kylesayrs 2 months ago
- 1 comment
Labels: ready
#1448 - [Bugfix][Tracing] Fix qwen2_5_vl
Pull Request -
State: closed - Opened by kylesayrs 2 months ago
- 1 comment
Labels: ready
#1447 - bge-reranker-v2-m3 support
Issue -
State: open - Opened by bisunny 2 months ago
- 1 comment
Labels: enhancement
#1446 - Fix missing logs when calling oneshot
Pull Request -
State: open - Opened by kelkelcheng 2 months ago
- 3 comments
Labels: ready
#1445 - oneshot entrypoint update
Pull Request -
State: open - Opened by ved1beta 2 months ago
- 2 comments
Labels: ready
#1444 - AWQModifier fast resolve mappings, better logging
Pull Request -
State: open - Opened by brian-dellabetta 2 months ago
- 1 comment
#1443 - Update `oneshot` to use an explicit keyword argument instead of using `**kwargs`
Issue -
State: open - Opened by dsikka 2 months ago
Labels: enhancement, good first issue
#1442 - Add Additional Model Mappings for `AWQ` and `SmoothQuant`
Issue -
State: open - Opened by dsikka 2 months ago
- 1 comment
Labels: enhancement, good first issue
#1441 - Remove `sparse_logs` folder
Issue -
State: open - Opened by dsikka 2 months ago
Labels: bug, enhancement, good first issue
#1440 - AWQ Qwen and Phi mappings
Pull Request -
State: open - Opened by brian-dellabetta 2 months ago
- 1 comment
Labels: ready
#1439 - patch awq tests/readme after QuantizationMixin refactor
Pull Request -
State: closed - Opened by brian-dellabetta 3 months ago
- 1 comment
Labels: ready
#1438 - [Research] Llama4 AutoWrapper + Onloading
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 2 comments
#1437 - [NVFp4] Activation Support
Pull Request -
State: open - Opened by dsikka 3 months ago
#1436 - Initial implementation for the docs site and setup for LLM Compressor
Pull Request -
State: open - Opened by markurtz 3 months ago
- 1 comment
#1435 - [WIP][AWQ] Support accumulation for reduced memory usage
Pull Request -
State: open - Opened by kylesayrs 3 months ago
#1434 - Added more tests for Quantization24SparseW4A16
Pull Request -
State: closed - Opened by shanjiaz 3 months ago
- 1 comment
Labels: ready
#1433 - Add: deepseekv2 smoothquant mappings
Pull Request -
State: closed - Opened by rahul-tuli 3 months ago
- 1 comment
Labels: ready
#1432 - NotImplementedError: Cannot copy out of meta tensor; no data! when trying to run AWQ
Issue -
State: closed - Opened by shaibal13 3 months ago
- 3 comments
#1431 - [Logging] Support logging once
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1430 - How to quant a model which a layer module is only `nn.Paramter` ?
Issue -
State: open - Opened by shuxiaobo 3 months ago
- 1 comment
#1429 - Remove RecipeArgs class & its references
Pull Request -
State: closed - Opened by shanjiaz 3 months ago
- 3 comments
Labels: ready
#1428 - [gemma3] Properly specifying which targets to ignore
Issue -
State: open - Opened by Foreist 3 months ago
- 1 comment
Labels: bug
#1427 - NotImplementedError: No compressed-tensors compatible scheme was found
Issue -
State: closed - Opened by BigFaceBoy 3 months ago
- 8 comments
Labels: bug
#1426 - AWQ QuantizationMixin + SequentialPipeline
Pull Request -
State: closed - Opened by brian-dellabetta 3 months ago
- 1 comment
Labels: ready
#1425 - [GPTQ] Change actorder default to "static"
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 3 comments
#1424 - [GPTQ] Add `actorder` option to modifier
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1423 - [Tracing] Reinstate ignore functionality
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1422 - Integrating HQQ (Half-Quadratic Quantization)?
Issue -
State: closed - Opened by learning-chip 3 months ago
- 1 comment
Labels: enhancement
#1421 - Question on unstable pruning result using SparseGPT method
Issue -
State: open - Opened by zjnyly 3 months ago
#1420 - [Typo] overriden
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1419 - Use model compression pathways
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1418 - Quant Qwen2.5-VL with llmcompressor=0.5.1&Transformers 4.51.3 than infer with vLLM 0.8.1 got error
Issue -
State: open - Opened by YangYang-DLUT 3 months ago
- 2 comments
#1417 - Add `pull_request` trigger to base tests workflow
Pull Request -
State: closed - Opened by dbarbuzzi 3 months ago
- 1 comment
Labels: ready
#1416 - Rename SparsityModifierMixin to SparsityModifierBase
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 3 comments
Labels: ready
#1415 - [AWQ] Gemma 3: ValueError: too many values to unpack (expected 4)
Issue -
State: open - Opened by ignaceHelsen 3 months ago
- 2 comments
#1414 - removing RecipeMetadata and references
Pull Request -
State: closed - Opened by shanjiaz 3 months ago
- 2 comments
Labels: ready
#1413 - Adding a readthedocs docs build for llm-compressor
Pull Request -
State: open - Opened by aireilly 3 months ago
- 6 comments
Labels: ready
#1412 - [Examples] Standardize AWQ example
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 3 comments
Labels: ready
#1411 - [WIP][Tracing] Code AutoWrapper
Pull Request -
State: open - Opened by kylesayrs 3 months ago
#1410 - [Feature] Log/info/Save/Restore quantization steps
Issue -
State: open - Opened by mratsim 3 months ago
Labels: enhancement
#1409 - [AWQ] Insane memory requirement: over 900GB for 32B model
Issue -
State: closed - Opened by mratsim 3 months ago
- 1 comment
Labels: bug
#1408 - Add new-features section
Pull Request -
State: closed - Opened by rahul-tuli 3 months ago
- 2 comments
Labels: ready
#1407 - validation check added
Pull Request -
State: open - Opened by ved1beta 3 months ago
- 5 comments
#1406 - AWQ Qwen3-235B-A22B and Qwen3-30B-A3B
Issue -
State: open - Opened by ehartford 3 months ago
- 12 comments
Labels: bug
#1405 - AWQ sanitize_kwargs minor cleanup
Pull Request -
State: closed - Opened by brian-dellabetta 3 months ago
- 1 comment
Labels: ready
#1404 - use "qwen_2_5_vl_example.py" quant Qwen2.5-VL-7B-Instruct, got error with "SAMPLE GENERATION"
Issue -
State: closed - Opened by YangYang-DLUT 3 months ago
- 1 comment
#1403 - Error when computing device_map for Mistral-small-3.1-24B-Instruct-2503
Issue -
State: open - Opened by VAmblardPEReN 3 months ago
Labels: bug
#1402 - [VLM] Fix mllama targets
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1401 - Add warning for non-divisible group quantization
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1400 - [WIP][Testing] Add VL e2e tests
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 1 comment
#1399 - [VLM] Add Gemma3 Example
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 2 comments
#1398 - Consolidate build config
Pull Request -
State: closed - Opened by dbarbuzzi 3 months ago
- 1 comment
Labels: ready
#1397 - Exclude images from package
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1396 - Drop `flash_attn` skip for quantizing_moe example tests
Pull Request -
State: closed - Opened by dbarbuzzi 3 months ago
- 1 comment
Labels: ready
#1395 - awq -- hotfix to missing kwargs
Pull Request -
State: closed - Opened by brian-dellabetta 3 months ago
- 1 comment
Labels: ready
#1394 - For multimodal models, such as QwenVL2.5, is the SmoothQuantModifier necessary when performing W8A8 quantization?
Issue -
State: open - Opened by weirdo2310 3 months ago
- 3 comments
#1393 - For FP8 Fused MoE layers, only per-tensor scalesfor weights and activations are supporte?
Issue -
State: open - Opened by shuxiaobo 3 months ago
- 1 comment
Labels: bug
#1392 - [Tracing] Trace with eager attention
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 1 comment
#1391 - [Lifecycle] Initialize only once, trigger on_start for each pipeline
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
#1390 - [Tracing] Autowrap methods by name
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 1 comment
#1389 - [Tracing] Skip non-ancestors of sequential targets
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 2 comments
Labels: ready
#1388 - [Tracing] Raise `_is_compiling_flag` while tracing
Pull Request -
State: open - Opened by kylesayrs 3 months ago
- 1 comment
Labels: ready
#1387 - [WIP][Tracing] Mistral3ForConditionalGeneration
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 1 comment
#1386 - AWQ: Clean up forward passes with kwargs using inspect.bind
Pull Request -
State: closed - Opened by ved1beta 3 months ago
- 2 comments
#1385 - AWQ -- Clean up forward passes with kwargs using `inspect.bind`
Issue -
State: closed - Opened by brian-dellabetta 3 months ago
- 1 comment
Labels: enhancement, good first issue
#1384 - bugfix AWQ with Llama models and python 3.9
Pull Request -
State: closed - Opened by brian-dellabetta 3 months ago
- 1 comment
Labels: ready
#1383 - Load the model to CPU but quantize using the GPU
Issue -
State: open - Opened by sgsdxzy 3 months ago
- 1 comment
Labels: enhancement
#1382 - Is there any way quant model on multi nodes?
Issue -
State: open - Opened by shuxiaobo 3 months ago
- 1 comment
Labels: enhancement
#1381 - Bump version; set ct version
Pull Request -
State: closed - Opened by dsikka 3 months ago
- 1 comment
#1380 - Update w4a16_actorder_weight.yaml lmeval config
Pull Request -
State: closed - Opened by dbarbuzzi 3 months ago
- 1 comment
Labels: ready
#1379 - [WIP] Recipe `model_dump` fixes
Pull Request -
State: open - Opened by rahul-tuli 3 months ago
#1378 - Revert "fix: Make Recipe.model_dump() output compatible ....
Pull Request -
State: closed - Opened by rahul-tuli 3 months ago
- 1 comment
Labels: ready
#1377 - Add: documentation for enhanced `save_pretrained` parameters
Pull Request -
State: closed - Opened by rahul-tuli 3 months ago
- 1 comment
#1376 - Enhance save_pretrained
Pull Request -
State: open - Opened by rahul-tuli 3 months ago
- 1 comment
#1375 - [Tests] Fix test case; update structure
Pull Request -
State: closed - Opened by dsikka 3 months ago
- 2 comments
Labels: ready
#1374 - [WIP] Add AWQ Asym e2e test case
Pull Request -
State: closed - Opened by dsikka 3 months ago
- 1 comment
Labels: ready
#1373 - [Tracing] Support tracing of Gemma3 [#1248]
Pull Request -
State: closed - Opened by kelkelcheng 3 months ago
- 7 comments
Labels: ready
#1372 - AWQ resolved mappings -- ensure shapes align
Pull Request -
State: closed - Opened by brian-dellabetta 3 months ago
- 11 comments
Labels: ready
#1371 - [Tests] Disable silently failing kv cache test
Pull Request -
State: closed - Opened by kylesayrs 3 months ago
- 3 comments
Labels: ready
#1369 - OOM (host) when running AWQ
Issue -
State: closed - Opened by zjnyly 3 months ago
- 2 comments
Labels: bug
#1368 - How to run AWQ-W4Afp8 quantization?
Issue -
State: open - Opened by wanzhenchn 3 months ago
- 2 comments
#1363 - Update: transformers support to latest
Pull Request -
State: closed - Opened by rahul-tuli 3 months ago
- 2 comments
#1359 - [Experimental] Mistral-format FP8 quantization
Pull Request -
State: open - Opened by mgoin 3 months ago
- 1 comment
#1358 - Running vllm after `oneshot` causes rerun of `oneshot`
Issue -
State: closed - Opened by brian-dellabetta 3 months ago
- 3 comments
Labels: bug
#1355 - How can i quant a model using fp8 blockwise quant just like deepseekv3
Issue -
State: closed - Opened by WhatGhost 4 months ago
- 13 comments
#1351 - Implement `QuantizationMixin`
Pull Request -
State: closed - Opened by kylesayrs 4 months ago
- 4 comments
Labels: ready
#1350 - qat question?
Issue -
State: closed - Opened by coolKeen 4 months ago
- 2 comments
#1349 - [Gemma3] The decoded token_ids are all [0,0,...,] after GPTQ quantization
Issue -
State: closed - Opened by Caleb66666 4 months ago
- 10 comments
#1348 - Add torch device to list of offloadable types
Pull Request -
State: closed - Opened by kylesayrs 4 months ago
- 1 comment
Labels: ready