Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / panqiwei/autogptq issues and pull requests
#412 - [BUG] Wrong generations in batch mode with exllamav2
Issue -
State: closed - Opened by gingsi about 1 year ago
- 5 comments
Labels: bug
#411 - Fix windows (no triton) and cpu-only support
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
- 1 comment
#410 - Improve message about buffer size in exllama v1 backend
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#409 - Fix CPU inference
Pull Request -
State: closed - Opened by yangw1234 about 1 year ago
- 2 comments
#408 - Fix quantize method with None mask
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#407 - Fix windows support
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#406 - Using Exllama backend requires all the modules to be on GPU - how?
Issue -
State: closed - Opened by tigerinus about 1 year ago
- 5 comments
Labels: bug
#405 - Allow specifying GPU used for quantisation, overriding hardcoded cuda:0
Pull Request -
State: open - Opened by TheBloke about 1 year ago
- 1 comment
#404 - [BUG] After upgrading to last version A new error apeared
Issue -
State: closed - Opened by ParisNeo about 1 year ago
- 4 comments
Labels: bug
#403 - [BUG] ImportError: DLL load failed while importing exllama_kernels: The specified module could not be found.
Issue -
State: closed - Opened by Mradr about 1 year ago
- 3 comments
Labels: bug
#402 - Issue when loading autgptq - CUDA extension not installed and exllama_kernels not installed
Issue -
State: closed - Opened by ditchtech about 1 year ago
- 12 comments
Labels: bug
#401 - With transformers 4.35.0 Flash Attention 2 breaks quantization, with exception `AttributeError: 'NoneType' object has no attribute 'to'` on ` attention_masks.append(kwargs["attention_mask"].to(self.data_device))`
Issue -
State: closed - Opened by TheBloke about 1 year ago
- 3 comments
Labels: bug
#400 - [BUG] libcudart.so.12 issues with latest v0.5.0
Issue -
State: open - Opened by winglian about 1 year ago
- 3 comments
Labels: bug
#399 - Problems with cQIGen on windows
Issue -
State: closed - Opened by Shroedinger about 1 year ago
- 16 comments
#398 - [BUG] Importing `AutoGPTQForCausalLM` on Colab causes `ImportError`
Issue -
State: closed - Opened by yumemio about 1 year ago
- 5 comments
Labels: bug
#397 - Update README and version following 0.5.0 release
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#385 - Add fix for CPU Inference
Pull Request -
State: open - Opened by vivekkhandelwal1 about 1 year ago
- 1 comment
#384 - Pin to accelerate>=0.22
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#383 - Allow using a model with basename `model`, use_safetensors defaults to True
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#382 - Improve RoCm support
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
- 2 comments
#381 - [BUG] setup.py fails if gekko, pandas, numpy are not installed
Issue -
State: closed - Opened by fxmarty about 1 year ago
- 1 comment
Labels: bug
#380 - [FEATURE] 支持 fuyu-8b 量化
Issue -
State: open - Opened by xunfeng1980 about 1 year ago
Labels: enhancement, chinese
#379 - Fix QiGen kernel generation
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#379 - Fix QiGen kernel generation
Pull Request -
State: closed - Opened by fxmarty about 1 year ago
#378 - 关于量化使用的数据最后没有eos的问题
Issue -
State: open - Opened by pipixia244 about 1 year ago
Labels: chinese
#377 - [BUG] 使用 autogptq int4 量化 Qwen-Chat-14B 后,发现温度低于 <= 0.5 报错
Issue -
State: closed - Opened by xunfeng1980 about 1 year ago
- 3 comments
Labels: bug, chinese
#376 - [`core`/ `QLinear`] Support CPU inference
Pull Request -
State: open - Opened by younesbelkada about 1 year ago
- 9 comments
#375 - auto_gptq.nn_modules.qlinear.qlinear_cuda:CUDA extension not installed.
Issue -
State: closed - Opened by ParisNeo about 1 year ago
- 17 comments
Labels: bug
#374 - Unrecognized tensor type ID: Autocast CUDA [BUG]
Issue -
State: closed - Opened by Andrew011002 about 1 year ago
- 14 comments
Labels: bug
#373 - [BUG] Missing source distribution in pypi for version 0.4.2
Issue -
State: closed - Opened by levkk about 1 year ago
- 4 comments
Labels: bug
#372 - quantize baichuan2-13b error
Issue -
State: open - Opened by yijinsheng about 1 year ago
- 1 comment
#371 - How to generate pytorch_model.bin.index.json or model.safetensors.index.json
Issue -
State: open - Opened by Fraudsterrrr about 1 year ago
Labels: chinese
#370 - [BUG]RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Issue -
State: closed - Opened by dlutsniper about 1 year ago
- 15 comments
Labels: bug
#369 - BUILDING FOR ROCM 5.7.1
Issue -
State: closed - Opened by letsdothis-roguethink about 1 year ago
- 4 comments
#368 - The Qlinear output has slightly difference for each inference
Issue -
State: closed - Opened by LightDXY about 1 year ago
- 1 comment
#367 - Modify qlinear_cuda for tracing the GPTQ model
Pull Request -
State: closed - Opened by vivekkhandelwal1 about 1 year ago
- 8 comments
#366 - [BUG] Mac support
Issue -
State: closed - Opened by Candouber about 1 year ago
- 2 comments
Labels: bug
#365 - [BUG]
Issue -
State: closed - Opened by MotoyaTakashi about 1 year ago
- 2 comments
Labels: bug
#364 - Save and Load sharded gptq checkpoint
Pull Request -
State: open - Opened by PanQiWei about 1 year ago
- 5 comments
#363 - Why inference gets slower by going down to lower bits?(in comparison with ggml)
Issue -
State: closed - Opened by Darshvino about 1 year ago
- 1 comment
Labels: bug
#362 - Add support for Mistral models.
Pull Request -
State: open - Opened by LaaZa about 1 year ago
#361 - PEFT initialization fix
Pull Request -
State: closed - Opened by alex4321 about 1 year ago
#360 - [FEATURE] GPTQ VectorQuantMatmul Kernel Documentation
Issue -
State: open - Opened by jeromeku about 1 year ago
- 1 comment
Labels: enhancement
#359 - [FEATURE] Mistral Support
Issue -
State: open - Opened by GTimothee about 1 year ago
- 3 comments
Labels: enhancement
#358 - ImportError: cannot import name 'PEFT_TYPE_TO_MODEL_MAPPING' from 'peft.peft_model'
Issue -
State: open - Opened by texasdave2 about 1 year ago
- 1 comment
#357 - [BUG] Type mismatch in exllamav2 QLinear activation
Issue -
State: closed - Opened by cyang49 about 1 year ago
- 12 comments
Labels: bug
#355 - import exllama QuantLinear instead of exllamav2's in `pack_model`
Pull Request -
State: closed - Opened by PanQiWei about 1 year ago
#354 - Revert "fix bug(breaking change) remove (zeors -= 1)"
Pull Request -
State: closed - Opened by PanQiWei about 1 year ago
- 3 comments
#353 - pack method missing in QuantLinear exllamav2
Issue -
State: closed - Opened by adiprasad about 1 year ago
- 2 comments
Labels: bug
#352 - Quant with larger context length
Issue -
State: open - Opened by adiprasad about 1 year ago
- 3 comments
Labels: enhancement, question
#351 - Question about MPT support
Issue -
State: closed - Opened by bonoshunki about 1 year ago
- 1 comment
#350 - Benchmark each GEMM/GEMV kernels independently
Issue -
State: open - Opened by stephen-youn about 1 year ago
- 2 comments
Labels: enhancement
#349 - exllamav2 integration
Pull Request -
State: closed - Opened by SunMarc about 1 year ago
#348 - The Path to v1.0.0
Issue -
State: open - Opened by PanQiWei about 1 year ago
Labels: enhancement
#347 - Use `adapter_name` for `get_gptq_peft_model` with `train_mode=True`
Pull Request -
State: closed - Opened by alex4321 about 1 year ago
#346 - [BUG] Issues with tensor types while finetuning the quantized model through LoRA
Issue -
State: closed - Opened by alex4321 about 1 year ago
- 11 comments
Labels: bug
#345 - How to quantize from local checkpoint
Issue -
State: open - Opened by dionman about 1 year ago
- 6 comments
#344 - [BUG]
Issue -
State: closed - Opened by linqingfan about 1 year ago
- 2 comments
Labels: bug
#343 - [BUG]RuntimeError: FWD: Unsupported hidden_size or types: 4096BFloat16FloatFloatFloatFloat
Issue -
State: open - Opened by edisonwd about 1 year ago
- 1 comment
Labels: bug
#342 - fix max_input_len = max_input_len
Pull Request -
State: closed - Opened by IliaZenkov about 1 year ago
#341 - [BUG] 关于使用auto-gptq作为teacher model 做蒸馏导致的excepted all tensor on the same devices问题
Issue -
State: open - Opened by HaoWuSR about 1 year ago
- 1 comment
Labels: bug, chinese
#340 - ROCM 5.6: no known conversion from 'const half *' [BUG]
Issue -
State: closed - Opened by Jipok about 1 year ago
- 7 comments
Labels: bug
#339 - Building cuda extension requires PyTorch(>=1.13.0) been installed, please install PyTorch first!
Issue -
State: open - Opened by msh01 about 1 year ago
- 11 comments
Labels: bug
#338 - 使用auto-gptq量化过的模型,推理速度变慢,在设置use_triton=True的情况下,推理速度更慢了
Issue -
State: open - Opened by yzw-yzw about 1 year ago
- 1 comment
Labels: chinese
#337 - What do you consider a good dataset size/rows for quantization ?
Issue -
State: open - Opened by nadimintikrish about 1 year ago
- 1 comment
Labels: question
#336 - dataset='c4' , how do i quantize model for custom dataset
Issue -
State: closed - Opened by ChethanN01 about 1 year ago
- 4 comments
#335 - Ignore unknown parameters in quantize_config.json
Pull Request -
State: closed - Opened by z80maniac about 1 year ago
- 1 comment
#334 - Support for mosaic MPT models
Issue -
State: open - Opened by imthebilliejoe about 1 year ago
- 2 comments
Labels: enhancement
#333 - [BUG] Error when building from source on Linux
Issue -
State: closed - Opened by RBNXI about 1 year ago
- 6 comments
Labels: bug
#332 - raise FileNotFoundError(f"Could not find model in {model_name_or_path}") FileNotFoundError: Could not find model in TheBloke/Llama-2-7b-Chat-GPTQ
Issue -
State: closed - Opened by Rahmat711 about 1 year ago
- 2 comments
Labels: bug
#331 - [BUG] Question about cuda kernels, a potential bug
Issue -
State: open - Opened by ChenMnZ about 1 year ago
- 1 comment
Labels: bug, help wanted
#330 - [BUG] The error of kernels cannot be omitted.
Issue -
State: closed - Opened by ChenMnZ about 1 year ago
Labels: bug
#328 - 4-bit quantization find GeneralQuantLinear element is torch.int32, why
Issue -
State: closed - Opened by jimmyforrest about 1 year ago
- 5 comments
#327 - 用quant_with_alpaca.py量化微调过的llama2-13b报错AttributeError: 'QuantLinear' object has no attribute 'q4'
Issue -
State: open - Opened by yzw-yzw about 1 year ago
- 3 comments
Labels: chinese
#326 - Add support for Falcon as part of Transformers 4.33.0, including new Falcon 180B
Pull Request -
State: closed - Opened by TheBloke about 1 year ago
#325 - fix bug(breaking change) remove (zeors -= 1)
Pull Request -
State: closed - Opened by qwopqwop200 about 1 year ago
- 4 comments
#324 - [BUG] CUDA extension not installed error when running AutoGPTQ in Docker
Issue -
State: closed - Opened by yachty66 about 1 year ago
- 10 comments
Labels: bug
#323 - [BUG] "The temp_state buffer is too small in the exllama backend" error, even after adding "model = exllama_set_max_input_length(model, 4096) "
Issue -
State: closed - Opened by Tamil-Arasan-31 about 1 year ago
- 21 comments
Labels: bug
#322 - [BUG] RuntimeError: no device index
Issue -
State: closed - Opened by itechbear about 1 year ago
- 4 comments
Labels: bug
#321 - [BUG] output Nonsense compared to llama.cpp
Issue -
State: open - Opened by YerongLi about 1 year ago
- 4 comments
Labels: bug
#320 - [BUG] Code breaks when 2 models are loaded in
Issue -
State: open - Opened by daniel-kukiela about 1 year ago
Labels: bug
#319 - Support sharded quantized model files in `from_quantized`
Issue -
State: open - Opened by shakealeg about 1 year ago
- 5 comments
Labels: enhancement, help wanted
#318 - [BUG] Another issue with the temp_state buffer, but only witch batch of 64
Issue -
State: open - Opened by daniel-kukiela about 1 year ago
- 2 comments
Labels: bug
#317 - [Question] AutoModelForCausalLM.from_pretrained failed for mt5 model, how to quantize mt5 by AutoGPTQ
Issue -
State: closed - Opened by DuoduoLi about 1 year ago
Labels: bug
#316 - [Discussion] batch generation example
Issue -
State: open - Opened by YerongLi about 1 year ago
- 3 comments
#315 - How to use exllama_set_max_input_length() with the HF models
Issue -
State: closed - Opened by daniel-kukiela about 1 year ago
- 3 comments
#314 - Non-FP16 Support
Issue -
State: closed - Opened by HanGuo97 about 1 year ago
- 1 comment
Labels: enhancement
#312 - Calibration process & calibration dataset used to perform GPTQ
Issue -
State: open - Opened by ht0rohit about 1 year ago
- 3 comments
Labels: documentation, question
#311 - fix typo in max_input_length
Pull Request -
State: closed - Opened by SunMarc about 1 year ago
#310 - fix model type changed after calling .to() method
Pull Request -
State: closed - Opened by PanQiWei about 1 year ago
#309 - Install skip qigen(windows)
Pull Request -
State: closed - Opened by qwopqwop200 about 1 year ago
- 2 comments
#308 - [BUG] 'BaseQuantizeConfig' object has no attribute 'get' when deploying with OpenLLM
Issue -
State: open - Opened by jaotheboss about 1 year ago
- 2 comments
Labels: bug
#307 - Error with loading the saved quantized model
Issue -
State: closed - Opened by akkasi over 1 year ago
- 5 comments
Labels: bug
#306 - [BUG] nan average_loss when running quantize_with_alpaca.py
Issue -
State: open - Opened by jaysonph over 1 year ago
- 5 comments
Labels: bug
#305 - Fix g_idx in fused kernel
Pull Request -
State: open - Opened by chu-tianxiang over 1 year ago
#304 - [FEATURE] Support of Qwen-VL
Issue -
State: closed - Opened by JustinLin610 over 1 year ago
Labels: enhancement
#303 - Update qwen.py for Qwen-VL
Pull Request -
State: closed - Opened by JustinLin610 over 1 year ago
- 2 comments
#302 - Any suggestions on quantizing llama2-70b model?
Issue -
State: closed - Opened by franklyd over 1 year ago
- 6 comments
#301 - how to quantize llama-based model?
Issue -
State: closed - Opened by vicwer over 1 year ago
- 4 comments
#300 - RuntimeError: x and w have incompatible shapes
Issue -
State: open - Opened by jr011 over 1 year ago
- 1 comment