qwopqwop200/GPTQ-for-LLaMa issues and pull requests

#26 - Windows build fails with unresolved symbols

Issue - State: closed - Opened by powderluv almost 2 years ago - 1 comment

#25 - running build_exit error

Issue - State: closed - Opened by BenjaminHei almost 2 years ago - 1 comment

#24 - More VRAM Efficient Attention

Issue - State: closed - Opened by MarkSchmidty almost 2 years ago - 2 comments

#23 - Add benchmark results for 3 and 4 bit 33B

Pull Request - State: closed - Opened by ItsLogic almost 2 years ago - 1 comment

#22 - Issue compiling in docker - No CUDA runtime is found

Issue - State: closed - Opened by TheTerrasque almost 2 years ago - 5 comments

#21 - NameError: name 'quant_cuda' is not defined

Issue - State: closed - Opened by CyberTimon almost 2 years ago - 2 comments

#20 - Add `.safetensors` support

Pull Request - State: closed - Opened by ghost almost 2 years ago - 4 comments

#19 - Is it possible to reuse the GPTQ implementation to quant oasst-sft-1-pythia-12b into 4b?

Issue - State: closed - Opened by npk48 almost 2 years ago - 2 comments

#18 - Tokenizer class LLaMATokenizer does not exist or is not currently imported.

Issue - State: closed - Opened by C0rn3j almost 2 years ago - 12 comments

#17 - FP8 Quantization?

Issue - State: closed - Opened by philipturner almost 2 years ago - 2 comments

#16 - Questions about group size

Issue - State: closed - Opened by DanielWe2 almost 2 years ago - 7 comments

#15 - Are these errors expected ?

Issue - State: closed - Opened by USBhost almost 2 years ago - 3 comments

#14 - Saving checkpoints?

Issue - State: closed - Opened by elephantpanda almost 2 years ago - 1 comment

#13 - Model Quantization Instructions

Issue - State: closed - Opened by MarkSchmidty almost 2 years ago - 3 comments

#12 - state_dict error on model load

Issue - State: closed - Opened by GamerUntouch almost 2 years ago - 3 comments

#11 - Multiple errors while compiling the kernel

Issue - State: closed - Opened by athu16 almost 2 years ago - 34 comments

#10 - Change ints to double in quant_cuda_kernel.cu?

Issue - State: closed - Opened by xiscoding almost 2 years ago - 6 comments

#9 - Supports more than a single token

Pull Request - State: closed - Opened by clcarwin almost 2 years ago - 1 comment

#8 - How to use for inference?

Issue - State: closed - Opened by DanielWe2 almost 2 years ago - 5 comments

#7 - Does not compile on CUDA 12.0

Issue - State: closed - Opened by jtang613 almost 2 years ago - 4 comments

#6 - CUDA kernel that supports more than a single token

Issue - State: closed - Opened by ahsima1 almost 2 years ago - 2 comments

#5 - AttributeError: 'LLaMAModel' object has no attribute 'decoder'

Issue - State: closed - Opened by Minami-su almost 2 years ago - 2 comments

#4 - Request: Optional non-CUDA version

Issue - State: closed - Opened by richardburleigh almost 2 years ago - 8 comments

#3 - Benchmark fails when using 4bit file

Issue - State: closed - Opened by ItsLogic almost 2 years ago - 7 comments

#2 - How to deal with the model from huggingface?

Issue - State: closed - Opened by Starlento almost 2 years ago - 3 comments

#1 - 3-bit quantization fails during the packing stage

Issue - State: closed - Opened by dustydecapod almost 2 years ago - 7 comments

Ecosyste.ms: Issues

GitHub / qwopqwop200/GPTQ-for-LLaMa issues and pull requests