Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / qwopqwop200/GPTQ-for-LLaMa issues and pull requests

#101 - OPT Models Fail to Pack During Quantization

Issue - State: closed - Opened by official-elinas over 1 year ago - 3 comments

#101 - OPT Models Fail to Pack During Quantization

Issue - State: closed - Opened by official-elinas over 1 year ago - 3 comments

#99 - Port left-padding fix to Offload_LlamaModel

Pull Request - State: closed - Opened by arctic-marmoset over 1 year ago

#98 - AMD Installation broke about 1 week ago

Issue - State: closed - Opened by YellowRoseCx over 1 year ago - 12 comments

#97 - KeyError on conversion attempt.

Issue - State: closed - Opened by GamerUntouch over 1 year ago - 4 comments

#96 - Much slower speed with the latest updates

Issue - State: closed - Opened by oobabooga over 1 year ago - 13 comments

#95 - Head is currently broken

Issue - State: closed - Opened by Qubitium over 1 year ago - 5 comments

#95 - Head is currently broken

Issue - State: closed - Opened by Qubitium over 1 year ago - 5 comments

#94 - Use multiple gpu for quantization?

Issue - State: closed - Opened by sgsdxzy over 1 year ago - 1 comment

#94 - Use multiple gpu for quantization?

Issue - State: closed - Opened by sgsdxzy over 1 year ago - 1 comment

#93 - llama_inference_offload.py hardcoded to cuda:0

Issue - State: closed - Opened by generic-username0718 over 1 year ago - 1 comment

#93 - llama_inference_offload.py hardcoded to cuda:0

Issue - State: closed - Opened by generic-username0718 over 1 year ago - 1 comment

#92 - Name format mixup - LLaMA has a capital A

Issue - State: closed - Opened by mcmonkey4eva over 1 year ago - 1 comment

#92 - Name format mixup - LLaMA has a capital A

Issue - State: closed - Opened by mcmonkey4eva over 1 year ago - 1 comment

#90 - Make sure to save the quantized model before bench/eval

Pull Request - State: closed - Opened by Qubitium over 1 year ago - 1 comment

#90 - Make sure to save the quantized model before bench/eval

Pull Request - State: closed - Opened by Qubitium over 1 year ago - 1 comment

#89 - Support for corrected Llama

Issue - State: closed - Opened by gante over 1 year ago - 3 comments

#88 - Error while python setup_cuda.py install>

Issue - State: closed - Opened by mahxds over 1 year ago - 14 comments

#88 - Error while python setup_cuda.py install>

Issue - State: closed - Opened by mahxds over 1 year ago - 14 comments

#87 - Implement fallback to PyTorch matmul on large input sizes

Pull Request - State: closed - Opened by MasterTaffer over 1 year ago - 3 comments

#87 - Implement fallback to PyTorch matmul on large input sizes

Pull Request - State: closed - Opened by MasterTaffer over 1 year ago - 3 comments

#86 - high VRAM Memory Allocation while evaluation

Issue - State: closed - Opened by zaziki23 over 1 year ago - 3 comments

#86 - high VRAM Memory Allocation while evaluation

Issue - State: closed - Opened by zaziki23 over 1 year ago - 3 comments

#84 - Is CUDA_VISIBLE_DEVICES necessary?

Issue - State: closed - Opened by neuhaus over 1 year ago - 1 comment

#83 - TypeError: expected string or bytes-like object

Issue - State: closed - Opened by ghost over 1 year ago - 1 comment

#82 - 4-bit is 10x slower compared to fp16 LLaMa

Issue - State: closed - Opened by fpgaminer over 1 year ago - 27 comments

#82 - 4-bit is 10x slower compared to fp16 LLaMa

Issue - State: closed - Opened by fpgaminer over 1 year ago - 27 comments

#81 - Make installable with pip

Pull Request - State: closed - Opened by sterlind over 1 year ago - 1 comment

#81 - Make installable with pip

Pull Request - State: closed - Opened by sterlind over 1 year ago - 1 comment

#80 - args is not defined after adding faster_kernel

Issue - State: closed - Opened by ye7iaserag over 1 year ago - 2 comments

#79 - Run with docker

Pull Request - State: closed - Opened by JamesDConley over 1 year ago - 1 comment

#78 - I can not reproduce 7b 6.09 Wiki2 PPL.

Issue - State: closed - Opened by USBhost over 1 year ago - 14 comments

#77 - Inference with 4bit is slow than fp32

Issue - State: closed - Opened by heya5 over 1 year ago - 2 comments

#76 - New: ~8% faster llama inference.

Pull Request - State: closed - Opened by aljungberg over 1 year ago - 1 comment

#76 - New: ~8% faster llama inference.

Pull Request - State: closed - Opened by aljungberg over 1 year ago - 1 comment

#75 - GPTQ Collaboration?

Issue - State: closed - Opened by dalistarh over 1 year ago - 4 comments

#75 - GPTQ Collaboration?

Issue - State: closed - Opened by dalistarh over 1 year ago - 4 comments

#74 - Installing cuda, cannot find ninja + cannot find file.

Issue - State: closed - Opened by jonplumb42 over 1 year ago - 1 comment

#74 - Installing cuda, cannot find ninja + cannot find file.

Issue - State: closed - Opened by jonplumb42 over 1 year ago - 1 comment

#73 - Inference using CPU

Issue - State: closed - Opened by lodorg over 1 year ago - 3 comments

#73 - Inference using CPU

Issue - State: closed - Opened by lodorg over 1 year ago - 3 comments

#72 - error on amd gpu when starting setup_cuda

Issue - State: closed - Opened by maxime-fleury over 1 year ago - 1 comment

#72 - error on amd gpu when starting setup_cuda

Issue - State: closed - Opened by maxime-fleury over 1 year ago - 1 comment

#71 - Error allocating RAM

Issue - State: closed - Opened by PeterDaGrape over 1 year ago - 3 comments

#71 - Error allocating RAM

Issue - State: closed - Opened by PeterDaGrape over 1 year ago - 3 comments

#70 - Running on CPU

Issue - State: closed - Opened by mayaeary over 1 year ago - 5 comments

#70 - Running on CPU

Issue - State: closed - Opened by mayaeary over 1 year ago - 5 comments

#69 - TypeError: load_quant() missing 1 required positional argument: 'groupsize'

Issue - State: closed - Opened by matbee-eth over 1 year ago - 7 comments

#69 - TypeError: load_quant() missing 1 required positional argument: 'groupsize'

Issue - State: closed - Opened by matbee-eth over 1 year ago - 7 comments

#67 - adding ipynb fle for building on colab

Pull Request - State: closed - Opened by guccialex over 1 year ago - 3 comments

#67 - adding ipynb fle for building on colab

Pull Request - State: closed - Opened by guccialex over 1 year ago - 3 comments

#66 - Is compute time expected to go up linearly with batch size?

Issue - State: closed - Opened by zphang over 1 year ago - 1 comment

#66 - Is compute time expected to go up linearly with batch size?

Issue - State: closed - Opened by zphang over 1 year ago - 1 comment

#65 - Issue loading tokenizer when using local models

Issue - State: closed - Opened by iamlemec over 1 year ago - 1 comment

#65 - Issue loading tokenizer when using local models

Issue - State: closed - Opened by iamlemec over 1 year ago - 1 comment

#64 - Having trouble using saved models

Issue - State: closed - Opened by dnhkng over 1 year ago - 6 comments

#63 - Extraneous data point

Issue - State: closed - Opened by philipturner over 1 year ago - 3 comments

#62 - opt.py python SyntaxError?

Issue - State: closed - Opened by alexl83 over 1 year ago - 3 comments

#61 - Issues with cuda setup

Issue - State: closed - Opened by IridiumMaster over 1 year ago - 1 comment

#60 - potential Mistakes in the test data selection for perplexity evaluation

Issue - State: closed - Opened by Green-Sky over 1 year ago - 2 comments

#59 - Error when installing cuda kernel

Issue - State: closed - Opened by plhosk over 1 year ago - 5 comments

#58 - Add support for devices with compute capability < 6.0

Pull Request - State: closed - Opened by tobbez over 1 year ago - 2 comments

#57 - How to fine-tune the 4-bit model?

Issue - State: closed - Opened by zsun227 over 1 year ago - 10 comments

#56 - Quantising on multiple GPU?

Issue - State: closed - Opened by dnhkng over 1 year ago - 1 comment

#55 - GPTQ+flexgen, is it possible?

Issue - State: closed - Opened by ye7iaserag over 1 year ago - 5 comments

#54 - Revert "Use the main transformers library, rename LLaMA to Llama"

Pull Request - State: closed - Opened by qwopqwop200 over 1 year ago - 4 comments

#53 - Revert "Use the main transformers library, rename LLaMA to Llama"

Pull Request - State: closed - Opened by qwopqwop200 over 1 year ago

#52 - Use the main transformers library, rename LLaMA to Llama

Pull Request - State: closed - Opened by oobabooga over 1 year ago

#51 - What would be required to quantize 65B model to 2-bit?

Issue - State: closed - Opened by Alcyon6 over 1 year ago - 2 comments

#50 - Will loras work with this?

Issue - State: closed - Opened by fblissjr over 1 year ago - 3 comments

#49 - Add alternative installation

Pull Request - State: closed - Opened by musabgultekin over 1 year ago - 1 comment

#48 - llama_inference RuntimeError: Internal: src/sentencepiece_processor.cc

Issue - State: closed - Opened by youkpan over 1 year ago - 1 comment

#47 - Problem with setup_cuda.py install

Issue - State: closed - Opened by farrael004 over 1 year ago - 14 comments

#46 - Quantizing GALACTICA?

Issue - State: closed - Opened by oobabooga over 1 year ago - 13 comments

#45 - [Request] Mixed Precission Quantization

Issue - State: closed - Opened by elephantpanda over 1 year ago - 7 comments

#44 - Nvcc fatal : Unsupported gpu architecture 'compute_86'

Issue - State: closed - Opened by DamonianoStudios over 1 year ago - 6 comments

#43 - RuntimeError: Tensors must have same number of dimensions: got 3 and 4

Issue - State: closed - Opened by enn-nafnlaus over 1 year ago - 4 comments

#42 - GPTQ C++ Implementation Question

Issue - State: closed - Opened by MarkSchmidty over 1 year ago - 1 comment

#41 - Bad results for WinoGrande - more testers searched

Issue - State: closed - Opened by DanielWe2 over 1 year ago - 1 comment

#40 - Script to execute Winogrande test

Pull Request - State: closed - Opened by DanielWe2 over 1 year ago - 2 comments

#39 - Bad performance of OPT models

Issue - State: closed - Opened by Zerogoki00 over 1 year ago - 2 comments

#38 - cuda extension problem

Issue - State: closed - Opened by WuNein over 1 year ago - 6 comments

#37 - Support other models?

Issue - State: closed - Opened by Ph0rk0z over 1 year ago - 2 comments

#36 - probability tensor contains either `inf`, `nan` or element < 0

Issue - State: closed - Opened by Minami-su over 1 year ago - 1 comment

#35 - How to convert the official ckp to fit your repo

Issue - State: closed - Opened by merlinarer over 1 year ago - 1 comment

#34 - Lonnnnnnnnng context load time before generation

Issue - State: closed - Opened by generic-username0718 over 1 year ago - 7 comments

#32 - converting local hf model with llama.py

Issue - State: closed - Opened by alexl83 over 1 year ago - 3 comments

#31 - PosixPath object has no attribute endswith Win11 WSL2

Issue - State: closed - Opened by rossbishop over 1 year ago - 5 comments

#30 - 4-bit llama gets progressively slower with each text generation

Issue - State: closed - Opened by 1aienthusiast over 1 year ago - 11 comments

#27 - Quantization produces non-deterministic weights

Issue - State: closed - Opened by MarkSchmidty over 1 year ago - 3 comments

#26 - Windows build fails with unresolved symbols

Issue - State: closed - Opened by powderluv over 1 year ago - 1 comment