Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / qwopqwop200/GPTQ-for-LLaMa issues and pull requests
#285 - Syntax changed in triton.testing.do_bench() causing error when running llama_inference.py
Issue -
State: open - Opened by prasanna 11 months ago
- 1 comment
#101 - OPT Models Fail to Pack During Quantization
Issue -
State: closed - Opened by official-elinas over 1 year ago
- 3 comments
#101 - OPT Models Fail to Pack During Quantization
Issue -
State: closed - Opened by official-elinas over 1 year ago
- 3 comments
#100 - TypeError: Offload_LlamaModel.forward() got an unexpected keyword argument 'position_ids' after today's commits
Issue -
State: closed - Opened by ye7iaserag over 1 year ago
- 3 comments
#99 - Port left-padding fix to Offload_LlamaModel
Pull Request -
State: closed - Opened by arctic-marmoset over 1 year ago
#98 - AMD Installation broke about 1 week ago
Issue -
State: closed - Opened by YellowRoseCx over 1 year ago
- 12 comments
#97 - KeyError on conversion attempt.
Issue -
State: closed - Opened by GamerUntouch over 1 year ago
- 4 comments
#96 - Much slower speed with the latest updates
Issue -
State: closed - Opened by oobabooga over 1 year ago
- 13 comments
#95 - Head is currently broken
Issue -
State: closed - Opened by Qubitium over 1 year ago
- 5 comments
#95 - Head is currently broken
Issue -
State: closed - Opened by Qubitium over 1 year ago
- 5 comments
#94 - Use multiple gpu for quantization?
Issue -
State: closed - Opened by sgsdxzy over 1 year ago
- 1 comment
#94 - Use multiple gpu for quantization?
Issue -
State: closed - Opened by sgsdxzy over 1 year ago
- 1 comment
#93 - llama_inference_offload.py hardcoded to cuda:0
Issue -
State: closed - Opened by generic-username0718 over 1 year ago
- 1 comment
#93 - llama_inference_offload.py hardcoded to cuda:0
Issue -
State: closed - Opened by generic-username0718 over 1 year ago
- 1 comment
#92 - Name format mixup - LLaMA has a capital A
Issue -
State: closed - Opened by mcmonkey4eva over 1 year ago
- 1 comment
#92 - Name format mixup - LLaMA has a capital A
Issue -
State: closed - Opened by mcmonkey4eva over 1 year ago
- 1 comment
#91 - 8-bit quantization succeeds with great benchmarks, but inference produces garbage past 128 tokens
Issue -
State: closed - Opened by QM60 over 1 year ago
- 2 comments
#91 - 8-bit quantization succeeds with great benchmarks, but inference produces garbage past 128 tokens
Issue -
State: closed - Opened by QM60 over 1 year ago
- 2 comments
#90 - Make sure to save the quantized model before bench/eval
Pull Request -
State: closed - Opened by Qubitium over 1 year ago
- 1 comment
#90 - Make sure to save the quantized model before bench/eval
Pull Request -
State: closed - Opened by Qubitium over 1 year ago
- 1 comment
#89 - Support for corrected Llama
Issue -
State: closed - Opened by gante over 1 year ago
- 3 comments
#88 - Error while python setup_cuda.py install>
Issue -
State: closed - Opened by mahxds over 1 year ago
- 14 comments
#88 - Error while python setup_cuda.py install>
Issue -
State: closed - Opened by mahxds over 1 year ago
- 14 comments
#87 - Implement fallback to PyTorch matmul on large input sizes
Pull Request -
State: closed - Opened by MasterTaffer over 1 year ago
- 3 comments
#87 - Implement fallback to PyTorch matmul on large input sizes
Pull Request -
State: closed - Opened by MasterTaffer over 1 year ago
- 3 comments
#86 - high VRAM Memory Allocation while evaluation
Issue -
State: closed - Opened by zaziki23 over 1 year ago
- 3 comments
#86 - high VRAM Memory Allocation while evaluation
Issue -
State: closed - Opened by zaziki23 over 1 year ago
- 3 comments
#85 - Windows 11 / WSL2 / Ubuntu / cuda 11.7 / RTX 3070 - IndexError: list index out of range (+PTX)
Issue -
State: closed - Opened by glarsson over 1 year ago
- 1 comment
#84 - Is CUDA_VISIBLE_DEVICES necessary?
Issue -
State: closed - Opened by neuhaus over 1 year ago
- 1 comment
#83 - TypeError: expected string or bytes-like object
Issue -
State: closed - Opened by ghost over 1 year ago
- 1 comment
#82 - 4-bit is 10x slower compared to fp16 LLaMa
Issue -
State: closed - Opened by fpgaminer over 1 year ago
- 27 comments
#82 - 4-bit is 10x slower compared to fp16 LLaMa
Issue -
State: closed - Opened by fpgaminer over 1 year ago
- 27 comments
#81 - Make installable with pip
Pull Request -
State: closed - Opened by sterlind over 1 year ago
- 1 comment
#81 - Make installable with pip
Pull Request -
State: closed - Opened by sterlind over 1 year ago
- 1 comment
#80 - args is not defined after adding faster_kernel
Issue -
State: closed - Opened by ye7iaserag over 1 year ago
- 2 comments
#79 - Run with docker
Pull Request -
State: closed - Opened by JamesDConley over 1 year ago
- 1 comment
#78 - I can not reproduce 7b 6.09 Wiki2 PPL.
Issue -
State: closed - Opened by USBhost over 1 year ago
- 14 comments
#77 - Inference with 4bit is slow than fp32
Issue -
State: closed - Opened by heya5 over 1 year ago
- 2 comments
#76 - New: ~8% faster llama inference.
Pull Request -
State: closed - Opened by aljungberg over 1 year ago
- 1 comment
#76 - New: ~8% faster llama inference.
Pull Request -
State: closed - Opened by aljungberg over 1 year ago
- 1 comment
#75 - GPTQ Collaboration?
Issue -
State: closed - Opened by dalistarh over 1 year ago
- 4 comments
#75 - GPTQ Collaboration?
Issue -
State: closed - Opened by dalistarh over 1 year ago
- 4 comments
#74 - Installing cuda, cannot find ninja + cannot find file.
Issue -
State: closed - Opened by jonplumb42 over 1 year ago
- 1 comment
#74 - Installing cuda, cannot find ninja + cannot find file.
Issue -
State: closed - Opened by jonplumb42 over 1 year ago
- 1 comment
#73 - Inference using CPU
Issue -
State: closed - Opened by lodorg over 1 year ago
- 3 comments
#73 - Inference using CPU
Issue -
State: closed - Opened by lodorg over 1 year ago
- 3 comments
#72 - error on amd gpu when starting setup_cuda
Issue -
State: closed - Opened by maxime-fleury over 1 year ago
- 1 comment
#72 - error on amd gpu when starting setup_cuda
Issue -
State: closed - Opened by maxime-fleury over 1 year ago
- 1 comment
#71 - Error allocating RAM
Issue -
State: closed - Opened by PeterDaGrape over 1 year ago
- 3 comments
#71 - Error allocating RAM
Issue -
State: closed - Opened by PeterDaGrape over 1 year ago
- 3 comments
#70 - Running on CPU
Issue -
State: closed - Opened by mayaeary over 1 year ago
- 5 comments
#70 - Running on CPU
Issue -
State: closed - Opened by mayaeary over 1 year ago
- 5 comments
#69 - TypeError: load_quant() missing 1 required positional argument: 'groupsize'
Issue -
State: closed - Opened by matbee-eth over 1 year ago
- 7 comments
#69 - TypeError: load_quant() missing 1 required positional argument: 'groupsize'
Issue -
State: closed - Opened by matbee-eth over 1 year ago
- 7 comments
#68 - The detected CUDA version (12.0) mismatches the version that was used to compile PyTorch (11.7)
Issue -
State: closed - Opened by ThatCoffeeGuy over 1 year ago
- 9 comments
#68 - The detected CUDA version (12.0) mismatches the version that was used to compile PyTorch (11.7)
Issue -
State: closed - Opened by ThatCoffeeGuy over 1 year ago
- 9 comments
#67 - adding ipynb fle for building on colab
Pull Request -
State: closed - Opened by guccialex over 1 year ago
- 3 comments
#67 - adding ipynb fle for building on colab
Pull Request -
State: closed - Opened by guccialex over 1 year ago
- 3 comments
#66 - Is compute time expected to go up linearly with batch size?
Issue -
State: closed - Opened by zphang over 1 year ago
- 1 comment
#66 - Is compute time expected to go up linearly with batch size?
Issue -
State: closed - Opened by zphang over 1 year ago
- 1 comment
#65 - Issue loading tokenizer when using local models
Issue -
State: closed - Opened by iamlemec over 1 year ago
- 1 comment
#65 - Issue loading tokenizer when using local models
Issue -
State: closed - Opened by iamlemec over 1 year ago
- 1 comment
#64 - Having trouble using saved models
Issue -
State: closed - Opened by dnhkng over 1 year ago
- 6 comments
#63 - Extraneous data point
Issue -
State: closed - Opened by philipturner over 1 year ago
- 3 comments
#62 - opt.py python SyntaxError?
Issue -
State: closed - Opened by alexl83 over 1 year ago
- 3 comments
#61 - Issues with cuda setup
Issue -
State: closed - Opened by IridiumMaster over 1 year ago
- 1 comment
#60 - potential Mistakes in the test data selection for perplexity evaluation
Issue -
State: closed - Opened by Green-Sky over 1 year ago
- 2 comments
#59 - Error when installing cuda kernel
Issue -
State: closed - Opened by plhosk over 1 year ago
- 5 comments
#58 - Add support for devices with compute capability < 6.0
Pull Request -
State: closed - Opened by tobbez over 1 year ago
- 2 comments
#57 - How to fine-tune the 4-bit model?
Issue -
State: closed - Opened by zsun227 over 1 year ago
- 10 comments
#56 - Quantising on multiple GPU?
Issue -
State: closed - Opened by dnhkng over 1 year ago
- 1 comment
#55 - GPTQ+flexgen, is it possible?
Issue -
State: closed - Opened by ye7iaserag over 1 year ago
- 5 comments
#54 - Revert "Use the main transformers library, rename LLaMA to Llama"
Pull Request -
State: closed - Opened by qwopqwop200 over 1 year ago
- 4 comments
#53 - Revert "Use the main transformers library, rename LLaMA to Llama"
Pull Request -
State: closed - Opened by qwopqwop200 over 1 year ago
#52 - Use the main transformers library, rename LLaMA to Llama
Pull Request -
State: closed - Opened by oobabooga over 1 year ago
#51 - What would be required to quantize 65B model to 2-bit?
Issue -
State: closed - Opened by Alcyon6 over 1 year ago
- 2 comments
#50 - Will loras work with this?
Issue -
State: closed - Opened by fblissjr over 1 year ago
- 3 comments
#49 - Add alternative installation
Pull Request -
State: closed - Opened by musabgultekin over 1 year ago
- 1 comment
#48 - llama_inference RuntimeError: Internal: src/sentencepiece_processor.cc
Issue -
State: closed - Opened by youkpan over 1 year ago
- 1 comment
#47 - Problem with setup_cuda.py install
Issue -
State: closed - Opened by farrael004 over 1 year ago
- 14 comments
#46 - Quantizing GALACTICA?
Issue -
State: closed - Opened by oobabooga over 1 year ago
- 13 comments
#45 - [Request] Mixed Precission Quantization
Issue -
State: closed - Opened by elephantpanda over 1 year ago
- 7 comments
#44 - Nvcc fatal : Unsupported gpu architecture 'compute_86'
Issue -
State: closed - Opened by DamonianoStudios over 1 year ago
- 6 comments
#43 - RuntimeError: Tensors must have same number of dimensions: got 3 and 4
Issue -
State: closed - Opened by enn-nafnlaus over 1 year ago
- 4 comments
#42 - GPTQ C++ Implementation Question
Issue -
State: closed - Opened by MarkSchmidty over 1 year ago
- 1 comment
#41 - Bad results for WinoGrande - more testers searched
Issue -
State: closed - Opened by DanielWe2 over 1 year ago
- 1 comment
#40 - Script to execute Winogrande test
Pull Request -
State: closed - Opened by DanielWe2 over 1 year ago
- 2 comments
#39 - Bad performance of OPT models
Issue -
State: closed - Opened by Zerogoki00 over 1 year ago
- 2 comments
#38 - cuda extension problem
Issue -
State: closed - Opened by WuNein over 1 year ago
- 6 comments
#37 - Support other models?
Issue -
State: closed - Opened by Ph0rk0z over 1 year ago
- 2 comments
#36 - probability tensor contains either `inf`, `nan` or element < 0
Issue -
State: closed - Opened by Minami-su over 1 year ago
- 1 comment
#35 - How to convert the official ckp to fit your repo
Issue -
State: closed - Opened by merlinarer over 1 year ago
- 1 comment
#34 - Lonnnnnnnnng context load time before generation
Issue -
State: closed - Opened by generic-username0718 over 1 year ago
- 7 comments
#33 - The current installed version of g++ is greater than the maximum required version by CUDA
Issue -
State: closed - Opened by frandmb over 1 year ago
- 7 comments
#32 - converting local hf model with llama.py
Issue -
State: closed - Opened by alexl83 over 1 year ago
- 3 comments
#31 - PosixPath object has no attribute endswith Win11 WSL2
Issue -
State: closed - Opened by rossbishop over 1 year ago
- 5 comments
#30 - 4-bit llama gets progressively slower with each text generation
Issue -
State: closed - Opened by 1aienthusiast over 1 year ago
- 11 comments
#29 - 4-bit llama gets progressively slower with each generation
Issue -
State: closed - Opened by 1aienthusiast over 1 year ago
#27 - Quantization produces non-deterministic weights
Issue -
State: closed - Opened by MarkSchmidty over 1 year ago
- 3 comments
#26 - Windows build fails with unresolved symbols
Issue -
State: closed - Opened by powderluv over 1 year ago
- 1 comment