Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / turboderp/exllamav2 issues and pull requests
#49 - Does not access add_tokens config when creating config
Issue -
State: closed - Opened by alimadelshin about 1 year ago
- 9 comments
#49 - Does not access add_tokens config when creating config
Issue -
State: closed - Opened by alimadelshin about 1 year ago
- 9 comments
#47 - Llama 70B 2.5bpw does not fit in 24GB GPU
Issue -
State: closed - Opened by Nikita-Sherstnev about 1 year ago
- 10 comments
#47 - Llama 70B 2.5bpw does not fit in 24GB GPU
Issue -
State: closed - Opened by Nikita-Sherstnev about 1 year ago
- 10 comments
#46 - Ninja Build Error for ROCm
Issue -
State: closed - Opened by lufixSch about 1 year ago
- 11 comments
#46 - Ninja Build Error for ROCm
Issue -
State: closed - Opened by lufixSch about 1 year ago
- 11 comments
#45 - Error because of string in PyPI flash-attn 2.2.3.post2 version
Issue -
State: closed - Opened by TehNomad about 1 year ago
- 1 comment
#45 - Error because of string in PyPI flash-attn 2.2.3.post2 version
Issue -
State: closed - Opened by TehNomad about 1 year ago
- 1 comment
#44 - Quantization subtly broken recently?
Issue -
State: closed - Opened by QM60 about 1 year ago
- 9 comments
#44 - Quantization subtly broken recently?
Issue -
State: closed - Opened by QM60 about 1 year ago
- 9 comments
#43 - Chat work greate but inference broken. Why?
Issue -
State: closed - Opened by 50Bytes-dev about 1 year ago
- 3 comments
#43 - Chat work greate but inference broken. Why?
Issue -
State: closed - Opened by 50Bytes-dev about 1 year ago
- 3 comments
#42 - don't support for batch inference?
Issue -
State: closed - Opened by LZY-the-boys about 1 year ago
- 2 comments
#42 - don't support for batch inference?
Issue -
State: closed - Opened by LZY-the-boys about 1 year ago
- 2 comments
#41 - ninja compilation error - gcc11
Issue -
State: closed - Opened by vasqu about 1 year ago
- 5 comments
#41 - ninja compilation error - gcc11
Issue -
State: closed - Opened by vasqu about 1 year ago
- 5 comments
#40 - Tesla P40 performance is still very low.
Issue -
State: closed - Opened by siriume about 1 year ago
- 4 comments
#40 - Tesla P40 performance is still very low.
Issue -
State: closed - Opened by siriume about 1 year ago
- 4 comments
#39 - Script to convert model, run quant, and save measurements
Pull Request -
State: closed - Opened by lonestriker about 1 year ago
#39 - Script to convert model, run quant, and save measurements
Pull Request -
State: closed - Opened by lonestriker about 1 year ago
#38 - Rope scaling, length, measurement_length during EXL2 quantization
Issue -
State: closed - Opened by grimulkan about 1 year ago
- 4 comments
#38 - Rope scaling, length, measurement_length during EXL2 quantization
Issue -
State: closed - Opened by grimulkan about 1 year ago
- 4 comments
#37 - bpw calculation
Issue -
State: closed - Opened by Chainfire about 1 year ago
- 6 comments
#37 - bpw calculation
Issue -
State: closed - Opened by Chainfire about 1 year ago
- 6 comments
#36 - Integrating Medusa
Issue -
State: closed - Opened by KaruroChori about 1 year ago
- 2 comments
#35 - Run a 34b model with a 4080 (16gb VRAM)
Issue -
State: closed - Opened by ScottSump about 1 year ago
- 3 comments
#34 - convert_safetensors doesn't force existing GPU
Issue -
State: closed - Opened by Chainfire about 1 year ago
- 1 comment
#34 - convert_safetensors doesn't force existing GPU
Issue -
State: closed - Opened by Chainfire about 1 year ago
- 1 comment
#33 - ROCM: Garbadge output
Issue -
State: closed - Opened by Jipok about 1 year ago
- 46 comments
#33 - ROCM: Garbadge output
Issue -
State: closed - Opened by Jipok about 1 year ago
- 46 comments
#32 - What's the best way to train ext2?
Issue -
State: closed - Opened by laoda513 about 1 year ago
- 1 comment
#32 - What's the best way to train ext2?
Issue -
State: closed - Opened by laoda513 about 1 year ago
- 1 comment
#31 - Conversion: release CUDA cache after VRAM intensive quant blocks
Pull Request -
State: closed - Opened by 19h about 1 year ago
- 12 comments
#30 - Convert/Quantizer bf16 support
Issue -
State: closed - Opened by Qubitium about 1 year ago
- 7 comments
#29 - Calibration data format clarification
Issue -
State: closed - Opened by Qubitium about 1 year ago
- 5 comments
#29 - Calibration data format clarification
Issue -
State: closed - Opened by Qubitium about 1 year ago
- 5 comments
#28 - We should use exllama1 for GPTQ and exllama2 for exl2?
Issue -
State: closed - Opened by BadisG about 1 year ago
- 3 comments
#28 - We should use exllama1 for GPTQ and exllama2 for exl2?
Issue -
State: closed - Opened by BadisG about 1 year ago
- 3 comments
#27 - Repetitive output in NVidia Jetson Orin Nano
Issue -
State: closed - Opened by EraldoMJunior about 1 year ago
- 4 comments
#27 - Repetitive output in NVidia Jetson Orin Nano
Issue -
State: closed - Opened by EraldoMJunior about 1 year ago
- 4 comments
#25 - Conversion help
Issue -
State: closed - Opened by Chainfire about 1 year ago
- 4 comments
#25 - Conversion help
Issue -
State: closed - Opened by Chainfire about 1 year ago
- 4 comments
#24 - add comment on model.load() usage
Pull Request -
State: closed - Opened by gojefferson about 1 year ago
#24 - add comment on model.load() usage
Pull Request -
State: closed - Opened by gojefferson about 1 year ago
#23 - Add copilot server example
Pull Request -
State: open - Opened by chenhunghan about 1 year ago
- 9 comments
#23 - Add copilot server example
Pull Request -
State: open - Opened by chenhunghan about 1 year ago
- 9 comments
#22 - Fix Compiling with HIP on Older Pytorch Version
Pull Request -
State: closed - Opened by leonxia1018 about 1 year ago
- 6 comments
#22 - Fix Compiling with HIP on Older Pytorch Version
Pull Request -
State: closed - Opened by leonxia1018 about 1 year ago
- 6 comments
#21 - nvcc fatal : Unknown option '-generate-dependencies-with-compile'
Issue -
State: closed - Opened by Hashflower about 1 year ago
- 1 comment
#20 - convert.py - RuntimeError: CUDA error: invalid configuration argument
Issue -
State: closed - Opened by Thireus about 1 year ago
- 22 comments
#18 - Support cfg sampler?
Issue -
State: closed - Opened by win10ogod about 1 year ago
- 1 comment
#17 - Fix typo in README.md
Pull Request -
State: closed - Opened by eltociear about 1 year ago
- 1 comment
#16 - Support Baichuan2?
Issue -
State: closed - Opened by lx0126z about 1 year ago
- 2 comments
#15 - Gibbish Output from 4bit EXL2 quantization
Issue -
State: closed - Opened by fgdfgfthgr-fox about 1 year ago
- 9 comments
#14 - Big difference in output between exllama1_hf and exllama2_hf
Issue -
State: closed - Opened by BadisG about 1 year ago
- 12 comments
#12 - aws gpu compatibility
Issue -
State: closed - Opened by gtkafka about 1 year ago
- 1 comment
#11 - OOM while trying to convert a 70B model to 4.75/4.25 bits on a 4090
Issue -
State: closed - Opened by Panchovix about 1 year ago
- 4 comments
#9 - Generation never stops
Issue -
State: closed - Opened by ortegaalfredo about 1 year ago
- 21 comments
#8 - Compile fail on P100.
Issue -
State: closed - Opened by SlimeSli about 1 year ago
- 1 comment
#7 - Can we run a 34B model with just 12gb Vram
Issue -
State: closed - Opened by Gyro0o about 1 year ago
- 3 comments
#6 - Convert script fails
Issue -
State: closed - Opened by epicfilemcnulty about 1 year ago
- 10 comments
#5 - Fix compiling and running on ROCm HIP
Pull Request -
State: closed - Opened by ardfork about 1 year ago
- 3 comments
#4 - Cannot split from textgen?
Issue -
State: closed - Opened by Ph0rk0z about 1 year ago
- 3 comments
#3 - It's using 19gb of vram for a 2.65bit 13b model
Issue -
State: closed - Opened by BadisG about 1 year ago
- 4 comments
#2 - Lower bits per weight
Issue -
State: closed - Opened by IgnacioFDM about 1 year ago
- 17 comments
#1 - What do you think of omniquant?
Issue -
State: closed - Opened by Ph0rk0z about 1 year ago
- 1 comment