Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / turboderp/exllamav2 issues and pull requests

#49 - Does not access add_tokens config when creating config

Issue - State: closed - Opened by alimadelshin about 1 year ago - 9 comments

#49 - Does not access add_tokens config when creating config

Issue - State: closed - Opened by alimadelshin about 1 year ago - 9 comments

#47 - Llama 70B 2.5bpw does not fit in 24GB GPU

Issue - State: closed - Opened by Nikita-Sherstnev about 1 year ago - 10 comments

#47 - Llama 70B 2.5bpw does not fit in 24GB GPU

Issue - State: closed - Opened by Nikita-Sherstnev about 1 year ago - 10 comments

#46 - Ninja Build Error for ROCm

Issue - State: closed - Opened by lufixSch about 1 year ago - 11 comments

#46 - Ninja Build Error for ROCm

Issue - State: closed - Opened by lufixSch about 1 year ago - 11 comments

#45 - Error because of string in PyPI flash-attn 2.2.3.post2 version

Issue - State: closed - Opened by TehNomad about 1 year ago - 1 comment

#45 - Error because of string in PyPI flash-attn 2.2.3.post2 version

Issue - State: closed - Opened by TehNomad about 1 year ago - 1 comment

#44 - Quantization subtly broken recently?

Issue - State: closed - Opened by QM60 about 1 year ago - 9 comments

#44 - Quantization subtly broken recently?

Issue - State: closed - Opened by QM60 about 1 year ago - 9 comments

#43 - Chat work greate but inference broken. Why?

Issue - State: closed - Opened by 50Bytes-dev about 1 year ago - 3 comments

#43 - Chat work greate but inference broken. Why?

Issue - State: closed - Opened by 50Bytes-dev about 1 year ago - 3 comments

#42 - don't support for batch inference?

Issue - State: closed - Opened by LZY-the-boys about 1 year ago - 2 comments

#42 - don't support for batch inference?

Issue - State: closed - Opened by LZY-the-boys about 1 year ago - 2 comments

#41 - ninja compilation error - gcc11

Issue - State: closed - Opened by vasqu about 1 year ago - 5 comments

#41 - ninja compilation error - gcc11

Issue - State: closed - Opened by vasqu about 1 year ago - 5 comments

#40 - Tesla P40 performance is still very low.

Issue - State: closed - Opened by siriume about 1 year ago - 4 comments

#40 - Tesla P40 performance is still very low.

Issue - State: closed - Opened by siriume about 1 year ago - 4 comments

#39 - Script to convert model, run quant, and save measurements

Pull Request - State: closed - Opened by lonestriker about 1 year ago

#39 - Script to convert model, run quant, and save measurements

Pull Request - State: closed - Opened by lonestriker about 1 year ago

#38 - Rope scaling, length, measurement_length during EXL2 quantization

Issue - State: closed - Opened by grimulkan about 1 year ago - 4 comments

#38 - Rope scaling, length, measurement_length during EXL2 quantization

Issue - State: closed - Opened by grimulkan about 1 year ago - 4 comments

#37 - bpw calculation

Issue - State: closed - Opened by Chainfire about 1 year ago - 6 comments

#37 - bpw calculation

Issue - State: closed - Opened by Chainfire about 1 year ago - 6 comments

#36 - Integrating Medusa

Issue - State: closed - Opened by KaruroChori about 1 year ago - 2 comments

#35 - Run a 34b model with a 4080 (16gb VRAM)

Issue - State: closed - Opened by ScottSump about 1 year ago - 3 comments

#34 - convert_safetensors doesn't force existing GPU

Issue - State: closed - Opened by Chainfire about 1 year ago - 1 comment

#34 - convert_safetensors doesn't force existing GPU

Issue - State: closed - Opened by Chainfire about 1 year ago - 1 comment

#33 - ROCM: Garbadge output

Issue - State: closed - Opened by Jipok about 1 year ago - 46 comments

#33 - ROCM: Garbadge output

Issue - State: closed - Opened by Jipok about 1 year ago - 46 comments

#32 - What's the best way to train ext2?

Issue - State: closed - Opened by laoda513 about 1 year ago - 1 comment

#32 - What's the best way to train ext2?

Issue - State: closed - Opened by laoda513 about 1 year ago - 1 comment

#31 - Conversion: release CUDA cache after VRAM intensive quant blocks

Pull Request - State: closed - Opened by 19h about 1 year ago - 12 comments

#30 - Convert/Quantizer bf16 support

Issue - State: closed - Opened by Qubitium about 1 year ago - 7 comments

#29 - Calibration data format clarification

Issue - State: closed - Opened by Qubitium about 1 year ago - 5 comments

#29 - Calibration data format clarification

Issue - State: closed - Opened by Qubitium about 1 year ago - 5 comments

#28 - We should use exllama1 for GPTQ and exllama2 for exl2?

Issue - State: closed - Opened by BadisG about 1 year ago - 3 comments

#28 - We should use exllama1 for GPTQ and exllama2 for exl2?

Issue - State: closed - Opened by BadisG about 1 year ago - 3 comments

#27 - Repetitive output in NVidia Jetson Orin Nano

Issue - State: closed - Opened by EraldoMJunior about 1 year ago - 4 comments

#27 - Repetitive output in NVidia Jetson Orin Nano

Issue - State: closed - Opened by EraldoMJunior about 1 year ago - 4 comments

#25 - Conversion help

Issue - State: closed - Opened by Chainfire about 1 year ago - 4 comments

#25 - Conversion help

Issue - State: closed - Opened by Chainfire about 1 year ago - 4 comments

#24 - add comment on model.load() usage

Pull Request - State: closed - Opened by gojefferson about 1 year ago

#24 - add comment on model.load() usage

Pull Request - State: closed - Opened by gojefferson about 1 year ago

#23 - Add copilot server example

Pull Request - State: open - Opened by chenhunghan about 1 year ago - 9 comments

#23 - Add copilot server example

Pull Request - State: open - Opened by chenhunghan about 1 year ago - 9 comments

#22 - Fix Compiling with HIP on Older Pytorch Version

Pull Request - State: closed - Opened by leonxia1018 about 1 year ago - 6 comments

#22 - Fix Compiling with HIP on Older Pytorch Version

Pull Request - State: closed - Opened by leonxia1018 about 1 year ago - 6 comments

#21 - nvcc fatal : Unknown option '-generate-dependencies-with-compile'

Issue - State: closed - Opened by Hashflower about 1 year ago - 1 comment

#20 - convert.py - RuntimeError: CUDA error: invalid configuration argument

Issue - State: closed - Opened by Thireus about 1 year ago - 22 comments

#18 - Support cfg sampler?

Issue - State: closed - Opened by win10ogod about 1 year ago - 1 comment

#17 - Fix typo in README.md

Pull Request - State: closed - Opened by eltociear about 1 year ago - 1 comment

#16 - Support Baichuan2?

Issue - State: closed - Opened by lx0126z about 1 year ago - 2 comments

#15 - Gibbish Output from 4bit EXL2 quantization

Issue - State: closed - Opened by fgdfgfthgr-fox about 1 year ago - 9 comments

#14 - Big difference in output between exllama1_hf and exllama2_hf

Issue - State: closed - Opened by BadisG about 1 year ago - 12 comments

#12 - aws gpu compatibility

Issue - State: closed - Opened by gtkafka about 1 year ago - 1 comment

#11 - OOM while trying to convert a 70B model to 4.75/4.25 bits on a 4090

Issue - State: closed - Opened by Panchovix about 1 year ago - 4 comments

#9 - Generation never stops

Issue - State: closed - Opened by ortegaalfredo about 1 year ago - 21 comments

#8 - Compile fail on P100.

Issue - State: closed - Opened by SlimeSli about 1 year ago - 1 comment

#7 - Can we run a 34B model with just 12gb Vram

Issue - State: closed - Opened by Gyro0o about 1 year ago - 3 comments

#6 - Convert script fails

Issue - State: closed - Opened by epicfilemcnulty about 1 year ago - 10 comments

#5 - Fix compiling and running on ROCm HIP

Pull Request - State: closed - Opened by ardfork about 1 year ago - 3 comments

#4 - Cannot split from textgen?

Issue - State: closed - Opened by Ph0rk0z about 1 year ago - 3 comments

#3 - It's using 19gb of vram for a 2.65bit 13b model

Issue - State: closed - Opened by BadisG about 1 year ago - 4 comments

#2 - Lower bits per weight

Issue - State: closed - Opened by IgnacioFDM about 1 year ago - 17 comments

#1 - What do you think of omniquant?

Issue - State: closed - Opened by Ph0rk0z about 1 year ago - 1 comment