Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / mobiusml/hqq issues and pull requests

#134 - Saving quantized Aria weights

Issue - State: open - Opened by leon-seidel 9 days ago - 2 comments

#133 - model(aria): do not restrict sdpa to MATH in prefill phase

Pull Request - State: closed - Opened by xffxff 9 days ago - 1 comment

#130 - 8bit + Aten + compile

Issue - State: open - Opened by zhangy659 29 days ago - 6 comments

#129 - cache_size_limit reached

Issue - State: closed - Opened by zhangy659 about 1 month ago - 22 comments

#128 - 4bit slower?

Issue - State: closed - Opened by zhangy659 about 1 month ago - 3 comments

#126 - Fix SDPA context manager perf regression

Pull Request - State: closed - Opened by msaroufim about 1 month ago - 1 comment

#125 - Support for HQQ Quantization: Compatibility with LLava and Qwen Models?

Issue - State: closed - Opened by NEWbie0709 about 1 month ago - 10 comments

#123 - slow loading process of pretrained model for finetuning in transformers

Issue - State: closed - Opened by jiaqiw09 about 2 months ago - 4 comments

#122 - KeyError: 'offload_meta'

Issue - State: closed - Opened by kadirnar about 2 months ago - 1 comment

#121 - Fix filename in `setup_torch.py`

Pull Request - State: closed - Opened by larin92 2 months ago - 8 comments

#120 - CUDA error when trying to use llama3.1 8B 4bit quantized model sample

Issue - State: closed - Opened by PatrickDahlin 2 months ago - 8 comments

#119 - integrated into gpt-fast

Issue - State: closed - Opened by kaizizzzzzz 3 months ago - 1 comment

#118 - Hqq vs gguf

Issue - State: closed - Opened by blap 3 months ago - 3 comments

#116 - torch.compile() the quantization method

Pull Request - State: open - Opened by rationalism 3 months ago - 6 comments

#115 - question about fine tune 1bit-quanitzed model

Issue - State: closed - Opened by zxbjushuai 3 months ago - 35 comments

#114 - Issue when loading the quantized model

Issue - State: closed - Opened by NEWbie0709 3 months ago - 5 comments

#113 - Question about Quantization

Issue - State: closed - Opened by NEWbie0709 3 months ago - 4 comments

#112 - docs: update Readme.md

Pull Request - State: open - Opened by eltociear 3 months ago

#111 - Quesiton on the speed for generating the response

Issue - State: closed - Opened by NEWbie0709 3 months ago - 18 comments

#110 - `hqq/backends/torchao.py` line 177, KeyError: 'scale'

Issue - State: closed - Opened by egorsmkv 3 months ago - 13 comments

#109 - zero and scale quant

Issue - State: closed - Opened by kaizizzzzzz 3 months ago - 1 comment

#108 - RuntimeError: Expected in.dtype() == at::kInt to be true, but got false.

Issue - State: closed - Opened by egorsmkv 3 months ago - 9 comments

#107 - TypeError: Object of type dtype is not JSON serializable

Issue - State: closed - Opened by zxbjushuai 3 months ago - 11 comments

#106 - Add recommended inductor config for speedup

Pull Request - State: closed - Opened by yiliu30 3 months ago - 1 comment

#105 - Warning: failed to import the BitBlas backend

Issue - State: closed - Opened by jinz2014 4 months ago - 7 comments

#104 - Easy way to run lm evaluation harness

Issue - State: closed - Opened by pythonLoader 4 months ago - 1 comment

#103 - Expected in.dtype() == at::kInt to be true, but got false

Issue - State: closed - Opened by jonashaag 4 months ago - 14 comments

#102 - Bug of the saved model when applying zero and scale quantization

Issue - State: closed - Opened by kaizizzzzzz 4 months ago - 1 comment

#101 - Support Gemma quantization

Issue - State: closed - Opened by kaizizzzzzz 4 months ago - 2 comments

#100 - Weight Sharding

Issue - State: closed - Opened by winglian 4 months ago - 2 comments

#98 - Use GPTQModel for GPTQ quantization: 2x faster + better PPL

Pull Request - State: closed - Opened by Qubitium 4 months ago - 2 comments

#97 - 3-bit quantization weight data type issue

Issue - State: closed - Opened by BeichenHuang 4 months ago - 10 comments

#96 - About the implentation of .cpu()

Issue - State: open - Opened by reflectionie 4 months ago - 1 comment

#94 - bitblas introduces dependency on CUDA version

Issue - State: closed - Opened by zodiacg 5 months ago - 3 comments

#93 - Add way to save quantize config and can be loaded again

Pull Request - State: closed - Opened by fahadh4ilyas 5 months ago - 8 comments

#92 - module 'torch.library' has no attribute 'custom_op'

Issue - State: closed - Opened by fahadh4ilyas 5 months ago - 4 comments

#91 - Fix hf load

Pull Request - State: closed - Opened by fahadh4ilyas 5 months ago - 3 comments

#90 - 2-bit quantization representation

Issue - State: closed - Opened by kaizizzzzzz 5 months ago - 3 comments

#88 - 1 bit inference

Issue - State: closed - Opened by kaizizzzzzz 5 months ago - 4 comments

#87 - Group_Size setting

Issue - State: closed - Opened by kaizizzzzzz 5 months ago - 1 comment

#86 - Activation quantization

Issue - State: closed - Opened by kaizizzzzzz 5 months ago - 9 comments

#84 - Is HQQLinearLoRAWithFakeQuant differentiable?

Issue - State: closed - Opened by lippman1125 6 months ago - 1 comment

#83 - Question about quantization.

Issue - State: closed - Opened by mxjmtxrm 6 months ago - 2 comments

#82 - Running HQQ Quantized Models on CPU

Issue - State: closed - Opened by 49Simon 6 months ago - 3 comments

#80 - [Question] Model Outputting Gibberish After Quantization

Issue - State: closed - Opened by DefinitlyEvil 6 months ago - 4 comments

#79 - AttributeError: 'LlamaForCausalLM' object has no attribute '_setup_cache'

Issue - State: closed - Opened by ChuanhongLi 6 months ago - 3 comments
Labels: bug

#78 - HQQ for convolutional layers

Issue - State: closed - Opened by danishansari 6 months ago - 6 comments

#77 - prepare_for_inference error

Issue - State: closed - Opened by BeichenHuang 6 months ago - 17 comments

#76 - No module named 'hqq.engine' Error.

Issue - State: closed - Opened by yixuantt 6 months ago - 2 comments

#75 - Not able to save quantized model

Issue - State: closed - Opened by BeichenHuang 6 months ago - 5 comments

#74 - Can the quantization process be on CPU?

Issue - State: closed - Opened by mxjmtxrm 7 months ago - 4 comments

#73 - Does it support Hqq optimization algorithm in diffusion models?

Issue - State: closed - Opened by kadirnar 7 months ago - 1 comment

#71 - Add multi-gpu support for `from_quantized` call

Issue - State: closed - Opened by mobicham 7 months ago - 1 comment
Labels: enhancement

#70 - Problem in load from saved model

Issue - State: closed - Opened by uisikdag 7 months ago - 2 comments

#69 - axis fix

Pull Request - State: closed - Opened by envomp 7 months ago - 2 comments

#67 - pass along cache_dir during snapshot download

Pull Request - State: closed - Opened by andysalerno 7 months ago - 1 comment

#66 - Performance of quantized model

Issue - State: closed - Opened by thhung 7 months ago - 1 comment

#65 - Issue with torchao patching with loaded model

Issue - State: closed - Opened by rohit-gupta 7 months ago - 8 comments
Labels: bug

#64 - torch.compile() for quantized model

Issue - State: closed - Opened by DHKim0428 7 months ago - 3 comments

#62 - How to load quantized model with flash_attn?

Issue - State: closed - Opened by mxjmtxrm 7 months ago - 2 comments

#59 - Supported Model in README

Issue - State: closed - Opened by sanjeev-bhandari 7 months ago - 1 comment

#58 - smaple code doesn't run

Issue - State: closed - Opened by LiangA 7 months ago - 6 comments

#57 - directly loading weights in specified device

Pull Request - State: closed - Opened by viraatdas 8 months ago - 9 comments

#56 - HQQ + Brevitas

Issue - State: closed - Opened by Giuseppe5 8 months ago - 1 comment
Labels: question

#55 - Issue with HQQLinear Layer in Stable Diffusion Model on Aten Backend

Issue - State: closed - Opened by DHKim0428 8 months ago - 7 comments

#54 - Readme save_quantized issue

Issue - State: closed - Opened by BeichenHuang 8 months ago - 1 comment

#52 - Support MPS

Issue - State: closed - Opened by benglewis 8 months ago - 5 comments

#51 - Initializing the model from state_dict

Pull Request - State: closed - Opened by envomp 8 months ago - 6 comments

#50 - Initializing the model from state_dict

Issue - State: closed - Opened by envomp 8 months ago - 3 comments

#49 - Request for amd support

Issue - State: closed - Opened by Wintoplay 8 months ago - 5 comments

#48 - forward cache_dir in HQQWrapper.from_quantized()

Pull Request - State: closed - Opened by MarkBenjamin 8 months ago - 1 comment

#47 - TypeError when load from_pretrain

Issue - State: closed - Opened by ghost 8 months ago - 10 comments

#46 - tensorflow or keras implementation

Issue - State: closed - Opened by patelprateek 8 months ago - 2 comments
Labels: enhancement

#45 - Difference between blog post and implementation

Issue - State: closed - Opened by dacorvo 8 months ago - 1 comment

#44 - Why does the 2bit 34b model take up 19GB of GPU memory

Issue - State: closed - Opened by Minami-su 8 months ago - 7 comments

#43 - How to accelerate the inference speed of 1bit+lora model

Issue - State: closed - Opened by Minami-su 8 months ago - 4 comments
Labels: enhancement

#42 - How to merge lora with 1bit model?

Issue - State: closed - Opened by Minami-su 8 months ago - 1 comment

#40 - transfer learning?

Issue - State: closed - Opened by NickyDark1 8 months ago - 2 comments
Labels: question

#39 - `.to` is not supported for HQQ-quantized models

Issue - State: closed - Opened by Abdullah-kwl 8 months ago - 5 comments
Labels: help wanted

#38 - torch cpu only?

Issue - State: closed - Opened by MarkBenjamin 8 months ago - 5 comments
Labels: question

#37 - Reuse Huggingface model cache directory as standard

Issue - State: closed - Opened by MarkBenjamin 8 months ago - 8 comments
Labels: bug

#36 - TypeError: HQQWrapper.from_quantized() got an unexpected keyword argument 'adapter'

Issue - State: closed - Opened by KainanYuval 8 months ago - 1 comment
Labels: help wanted

#35 - KeyError: 'self_attn.dense'

Issue - State: closed - Opened by Vasanthengineer4949 8 months ago - 2 comments
Labels: help wanted

#34 - Your session crashed after using all available RAM

Issue - State: closed - Opened by Abdullah-kwl 8 months ago - 1 comment

#33 - TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'

Issue - State: closed - Opened by Minami-su 8 months ago - 1 comment

#32 - Packing Format

Issue - State: closed - Opened by jeromeku 8 months ago - 9 comments
Labels: question

#31 - Load saved quant to continue training

Issue - State: closed - Opened by Sneakr 8 months ago - 12 comments
Labels: help wanted

#30 - Does Gemma support?

Issue - State: closed - Opened by NickyDark1 8 months ago - 2 comments
Labels: enhancement

#29 - HQQ OOMs on large models

Issue - State: closed - Opened by rationalism 8 months ago - 12 comments
Labels: enhancement

#28 - Support for Mixtral with vLLM

Issue - State: closed - Opened by HaritzPuerto 8 months ago - 1 comment
Labels: enhancement