Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / mobiusml/hqq issues and pull requests
#134 - Saving quantized Aria weights
Issue -
State: open - Opened by leon-seidel 9 days ago
- 2 comments
#133 - model(aria): do not restrict sdpa to MATH in prefill phase
Pull Request -
State: closed - Opened by xffxff 9 days ago
- 1 comment
#130 - 8bit + Aten + compile
Issue -
State: open - Opened by zhangy659 29 days ago
- 6 comments
#129 - cache_size_limit reached
Issue -
State: closed - Opened by zhangy659 about 1 month ago
- 22 comments
#128 - 4bit slower?
Issue -
State: closed - Opened by zhangy659 about 1 month ago
- 3 comments
#127 - [rfc][dont merge] Use the skip_guard_eval stance to remove torch.compile guard overhead
Pull Request -
State: open - Opened by anijain2305 about 1 month ago
- 3 comments
#126 - Fix SDPA context manager perf regression
Pull Request -
State: closed - Opened by msaroufim about 1 month ago
- 1 comment
#125 - Support for HQQ Quantization: Compatibility with LLava and Qwen Models?
Issue -
State: closed - Opened by NEWbie0709 about 1 month ago
- 10 comments
#124 - Group size and restrictions: documentation and implementation contradict each other
Issue -
State: open - Opened by Maykeye about 1 month ago
- 5 comments
#123 - slow loading process of pretrained model for finetuning in transformers
Issue -
State: closed - Opened by jiaqiw09 about 2 months ago
- 4 comments
#122 - KeyError: 'offload_meta'
Issue -
State: closed - Opened by kadirnar about 2 months ago
- 1 comment
#121 - Fix filename in `setup_torch.py`
Pull Request -
State: closed - Opened by larin92 2 months ago
- 8 comments
#120 - CUDA error when trying to use llama3.1 8B 4bit quantized model sample
Issue -
State: closed - Opened by PatrickDahlin 2 months ago
- 8 comments
#119 - integrated into gpt-fast
Issue -
State: closed - Opened by kaizizzzzzz 3 months ago
- 1 comment
#118 - Hqq vs gguf
Issue -
State: closed - Opened by blap 3 months ago
- 3 comments
#116 - torch.compile() the quantization method
Pull Request -
State: open - Opened by rationalism 3 months ago
- 6 comments
#115 - question about fine tune 1bit-quanitzed model
Issue -
State: closed - Opened by zxbjushuai 3 months ago
- 35 comments
#114 - Issue when loading the quantized model
Issue -
State: closed - Opened by NEWbie0709 3 months ago
- 5 comments
#113 - Question about Quantization
Issue -
State: closed - Opened by NEWbie0709 3 months ago
- 4 comments
#112 - docs: update Readme.md
Pull Request -
State: open - Opened by eltociear 3 months ago
#111 - Quesiton on the speed for generating the response
Issue -
State: closed - Opened by NEWbie0709 3 months ago
- 18 comments
#110 - `hqq/backends/torchao.py` line 177, KeyError: 'scale'
Issue -
State: closed - Opened by egorsmkv 3 months ago
- 13 comments
#109 - zero and scale quant
Issue -
State: closed - Opened by kaizizzzzzz 3 months ago
- 1 comment
#108 - RuntimeError: Expected in.dtype() == at::kInt to be true, but got false.
Issue -
State: closed - Opened by egorsmkv 3 months ago
- 9 comments
#107 - TypeError: Object of type dtype is not JSON serializable
Issue -
State: closed - Opened by zxbjushuai 3 months ago
- 11 comments
#106 - Add recommended inductor config for speedup
Pull Request -
State: closed - Opened by yiliu30 3 months ago
- 1 comment
#105 - Warning: failed to import the BitBlas backend
Issue -
State: closed - Opened by jinz2014 4 months ago
- 7 comments
#104 - Easy way to run lm evaluation harness
Issue -
State: closed - Opened by pythonLoader 4 months ago
- 1 comment
#103 - Expected in.dtype() == at::kInt to be true, but got false
Issue -
State: closed - Opened by jonashaag 4 months ago
- 14 comments
#102 - Bug of the saved model when applying zero and scale quantization
Issue -
State: closed - Opened by kaizizzzzzz 4 months ago
- 1 comment
#101 - Support Gemma quantization
Issue -
State: closed - Opened by kaizizzzzzz 4 months ago
- 2 comments
#100 - Weight Sharding
Issue -
State: closed - Opened by winglian 4 months ago
- 2 comments
#99 - RuntimeError: Expected in.dtype() == at::kInt to be true, but got false. (Could this error message be improved? If so, please report an enhancement request to PyTorch.)
Issue -
State: closed - Opened by kadirnar 4 months ago
- 1 comment
#98 - Use GPTQModel for GPTQ quantization: 2x faster + better PPL
Pull Request -
State: closed - Opened by Qubitium 4 months ago
- 2 comments
#97 - 3-bit quantization weight data type issue
Issue -
State: closed - Opened by BeichenHuang 4 months ago
- 10 comments
#96 - About the implentation of .cpu()
Issue -
State: open - Opened by reflectionie 4 months ago
- 1 comment
#95 - OSError: libnvrtc.so.12: cannot open shared object file: No such file or directory
Issue -
State: closed - Opened by kadirnar 5 months ago
- 1 comment
#94 - bitblas introduces dependency on CUDA version
Issue -
State: closed - Opened by zodiacg 5 months ago
- 3 comments
#93 - Add way to save quantize config and can be loaded again
Pull Request -
State: closed - Opened by fahadh4ilyas 5 months ago
- 8 comments
#92 - module 'torch.library' has no attribute 'custom_op'
Issue -
State: closed - Opened by fahadh4ilyas 5 months ago
- 4 comments
#91 - Fix hf load
Pull Request -
State: closed - Opened by fahadh4ilyas 5 months ago
- 3 comments
#90 - 2-bit quantization representation
Issue -
State: closed - Opened by kaizizzzzzz 5 months ago
- 3 comments
#89 - Weird problem in loading quantized_model + lora_adpter
Issue -
State: closed - Opened by kaizizzzzzz 5 months ago
#88 - 1 bit inference
Issue -
State: closed - Opened by kaizizzzzzz 5 months ago
- 4 comments
#87 - Group_Size setting
Issue -
State: closed - Opened by kaizizzzzzz 5 months ago
- 1 comment
#86 - Activation quantization
Issue -
State: closed - Opened by kaizizzzzzz 5 months ago
- 9 comments
#85 - hqq+ lora ValueError || ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True'
Issue -
State: closed - Opened by tellyoung 5 months ago
- 3 comments
#84 - Is HQQLinearLoRAWithFakeQuant differentiable?
Issue -
State: closed - Opened by lippman1125 6 months ago
- 1 comment
#83 - Question about quantization.
Issue -
State: closed - Opened by mxjmtxrm 6 months ago
- 2 comments
#82 - Running HQQ Quantized Models on CPU
Issue -
State: closed - Opened by 49Simon 6 months ago
- 3 comments
#81 - AttributeError: 'HQQLinearTorchWeightOnlynt4' object has no attribute 'weight'
Issue -
State: closed - Opened by ChuanhongLi 6 months ago
- 7 comments
#80 - [Question] Model Outputting Gibberish After Quantization
Issue -
State: closed - Opened by DefinitlyEvil 6 months ago
- 4 comments
#79 - AttributeError: 'LlamaForCausalLM' object has no attribute '_setup_cache'
Issue -
State: closed - Opened by ChuanhongLi 6 months ago
- 3 comments
Labels: bug
#78 - HQQ for convolutional layers
Issue -
State: closed - Opened by danishansari 6 months ago
- 6 comments
#77 - prepare_for_inference error
Issue -
State: closed - Opened by BeichenHuang 6 months ago
- 17 comments
#76 - No module named 'hqq.engine' Error.
Issue -
State: closed - Opened by yixuantt 6 months ago
- 2 comments
#75 - Not able to save quantized model
Issue -
State: closed - Opened by BeichenHuang 6 months ago
- 5 comments
#74 - Can the quantization process be on CPU?
Issue -
State: closed - Opened by mxjmtxrm 7 months ago
- 4 comments
#73 - Does it support Hqq optimization algorithm in diffusion models?
Issue -
State: closed - Opened by kadirnar 7 months ago
- 1 comment
#72 - Compatibility Issue: TypeError for Union Type Hints with Python Versions Below 3.10
Issue -
State: closed - Opened by hjh0119 7 months ago
- 1 comment
#71 - Add multi-gpu support for `from_quantized` call
Issue -
State: closed - Opened by mobicham 7 months ago
- 1 comment
Labels: enhancement
#70 - Problem in load from saved model
Issue -
State: closed - Opened by uisikdag 7 months ago
- 2 comments
#69 - axis fix
Pull Request -
State: closed - Opened by envomp 7 months ago
- 2 comments
#67 - pass along cache_dir during snapshot download
Pull Request -
State: closed - Opened by andysalerno 7 months ago
- 1 comment
#66 - Performance of quantized model
Issue -
State: closed - Opened by thhung 7 months ago
- 1 comment
#65 - Issue with torchao patching with loaded model
Issue -
State: closed - Opened by rohit-gupta 7 months ago
- 8 comments
Labels: bug
#64 - torch.compile() for quantized model
Issue -
State: closed - Opened by DHKim0428 7 months ago
- 3 comments
#62 - How to load quantized model with flash_attn?
Issue -
State: closed - Opened by mxjmtxrm 7 months ago
- 2 comments
#61 - load the model into GPU or device_map using HQQModelForCausalLM.from_pretrained?
Issue -
State: closed - Opened by icoicqico 7 months ago
- 12 comments
#59 - Supported Model in README
Issue -
State: closed - Opened by sanjeev-bhandari 7 months ago
- 1 comment
#58 - smaple code doesn't run
Issue -
State: closed - Opened by LiangA 7 months ago
- 6 comments
#57 - directly loading weights in specified device
Pull Request -
State: closed - Opened by viraatdas 8 months ago
- 9 comments
#56 - HQQ + Brevitas
Issue -
State: closed - Opened by Giuseppe5 8 months ago
- 1 comment
Labels: question
#55 - Issue with HQQLinear Layer in Stable Diffusion Model on Aten Backend
Issue -
State: closed - Opened by DHKim0428 8 months ago
- 7 comments
#54 - Readme save_quantized issue
Issue -
State: closed - Opened by BeichenHuang 8 months ago
- 1 comment
#52 - Support MPS
Issue -
State: closed - Opened by benglewis 8 months ago
- 5 comments
#51 - Initializing the model from state_dict
Pull Request -
State: closed - Opened by envomp 8 months ago
- 6 comments
#50 - Initializing the model from state_dict
Issue -
State: closed - Opened by envomp 8 months ago
- 3 comments
#49 - Request for amd support
Issue -
State: closed - Opened by Wintoplay 8 months ago
- 5 comments
#48 - forward cache_dir in HQQWrapper.from_quantized()
Pull Request -
State: closed - Opened by MarkBenjamin 8 months ago
- 1 comment
#47 - TypeError when load from_pretrain
Issue -
State: closed - Opened by ghost 8 months ago
- 10 comments
#46 - tensorflow or keras implementation
Issue -
State: closed - Opened by patelprateek 8 months ago
- 2 comments
Labels: enhancement
#45 - Difference between blog post and implementation
Issue -
State: closed - Opened by dacorvo 8 months ago
- 1 comment
#44 - Why does the 2bit 34b model take up 19GB of GPU memory
Issue -
State: closed - Opened by Minami-su 8 months ago
- 7 comments
#43 - How to accelerate the inference speed of 1bit+lora model
Issue -
State: closed - Opened by Minami-su 8 months ago
- 4 comments
Labels: enhancement
#42 - How to merge lora with 1bit model?
Issue -
State: closed - Opened by Minami-su 8 months ago
- 1 comment
#41 - An error occurred when I was training a 1bit model using lora........(element 0 of tensors does not require grad and does not have a grad_fn)
Issue -
State: closed - Opened by Minami-su 8 months ago
- 16 comments
Labels: help wanted
#40 - transfer learning?
Issue -
State: closed - Opened by NickyDark1 8 months ago
- 2 comments
Labels: question
#39 - `.to` is not supported for HQQ-quantized models
Issue -
State: closed - Opened by Abdullah-kwl 8 months ago
- 5 comments
Labels: help wanted
#38 - torch cpu only?
Issue -
State: closed - Opened by MarkBenjamin 8 months ago
- 5 comments
Labels: question
#37 - Reuse Huggingface model cache directory as standard
Issue -
State: closed - Opened by MarkBenjamin 8 months ago
- 8 comments
Labels: bug
#36 - TypeError: HQQWrapper.from_quantized() got an unexpected keyword argument 'adapter'
Issue -
State: closed - Opened by KainanYuval 8 months ago
- 1 comment
Labels: help wanted
#35 - KeyError: 'self_attn.dense'
Issue -
State: closed - Opened by Vasanthengineer4949 8 months ago
- 2 comments
Labels: help wanted
#34 - Your session crashed after using all available RAM
Issue -
State: closed - Opened by Abdullah-kwl 8 months ago
- 1 comment
#33 - TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
Issue -
State: closed - Opened by Minami-su 8 months ago
- 1 comment
#32 - Packing Format
Issue -
State: closed - Opened by jeromeku 8 months ago
- 9 comments
Labels: question
#31 - Load saved quant to continue training
Issue -
State: closed - Opened by Sneakr 8 months ago
- 12 comments
Labels: help wanted
#30 - Does Gemma support?
Issue -
State: closed - Opened by NickyDark1 8 months ago
- 2 comments
Labels: enhancement
#29 - HQQ OOMs on large models
Issue -
State: closed - Opened by rationalism 8 months ago
- 12 comments
Labels: enhancement
#28 - Support for Mixtral with vLLM
Issue -
State: closed - Opened by HaritzPuerto 8 months ago
- 1 comment
Labels: enhancement