Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / casper-hansen/AutoAWQ issues and pull requests

#99 - GPTBigCode 15B

Issue - State: closed - Opened by SebastianBodza 12 months ago - 4 comments

#98 - SmoothQuant implementation

Pull Request - State: closed - Opened by casper-hansen 12 months ago - 2 comments

#97 - Pass arguments to AutoConfig

Pull Request - State: closed - Opened by s4rduk4r 12 months ago - 1 comment

#96 - Only apply attention mask if seqlen is greater than 1

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#95 - Refactor cache and embedding modules

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#94 - Error with demo code

Issue - State: closed - Opened by EricBLivingston 12 months ago - 3 comments

#93 - quant oom

Issue - State: closed - Opened by esmeetu 12 months ago - 3 comments

#92 - Turing inference support (Colab+Kaggle working)

Pull Request - State: closed - Opened by casper-hansen 12 months ago - 1 comment

#91 - Add quantconfig

Pull Request - State: closed - Opened by GTimothee 12 months ago - 1 comment

#90 - Mistral fused modules

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#89 - Fix Falcon n_kv_heads parameter

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#88 - Fix unexpected keyword

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#87 - Issue with model.bin file after quantizing

Issue - State: closed - Opened by RonanKMcGovern 12 months ago - 5 comments

#86 - QuantAttentionFused.forward() padding_mask Error

Issue - State: closed - Opened by RonanKMcGovern 12 months ago - 2 comments

#85 - Add LoRA fine-tuning to AWQ

Issue - State: open - Opened by RonanKMcGovern 12 months ago - 17 comments
Labels: enhancement

#84 - Faster build, fix "no space left".

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#83 - Add mistral

Pull Request - State: closed - Opened by jamesdborin 12 months ago

#81 - I have incompatible transformers

Issue - State: closed - Opened by Galaxia-mk 12 months ago - 1 comment

#80 - Add low_cpu_mem_usage=True in example

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#79 - Mistral support

Pull Request - State: closed - Opened by casper-hansen 12 months ago - 3 comments

#78 - Qwen support

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#77 - Offloading to cpu and disk

Pull Request - State: closed - Opened by s4rduk4r 12 months ago

#76 - AWQ inference is 50% slower than GPTQ

Issue - State: closed - Opened by gestalt73 12 months ago - 4 comments

#75 - Fix KV cache shapes error

Pull Request - State: closed - Opened by casper-hansen 12 months ago

#74 - AutoAWQ adopation in other porjects yet?

Issue - State: closed - Opened by yhyu13 12 months ago - 2 comments

#73 - 使用conda安装autoawq,报错

Issue - State: closed - Opened by sunbeibei-hub 12 months ago - 5 comments

#72 - 支持awq8bit量化吗?

Issue - State: closed - Opened by sunbeibei-hub 12 months ago - 3 comments

#71 - INT8 support - SmoothQuant

Pull Request - State: closed - Opened by casper-hansen 12 months ago - 1 comment

#70 - CodeLlama 34B errors out after 3+ completions

Issue - State: closed - Opened by abacaj 12 months ago - 10 comments

#69 - Use typing classes over base types

Pull Request - State: closed - Opened by VikParuchuri 12 months ago - 1 comment

#66 - Improve model loading

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#65 - CUDA out of memory when quantizing LLaMA-70b

Issue - State: closed - Opened by Enjia about 1 year ago - 2 comments

#64 - Must split_k_iters be 8 for GEMM or GEMMv2 kernel?

Issue - State: closed - Opened by PYNing about 1 year ago - 2 comments

#62 - Refactor quantization code

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#61 - Add GPT BigCode support (StarCoder)

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#60 - Support kv_heads

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#59 - Strange interaction for DeepZero via text-generation-webui?

Issue - State: closed - Opened by cal066 about 1 year ago - 5 comments

#58 - 2x faster context processing with GEMV

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#57 - 不使用显存加载模型,仅使用内存和cpu

Issue - State: closed - Opened by sunbeibei-hub about 1 year ago - 1 comment

#56 - Added inference unit tests for all architectures except gpt-j

Pull Request - State: closed - Opened by bdambrosio about 1 year ago

#55 - Falcon-7b quantization failure

Issue - State: closed - Opened by bdambrosio about 1 year ago - 1 comment

#54 - Handle `n_kv_heads` for fused layers

Issue - State: closed - Opened by casper-hansen about 1 year ago
Labels: enhancement

#53 - support windows

Pull Request - State: closed - Opened by qwopqwop200 about 1 year ago - 4 comments

#52 - Support InternLM

Issue - State: open - Opened by casper-hansen about 1 year ago - 1 comment
Labels: help wanted, good first issue

#51 - Support Qwen

Issue - State: closed - Opened by casper-hansen about 1 year ago
Labels: help wanted, good first issue

#50 - Support Baichuan

Issue - State: closed - Opened by casper-hansen about 1 year ago - 2 comments
Labels: help wanted, good first issue

#49 - Support GPT-2

Issue - State: open - Opened by casper-hansen about 1 year ago - 1 comment
Labels: help wanted, good first issue

#48 - Optimize GEMV kernel - context and batch size

Issue - State: closed - Opened by casper-hansen about 1 year ago
Labels: help wanted

#47 - Safetensors and model sharding

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#46 - 'ellipsis' object has no attribute 'shape'

Issue - State: closed - Opened by samyoed about 1 year ago - 3 comments

#45 - INT8 quantization support

Issue - State: open - Opened by casper-hansen about 1 year ago - 3 comments
Labels: enhancement, help wanted

#44 - Implement metal kernel for GPUs on Mac

Issue - State: open - Opened by casper-hansen about 1 year ago
Labels: help wanted

#43 - Unit tests

Pull Request - State: closed - Opened by bdambrosio about 1 year ago - 3 comments

#42 - push_to_hub error

Issue - State: open - Opened by ryanshrott about 1 year ago - 1 comment

#41 - Support for `gpt-neox` model

Issue - State: closed - Opened by hyungwonchoi about 1 year ago - 1 comment

#40 - [NEW] GEMV kernel implementation

Pull Request - State: closed - Opened by casper-hansen about 1 year ago - 2 comments

#39 - Create tuning section in quant_config

Issue - State: open - Opened by casper-hansen about 1 year ago
Labels: enhancement

#38 - Flash attention and TorchAttention module

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#37 - Exllama optimize q4 matmul and fixbug

Pull Request - State: closed - Opened by qwopqwop200 about 1 year ago

#36 - Implement weight map and weight sharding

Issue - State: closed - Opened by casper-hansen about 1 year ago - 1 comment
Labels: help wanted, good first issue

#35 - Support Falcon 180B

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#34 - Benchmark test data

Issue - State: closed - Opened by wanzhenchn about 1 year ago - 8 comments

#33 - Any plan on updating bloom benchmark?

Issue - State: closed - Opened by DRL36 about 1 year ago - 1 comment

#32 - 📌 AutoAWQ Roadmap

Issue - State: closed - Opened by casper-hansen about 1 year ago - 11 comments

#31 - Add unit/integration testing

Issue - State: open - Opened by casper-hansen about 1 year ago - 4 comments
Labels: help wanted, good first issue

#30 - Exllama integration

Pull Request - State: closed - Opened by casper-hansen about 1 year ago - 15 comments

#29 - batching sample; no disable_fused_layers for FP16 model

Pull Request - State: closed - Opened by wanzhenchn about 1 year ago - 2 comments

#28 - [BUG] Fix illegal memory access + Quantized Multi-GPU support

Pull Request - State: closed - Opened by casper-hansen about 1 year ago - 1 comment

#27 - Allow user to use custom calibration data for quantization

Pull Request - State: closed - Opened by boehm-e about 1 year ago - 2 comments

#26 - Implement batch size for speed test

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#25 - support speedtest to benchmark FP16 model

Pull Request - State: closed - Opened by wanzhenchn about 1 year ago - 3 comments

#24 - remove fixed compute capabilities list

Pull Request - State: closed - Opened by wanzhenchn about 1 year ago - 3 comments

#23 - YaRN support for LLaMa models

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#22 - add llava model support

Issue - State: closed - Opened by qZhang88 about 1 year ago - 3 comments
Labels: enhancement, good first issue

#21 - fuse_layers bug fix

Pull Request - State: closed - Opened by qwopqwop200 about 1 year ago - 2 comments

#20 - Bug hunt: illegal memory access

Issue - State: closed - Opened by casper-hansen about 1 year ago - 10 comments
Labels: bug, help wanted

#19 - Implement xformers layernorm (2x faster than nn.LayerNorm)

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#18 - Refactor fused modules

Pull Request - State: closed - Opened by casper-hansen about 1 year ago

#17 - Add multi-gpu support to fused layers

Issue - State: closed - Opened by casper-hansen about 1 year ago - 1 comment
Labels: help wanted

#16 - windows support

Pull Request - State: closed - Opened by qwopqwop200 about 1 year ago - 1 comment

#15 - Support batch input for performance test

Issue - State: closed - Opened by wanzhenchn about 1 year ago - 2 comments

#14 - Windows build support

Issue - State: closed - Opened by casper-hansen about 1 year ago
Labels: help wanted

#13 - Cuda issue when trying to install

Issue - State: closed - Opened by mhenrichsen about 1 year ago - 5 comments

#12 - Recursion error when creating AutoTokenizer for llama-13b-hf

Issue - State: closed - Opened by wanzhenchn about 1 year ago - 4 comments

#11 - Compatibility in Python 3.8 when running entry.py

Issue - State: closed - Opened by wanzhenchn about 1 year ago - 2 comments

#10 - Quantize models with custom datasets

Issue - State: closed - Opened by casper-hansen about 1 year ago
Labels: enhancement

#9 - Release PyPi package + Create GitHub workflow

Pull Request - State: closed - Opened by casper-hansen about 1 year ago - 3 comments

#8 - Create class QuantConfig

Issue - State: closed - Opened by casper-hansen about 1 year ago - 8 comments
Labels: good first issue

#7 - Clean up fused modules

Issue - State: closed - Opened by casper-hansen about 1 year ago - 1 comment
Labels: good first issue

#6 - Interested in Hugging Face transformers integration?

Issue - State: closed - Opened by younesbelkada about 1 year ago - 2 comments

#5 - Implement BigCode models (StarCoder etc.)

Issue - State: closed - Opened by casper-hansen about 1 year ago - 7 comments

#4 - Experiment with implementing AWQ for BERT models

Issue - State: open - Opened by casper-hansen about 1 year ago - 3 comments
Labels: help wanted

#3 - Implement exllama q4_matmul kernel as alternative

Issue - State: open - Opened by casper-hansen about 1 year ago - 5 comments
Labels: enhancement, help wanted

#2 - Implement faster LayerNorm than nn.LayerNorm

Issue - State: closed - Opened by casper-hansen about 1 year ago - 1 comment
Labels: enhancement, help wanted

#1 - Add GPTJ Support

Pull Request - State: closed - Opened by jamesdborin about 1 year ago - 3 comments