Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / mit-han-lab/smoothquant issues and pull requests
#95 - Why only 4 layers?
Issue -
State: open - Opened by VincentXWD 2 months ago
#94 - Support for Qwen2
Issue -
State: open - Opened by JiaXinLI98 3 months ago
#93 - How to quantize the out_proj and fc2 module in OPT model family
Issue -
State: open - Opened by yanchenmochen 3 months ago
#92 - How to quantize llama3?
Issue -
State: open - Opened by jpyo0803 4 months ago
#91 - export_int8_model.py size issue
Issue -
State: open - Opened by ljhyeok123 4 months ago
- 1 comment
#90 - quantify other models,
Issue -
State: open - Opened by AlexMa0 4 months ago
#89 - best Alpha value for Qwen 1.5 72B
Issue -
State: open - Opened by Riskin1999 5 months ago
#88 - how to draw this result directly? is there any script?
Issue -
State: open - Opened by foreverpiano 5 months ago
- 1 comment
#87 - Huggingface_Hub Issue
Issue -
State: open - Opened by faize5 6 months ago
- 2 comments
#86 - Can SmoothQuant be used on ViT models?
Issue -
State: open - Opened by n9s8a 7 months ago
#85 - Whether it can be supported stable diffusion
Issue -
State: open - Opened by songh11 7 months ago
#84 - Inquiry about Int8 BMM overflow
Issue -
State: open - Opened by luzai 7 months ago
#83 - Error when running smoothquant_opt_real_int8_demo.ipynb
Issue -
State: open - Opened by kaijun924 7 months ago
#82 - how to use model.generate with smoothquant models
Issue -
State: open - Opened by Hao-YunDeng 7 months ago
#81 - which version of transformer and datasets package do we need for this repo?
Issue -
State: open - Opened by ghost 8 months ago
- 2 comments
#80 - adjust activations
Issue -
State: open - Opened by muzi0111 8 months ago
#79 - Question: why not need explicit scaling for activation X
Issue -
State: open - Opened by ghost 8 months ago
- 2 comments
#78 - RuntimeError: "clamp_min_cpu" not implemented for 'Half'
Issue -
State: closed - Opened by ghost 8 months ago
- 1 comment
#77 - Weight migration for Llama?
Issue -
State: open - Opened by atyshka 8 months ago
#76 - Question about code
Issue -
State: open - Opened by Lucky-Lance 8 months ago
#75 - How Can I Peft the Smoothquanted LLM?
Issue -
State: open - Opened by LameloBally 8 months ago
- 1 comment
#74 - bmm_s8t_s8n_s8t cannot run with this shape
Issue -
State: closed - Opened by xiachong94 8 months ago
#73 - Can I reproduce SmoothQuant on CPU only since I see that torch-int8 requires a GPU, and I am only interested in inference on the CPU?
Issue -
State: open - Opened by WCSY-YG 9 months ago
#72 - set quantize_output True the acc drop to 0
Issue -
State: open - Opened by lonleyodd 10 months ago
#71 - ask for a function in linear.py for smoothquant in llama @Anizpz
Issue -
State: open - Opened by msz12345 10 months ago
#70 - w8a8 Does it require dequantization during forward inference?
Issue -
State: open - Opened by shatealaboxiaowang 11 months ago
- 1 comment
#69 - general question about SmoothQuant kv-cache quantization
Issue -
State: open - Opened by brisker 11 months ago
#68 - Got accuray=0 when trying _real_int8_demo.ipynb
Issue -
State: open - Opened by leocnj 11 months ago
#67 - how to reproduce ppl of wikitext2?
Issue -
State: open - Opened by Arthur-Ling 11 months ago
- 1 comment
#66 - Activation scales for bloomz 7.1b
Issue -
State: open - Opened by bil-ash 11 months ago
- 1 comment
#65 - support auto search for per-layer smoothing alphas, and auto clip for weights, both bits-aware, can do W4A8 with minor loss
Pull Request -
State: closed - Opened by yyfcc17 12 months ago
- 2 comments
#64 - What does the accuracy in Figure 7 of the paper mean?
Issue -
State: open - Opened by YundongGai 12 months ago
#63 - Demo code for Bloom model?
Issue -
State: open - Opened by llCurious 12 months ago
#62 - Inference time decreases only by 7.5% on opt-6.7B
Issue -
State: open - Opened by FurryMushroom about 1 year ago
- 1 comment
#61 - llama-2-chat demo
Pull Request -
State: closed - Opened by liquanfeng about 1 year ago
#60 - pickle.UnpicklingError: invalid load key, 'v'.
Issue -
State: open - Opened by baiSongL about 1 year ago
- 2 comments
#59 - failed to run int8 opt
Issue -
State: closed - Opened by jackzhou121 about 1 year ago
- 2 comments
#58 - UnpicklingError: invalid load key, 'v'.
Issue -
State: closed - Opened by FurryMushroom about 1 year ago
- 7 comments
#57 - add llama model support
Pull Request -
State: open - Opened by AniZpZ about 1 year ago
#56 - which is faster between smoothquant and autogptq?
Issue -
State: open - Opened by InkdyeHuang about 1 year ago
#55 - [BUG] Int8 inference with torch-int encounter errors
Issue -
State: open - Opened by WelY1 about 1 year ago
#54 - How to calculate Alpha?
Issue -
State: open - Opened by Triple-L about 1 year ago
#53 - Why do different models have the same size?
Issue -
State: open - Opened by WelY1 about 1 year ago
#52 - Activation Channel Scales and Calibration
Issue -
State: open - Opened by 520zw about 1 year ago
- 1 comment
#51 - The ppl value of the opt-6.7b-smoothquant model shows abnormal performance
Issue -
State: open - Opened by sitabulaixizawaluduo over 1 year ago
- 1 comment
#50 - circular import
Issue -
State: open - Opened by breaddance over 1 year ago
#49 - Can you explain in a step by step manner how we can implement this on our own model and dataset?
Issue -
State: open - Opened by shahaamirbader over 1 year ago
#48 - How to reproduce the performance described in the paper
Issue -
State: open - Opened by rolex-cjj over 1 year ago
- 2 comments
#47 - How to conduct zero-shot experiments?
Issue -
State: open - Opened by moodom over 1 year ago
#46 - Error loading `AutoModelForCausalLM` in `examples/generate_act_scales.py`
Issue -
State: closed - Opened by julian-q over 1 year ago
- 1 comment
#45 - Could not open smoothquant_opt_demo.ipynb
Issue -
State: open - Opened by foreverpiano over 1 year ago
- 1 comment
#44 - How can I make it support Bloom-7b?
Issue -
State: open - Opened by moonlightian over 1 year ago
#43 - batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
Issue -
State: closed - Opened by GuoYi0 over 1 year ago
- 1 comment
#42 - Accuracy drop for Llama
Issue -
State: open - Opened by fmo-mt over 1 year ago
- 9 comments
#41 - No module named 'torch_int'
Issue -
State: open - Opened by kaust2018 over 1 year ago
- 7 comments
#40 - support GPTNEOX model
Issue -
State: open - Opened by amazingkmy over 1 year ago
#39 - How to reproduce the result with lm-evaluation-harness
Issue -
State: open - Opened by Ther-nullptr over 1 year ago
#38 - How can smoothquant be used in ConvNets
Issue -
State: open - Opened by littletomatodonkey over 1 year ago
- 1 comment
#37 - SmoothQuant for llama
Issue -
State: open - Opened by shhn1 over 1 year ago
- 2 comments
#36 - How to use SmoothQuant in FasterTransformer?
Issue -
State: open - Opened by jiangsongHW over 1 year ago
- 1 comment
#35 - can provide the relize of GLM model or ohter model which is in you paper?
Issue -
State: open - Opened by o-github-o over 1 year ago
#34 - Doesn't work on gpt models.
Issue -
State: closed - Opened by YaphetS-X over 1 year ago
- 2 comments
#33 - git lfs pull ERROR
Issue -
State: closed - Opened by lingffff over 1 year ago
- 2 comments
#32 - How does it compares to Deepspeed?
Issue -
State: open - Opened by LifeIsStrange over 1 year ago
#31 - git lfs is currently down,could you solve this problem?
Issue -
State: closed - Opened by Anychnn over 1 year ago
- 1 comment
#30 - No module named 'torch_int'
Issue -
State: closed - Opened by liangxiaoyun over 1 year ago
- 1 comment
#29 - 4bit weight quantization? 4bit activation quantization?
Issue -
State: open - Opened by Thomas-MMJ over 1 year ago
- 1 comment
#28 - what is the transformers' version
Issue -
State: closed - Opened by lippman1125 over 1 year ago
#27 - How to implement this method combinded with decoder
Issue -
State: open - Opened by lileilai over 1 year ago
- 2 comments
#26 - Support for LLAMA
Issue -
State: closed - Opened by fmac2000 over 1 year ago
- 2 comments
#25 - Out of memory
Issue -
State: open - Opened by lileilai over 1 year ago
- 1 comment
#24 - Is O1 and O2 version for smoothquant available?
Issue -
State: open - Opened by Ther-nullptr over 1 year ago
#23 - Missing the activation scales of opt-125m
Issue -
State: closed - Opened by Ther-nullptr over 1 year ago
- 1 comment
#22 - What is the difference between `get_act_scales` and `get_static_decoder_layer_scales`
Issue -
State: open - Opened by CaffreyR almost 2 years ago
- 1 comment
#21 - Post-LayerNorm support
Issue -
State: open - Opened by minghaoBD almost 2 years ago
- 1 comment
#20 - mseznec/export weights for ft fixes
Pull Request -
State: closed - Opened by mickaelseznec almost 2 years ago
#19 - add option to export scaling factors for FT
Pull Request -
State: closed - Opened by mickaelseznec almost 2 years ago
- 1 comment
#18 - Visualization tool
Issue -
State: open - Opened by ArulselvanMadhavan almost 2 years ago
- 2 comments
#17 - Size mismatch
Issue -
State: open - Opened by anujnayyar1 almost 2 years ago
- 1 comment
#16 - Bloom code
Issue -
State: open - Opened by Toan-Do almost 2 years ago
- 2 comments
#15 - paper says smoothing all linear layers, but code seems to smooth only the qkv projection in attention and the first fc in ffn?
Issue -
State: closed - Opened by chenho74 almost 2 years ago
- 5 comments
#14 - Test smoothquant accuracy for just fc2 layer
Issue -
State: closed - Opened by erichan1 almost 2 years ago
- 7 comments
#13 - error encounctered when loading act_scales
Issue -
State: closed - Opened by chenho74 almost 2 years ago
- 2 comments
#12 - different smoothquant levels
Issue -
State: closed - Opened by erichan1 almost 2 years ago
- 3 comments
#11 - Input to ReLU is quantized to int8? An error in quantization_flow.png?
Issue -
State: closed - Opened by chenho74 almost 2 years ago
- 2 comments
#10 - The Naive W8A8 Quantized model accuracy of medium size model (e.g opt-2.7b)
Issue -
State: closed - Opened by LiuShixing almost 2 years ago
- 2 comments
#9 - Merge branch 'pr/mickaelseznec/3' into mickaelseznec-mseznec/fastertransformer-compat
Pull Request -
State: closed - Opened by Guangxuan-Xiao almost 2 years ago
#8 - Does it support freeze the quantized pretrained model then use prefix tuning ?
Issue -
State: closed - Opened by LiuShixing almost 2 years ago
- 1 comment
#7 - Latency calculation for OPT 175B
Issue -
State: closed - Opened by tangbinh almost 2 years ago
- 2 comments
#6 - SmoothQuant real-INT8 inference for PyTorch
Pull Request -
State: closed - Opened by Guangxuan-Xiao almost 2 years ago
Labels: enhancement
#5 - Eta for PyTorch
Issue -
State: closed - Opened by erichan1 almost 2 years ago
- 2 comments
#4 - Support for quantizing bf16 model
Issue -
State: closed - Opened by erichan1 almost 2 years ago
- 2 comments
#3 - mseznec/fastertransformer-compat
Pull Request -
State: closed - Opened by mickaelseznec almost 2 years ago
- 2 comments
#2 - Calculating quantization scales for new models?
Issue -
State: closed - Opened by singularperturbation almost 2 years ago
- 1 comment
#1 - use smoothquant in different models architucture [proposed Label] Question
Issue -
State: closed - Opened by deep-matter almost 2 years ago
- 1 comment