Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / mit-han-lab/smoothquant issues and pull requests

#95 - Why only 4 layers?

Issue - State: open - Opened by VincentXWD 2 months ago

#94 - Support for Qwen2

Issue - State: open - Opened by JiaXinLI98 3 months ago

#92 - How to quantize llama3?

Issue - State: open - Opened by jpyo0803 4 months ago

#91 - export_int8_model.py size issue

Issue - State: open - Opened by ljhyeok123 4 months ago - 1 comment

#90 - quantify other models,

Issue - State: open - Opened by AlexMa0 4 months ago

#89 - best Alpha value for Qwen 1.5 72B

Issue - State: open - Opened by Riskin1999 5 months ago

#88 - how to draw this result directly? is there any script?

Issue - State: open - Opened by foreverpiano 5 months ago - 1 comment

#87 - Huggingface_Hub Issue

Issue - State: open - Opened by faize5 6 months ago - 2 comments

#86 - Can SmoothQuant be used on ViT models?

Issue - State: open - Opened by n9s8a 7 months ago

#85 - Whether it can be supported stable diffusion

Issue - State: open - Opened by songh11 7 months ago

#84 - Inquiry about Int8 BMM overflow

Issue - State: open - Opened by luzai 7 months ago

#82 - how to use model.generate with smoothquant models

Issue - State: open - Opened by Hao-YunDeng 7 months ago

#81 - which version of transformer and datasets package do we need for this repo?

Issue - State: open - Opened by ghost 8 months ago - 2 comments

#80 - adjust activations

Issue - State: open - Opened by muzi0111 8 months ago

#79 - Question: why not need explicit scaling for activation X

Issue - State: open - Opened by ghost 8 months ago - 2 comments

#78 - RuntimeError: "clamp_min_cpu" not implemented for 'Half'

Issue - State: closed - Opened by ghost 8 months ago - 1 comment

#77 - Weight migration for Llama?

Issue - State: open - Opened by atyshka 8 months ago

#76 - Question about code

Issue - State: open - Opened by Lucky-Lance 8 months ago

#75 - How Can I Peft the Smoothquanted LLM?

Issue - State: open - Opened by LameloBally 8 months ago - 1 comment

#74 - bmm_s8t_s8n_s8t cannot run with this shape

Issue - State: closed - Opened by xiachong94 8 months ago

#72 - set quantize_output True the acc drop to 0

Issue - State: open - Opened by lonleyodd 10 months ago

#70 - w8a8 Does it require dequantization during forward inference?

Issue - State: open - Opened by shatealaboxiaowang 11 months ago - 1 comment

#68 - Got accuray=0 when trying _real_int8_demo.ipynb

Issue - State: open - Opened by leocnj 11 months ago

#67 - how to reproduce ppl of wikitext2?

Issue - State: open - Opened by Arthur-Ling 11 months ago - 1 comment

#66 - Activation scales for bloomz 7.1b

Issue - State: open - Opened by bil-ash 11 months ago - 1 comment

#63 - Demo code for Bloom model?

Issue - State: open - Opened by llCurious 12 months ago

#62 - Inference time decreases only by 7.5% on opt-6.7B

Issue - State: open - Opened by FurryMushroom about 1 year ago - 1 comment

#61 - llama-2-chat demo

Pull Request - State: closed - Opened by liquanfeng about 1 year ago

#60 - pickle.UnpicklingError: invalid load key, 'v'.

Issue - State: open - Opened by baiSongL about 1 year ago - 2 comments

#59 - failed to run int8 opt

Issue - State: closed - Opened by jackzhou121 about 1 year ago - 2 comments

#58 - UnpicklingError: invalid load key, 'v'.

Issue - State: closed - Opened by FurryMushroom about 1 year ago - 7 comments

#57 - add llama model support

Pull Request - State: open - Opened by AniZpZ about 1 year ago

#56 - which is faster between smoothquant and autogptq?

Issue - State: open - Opened by InkdyeHuang about 1 year ago

#55 - [BUG] Int8 inference with torch-int encounter errors

Issue - State: open - Opened by WelY1 about 1 year ago

#54 - How to calculate Alpha?

Issue - State: open - Opened by Triple-L about 1 year ago

#53 - Why do different models have the same size?

Issue - State: open - Opened by WelY1 about 1 year ago

#52 - Activation Channel Scales and Calibration

Issue - State: open - Opened by 520zw about 1 year ago - 1 comment

#50 - circular import

Issue - State: open - Opened by breaddance over 1 year ago

#48 - How to reproduce the performance described in the paper

Issue - State: open - Opened by rolex-cjj over 1 year ago - 2 comments

#47 - How to conduct zero-shot experiments?

Issue - State: open - Opened by moodom over 1 year ago

#46 - Error loading `AutoModelForCausalLM` in `examples/generate_act_scales.py`

Issue - State: closed - Opened by julian-q over 1 year ago - 1 comment

#45 - Could not open smoothquant_opt_demo.ipynb

Issue - State: open - Opened by foreverpiano over 1 year ago - 1 comment

#44 - How can I make it support Bloom-7b?

Issue - State: open - Opened by moonlightian over 1 year ago

#42 - Accuracy drop for Llama

Issue - State: open - Opened by fmo-mt over 1 year ago - 9 comments

#41 - No module named 'torch_int'

Issue - State: open - Opened by kaust2018 over 1 year ago - 7 comments

#40 - support GPTNEOX model

Issue - State: open - Opened by amazingkmy over 1 year ago

#38 - How can smoothquant be used in ConvNets

Issue - State: open - Opened by littletomatodonkey over 1 year ago - 1 comment

#37 - SmoothQuant for llama

Issue - State: open - Opened by shhn1 over 1 year ago - 2 comments

#36 - How to use SmoothQuant in FasterTransformer?

Issue - State: open - Opened by jiangsongHW over 1 year ago - 1 comment

#34 - Doesn't work on gpt models.

Issue - State: closed - Opened by YaphetS-X over 1 year ago - 2 comments

#33 - git lfs pull ERROR

Issue - State: closed - Opened by lingffff over 1 year ago - 2 comments

#32 - How does it compares to Deepspeed?

Issue - State: open - Opened by LifeIsStrange over 1 year ago

#31 - git lfs is currently down,could you solve this problem?

Issue - State: closed - Opened by Anychnn over 1 year ago - 1 comment

#30 - No module named 'torch_int'

Issue - State: closed - Opened by liangxiaoyun over 1 year ago - 1 comment

#29 - 4bit weight quantization? 4bit activation quantization?

Issue - State: open - Opened by Thomas-MMJ over 1 year ago - 1 comment

#28 - what is the transformers' version

Issue - State: closed - Opened by lippman1125 over 1 year ago

#27 - How to implement this method combinded with decoder

Issue - State: open - Opened by lileilai over 1 year ago - 2 comments

#26 - Support for LLAMA

Issue - State: closed - Opened by fmac2000 over 1 year ago - 2 comments

#25 - Out of memory

Issue - State: open - Opened by lileilai over 1 year ago - 1 comment

#24 - Is O1 and O2 version for smoothquant available?

Issue - State: open - Opened by Ther-nullptr over 1 year ago

#23 - Missing the activation scales of opt-125m

Issue - State: closed - Opened by Ther-nullptr over 1 year ago - 1 comment

#21 - Post-LayerNorm support

Issue - State: open - Opened by minghaoBD almost 2 years ago - 1 comment

#20 - mseznec/export weights for ft fixes

Pull Request - State: closed - Opened by mickaelseznec almost 2 years ago

#19 - add option to export scaling factors for FT

Pull Request - State: closed - Opened by mickaelseznec almost 2 years ago - 1 comment

#18 - Visualization tool

Issue - State: open - Opened by ArulselvanMadhavan almost 2 years ago - 2 comments

#17 - Size mismatch

Issue - State: open - Opened by anujnayyar1 almost 2 years ago - 1 comment

#16 - Bloom code

Issue - State: open - Opened by Toan-Do almost 2 years ago - 2 comments

#14 - Test smoothquant accuracy for just fc2 layer

Issue - State: closed - Opened by erichan1 almost 2 years ago - 7 comments

#13 - error encounctered when loading act_scales

Issue - State: closed - Opened by chenho74 almost 2 years ago - 2 comments

#12 - different smoothquant levels

Issue - State: closed - Opened by erichan1 almost 2 years ago - 3 comments

#11 - Input to ReLU is quantized to int8? An error in quantization_flow.png?

Issue - State: closed - Opened by chenho74 almost 2 years ago - 2 comments

#10 - The Naive W8A8 Quantized model accuracy of medium size model (e.g opt-2.7b)

Issue - State: closed - Opened by LiuShixing almost 2 years ago - 2 comments

#8 - Does it support freeze the quantized pretrained model then use prefix tuning ?

Issue - State: closed - Opened by LiuShixing almost 2 years ago - 1 comment

#7 - Latency calculation for OPT 175B

Issue - State: closed - Opened by tangbinh almost 2 years ago - 2 comments

#6 - SmoothQuant real-INT8 inference for PyTorch

Pull Request - State: closed - Opened by Guangxuan-Xiao almost 2 years ago
Labels: enhancement

#5 - Eta for PyTorch

Issue - State: closed - Opened by erichan1 almost 2 years ago - 2 comments

#4 - Support for quantizing bf16 model

Issue - State: closed - Opened by erichan1 almost 2 years ago - 2 comments

#3 - mseznec/fastertransformer-compat

Pull Request - State: closed - Opened by mickaelseznec almost 2 years ago - 2 comments

#2 - Calculating quantization scales for new models?

Issue - State: closed - Opened by singularperturbation almost 2 years ago - 1 comment

#1 - use smoothquant in different models architucture [proposed Label] Question

Issue - State: closed - Opened by deep-matter almost 2 years ago - 1 comment