airaria/TextBrewer issues and pull requests

#121 - 学生模型权重初始化问题

Issue - State: open - Opened by cgh-code777 6 months ago - 2 comments

#120 - 请问支持BERT-of-Theseus的蒸馏方式吗

Issue - State: closed - Opened by zhanghanweii 11 months ago - 3 comments
Labels: stale

#119 - 麻烦问下，目前支持llama模型吗

Issue - State: closed - Opened by StevensPrime about 1 year ago - 2 comments
Labels: stale

#118 - 可以使用chatgpt蒸馏到bert或者T5吗？

Issue - State: closed - Opened by Hoogck about 1 year ago - 2 comments
Labels: stale

#117 - 蒸馏后的模型进行evaluate，报错AxisError: axis 2 is out of bounds for array of dimension 1

Issue - State: closed - Opened by MrRace about 1 year ago - 5 comments
Labels: stale

#116 - 老师您好，我想问一下，比如roberta蒸馏到tinybert，中间的hidden是通过线性层拉到同样的维度去算mse，那在推理的时候岂不是这些经过梯度更新的线性层毫无作用？那请问这些线性层仅仅就是为了调整维度？

Issue - State: closed - Opened by lean-wang about 1 year ago - 2 comments

#115 - 老师，您好，请问有多任务多教师的蒸馏的demo吗？

Issue - State: closed - Opened by lean-wang about 1 year ago - 4 comments
Labels: stale

#114 - msra_ner.ipynb最后的trainer.evaluate()显示CUDA out of memory，请问训练的显存要求是多大？十分感谢！

Issue - State: closed - Opened by jinxiaolinlin over 1 year ago - 2 comments
Labels: stale

#113 - 不同维度蒸馏有对应的例子吗，从768降到256

Issue - State: closed - Opened by weidalan over 1 year ago - 4 comments
Labels: stale

#112 - notebook_examples/msra_ner.ipynb 运行报错

Issue - State: closed - Opened by MrRace over 1 year ago - 12 comments
Labels: stale

#111 - 关于ner数据的处理

Issue - State: closed - Opened by Soulscb over 1 year ago - 2 comments
Labels: stale

#110 - 在VisionTransformer

Issue - State: closed - Opened by zym1599 over 1 year ago - 5 comments
Labels: stale

#109 - CVE-2007-4559 Patch

Pull Request - State: closed - Opened by TrellixVulnTeam almost 2 years ago - 2 comments
Labels: stale

#108 - Does it support translation model?

Issue - State: closed - Opened by AIikai almost 2 years ago - 2 comments
Labels: stale

#107 - How about the distillation effect of gpt2 ？

Issue - State: closed - Opened by xk503775229 almost 2 years ago - 2 comments
Labels: stale

#106 - Picking right layers

Issue - State: closed - Opened by patryk-at-pieces almost 2 years ago - 3 comments
Labels: stale

#105 - Fix typo errors in README.md

Pull Request - State: closed - Opened by mehdie79 almost 2 years ago

#104 - Show the progress bar when training.

Issue - State: closed - Opened by Gridnn about 2 years ago - 3 comments
Labels: stale

#103 - interpreting intermediate matches

Issue - State: closed - Opened by kaliaanup about 2 years ago - 5 comments
Labels: stale

#102 - Where to find gs4210.pkl file or how to generate it ? thanks

Issue - State: closed - Opened by Shaukat-Hussain about 2 years ago - 2 comments
Labels: stale

#101 - pre-trained student weights

Issue - State: closed - Opened by roymiles about 2 years ago - 3 comments
Labels: stale

#100 - Modify batch type checking in distiller_utils.py

Pull Request - State: closed - Opened by yourplatanus about 2 years ago - 2 comments

#99 - Add conditional statement in distiller_utils.py

Pull Request - State: closed - Opened by yourplatanus about 2 years ago - 1 comment

#98 - TextBrewer/src/textbrewer/distiller_utils.py get_outputs_from_batch fails tocheck dicts properly for maskedLM

Issue - State: closed - Opened by AddedK over 2 years ago - 4 comments

#97 - TextBrewer/examples/notebook_examples/msra_ner.ipynb have bug?

Issue - State: closed - Opened by 645187919 over 2 years ago - 1 comment

#96 - 请问可以直接用于unilm中的NLU和NLG吗？

Issue - State: closed - Opened by cingtiye over 2 years ago - 1 comment
Labels: stale

#95 - 如何实现early stopping

Issue - State: closed - Opened by yuange555 over 2 years ago

#94 - 请问有添加早停机制的打算吗？

Issue - State: closed - Opened by catqaq over 2 years ago - 3 comments

#93 - 如何蒸馏不分层的新特征？

Issue - State: closed - Opened by catqaq over 2 years ago - 2 comments

#92 - 请问有针对BertForMaskedLM的蒸馏示例吗

Issue - State: closed - Opened by dongteng over 2 years ago - 3 comments
Labels: stale

#91 - 关于损失函数的问题

Issue - State: closed - Opened by LLLLLLoki over 2 years ago - 1 comment

#90 - 关于损失函数的问题

Issue - State: closed - Opened by LLLLLLoki over 2 years ago - 5 comments
Labels: stale

#89 - 关于任务无关的蒸馏

Issue - State: closed - Opened by savannahfan over 2 years ago - 1 comment

#88 - mnli main_train

Issue - State: closed - Opened by Soulscb over 2 years ago - 7 comments
Labels: stale

#87 - 是否能做预训练的蒸馏

Issue - State: closed - Opened by YoungErm over 2 years ago - 3 comments
Labels: stale

#86 - 中间层Loss，会去更新后面网络的参数吗

Issue - State: closed - Opened by DvHuang over 2 years ago - 6 comments
Labels: stale

#85 - msra_ner.ipynb复现代码bug

Issue - State: closed - Opened by HXYstudy over 2 years ago - 7 comments
Labels: stale

#84 - can't open examples/notebook_examples/sst2.ipynb in colab

Issue - State: closed - Opened by josephcui over 2 years ago - 2 comments
Labels: stale

#83 - Is there any methods can simplify the output during training?

Issue - State: closed - Opened by HarryHe11 over 2 years ago - 4 comments
Labels: stale

#82 - Notebook JSON is invalid

Issue - State: closed - Opened by tanyaroosta almost 3 years ago - 1 comment

#81 - random_token_example error

Issue - State: closed - Opened by tanyaroosta almost 3 years ago - 5 comments
Labels: stale

#80 - PyTorch Lightning

Issue - State: closed - Opened by tchaton almost 3 years ago - 2 comments
Labels: stale

#79 - examples/random_token_example, when I run python distill.py then exception Killed

Issue - State: closed - Opened by dulante00 almost 3 years ago - 2 comments
Labels: stale

#78 - CUDA Error with your Notebook Example

Issue - State: closed - Opened by cabisarri almost 3 years ago - 2 comments
Labels: stale

#77 - Cuda Error in scripts

Issue - State: closed - Opened by cabisarri almost 3 years ago

#76 - 使用自定义的网络结构

Issue - State: closed - Opened by zhangatao almost 3 years ago - 2 comments

#75 - Add note examples

Pull Request - State: closed - Opened by lokwq almost 3 years ago

#74 - Add note examples test2

Pull Request - State: closed - Opened by lokwq almost 3 years ago

#73 - Add note examples

Pull Request - State: closed - Opened by lokwq almost 3 years ago

#72 - 关于MNLI任务复现问题

Issue - State: closed - Opened by sunnan-nn about 3 years ago - 2 comments
Labels: stale

#71 - 请问此框架的loss函数是否存在问题？

Issue - State: closed - Opened by Jay2Coomzz about 3 years ago - 2 comments
Labels: stale

#70 - 模型没有被训练，每个epoch保存的模型weight一模一样。

Issue - State: closed - Opened by Jay2Coomzz about 3 years ago - 4 comments
Labels: stale

#69 - GeneralDistiller的train函数报错

Issue - State: closed - Opened by Jay2Coomzz about 3 years ago - 6 comments
Labels: stale

#68 - 关于中文阅读理解数据集t4学生模型蒸馏配置的问题

Issue - State: closed - Opened by SouthBays about 3 years ago - 5 comments
Labels: stale

#67 - examples/mnli_example: run_mnli_train.sh 模型没被训练

Issue - State: closed - Opened by Yin169 about 3 years ago - 15 comments
Labels: stale

#66 - Data preparation

Issue - State: closed - Opened by liuhl-source about 3 years ago - 4 comments
Labels: stale

#65 - Type Error: init() got an unexpected keyword argument 'unk_token'

Issue - State: closed - Opened by SnailDM about 3 years ago - 4 comments
Labels: stale

#64 - earth mover distance的优化求解是如何实现可导的

Issue - State: closed - Opened by Youarerare about 3 years ago - 2 comments
Labels: stale

#63 - 维度不匹配问题

Issue - State: closed - Opened by iamxinxin about 3 years ago - 1 comment

#62 - 训练mnli任务教师模型准确率很低？

Issue - State: closed - Opened by binglinchengxiash about 3 years ago - 1 comment

#61 - 请问下在mnli任务训练中，第一步训练的是教师模型，配置为啥加载的是学生模型的配置？

Issue - State: closed - Opened by binglinchengxiash about 3 years ago - 2 comments

#60 - mnli问题

Issue - State: closed - Opened by binglinchengxiash about 3 years ago - 2 comments

#59 - 运行时错误，询问。

Issue - State: closed - Opened by Noeverer about 3 years ago - 4 comments
Labels: stale

#58 - 运行时出现错误

Issue - State: closed - Opened by Noeverer about 3 years ago

#57 - 无法复习结果

Issue - State: closed - Opened by houpanpan over 3 years ago - 6 comments
Labels: stale

#56 - kd_loss不下降，准确率也基本不动，但是隐层好像拟合的很好，是什么原因呢？

Issue - State: closed - Opened by Youarerare over 3 years ago - 1 comment

#55 - RuntimeError: Incoming model is an instance of torch.nn.parallel.DataParallel. Parallel wrappers should only be applied to the model(s) AFTER the model(s) have been returned from amp.initialize.

Issue - State: closed - Opened by Youarerare over 3 years ago - 4 comments

#54 - Rename DistillMultiBert.json to DistillMultiBertToTiny.json

Pull Request - State: closed - Opened by DA-southampton over 3 years ago

#53 - 更换模型无法复现finetune结果

Issue - State: closed - Opened by houpanpan over 3 years ago - 3 comments
Labels: stale

#52 - 直接运行ner_ElectraTrain_dist.sh，在半个epoch F1可以达到90以上。你再检查下数据或模型是不是有问题？

Issue - State: closed - Opened by houpanpan over 3 years ago - 5 comments
Labels: stale

#51 - NER 任务中teacher model 和student model的matches如匹配

Issue - State: closed - Opened by houpanpan over 3 years ago - 3 comments
Labels: stale

#50 - 不支持多标签分类任务蒸馏的原因

Issue - State: closed - Opened by codemayq over 3 years ago - 3 comments

#49 - msra命名实体识别任务无法复现结果

Issue - State: closed - Opened by houpanpan over 3 years ago - 13 comments
Labels: stale

#48 - msra命名实体识别train 过程无法复现结果

Issue - State: closed - Opened by houpanpan over 3 years ago - 2 comments
Labels: stale

#47 - update mnli_exmaple and bert-emd

Pull Request - State: closed - Opened by airaria over 3 years ago

#46 - update mnli_exmaple and bert-emd

Pull Request - State: closed - Opened by airaria over 3 years ago

#45 - 请问一下msra实体识别任务中，matches没有给参数，默认是用那种蒸馏方式啊？

Issue - State: closed - Opened by SunyanGu over 3 years ago - 3 comments
Labels: stale

#44 - How to customize teacher and student models, such as VGG in CV？

Issue - State: closed - Opened by LiXuanming over 3 years ago - 3 comments
Labels: stale

#43 - [Question] 论文中数据增强方法的一些疑惑？

Issue - State: closed - Opened by wangyuxinwhy over 3 years ago - 2 comments
Labels: stale

#42 - 关于初始化训练好的student模型做推理预测问题

Issue - State: closed - Opened by chenjun0210 over 3 years ago - 1 comment

#41 - mnli_example实验相关

Issue - State: closed - Opened by qq260612718 over 3 years ago - 6 comments
Labels: stale

#40 - 请问作者有做中文数据增强吗？

Issue - State: closed - Opened by hahlw over 3 years ago - 5 comments
Labels: stale

#39 - Hard loss 有用吗？

Issue - State: closed - Opened by sz128 over 3 years ago - 12 comments
Labels: stale

#38 - TypeError: model should be either torch.nn.Module or a dict

Issue - State: closed - Opened by dakota987 over 3 years ago - 1 comment

#37 - How to perform data augmentation?

Issue - State: closed - Opened by LorrinWWW over 3 years ago - 7 comments

#36 - 关于MMD loss

Issue - State: closed - Opened by sz128 over 3 years ago - 5 comments
Labels: stale

#35 - 蒸馏的时候映射层的对应关系是什么样的

Issue - State: closed - Opened by 652994331 over 3 years ago - 3 comments

#34 - 对roberta-wwm-ext蒸馏遇到维度不匹配

Issue - State: closed - Opened by junrong1 over 3 years ago - 6 comments

#33 - 蒸馏效果不好，请问一下怎么解决？

Issue - State: closed - Opened by cgq0816 over 3 years ago - 8 comments
Labels: stale

#32 - shall we give some examples about text classification like cmrc、ner tasks？

Issue - State: closed - Opened by AlexYoung757 over 3 years ago - 3 comments
Labels: stale

#31 - the link is fail

Issue - State: closed - Opened by Soulscb over 3 years ago - 3 comments
Labels: stale

#30 - 对BERT-wwm-ext进行蒸馏时遇到以下问题，代码已贴出

Issue - State: closed - Opened by cgq0816 over 3 years ago - 10 comments

#29 - V0.2.1dev

Pull Request - State: closed - Opened by airaria over 3 years ago - 2 comments

#28 - 蒸馏后学生模型乱码

Issue - State: closed - Opened by peixin-lin over 3 years ago - 7 comments
Labels: stale

#27 - self.d_config.is_caching_logits=True about results_T

Issue - State: closed - Opened by chenhaoenen almost 4 years ago - 1 comment

#26 - 请问有没有BERT蒸馏到简单模型的sample，比如说BiGRU、CNN之类的？

Issue - State: closed - Opened by yongzhuo almost 4 years ago - 1 comment

#25 - What is CustomMatch?

Issue - State: closed - Opened by zhujiangang almost 4 years ago - 1 comment

#24 - 命名`temperature_scheduler`的合理性

Issue - State: closed - Opened by shenfe almost 4 years ago - 1 comment

#23 - 在对中间层求mse loss的时候会出现这层loss 不收敛，其他不同种类的loss 都会有不同程度的收敛？

Issue - State: closed - Opened by LLLLLLoki almost 4 years ago - 6 comments

#22 - Question on different input format for teacher and student models

Issue - State: closed - Opened by meisyarahd almost 4 years ago - 4 comments

GitHub / airaria/TextBrewer issues and pull requests