Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / majumderb/rezero issues and pull requests
#19 - Bump torch from 1.4.0 to 2.2.0
Pull Request -
State: open - Opened by dependabot[bot] 4 months ago
Labels: dependencies
#18 - Add batch_first, dtype, device arguments
Pull Request -
State: open - Opened by lericson almost 2 years ago
#17 - Learning rate of the Param resweight
Issue -
State: open - Opened by Polarisjame about 2 years ago
#16 - resweight is almost 0
Issue -
State: open - Opened by burcehan about 3 years ago
- 1 comment
#15 - Is ReZero applicable to fine-tuning?
Issue -
State: open - Opened by encounter1997 over 3 years ago
#14 - weight decay for the resweight?
Issue -
State: open - Opened by Kyeongpil about 4 years ago
- 2 comments
#13 - Can you relaese the code for ResNet-56 in Table2 ?
Issue -
State: open - Opened by cuge1995 over 4 years ago
#12 - The description of RZTXDecoderLayer is the same as EncoderLayer
Issue -
State: closed - Opened by jiang-yuan over 4 years ago
#11 - Sry guys but your paper is not worth more than zero :)
Issue -
State: closed - Opened by AmorfEvo over 4 years ago
- 1 comment
#10 - can rezero be applied to cnn ?
Issue -
State: closed - Opened by carr123 over 4 years ago
- 1 comment
#9 - The order of dropout and *resweight
Issue -
State: closed - Opened by OneDirection9 over 4 years ago
- 3 comments
#8 - when apply rezero to bert or gpt, get NAN gradients
Issue -
State: open - Opened by yyht over 4 years ago
- 5 comments
#7 - Does it work in not so deep architectures?
Issue -
State: closed - Opened by wotulong over 4 years ago
- 3 comments
#6 - Relationship between ReZero and Zero gamma trick
Issue -
State: closed - Opened by hukkai over 4 years ago
- 2 comments
#5 - does rezero work in machine translation tasks?
Issue -
State: closed - Opened by zherowolf over 4 years ago
- 3 comments
#4 - rezero with norm
Issue -
State: closed - Opened by AllenDun over 4 years ago
- 1 comment
#3 - I don't see any other application other than NLP?
Issue -
State: closed - Opened by nile649 over 4 years ago
- 1 comment
#2 - Update README.md : add BiBTex header
Pull Request -
State: closed - Opened by mpariente over 4 years ago
#1 - Can the method be applied to CNN?
Issue -
State: closed - Opened by JunMa11 over 4 years ago
- 1 comment