Beomi/InfiniTransformer issues and pull requests

#28 - Question related to the loss calculation

Issue - State: open - Opened by zeyuliu1037 2 months ago

#28 - Question related to the loss calculation

Issue - State: open - Opened by zeyuliu1037 2 months ago

#27 - Support Zero-3?

Issue - State: open - Opened by WF0511 7 months ago - 1 comment

#27 - Support Zero-3?

Issue - State: open - Opened by WF0511 7 months ago - 1 comment

#26 - Issue while runing test_train.small.gemma.infini.py

Issue - State: open - Opened by raghavgarg97 8 months ago - 2 comments

#26 - Issue while runing test_train.small.gemma.infini.py

Issue - State: open - Opened by raghavgarg97 8 months ago - 2 comments

#25 - Model loses information very quickly

Issue - State: open - Opened by Lazy3valuation 8 months ago - 2 comments

#25 - Model loses information very quickly

Issue - State: open - Opened by Lazy3valuation 8 months ago - 2 comments

#24 - BitLinear

Issue - State: open - Opened by DewEfresh 8 months ago

#24 - BitLinear

Issue - State: open - Opened by DewEfresh 8 months ago

#23 - About memory missing location information

Issue - State: open - Opened by LzhinFdu 9 months ago - 6 comments

#23 - About memory missing location information

Issue - State: open - Opened by LzhinFdu 9 months ago - 6 comments

#22 - What is the min GPU memory required to fine-tune the model?

Issue - State: open - Opened by Ozawa333 9 months ago

#22 - What is the min GPU memory required to fine-tune the model?

Issue - State: open - Opened by Ozawa333 9 months ago

#21 - mem and norm_term is nan？

Issue - State: closed - Opened by DavideHe 9 months ago - 15 comments

#20 - Segment and block size error

Issue - State: closed - Opened by Lazy3valuation 9 months ago - 1 comment

#19 - Update README.md

Pull Request - State: closed - Opened by eltociear 10 months ago - 1 comment

#19 - Update README.md

Pull Request - State: closed - Opened by eltociear 10 months ago - 1 comment

#18 - Minor fix typo in `dtype_memory_size_dict`

Pull Request - State: closed - Opened by Liberatedwinner 10 months ago - 1 comment

#18 - Minor fix typo in `dtype_memory_size_dict`

Pull Request - State: closed - Opened by Liberatedwinner 10 months ago - 1 comment

#17 - Are there any trained InfinityTransformer weights available?

Issue - State: open - Opened by PasiKoodaa 10 months ago - 1 comment

#17 - Are there any trained InfinityTransformer weights available?

Issue - State: open - Opened by PasiKoodaa 10 months ago - 1 comment

#16 - Inference code (with Segments)

Issue - State: closed - Opened by Beomi 10 months ago

#16 - Inference code (with Segments)

Issue - State: closed - Opened by Beomi 10 months ago

#15 - Memory does not use PE

Issue - State: closed - Opened by Beomi 10 months ago
Labels: bug

#15 - Memory does not use PE

Issue - State: closed - Opened by Beomi 10 months ago
Labels: bug

#14 - Memory should be per layer

Issue - State: closed - Opened by Beomi 10 months ago
Labels: bug

#14 - Memory should be per layer

Issue - State: closed - Opened by Beomi 10 months ago
Labels: bug

#13 - Limitations of the method

Issue - State: open - Opened by fyang064 10 months ago - 2 comments

#13 - Limitations of the method

Issue - State: open - Opened by fyang064 10 months ago - 2 comments

#12 - Add Memory optimized model

Pull Request - State: closed - Opened by Beomi 10 months ago

#11 - Model generating random sequence

Issue - State: open - Opened by Lazy3valuation 10 months ago - 8 comments

#10 - Suggest to use the constant memory gradient computation in Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

Issue - State: closed - Opened by huyufeng0407 10 months ago

#9 - load model failed

Issue - State: closed - Opened by 18140663659 10 months ago - 4 comments

#8 - question about norm_term_broadcastable

Issue - State: closed - Opened by ictzyqq 10 months ago - 5 comments
Labels: bug

#7 - config no attn_implementation = "eager"

Issue - State: closed - Opened by a1exyu 10 months ago - 4 comments

#6 - Code not running on GPU

Issue - State: closed - Opened by Lazy3valuation 10 months ago - 6 comments

#5 - Discord server for this?

Issue - State: closed - Opened by Lazy3valuation 10 months ago

#4 - question about activation function

Issue - State: closed - Opened by huyufeng0407 10 months ago - 2 comments
Labels: bug

#3 - can you support llama2 model?

Issue - State: closed - Opened by awzhgw 10 months ago - 1 comment
Labels: enhancement

#2 - Segment-Wise Attention

Issue - State: closed - Opened by jlamprou 10 months ago - 11 comments

#1 - When will the code be made public, please?

Issue - State: open - Opened by zzr-idam 10 months ago - 3 comments

GitHub / Beomi/InfiniTransformer issues and pull requests