lucidrains/routing-transformer issues and pull requests

#33 - How to reconstruct the full attention matrix?

Issue - State: open - Opened by FarzanT over 2 years ago - 2 comments

#32 - ONNX export hangs

Issue - State: closed - Opened by genolve about 3 years ago - 1 comment

#31 - Compound words

Issue - State: closed - Opened by wingedsheep over 3 years ago

#30 - Enquiry about BPC Calculation

Issue - State: open - Opened by ShiweiLiuFdu almost 4 years ago

#29 - 大佬可否细讲讲KmeansAttention？原理和动机

Issue - State: open - Opened by guotong1988 almost 4 years ago

#28 - Could you please illustrate more about KmeansAttention? Thank you very much!

Issue - State: open - Opened by guotong1988 almost 4 years ago

#27 - input_mask behavior

Issue - State: open - Opened by AliOskooeiTR about 4 years ago

#26 - TPU support

Issue - State: open - Opened by abcp4 over 4 years ago

#25 - recieves_context cause tensor mismatch error

Issue - State: closed - Opened by WeForgot over 4 years ago - 1 comment

#24 - results on wiki103 or enwiki8

Issue - State: open - Opened by yelongshen over 4 years ago

#23 - README Typo

Issue - State: closed - Opened by rainmaker712 over 4 years ago - 1 comment

#22 - Music Routing Transformer Colab

Issue - State: open - Opened by asigalov61 over 4 years ago - 1 comment

#21 - Building and training a RoutingTransformerEncDec from pre-trained RoutingTransformerLMs

Issue - State: closed - Opened by AliOskooeiTR over 4 years ago - 7 comments

#20 - LM slower than the encoder-decoder with the same depth and max_seq_len, window size

Issue - State: open - Opened by AliOskooeiTR over 4 years ago - 3 comments

#19 - Issue about input shape

Issue - State: closed - Opened by guohanyang1994 over 4 years ago - 1 comment

#18 - Usage for image generation

Issue - State: closed - Opened by Hosein47 over 4 years ago - 11 comments

#17 - Sequence length limited

Issue - State: closed - Opened by guohanyang1994 over 4 years ago - 14 comments

#16 - Report an error in the training example of enwik8_simple

Issue - State: closed - Opened by guokr233 over 4 years ago - 1 comment

#15 - When running the RoutingTransformerLM's example, there is an error that the tensor dimension does not match

Issue - State: closed - Opened by guokr233 over 4 years ago - 1 comment

#14 - Add ReZero and ScaleNorm support

Pull Request - State: closed - Opened by tomweingarten almost 5 years ago - 7 comments

#13 - MoE doesn't work with reversible layers

Issue - State: closed - Opened by tomweingarten almost 5 years ago - 2 comments

#12 - Batch size 1

Issue - State: closed - Opened by matthew-jurewicz almost 5 years ago - 2 comments

#11 - Long dependencies

Issue - State: open - Opened by matthew-jurewicz almost 5 years ago - 2 comments

#10 - normalize queries and keys before dot product

Pull Request - State: closed - Opened by lucidrains about 5 years ago

#9 - Why doesn't AutoregressiveWrapper sum the encoder aux loss?

Issue - State: closed - Opened by tomweingarten about 5 years ago - 8 comments

#8 - What does autoregressive mean?

Issue - State: closed - Opened by matthew-jurewicz about 5 years ago - 8 comments

#7 - One-hot encoded input?

Issue - State: closed - Opened by matthew-jurewicz about 5 years ago - 4 comments

#6 - Missing key(s) in state_dict

Issue - State: closed - Opened by epetros about 5 years ago - 4 comments

#5 - AutoregressiveWrapper expects different input lengths based on type

Issue - State: closed - Opened by tomweingarten about 5 years ago - 3 comments

#4 - Encoder-decoder fails at KMeans attention

Issue - State: closed - Opened by tomweingarten about 5 years ago - 16 comments

#3 - use_evonorm no longer supported in PKM

Issue - State: closed - Opened by tomweingarten about 5 years ago - 1 comment

#2 - Fix top_p to define threshold similarly to top_k and not garble output.

Pull Request - State: closed - Opened by tomweingarten about 5 years ago - 1 comment

#1 - top_p returns wrong values and re-orders the data

Issue - State: closed - Opened by tomweingarten about 5 years ago - 7 comments

GitHub / lucidrains/routing-transformer issues and pull requests