jadore801120/attention-is-all-you-need-pytorch issues and pull requests

#223 - Update Models.py make PE with torch instead of numpy

Pull Request - State: open - Opened by NGC13009 about 2 months ago

#155 - To make position embedding be implemented by PyTorch, and to support …

Pull Request - State: closed - Opened by zipzou over 4 years ago

#100 - Compare results between this one and the Tensorflow one

Issue - State: open - Opened by Frozenmad over 5 years ago - 1 comment

#100 - Compare results between this one and the Tensorflow one

Issue - State: open - Opened by Frozenmad over 5 years ago - 1 comment

#99 - Meaning of `padding_idx` in `get_sinusoid_encoding_table`?

Issue - State: closed - Opened by colllin over 5 years ago - 2 comments

#99 - Meaning of `padding_idx` in `get_sinusoid_encoding_table`?

Issue - State: closed - Opened by colllin over 5 years ago - 2 comments

#98 - Translate sentence by sentence

Issue - State: closed - Opened by nikhiljaiswal over 5 years ago - 2 comments

#98 - Translate sentence by sentence

Issue - State: closed - Opened by nikhiljaiswal over 5 years ago - 2 comments

#97 - the test demo

Issue - State: open - Opened by chenjun2hao over 5 years ago - 1 comment

#97 - the test demo

Issue - State: open - Opened by chenjun2hao over 5 years ago - 1 comment

#96 - Do you have a trained model dump ?

Issue - State: closed - Opened by gauravlath07 over 5 years ago - 1 comment

#96 - Do you have a trained model dump ?

Issue - State: closed - Opened by gauravlath07 over 5 years ago - 1 comment

#95 - GPU transition problem

Issue - State: closed - Opened by HikaruSama233 over 5 years ago - 1 comment

#95 - GPU transition problem

Issue - State: closed - Opened by HikaruSama233 over 5 years ago - 1 comment

#94 - Experienced a sudden drop in train loss

Issue - State: open - Opened by ftakanashi over 5 years ago - 2 comments

#94 - Experienced a sudden drop in train loss

Issue - State: open - Opened by ftakanashi over 5 years ago - 2 comments

#93 - why is the BLEU score of the translate result so bad？

Issue - State: closed - Opened by qtxue over 5 years ago - 6 comments

#93 - why is the BLEU score of the translate result so bad？

Issue - State: closed - Opened by qtxue over 5 years ago - 6 comments

#92 - HOW MultiHeadAttention WORKS

Issue - State: closed - Opened by 578123043 almost 6 years ago

#92 - HOW MultiHeadAttention WORKS

Issue - State: closed - Opened by 578123043 almost 6 years ago

#91 - Maybe there should be a NoneType check in transformer.SubLayers line 53

Issue - State: closed - Opened by Ylizin almost 6 years ago - 1 comment

#91 - Maybe there should be a NoneType check in transformer.SubLayers line 53

Issue - State: closed - Opened by Ylizin almost 6 years ago - 1 comment

#90 - the inference python,how long will the testing demo

Issue - State: closed - Opened by chenjun2hao almost 6 years ago - 1 comment

#90 - the inference python,how long will the testing demo

Issue - State: closed - Opened by chenjun2hao almost 6 years ago - 1 comment

#89 - Missing $ on loop variable at preprocess.perl line 1.

Issue - State: closed - Opened by ankur6ue almost 6 years ago - 4 comments

#89 - Missing $ on loop variable at preprocess.perl line 1.

Issue - State: closed - Opened by ankur6ue almost 6 years ago - 4 comments

#88 - Evaluation module

Pull Request - State: open - Opened by gcunhase almost 6 years ago

#88 - Evaluation module

Pull Request - State: open - Opened by gcunhase almost 6 years ago

#87 - Why encoder and decoder use "non_pad_mask"?

Issue - State: closed - Opened by tamuhey almost 6 years ago - 5 comments

#86 - How to faster the training process

Issue - State: open - Opened by neu-teng almost 6 years ago - 1 comment

#85 - How can I evaluate the model on BLEU score?

Issue - State: open - Opened by Esaada almost 6 years ago - 1 comment

#84 - How to use multi-GPUS ?

Issue - State: closed - Opened by neu-teng almost 6 years ago - 2 comments

#83 - Read and write files using UTF-8 encoding

Pull Request - State: open - Opened by elvisyjlin almost 6 years ago

#82 - arguments problem?

Issue - State: closed - Opened by A6Matrix almost 6 years ago - 14 comments

#81 - Did you use LSTM or GRU in the encoder and attention part?

Issue - State: closed - Opened by liperrino almost 6 years ago - 2 comments

#80 - Is this really multiheaded attention?

Issue - State: closed - Opened by giannisdaras almost 6 years ago - 2 comments

#79 - Question about beamsearch

Issue - State: closed - Opened by ZhengkunTian almost 6 years ago - 2 comments

#78 - Pytorch Exception in Thread: ValueError: signal number 32 out of range

Issue - State: closed - Opened by Finley1991 almost 6 years ago - 1 comment

#77 - the reason why translate runs so slowly.

Issue - State: closed - Opened by huangnengCSU almost 6 years ago - 2 comments

#76 - confused about tgt_seq and gold, please give me some help

Issue - State: open - Opened by JavisPeng almost 6 years ago - 11 comments

#75 - Why in Beam needs 2 time topk operation?

Issue - State: closed - Opened by Genie-Liu almost 6 years ago - 3 comments

#74 - translate so slow, a batch composed of 64 instances will cost 13s+ to translate?

Issue - State: closed - Opened by BinWone almost 6 years ago - 2 comments

#73 - I think the best_score_and_idx should be scores[0] and ids[0]. Why is the index 1 here?

Issue - State: closed - Opened by JepsonWong almost 6 years ago - 1 comment

#72 - Dropout on Attention

Issue - State: closed - Opened by MilesQLi about 6 years ago - 1 comment

#71 - 'ascii' codec can't decode byte 0xc3 , What am I missing?!

Issue - State: closed - Opened by Esaada about 6 years ago - 2 comments

#70 - fix missing dropout on the embeddings

Pull Request - State: closed - Opened by JulesGM about 6 years ago - 1 comment

#69 - missing dropout on the embeddings

Issue - State: closed - Opened by JulesGM about 6 years ago - 2 comments

#68 - Any reproduced results?

Issue - State: closed - Opened by hfxunlp about 6 years ago - 2 comments

#67 - Update SubLayers.py

Pull Request - State: open - Opened by Sanyuan-Chen about 6 years ago

#66 - Added Package requirement file

Pull Request - State: open - Opened by y12uc231 about 6 years ago

#65 - How to reproduce the result in README?

Issue - State: closed - Opened by ljch2018 about 6 years ago - 1 comment

#64 - encounter OOM error when the default batch_size is set to a number more than 64, eg 65

Issue - State: closed - Opened by HenryWoodOTC about 6 years ago - 2 comments

#63 - Beam search

Issue - State: closed - Opened by charlesfufu about 6 years ago - 1 comment

#62 - Model accuracy

Issue - State: open - Opened by charlesfufu about 6 years ago - 1 comment

#61 - Something wrong with F.cross_entropy(pred, gold, ignore_index=Constants.PAD, reduction="sum")''! !

Issue - State: closed - Opened by charlesfufu about 6 years ago - 3 comments

#60 - accuracy reduce during the training

Issue - State: open - Opened by Sxx1995 about 6 years ago - 9 comments

#59 - KeyError on testing

Issue - State: closed - Opened by dksehdals216 over 6 years ago - 2 comments

#58 - Softmax layer for output probabilities

Issue - State: closed - Opened by RobinVdE over 6 years ago - 2 comments

#57 - Error about the mask in ScaledDotProductAttention

Issue - State: closed - Opened by yangze0930 over 6 years ago - 5 comments

#56 - Tensor data type error

Issue - State: closed - Opened by ylmeng over 6 years ago - 2 comments

#55 - eval() questions

Issue - State: closed - Opened by xuefei1 over 6 years ago - 2 comments

#54 - Model training error

Issue - State: closed - Opened by ejklektov over 6 years ago - 3 comments

#53 - Dropout when predicting

Issue - State: closed - Opened by ColdEyeLampr over 6 years ago - 1 comment

#52 - question about softmax

Issue - State: closed - Opened by Continue7777 over 6 years ago - 2 comments

#51 - RunTimeError During Training

Issue - State: closed - Opened by JulianRMedina over 6 years ago - 3 comments

#50 - About label smoothing.

Issue - State: closed - Opened by chenyangh over 6 years ago - 1 comment

#49 - Temper in ScaledDotProductAttention?

Issue - State: closed - Opened by ivan-bilan over 6 years ago - 3 comments

#48 - Fix an error on Windows about 32, 64-bit integer

Pull Request - State: open - Opened by kkppll-ss over 6 years ago

#47 - Pretrained models?

Issue - State: closed - Opened by ZeweiChu over 6 years ago - 3 comments

#46 - Code failing while translation

Issue - State: closed - Opened by anuj-rathore over 6 years ago - 1 comment

#45 - What is the performance on WMT'14 ENDE datasets ?

Issue - State: open - Opened by KelleyYin over 6 years ago

#44 - All translated sentences start with a same word

Issue - State: closed - Opened by zhangdistephen over 6 years ago - 19 comments

#43 - Model training error

Issue - State: closed - Opened by Crista23 over 6 years ago - 1 comment

#42 - Question about the value of temper in ScaledDotProductAttention and d_inner_hid default value

Issue - State: closed - Opened by zhangdistephen over 6 years ago - 1 comment

#41 - Preprocessing Error

Issue - State: closed - Opened by karanchahal almost 7 years ago - 3 comments

#40 - pass dropout param to ScaledDotProductAttention

Pull Request - State: closed - Opened by zalivnaja almost 7 years ago - 4 comments

#39 - add teacher forcing parameter

Pull Request - State: closed - Opened by zalivnaja almost 7 years ago - 1 comment

#38 - add seed option

Pull Request - State: closed - Opened by zalivnaja almost 7 years ago

#37 - Document strings' style do not accord PEP8

Issue - State: closed - Opened by ghrua almost 7 years ago - 1 comment

#36 - the accuracy in the process of training is always zero

Issue - State: closed - Opened by zhang-wen almost 7 years ago - 4 comments

#35 - What command can continue running program?

Issue - State: closed - Opened by cong1 about 7 years ago - 1 comment

#34 - Multi-GPUs?

Issue - State: closed - Opened by ayanamongol about 7 years ago - 2 comments

#33 - embedding of positional encoding?

Issue - State: closed - Opened by culurciello about 7 years ago - 7 comments

#32 - Positional Encoding

Issue - State: closed - Opened by jaseleephd about 7 years ago - 1 comment

#31 - how to change size of dictionaries

Issue - State: closed - Opened by cong1 about 7 years ago - 1 comment

#30 - how to reduce the size of directory?

Issue - State: closed - Opened by xumin2501 about 7 years ago - 1 comment

#29 - can not download mmt16_task1_test.tgz

Issue - State: closed - Opened by xumin2501 about 7 years ago - 1 comment

#28 - Bug in translating

Issue - State: closed - Opened by Yevgnen about 7 years ago - 2 comments

#27 - Feeding the output of the last encoding layer to the decoder

Issue - State: closed - Opened by Yevgnen about 7 years ago - 2 comments

#26 - need to check

Issue - State: closed - Opened by jiahuigeng about 7 years ago - 2 comments

#25 - Memory Problem?

Issue - State: closed - Opened by renqianluo about 7 years ago - 12 comments

#24 - Batch size limitation

Issue - State: open - Opened by rawmarshmellows over 7 years ago - 5 comments

#23 - d_word_vec and d_model must be equal in Encoder

Issue - State: closed - Opened by yangkky over 7 years ago - 2 comments

#22 - Correct translator

Pull Request - State: closed - Opened by ZiJianZhao over 7 years ago - 1 comment

#21 - Batch Beam Search Problem

Issue - State: closed - Opened by ZiJianZhao over 7 years ago - 3 comments

#20 - What is the score in beam search stands for?

Issue - State: closed - Opened by xyz2357 over 7 years ago - 3 comments

#19 - Masking bug?

Issue - State: closed - Opened by larspars over 7 years ago - 12 comments

#18 - Ubuntu Server Unable to recognise German Character

Issue - State: closed - Opened by gaopeng-eugene over 7 years ago - 1 comment

#17 - Correct LayerNormalization

Pull Request - State: closed - Opened by ZiJianZhao over 7 years ago - 1 comment

#16 - masking on tensor.data?

Issue - State: closed - Opened by chihyaoma over 7 years ago - 4 comments

GitHub / jadore801120/attention-is-all-you-need-pytorch issues and pull requests