zihangdai/xlnet issues and pull requests

#297 - CPU->GPU Memcpy failed when finetuning with STS-B

Issue - State: open - Opened by xavinatalia 9 months ago

#296 - Why are activation and dropout added after the classification layer?

Issue - State: closed - Opened by MrInouye about 1 year ago

#295 - xlnet, transformer xl attention score funtion problem

Issue - State: open - Opened by wonjunchoi-arc about 1 year ago

#294 - Update data_utils.py

Pull Request - State: closed - Opened by ruxandrastancioi over 1 year ago

#293 - pre-train xlnet for French language

Issue - State: open - Opened by karimmahalian almost 2 years ago

#292 - XLnet colab example error .

Issue - State: open - Opened by AlexTrinityBlock almost 2 years ago - 1 comment

#291 - 【Huawei】2012Lab-Project Cooperation&Exchange Invitation&Job Invitation-Zihang Dai

Issue - State: open - Opened by HanLu1226 over 2 years ago

#290 - run error about "InternalError (see above for traceback): Blas xGEMMBatched launch failed : a.shape=[12,512,64], b.shape=[12,64,512], m=512, n=512, k=64, batch_size=12"

Issue - State: closed - Opened by ccutyear over 2 years ago - 2 comments

#289 - Tokens and values

Issue - State: open - Opened by Dhurim almost 3 years ago

#288 - Update data_utils.py

Pull Request - State: open - Opened by DLPerf over 3 years ago - 1 comment

#287 - Performance issue in data_utils.py (by P3)

Issue - State: open - Opened by DLPerf over 3 years ago - 1 comment

#286 - Performance issues in the program

Issue - State: open - Opened by DLPerf over 3 years ago

#285 - Performance issue in the program

Issue - State: open - Opened by DLPerf over 3 years ago - 1 comment

#284 - TypeError: Fetch argument None has invalid type <class 'NoneType'> in train_gpu.py

Issue - State: open - Opened by songhee-lee over 3 years ago - 1 comment

#283 - How to get the XLNet vocabulary from spiece.model file and store it to a .vocab file?

Issue - State: open - Opened by SambhawDrag over 3 years ago

#282 - Feature/enhance predictions workflow

Pull Request - State: closed - Opened by agrudkow over 3 years ago

#281 - How to pretrain on multiple GPU?

Issue - State: closed - Opened by DHZBill almost 4 years ago

#280 - checkpoint_management.py export info

Issue - State: open - Opened by dll1314 almost 4 years ago

#279 - How are the positional encodings derived

Issue - State: open - Opened by bnicholl almost 4 years ago

#278 - specify tf version 1.x

Pull Request - State: open - Opened by amrzv about 4 years ago

#277 - Why is the first layer of the query stream initialized with the same vector w rather than different vectors?

Issue - State: open - Opened by Huakui-Zhang about 4 years ago

#276 - GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?

Issue - State: open - Opened by guotong1988 over 4 years ago - 1 comment

#275 - XLNet其实不能稳压RoBERTa吧？

Issue - State: closed - Opened by guotong1988 over 4 years ago - 1 comment

#274 - What is the function of _sample_mask method?

Issue - State: closed - Opened by guotong1988 over 4 years ago - 1 comment

#273 - Removing mem-reuse will not decrease the pretraining model performance for short text?

Issue - State: open - Opened by guotong1988 over 4 years ago

#272 - The relation of reuse_len and mem_len?

Issue - State: closed - Opened by guotong1988 over 4 years ago - 1 comment

#271 - reuse_len=0 means no mem? And no benefit for long text but not worse for short text?

Issue - State: closed - Opened by guotong1988 over 4 years ago - 1 comment

#270 - Problem with generating predictions from fine tuned classification model

Issue - State: open - Opened by abdullahkhilji over 4 years ago

#269 - Multi-gpu slower than single-gpu

Issue - State: open - Opened by weiyx15 over 4 years ago - 1 comment

#268 - OOM with least batch 2 in train_gpu.py

Issue - State: closed - Opened by eddatt over 4 years ago

#267 - colab notebook can not run under tensorflow 2.0

Issue - State: open - Opened by jlff over 4 years ago

#266 - _split_a_and_b

Issue - State: closed - Opened by FruVirus over 4 years ago

#265 - the special tokens of XLNet is different from BERT

Issue - State: open - Opened by lytum over 4 years ago - 2 comments

#264 - get_sequence_output is not contextualized

Issue - State: open - Opened by maziyarpanahi almost 5 years ago - 1 comment

#263 - Why the max_seq_length = 512 for XLNet?

Issue - State: open - Opened by vr25 almost 5 years ago - 4 comments

#262 - Is Next Sentence Prediction implemented in the code ?

Issue - State: open - Opened by GhaliaRehawi almost 5 years ago

#261 - How to use your pretrained model for question-answering ? # Question

Issue - State: open - Opened by Alla-Abdella almost 5 years ago - 2 comments

#260 - ValueError when running ./gpu_squad_base.sh

Issue - State: open - Opened by Omnis23 almost 5 years ago - 3 comments

#259 - OOM ERROR when using local batch size=128 on TPUv3-8

Issue - State: open - Opened by GhaliaRehawi almost 5 years ago - 1 comment

#258 - Is it possible feed xlnet to seq2seq encoder/decoder NMT (for low resource language)?

Issue - State: open - Opened by JohnasSolomon almost 5 years ago

#257 - Can you upload the processor code(run_classifier.py) for glue dataset(cola, qqp, sst-2, rte, mrpc)?

Issue - State: open - Opened by YJYJLee about 5 years ago - 1 comment

#256 - Number of training epochs in original publication

Issue - State: open - Opened by jjedele about 5 years ago

#255 - Docker support

Issue - State: open - Opened by sanjibnarzary about 5 years ago

#254 - [CLS] token / during training process

Issue - State: open - Opened by cherepanovic about 5 years ago

#253 - Is real factorization?

Issue - State: open - Opened by fangwch about 5 years ago

#252 - Python2 to Python3?

Issue - State: open - Opened by hammad26 about 5 years ago - 1 comment

#251 - Commands for training and testing on IMDB dataset.

Issue - State: open - Opened by VikasRajashekar about 5 years ago - 1 comment

#250 - Changing Vocab size

Issue - State: open - Opened by yusufani about 5 years ago

#249 - text classification on 3 classes

Issue - State: open - Opened by VikasRajashekar about 5 years ago - 2 comments

#248 - Normalization by NFKC

Issue - State: closed - Opened by Ina299 about 5 years ago - 1 comment

#247 - Merging various fixes for Colab, Cloud TPU, TPU Pod, ...

Pull Request - State: open - Opened by vochicong about 5 years ago - 2 comments

#246 - Has anyone run run_classifier.py with param is_regression=False

Issue - State: open - Opened by manhongxiang about 5 years ago

#245 - Added tpu support for colab notebooks. Changed run_squad.py, model_ut…

Pull Request - State: closed - Opened by gonwi about 5 years ago

#244 - Can I run sentence relevance appliance with only one 1080Ti GPU?

Issue - State: closed - Opened by manhongxiang about 5 years ago - 2 comments

#243 - Added sample code snippet to perform estimation (from the learned model)

Pull Request - State: open - Opened by bhaskar24 over 5 years ago

#242 - Xlnet Training Error

Issue - State: closed - Opened by yusufani over 5 years ago - 5 comments

#241 - Tokens predicted in permutation in pretraining.

Issue - State: open - Opened by Magnusnolsoe over 5 years ago

#240 - Pre-trained XLNet Base Model

Issue - State: closed - Opened by CapGOGO over 5 years ago

#239 - Config for TPU pod

Pull Request - State: open - Opened by vochicong over 5 years ago

#238 - How do we mask and predict words in sentence in xlnet? how to do next word prediction in xlnet?

Issue - State: open - Opened by MuruganR96 over 5 years ago

#237 - Issues with sentencepiece

Issue - State: open - Opened by Mahasweta-usc over 5 years ago - 2 comments

#236 - why the mask_emb is necessary here

Issue - State: open - Opened by ewrfcas over 5 years ago

#235 - Error while using bfloat16 in run_squad.py

Issue - State: open - Opened by guozhiyu over 5 years ago

#234 - No convergence in train.py

Issue - State: open - Opened by Mahasweta-usc over 5 years ago

#230 - Further pretrain in domain specific corpus

Issue - State: open - Opened by hexiaoyupku over 5 years ago - 2 comments

#224 - Text Classifier Prediction Problem

Issue - State: open - Opened by MissMcFly over 5 years ago - 2 comments

#222 - Is xlnet indeed context aware?

Issue - State: open - Opened by studiocardo over 5 years ago - 5 comments

#214 - Multigpus memory leak during pretraining

Issue - State: closed - Opened by huseinzol05 over 5 years ago - 4 comments

#209 - TPU num_shards and num_replicas error

Issue - State: open - Opened by huiwudiyi over 5 years ago - 2 comments

#200 - Adding GPU automatic mixed precision training

Pull Request - State: open - Opened by vinhngx over 5 years ago - 2 comments

#192 - About the input of the XLNET

Issue - State: open - Opened by wjn1996 over 5 years ago - 1 comment

#184 - no lr_layer_decay_rate for embedding

Issue - State: open - Opened by fyubang over 5 years ago - 3 comments

#183 - Using XLNetModel class for inference

Issue - State: open - Opened by OmriPi over 5 years ago - 6 comments

#180 - Pre-training on other language

Issue - State: open - Opened by 1234560o over 5 years ago - 4 comments

#177 - XLNet-Large has a very unstable performance, do you have the same problem?

Issue - State: open - Opened by yucc2018 over 5 years ago - 3 comments

#169 - What output should I expect during training?

Issue - State: closed - Opened by timnugent over 5 years ago - 5 comments

#166 - TypeError for pretraining on new dataset. `filenames` must be a `tf.data.Dataset` of `tf.string` elements.

Issue - State: closed - Opened by vanh17 over 5 years ago - 3 comments

#163 - Cache problem during pretraining

Issue - State: open - Opened by rkcalnode over 5 years ago - 5 comments

#158 - ValueError: Outputs of true_fn and false_fn must have the same type: int64, bool

Issue - State: open - Opened by yana-xuyan over 5 years ago - 4 comments

#151 - Extract Contextual Word Embeddings

Pull Request - State: open - Opened by Hazoom over 5 years ago - 17 comments

#139 - Code/Information for Document Ranking Task

Issue - State: open - Opened by lukemelas over 5 years ago - 4 comments

#133 - is there a vocabulary for xlnet

Issue - State: open - Opened by cotitan over 5 years ago - 3 comments

#129 - squad prediction does not return "no answer"

Issue - State: open - Opened by rakshanda22 over 5 years ago - 1 comment

#119 - Reproduction results for SQuAD 2.0

Issue - State: open - Opened by cooelf over 5 years ago - 3 comments

#117 - different runs on same data produce slightly different outputs

Issue - State: open - Opened by Enumaris over 5 years ago - 3 comments

#115 - Fine tuning XLNet on STS-B with V100 16GB GPU(s) does not converge

Issue - State: open - Opened by cockroachzl over 5 years ago - 9 comments

#113 - How to export?

Issue - State: closed - Opened by jinamshah over 5 years ago - 11 comments

#108 - attention when preprocessing with prepro_squad.sh

Pull Request - State: closed - Opened by kun126 over 5 years ago

#104 - Understanding _local_perm

Issue - State: open - Opened by astariul over 5 years ago - 3 comments

#99 - type error during finetuning on model pretrained with my own dataset

Issue - State: open - Opened by xieyuchen13 over 5 years ago - 1 comment

#97 - Understanding rel_shift function

Issue - State: open - Opened by JinhaoLei over 5 years ago - 2 comments

#88 - Can we perform CoLA task flassification with 8GB GPU?

Issue - State: open - Opened by GeetDsa over 5 years ago - 3 comments

#80 - [Question]: Like bert xlnet also has a max_len of 512 tokens, what would be good way to process longer text

Issue - State: open - Opened by kapilkd13 over 5 years ago - 8 comments

#78 - Error while using use_bfloat16 in run_classifier.py

Issue - State: open - Opened by ericwtlin over 5 years ago - 2 comments

#64 - Fine Tuning - SQuAD 2.0 on GPU 8 GB

Issue - State: open - Opened by renatoviolin over 5 years ago - 20 comments

#52 - Pre-training: checkpoint files are not written

Issue - State: open - Opened by stefan-it over 5 years ago - 3 comments

#48 - Error while running the pretrained model on MNLI

Issue - State: open - Opened by LeenaShekhar over 5 years ago - 9 comments

#47 - Added Colab TPU support with Colab Notebook and modified repo

Pull Request - State: open - Opened by aditya-malte over 5 years ago - 9 comments

#40 - corpus_info_path can't be found in train_gpu.py

Issue - State: closed - Opened by xiguashuiguo over 5 years ago - 1 comment

#35 - XLNet stuck for Text Classification task

Issue - State: open - Opened by aisheh90 over 5 years ago - 11 comments

GitHub / zihangdai/xlnet issues and pull requests