karpathy/minGPT issues and pull requests

#143 - Reallly small correction

Pull Request - State: open - Opened by manuelsh about 1 month ago

#140 - <|endoftext|> token isn't encoded correctly

Issue - State: open - Opened by ttumiel 5 months ago - 2 comments

#138 - DEMO: minGPT on tinygrad

Pull Request - State: open - Opened by ziliangpeng 7 months ago

#100 - fix: add missing dependency in `setup.py`

Pull Request - State: open - Opened by ben-schulz almost 2 years ago

#99 - tests do not run in project as built

Issue - State: open - Opened by ben-schulz almost 2 years ago - 1 comment

#98 - Facilitating setup with popular tools

Issue - State: open - Opened by Utopiah almost 2 years ago

#97 - Fix typo in bpe.py

Pull Request - State: open - Opened by eltociear almost 2 years ago

#97 - Fix typo in bpe.py

Pull Request - State: open - Opened by eltociear almost 2 years ago

#96 - Update README.md

Pull Request - State: closed - Opened by chinhaihour almost 2 years ago

#96 - Update README.md

Pull Request - State: closed - Opened by chinhaihour almost 2 years ago

#95 - Caching for generation

Issue - State: open - Opened by murbard almost 2 years ago - 1 comment

#95 - Caching for generation

Issue - State: open - Opened by murbard almost 2 years ago - 1 comment

#94 - Renaming transformer.h into transformer.l

Issue - State: open - Opened by marxav almost 2 years ago

#93 - Wrong definition of Query, Key, Value matrices? They shouldn't have bias=True

Issue - State: open - Opened by LeoPerelli almost 2 years ago - 3 comments

#92 - Remove extraneous semicolons, remind users to get transformers repo if needed

Pull Request - State: open - Opened by danielgross about 2 years ago

#92 - Remove extraneous semicolons, remind users to get transformers repo if needed

Pull Request - State: open - Opened by danielgross about 2 years ago

#91 - `named_parameters` does not have to be recursive

Pull Request - State: open - Opened by Equim-chan about 2 years ago

#91 - `named_parameters` does not have to be recursive

Pull Request - State: open - Opened by Equim-chan about 2 years ago

#90 - cannot import name 'sample' from 'mingpt.utils'

Issue - State: open - Opened by chris-the-wiz about 2 years ago - 1 comment

#89 - Update readme.md

Pull Request - State: open - Opened by macleginn about 2 years ago

#89 - Update readme.md

Pull Request - State: open - Opened by macleginn about 2 years ago

#88 - Update README.md

Pull Request - State: open - Opened by macleginn about 2 years ago

#87 - Add DataParallel and make Block support DataParallel

Pull Request - State: open - Opened by gngdb about 2 years ago

#86 - Add dtype support

Pull Request - State: open - Opened by younesbelkada over 2 years ago

#86 - Add dtype support

Pull Request - State: open - Opened by younesbelkada over 2 years ago

#85 - Move token + pos embedding computation to a separate method

Pull Request - State: closed - Opened by ericjang over 2 years ago - 12 comments

#85 - Move token + pos embedding computation to a separate method

Pull Request - State: closed - Opened by ericjang over 2 years ago - 12 comments

#84 - Add setup.py to allow mingpt to be used as a third-party library

Pull Request - State: closed - Opened by ericjang over 2 years ago - 1 comment

#83 - Use XOR operator `^` for checking assertion `type_given XOR params_given`

Pull Request - State: closed - Opened by mishig25 over 2 years ago - 1 comment

#82 - Fix README.md typo

Pull Request - State: closed - Opened by neverix over 2 years ago - 1 comment

#81 - Add optimizer to Trainer's self for callbacks.

Pull Request - State: closed - Opened by luigidisotto over 2 years ago - 1 comment

#80 - Refactor for modern 2022 python style and usage

Pull Request - State: open - Opened by mattsta over 2 years ago - 3 comments

#80 - Refactor for modern 2022 python style and usage

Pull Request - State: open - Opened by mattsta over 2 years ago - 3 comments

#79 - Is it more reasonable to only use causal attention in the first block of GPT

Issue - State: open - Opened by charlesxu90 over 2 years ago

#79 - Is it more reasonable to only use causal attention in the first block of GPT

Issue - State: open - Opened by charlesxu90 over 2 years ago

#78 - Typos

Pull Request - State: closed - Opened by nat over 2 years ago - 1 comment

#78 - Typos

Pull Request - State: closed - Opened by nat over 2 years ago - 1 comment

#77 - Meaning of "-1 because very last digit doesn't plug back"

Issue - State: open - Opened by vwxyzjn over 2 years ago

#77 - Meaning of "-1 because very last digit doesn't plug back"

Issue - State: open - Opened by vwxyzjn over 2 years ago

#76 - Refactor repo into script-based projects

Pull Request - State: closed - Opened by karpathy over 2 years ago

#76 - Refactor repo into script-based projects

Pull Request - State: closed - Opened by karpathy over 2 years ago

#75 - Perfect training and evaluation loss, but terrible test-time performance

Issue - State: closed - Opened by micahcarroll over 2 years ago - 1 comment

#75 - Perfect training and evaluation loss, but terrible test-time performance

Issue - State: closed - Opened by micahcarroll over 2 years ago - 1 comment

#74 - Add unit tests

Pull Request - State: closed - Opened by mishig25 over 2 years ago - 1 comment

#74 - Add unit tests

Pull Request - State: closed - Opened by mishig25 over 2 years ago - 1 comment

#73 - Add tests

Pull Request - State: closed - Opened by mishig25 over 2 years ago

#73 - Add tests

Pull Request - State: closed - Opened by mishig25 over 2 years ago

#72 - #71 use config n_head instead of hardcoded 4 heads

Pull Request - State: open - Opened by SpeedCoder5 over 2 years ago

#72 - #71 use config n_head instead of hardcoded 4 heads

Pull Request - State: open - Opened by SpeedCoder5 over 2 years ago

#71 - model self-attention hardcoded to 4 heads

Issue - State: closed - Opened by SpeedCoder5 over 2 years ago - 1 comment

#70 - How to handle unequal sequence length in a batch

Issue - State: closed - Opened by luxuantao over 2 years ago - 2 comments

#69 - Added the condition for test_dataset's presence.

Pull Request - State: open - Opened by RohanAwhad over 2 years ago

#69 - Added the condition for test_dataset's presence.

Pull Request - State: open - Opened by RohanAwhad over 2 years ago

#68 - Integration with HuggingFace

Issue - State: open - Opened by marxav over 2 years ago

#68 - Integration with HuggingFace

Issue - State: open - Opened by marxav over 2 years ago

#67 - play_math AdditionDataset.__get_item__ return value?

Issue - State: open - Opened by SpeedCoder5 over 2 years ago

#67 - play_math AdditionDataset.__get_item__ return value?

Issue - State: open - Opened by SpeedCoder5 over 2 years ago

#66 - Add distributed data parallel trainer

Pull Request - State: open - Opened by aravindsrinivas over 2 years ago

#65 - How to determine `warmup_tokens` and `final_tokens`?

Issue - State: open - Opened by fgolemo almost 3 years ago

#64 - modify to use pooling instead

Pull Request - State: closed - Opened by annasajkh almost 3 years ago

#63 - Fix broken hugging face link & add link to huggingface / transformers

Pull Request - State: closed - Opened by mishig25 almost 3 years ago - 2 comments

#63 - Fix broken hugging face link & add link to huggingface / transformers

Pull Request - State: closed - Opened by mishig25 almost 3 years ago - 2 comments

#62 - initialize position embeddings

Pull Request - State: closed - Opened by t-vi about 3 years ago - 1 comment

#62 - initialize position embeddings

Pull Request - State: closed - Opened by t-vi about 3 years ago - 1 comment

#61 - Code completion notebook

Pull Request - State: closed - Opened by tetelestia over 3 years ago - 2 comments

#60 - Implemented play_word.ipynb example

Pull Request - State: open - Opened by emukans over 3 years ago - 1 comment

#59 - Error when I provide test dataset (custom minGPT)

Issue - State: open - Opened by asigalov61 over 3 years ago

#59 - Error when I provide test dataset (custom minGPT)

Issue - State: open - Opened by asigalov61 over 3 years ago

#58 - minor typo

Pull Request - State: closed - Opened by project-delphi over 3 years ago - 1 comment

#58 - minor typo

Pull Request - State: closed - Opened by project-delphi over 3 years ago - 1 comment

#57 - TPU/GPU training: KeyError 'pos_emb'

Issue - State: open - Opened by tech509201941 over 3 years ago

#56 - How do I see test loss?

Issue - State: open - Opened by aletote almost 4 years ago - 1 comment

#55 - perform sqrt-d scaling on q instead of att matrix

Pull Request - State: closed - Opened by aravindsrinivas almost 4 years ago - 2 comments

#55 - perform sqrt-d scaling on q instead of att matrix

Pull Request - State: closed - Opened by aravindsrinivas almost 4 years ago - 2 comments

#54 - Question about memory usage for play_math

Issue - State: open - Opened by pablogranolabar almost 4 years ago

#54 - Question about memory usage for play_math

Issue - State: open - Opened by pablogranolabar almost 4 years ago

#53 - move instantiation of DataLoader

Pull Request - State: closed - Opened by waynemystir almost 4 years ago - 1 comment

#53 - move instantiation of DataLoader

Pull Request - State: closed - Opened by waynemystir almost 4 years ago - 1 comment

#52 - Layer norm should be after residual block

Pull Request - State: closed - Opened by Scikud about 4 years ago - 1 comment

#52 - Layer norm should be after residual block

Pull Request - State: closed - Opened by Scikud about 4 years ago - 1 comment

#51 - Use of amp.autocast does not improve performance

Issue - State: open - Opened by aurotripathy about 4 years ago - 1 comment

#51 - Use of amp.autocast does not improve performance

Issue - State: open - Opened by aurotripathy about 4 years ago - 1 comment

#50 - Add a multi-language example

Pull Request - State: closed - Opened by marxav about 4 years ago

#50 - Add a multi-language example

Pull Request - State: closed - Opened by marxav about 4 years ago

#49 - Add a "How to cite this work" section in README.md

Issue - State: open - Opened by marxav about 4 years ago - 2 comments

#49 - Add a "How to cite this work" section in README.md

Issue - State: open - Opened by marxav about 4 years ago - 2 comments

#48 - How to apply to time series?

Issue - State: closed - Opened by thinkingparticle about 4 years ago - 1 comment

#47 - Will this repo add the reformer or teach people how to implement reformer in Pytorch?

Issue - State: closed - Opened by JonathanSum about 4 years ago - 2 comments

#46 - If possible, please add TQDM (auto)

Pull Request - State: closed - Opened by asigalov61 about 4 years ago - 5 comments

#46 - If possible, please add TQDM (auto)

Pull Request - State: closed - Opened by asigalov61 about 4 years ago - 5 comments

#45 - Is there a Tensorflow-version of minGPT? Or will someone implement it?

Issue - State: closed - Opened by guotong1988 about 4 years ago - 2 comments

#45 - Is there a Tensorflow-version of minGPT? Or will someone implement it?

Issue - State: closed - Opened by guotong1988 about 4 years ago - 2 comments

#44 - GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?

Issue - State: open - Opened by guotong1988 about 4 years ago - 1 comment

#44 - GPT vs BERT, under same computation and data resource, which one is better for downstream tasks like GLUE?

Issue - State: open - Opened by guotong1988 about 4 years ago - 1 comment

#43 - #39 Adding B, T, C description in comment.

Pull Request - State: closed - Opened by JonathanSum about 4 years ago - 1 comment

#43 - #39 Adding B, T, C description in comment.

Pull Request - State: closed - Opened by JonathanSum about 4 years ago - 1 comment

#42 - Sharing Pretrained Checkpoints

Issue - State: open - Opened by barisbatuhan about 4 years ago - 2 comments

#41 - typo 'terations'

Pull Request - State: closed - Opened by JonathanSum about 4 years ago - 1 comment

#41 - typo 'terations'

Pull Request - State: closed - Opened by JonathanSum about 4 years ago - 1 comment

#40 - Potential encoding issue in addition problem in play_math notebook?

Issue - State: closed - Opened by ravi-annaswamy about 4 years ago - 3 comments

GitHub / karpathy/minGPT issues and pull requests