Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / karpathy/build-nanogpt issues and pull requests
#83 - Adding Resume Training Functionality
Pull Request -
State: open - Opened by yousefelsharkawy about 1 month ago
#82 - How is the autoregressive loss handled?
Issue -
State: closed - Opened by BabyCNM about 1 month ago
- 2 comments
#81 - Avoid tiktoken.decode panic on unknown tokens.
Issue -
State: open - Opened by IggShaman 3 months ago
- 1 comment
#80 - Avoid tiktoken.decode panic on unknown tokens.
Pull Request -
State: open - Opened by IggShaman 3 months ago
#79 - torch.compile-d models do not work with example generation and hellaswag eval
Issue -
State: open - Opened by IggShaman 3 months ago
- 1 comment
#78 - Make example generation work when the model is torch.compile-d
Pull Request -
State: open - Opened by IggShaman 3 months ago
#77 - fix: Progress bar not complete after starting a new shard
Pull Request -
State: open - Opened by sanbaiw 3 months ago
#73 - Update train_gpt2.py with compile mode for torch.compile
Pull Request -
State: closed - Opened by manu-chauhan 3 months ago
- 1 comment
#72 - Ensure Consistency Between GPTConfig.block_size and Sequence Length T
Pull Request -
State: open - Opened by Benetti-Hub 3 months ago
- 1 comment
#66 - Refactor: Replace Class Name with cls in from_pretrained Method- Fix subclassing
Pull Request -
State: open - Opened by mohsenhariri 4 months ago
#64 - Add `torch_xla` support to `build-nanogpt`
Pull Request -
State: closed - Opened by miladm 4 months ago
#62 - Enable torch compile with DDP
Pull Request -
State: open - Opened by stevebako 4 months ago
#60 - Fix torch.compile Issue - Error with HellaSwag eval and Generation
Issue -
State: closed - Opened by ML-Guy 4 months ago
#59 - Run the script on MacOS
Pull Request -
State: open - Opened by amjadmajid 4 months ago
#57 - Run fineweb.py on MacOS
Pull Request -
State: closed - Opened by amjadmajid 4 months ago
#56 - Text generation can use raw_model instead of model
Issue -
State: open - Opened by sapphire008 4 months ago
#54 - consistently change model to raw_model
Pull Request -
State: open - Opened by hdocmsu 4 months ago
#52 - Dataloader now shuffles the shards and documents within
Pull Request -
State: open - Opened by fraserlove 4 months ago
- 1 comment
#50 - Different inference results between flash attention and manually implemented attention appeared.
Issue -
State: open - Opened by Jaeckel-d 4 months ago
#49 - How to support padding in the train dataset for training ?
Issue -
State: open - Opened by mrhimanshu 4 months ago
- 2 comments
#48 - Integrating GPT-2 with deepspeed Zero-1, Zero-2 and Zero-3
Issue -
State: open - Opened by Devadeut 4 months ago
- 1 comment
#47 - Cannot get the log file "log124M_40B/log.txt"?
Issue -
State: open - Opened by dtdo90 4 months ago
- 5 comments
#46 - Fineweb sharding multi-process bugfix
Pull Request -
State: closed - Opened by GrahamTheCoder 4 months ago
- 1 comment
#45 - Running codes on Windows issues
Issue -
State: open - Opened by gerardaristizabalpla4 4 months ago
- 2 comments
#44 - RuntimeError: User specified an unsupported autocast device_type 'cuda:0'
Issue -
State: closed - Opened by 0smboy 4 months ago
- 1 comment
#43 - Update README.md
Pull Request -
State: open - Opened by sriramgkn 5 months ago
#42 - Fix out of bounds check
Pull Request -
State: open - Opened by lukasugar 5 months ago
#41 - Executing with 1 GPU raises "OutOfMemory Exception", executing with 2 GPUs "RuntimeError: CUDA error: invalid device ordinal"
Issue -
State: closed - Opened by nmerkle 5 months ago
- 2 comments
#40 - add requirements
Pull Request -
State: open - Opened by mchen610 5 months ago
#39 - add requirements
Pull Request -
State: closed - Opened by mchen610 5 months ago
#38 - 564
Pull Request -
State: closed - Opened by testwithtesty 5 months ago
#36 - Add .gitignore
Pull Request -
State: closed - Opened by lutzroeder 5 months ago
#32 - Fix typo in configure_optimizers
Pull Request -
State: open - Opened by lukasugar 5 months ago
#31 - Is dataloader making optimal batches?
Issue -
State: closed - Opened by paraschopra 5 months ago
- 1 comment
#30 - fix sync issue that results in incorrect gradient accumulation and incorrect loss
Pull Request -
State: closed - Opened by WilsonCWu 5 months ago
- 1 comment
#29 - NO dropout in MLP and CausalSelfAttention
Issue -
State: closed - Opened by peter-ni-noob 5 months ago
- 2 comments
#28 - fix `Attempted to set non-positive bottom ylim` UserWarning in play.ipynb
Pull Request -
State: closed - Opened by WilsonCWu 5 months ago
#27 - fix script for non-cuda devices
Pull Request -
State: closed - Opened by adamskrodzki 5 months ago
- 1 comment
#26 - current position should be 0 at the start a shard
Pull Request -
State: closed - Opened by eliebak 5 months ago
- 2 comments
#25 - Sharding the dataset not completing?
Issue -
State: open - Opened by dustinwloring1988 5 months ago
- 7 comments
#24 - Extending GPT2 for audio generation
Pull Request -
State: open - Opened by nivibilla 5 months ago
#23 - fix the double-jump for out-of-bound check in DataLoaderLite
Pull Request -
State: closed - Opened by fatemi 5 months ago
- 3 comments
#22 - Chunking method in the original GPT-2 training dataset
Issue -
State: closed - Opened by rasbt 5 months ago
- 2 comments
#20 - Fix typo in README.md
Pull Request -
State: closed - Opened by awesomebytes 5 months ago
- 2 comments
#19 - Propose validation loss calculation to imporve accuracy by reducing floating-point errors
Pull Request -
State: open - Opened by aakashapoorv 5 months ago
- 2 comments
#18 - Embeddings are initialized with std of 0.02
Issue -
State: open - Opened by eryk-mazus 5 months ago
- 2 comments
#17 - Implement tensor parallelism
Issue -
State: closed - Opened by marib00 5 months ago
- 4 comments
#16 - Fix device type in autocast
Pull Request -
State: closed - Opened by banyan-god 5 months ago
#9 - Update train_gpt2.py
Pull Request -
State: closed - Opened by zhangfaen 5 months ago
#8 - Update README.md
Pull Request -
State: closed - Opened by aepiotti 5 months ago
#6 - Consider using `torch.compile(model, fullgraph=True, mode="reduce-overhead")`
Issue -
State: open - Opened by lezcano 5 months ago
- 11 comments
#5 - Typo fix in readme
Pull Request -
State: closed - Opened by rasbt 5 months ago
#4 - Fix fineweb's parallel processing on windows
Pull Request -
State: open - Opened by ltogniolli 5 months ago
- 3 comments