Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / karpathy/build-nanogpt issues and pull requests
#90 - Saving `raw_model.state_dict()` checkpoints
Issue -
State: open - Opened by anw-g01 23 days ago
- 1 comment
#89 - Why does weights sharing have to be in that direction?
Issue -
State: open - Opened by BetterWang about 1 month ago
- 1 comment
#88 - Fixed fineweb script to run on Windows.
Pull Request -
State: open - Opened by koiker about 1 month ago
#86 - How Can I extract Last Layer Representation?
Issue -
State: open - Opened by shantanu778 4 months ago
#83 - Adding Resume Training Functionality
Pull Request -
State: open - Opened by yousefelsharkawy 5 months ago
#82 - How is the autoregressive loss handled?
Issue -
State: closed - Opened by BabyCNM 5 months ago
- 2 comments
#81 - Avoid tiktoken.decode panic on unknown tokens.
Issue -
State: open - Opened by IggShaman 6 months ago
- 1 comment
#80 - Avoid tiktoken.decode panic on unknown tokens.
Pull Request -
State: open - Opened by IggShaman 6 months ago
#79 - torch.compile-d models do not work with example generation and hellaswag eval
Issue -
State: open - Opened by IggShaman 6 months ago
- 1 comment
#78 - Make example generation work when the model is torch.compile-d
Pull Request -
State: open - Opened by IggShaman 6 months ago
#77 - fix: Progress bar not complete after starting a new shard
Pull Request -
State: open - Opened by sanbaiw 6 months ago
#73 - Update train_gpt2.py with compile mode for torch.compile
Pull Request -
State: closed - Opened by manu-chauhan 6 months ago
- 1 comment
#72 - Ensure Consistency Between GPTConfig.block_size and Sequence Length T
Pull Request -
State: open - Opened by Benetti-Hub 6 months ago
- 1 comment
#66 - Refactor: Replace Class Name with cls in from_pretrained Method- Fix subclassing
Pull Request -
State: open - Opened by mohsenhariri 7 months ago
#64 - Add `torch_xla` support to `build-nanogpt`
Pull Request -
State: closed - Opened by miladm 7 months ago
#62 - Enable torch compile with DDP
Pull Request -
State: open - Opened by stevebako 7 months ago
#60 - Fix torch.compile Issue - Error with HellaSwag eval and Generation
Issue -
State: closed - Opened by ML-Guy 7 months ago
#59 - Run the script on MacOS
Pull Request -
State: closed - Opened by amjadmajid 7 months ago
#57 - Run fineweb.py on MacOS
Pull Request -
State: closed - Opened by amjadmajid 7 months ago
#56 - Text generation can use raw_model instead of model
Issue -
State: open - Opened by sapphire008 7 months ago
#54 - consistently change model to raw_model
Pull Request -
State: open - Opened by hdocmsu 8 months ago
#52 - Dataloader now shuffles the shards and documents within
Pull Request -
State: open - Opened by fraserlove 8 months ago
- 1 comment
#50 - Different inference results between flash attention and manually implemented attention appeared.
Issue -
State: open - Opened by Jaeckel-d 8 months ago
#49 - How to support padding in the train dataset for training ?
Issue -
State: open - Opened by mrhimanshu 8 months ago
- 2 comments
#48 - Integrating GPT-2 with deepspeed Zero-1, Zero-2 and Zero-3
Issue -
State: open - Opened by Devadeut 8 months ago
- 1 comment
#47 - Cannot get the log file "log124M_40B/log.txt"?
Issue -
State: open - Opened by dtdo90 8 months ago
- 5 comments
#46 - Fineweb sharding multi-process bugfix
Pull Request -
State: closed - Opened by GrahamTheCoder 8 months ago
- 1 comment
#45 - Running codes on Windows issues
Issue -
State: open - Opened by gerardaristizabalpla4 8 months ago
- 2 comments
#44 - RuntimeError: User specified an unsupported autocast device_type 'cuda:0'
Issue -
State: closed - Opened by 0smboy 8 months ago
- 1 comment
#43 - Update README.md
Pull Request -
State: open - Opened by sriramgkn 8 months ago
#42 - Fix out of bounds check
Pull Request -
State: open - Opened by lukasugar 8 months ago
#41 - Executing with 1 GPU raises "OutOfMemory Exception", executing with 2 GPUs "RuntimeError: CUDA error: invalid device ordinal"
Issue -
State: closed - Opened by nmerkle 8 months ago
- 2 comments
#40 - add requirements
Pull Request -
State: open - Opened by mchen610 8 months ago
#39 - add requirements
Pull Request -
State: closed - Opened by mchen610 8 months ago
#38 - 564
Pull Request -
State: closed - Opened by testwithtesty 8 months ago
#36 - Add .gitignore
Pull Request -
State: closed - Opened by lutzroeder 8 months ago
#32 - Fix typo in configure_optimizers
Pull Request -
State: open - Opened by lukasugar 8 months ago
#31 - Is dataloader making optimal batches?
Issue -
State: closed - Opened by paraschopra 8 months ago
- 1 comment
#30 - fix sync issue that results in incorrect gradient accumulation and incorrect loss
Pull Request -
State: closed - Opened by WilsonCWu 8 months ago
- 1 comment
#29 - NO dropout in MLP and CausalSelfAttention
Issue -
State: closed - Opened by peter-ni-noob 8 months ago
- 2 comments
#28 - fix `Attempted to set non-positive bottom ylim` UserWarning in play.ipynb
Pull Request -
State: closed - Opened by WilsonCWu 8 months ago
#27 - fix script for non-cuda devices
Pull Request -
State: closed - Opened by adamskrodzki 8 months ago
- 1 comment
#26 - current position should be 0 at the start a shard
Pull Request -
State: closed - Opened by eliebak 8 months ago
- 2 comments
#25 - Sharding the dataset not completing?
Issue -
State: open - Opened by dustinwloring1988 8 months ago
- 7 comments
#24 - Extending GPT2 for audio generation
Pull Request -
State: open - Opened by nivibilla 8 months ago
#23 - fix the double-jump for out-of-bound check in DataLoaderLite
Pull Request -
State: closed - Opened by fatemi 8 months ago
- 3 comments
#22 - Chunking method in the original GPT-2 training dataset
Issue -
State: closed - Opened by rasbt 8 months ago
- 2 comments
#20 - Fix typo in README.md
Pull Request -
State: closed - Opened by awesomebytes 8 months ago
- 2 comments
#19 - Propose validation loss calculation to imporve accuracy by reducing floating-point errors
Pull Request -
State: open - Opened by aakashapoorv 8 months ago
- 2 comments
#18 - Embeddings are initialized with std of 0.02
Issue -
State: open - Opened by eryk-mazus 8 months ago
- 2 comments
#17 - Implement tensor parallelism
Issue -
State: closed - Opened by marib00 8 months ago
- 4 comments
#16 - Fix device type in autocast
Pull Request -
State: closed - Opened by banyan-god 8 months ago
#9 - Update train_gpt2.py
Pull Request -
State: closed - Opened by zhangfaen 8 months ago
#8 - Update README.md
Pull Request -
State: closed - Opened by aepiotti 8 months ago
#6 - Consider using `torch.compile(model, fullgraph=True, mode="reduce-overhead")`
Issue -
State: open - Opened by lezcano 8 months ago
- 11 comments
#5 - Typo fix in readme
Pull Request -
State: closed - Opened by rasbt 8 months ago
#4 - Fix fineweb's parallel processing on windows
Pull Request -
State: open - Opened by ltogniolli 8 months ago
- 3 comments