Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / karpathy/nanoGPT issues and pull requests
#594 - A GUI version of nanoGPT
Issue -
State: open - Opened by ystemsrx 9 days ago
#593 - Refactor: Optimize device handling and DDP setup
Pull Request -
State: closed - Opened by metacritical 15 days ago
- 1 comment
#592 - Refactor: Optimize device handling and DDP setup
Pull Request -
State: closed - Opened by metacritical 15 days ago
#591 - attempt to refactor nanoGPT
Issue -
State: open - Opened by tesla-cat 19 days ago
#590 - RoPE implementation with a shakespeare-char-rope test
Pull Request -
State: open - Opened by albertvucinovic 26 days ago
- 1 comment
#589 - Why seed 1337?
Issue -
State: open - Opened by sillymultifora 30 days ago
- 1 comment
#588 - Refactored code from different base based on leyan_branch
Pull Request -
State: open - Opened by cesposo about 1 month ago
#587 - Add the Quantized model and also a Demo of the Quantized model
Pull Request -
State: open - Opened by Ruhaan838 about 1 month ago
#586 - why in transformer we compute for all tokens but then use only the last token for prediction?
Issue -
State: open - Opened by Ahmedd-Wahdan about 1 month ago
#585 - Add rotary
Pull Request -
State: open - Opened by sunddytwo about 1 month ago
#584 - Dec branch commit
Pull Request -
State: open - Opened by cesposo about 2 months ago
#583 - tuh
Pull Request -
State: open - Opened by ftDyuthi about 2 months ago
#582 - Create Raja king
Pull Request -
State: open - Opened by RockyRajhacker 2 months ago
#581 - Added weight pruning
Pull Request -
State: closed - Opened by aswinr19 2 months ago
#580 - added dataset
Pull Request -
State: closed - Opened by Gaurav-B-R 2 months ago
#579 - how fix it?
Issue -
State: closed - Opened by WhiteSnowGirl 2 months ago
- 1 comment
#578 - fix: ensure non-zero learning rate during warmup at iteration 0
Pull Request -
State: closed - Opened by silasalberti 2 months ago
- 1 comment
#577 - NanoGPT and RTX 4090
Issue -
State: open - Opened by ArtHughes 3 months ago
#576 - Test
Pull Request -
State: closed - Opened by rkdgmlqja 3 months ago
#575 - Feature/concrete dropout
Pull Request -
State: closed - Opened by javiermas 3 months ago
#574 - Merge for comprehension when filtering parameters without grad
Pull Request -
State: open - Opened by tsdeng 3 months ago
#574 - Merge for comprehension when filtering parameters without grad
Pull Request -
State: open - Opened by tsdeng 3 months ago
#573 - Oren/amd mess
Pull Request -
State: closed - Opened by OrenLeung 3 months ago
#572 - Oren/config
Pull Request -
State: closed - Opened by OrenLeung 3 months ago
#571 - cancel
Pull Request -
State: closed - Opened by Zhao-Yuting 3 months ago
#570 - NaniGpt
Issue -
State: open - Opened by ashokkumar272 4 months ago
#570 - NaniGpt
Issue -
State: open - Opened by ashokkumar272 4 months ago
#569 - added fix to type comparison to enable fused AdamW
Pull Request -
State: open - Opened by seanjudelyons 4 months ago
#568 - Spring cleaning
Pull Request -
State: closed - Opened by ckgresla 4 months ago
#567 - How best to implement a differential transformer?
Issue -
State: open - Opened by Wilsontomass 4 months ago
- 2 comments
#566 - the things
Pull Request -
State: closed - Opened by drisspg 4 months ago
#565 - Normalized gpt
Pull Request -
State: closed - Opened by santiagoakle 4 months ago
- 1 comment
#564 - Ddp do not sync when not needed
Pull Request -
State: closed - Opened by OrenLeung 4 months ago
#563 - Refactor to stop inductor mess
Pull Request -
State: closed - Opened by OrenLeung 4 months ago
#562 - Moe
Pull Request -
State: closed - Opened by hellozmz 4 months ago
#561 - Clean
Pull Request -
State: closed - Opened by simran-arora 4 months ago
#560 - Windows 11: FileExistsError: [WinError 183] Cannot create a file when that file already exists
Issue -
State: open - Opened by VyBui 5 months ago
- 2 comments
#559 - Update README.md
Pull Request -
State: closed - Opened by eshwarram 5 months ago
#558 - Updated README.md to include table of contents, why this project is useful, and how to contribute, and added an output for one command
Pull Request -
State: open - Opened by arhaque09 5 months ago
#557 - Updated README.md to include table of contents, why this project is useful, and how to contribute
Pull Request -
State: closed - Opened by arhaque09 5 months ago
#556 - Updated README.md to include table of contents, why this project is useful, and how to contribute
Pull Request -
State: closed - Opened by arhaque09 5 months ago
#555 - Adding NVIDIA hardware performance detection
Pull Request -
State: open - Opened by fparisio 5 months ago
#554 - Pretraining loss explosion
Issue -
State: open - Opened by mattgorb 5 months ago
- 3 comments
#553 - Add fire finetuning
Pull Request -
State: open - Opened by gkielian 6 months ago
#552 - why is the warmup_iters set 2000?
Issue -
State: open - Opened by luxunxiansheng 6 months ago
#551 - The Positional Encoding is not using sin / cos?
Issue -
State: open - Opened by mw66 6 months ago
- 1 comment
#550 - Remove flashattention from model.py
Pull Request -
State: closed - Opened by chughtapan 6 months ago
#549 - Implement muP and add code for mup guide blog
Pull Request -
State: closed - Opened by ndey96 6 months ago
#548 - Perplexity
Issue -
State: open - Opened by Precola 6 months ago
#547 - Progressive training?
Issue -
State: open - Opened by immartian 6 months ago
- 5 comments
#546 - Add support for 0 temperature
Pull Request -
State: open - Opened by jmccrosky 6 months ago
#545 - torchrun on L40S Error:torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
Issue -
State: closed - Opened by Precola 6 months ago
- 1 comment
#544 - Rocm support?
Issue -
State: open - Opened by ilovethensa 6 months ago
- 1 comment
#543 - Calculation of Batch Size
Issue -
State: closed - Opened by Precola 7 months ago
- 1 comment
#542 - configuration for Macs(apple silicon)
Issue -
State: open - Opened by bawsi99 7 months ago
#541 - Adding gpt2 training experiment
Pull Request -
State: closed - Opened by NewtonSander 7 months ago
#540 - Use weights_only for loading
Pull Request -
State: open - Opened by kit1980 7 months ago
#539 - What to change for training on two T4 GPUs ?
Issue -
State: open - Opened by noorchauhan 7 months ago
- 1 comment
#538 - Update train.py for more efficiency
Pull Request -
State: open - Opened by Jesseonmi 7 months ago
#537 - Simple Use Case Demonstration with Old School Runescape Terminology
Issue -
State: open - Opened by Omarch47 7 months ago
#536 - Solution to Exercise 1 from Youtube Lecture (Batching the heads) - Why does it work?
Issue -
State: closed - Opened by Andrew-Luo1 7 months ago
- 1 comment
#535 - Nano GPT
Issue -
State: open - Opened by phanee123 7 months ago
#534 - ddp on macbook CPU
Pull Request -
State: closed - Opened by langong347 7 months ago
#533 - free up state_dict variable memory after loading checkpoint
Pull Request -
State: open - Opened by adistomar 8 months ago
#532 - FileNotFoundError: [Errno 2] No such file or directory: 'data/openwebtext/train.bin'
Issue -
State: open - Opened by HarikrishnanK9 8 months ago
- 1 comment
#531 - About the get_batch
Issue -
State: open - Opened by leo-young 8 months ago
- 1 comment
#530 - Add automatic detection of number of CPU cores
Pull Request -
State: open - Opened by Jakobovski 8 months ago
- 1 comment
#529 - Data cleaning for openwebtext
Issue -
State: open - Opened by zzkzzkjsw 8 months ago
#528 - fix val dataset size code comment
Pull Request -
State: open - Opened by vhmth 8 months ago
#527 - fix(train.py): mfu estimation to respect CPU-GPU sync point
Pull Request -
State: open - Opened by JasonLiJT 8 months ago
#526 - code gpt v1
Pull Request -
State: closed - Opened by shatrugna 8 months ago
#525 - "RuntimeError: Internal Triton PTX codegen error" is raised when I train shakespeare_char with a GPU
Issue -
State: open - Opened by shenbb 8 months ago
- 5 comments
#524 - Pretraining Divergence
Issue -
State: open - Opened by egoetz 8 months ago
- 3 comments
#493 - Overfitting of the small GPU model
Issue -
State: open - Opened by Bachstelze 9 months ago
#492 - Drop in performance when changing dtype to float32
Issue -
State: open - Opened by blaisedelattre 9 months ago
- 2 comments
#491 - Improvements to RWKV v5.1
Pull Request -
State: closed - Opened by faresobeid 9 months ago
#490 - Update README.md - added alternative running instructions
Pull Request -
State: closed - Opened by dnordfors 9 months ago
#489 - What does "prioritize teeth over education" even mean?
Issue -
State: open - Opened by dw61 9 months ago
- 2 comments
#488 - sign descent seems to do better than adamw?
Pull Request -
State: open - Opened by nullonesix 9 months ago
#487 - Update README.md
Pull Request -
State: closed - Opened by jellehak 9 months ago
#486 - [Q] Async prefetch next batch while model is doing forward pass
Issue -
State: open - Opened by GM-git-dotcom 9 months ago
- 1 comment
#485 - Shouldnt the ddp check be on ZERO instead of -1
Issue -
State: open - Opened by sajinpgupta 9 months ago
#484 - Hyperparameter Tuning
Issue -
State: closed - Opened by SinanCavusoglu 9 months ago
#483 - Index out of range when training on custom dataset
Issue -
State: open - Opened by TayTT 9 months ago
- 1 comment
#482 - What is the meaning of nh and hs
Issue -
State: closed - Opened by Bachstelze 9 months ago
- 1 comment
#481 - Fix: conditional use of GradScaler based on device_type and dtype in train.py
Pull Request -
State: open - Opened by BRAINIAC2677 10 months ago
#480 - neverMind
Issue -
State: closed - Opened by Zemulax 10 months ago
#479 - Implement multi-token prediction option for models
Issue -
State: open - Opened by tmostak 10 months ago
- 7 comments
#478 - nanoGPT/model.py where `manual implementation of attention`,Is it correct to modify it like I did?
Issue -
State: open - Opened by wmx-github 10 months ago
- 1 comment
#477 - Training fails on Python 3.12 on either GPU or CPU
Issue -
State: closed - Opened by tigran123 10 months ago
- 3 comments
#476 - Recommendation for something smaller
Issue -
State: open - Opened by diamondfishtools 10 months ago
- 1 comment
#475 - [Question] Why use `__call__` to do forward.
Issue -
State: closed - Opened by Felix-Zhenghao 10 months ago
- 2 comments
#474 - could nanoGPT be the AI assistant for the development of CAX software?
Issue -
State: open - Opened by fengsim 10 months ago
- 1 comment
#473 - [Question] The mask size seems wrong?
Issue -
State: closed - Opened by Felix-Zhenghao 10 months ago
#472 - [Question] why bias is init to zero?
Issue -
State: closed - Opened by michael8090 10 months ago
- 1 comment
#471 - Citing this project in research
Issue -
State: open - Opened by davmacario 11 months ago
- 4 comments
#470 - CUDA error: device-side assert triggered
Issue -
State: closed - Opened by ecsfu 11 months ago
#469 - How to Set "vocab_size" and "block_size" for Word Embedding?
Issue -
State: open - Opened by haibao-yu 11 months ago
- 1 comment
#468 - Is this loss curve normal
Issue -
State: open - Opened by banyan-god 11 months ago
- 21 comments
#467 - Resume Training
Issue -
State: open - Opened by tiredsoul21 11 months ago
- 3 comments