Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / karpathy/nanoGPT issues and pull requests
#97 - High validation loss when fine-tuning Shakespeare on gpt-xl?
Issue -
State: open - Opened by tombenj over 1 year ago
- 1 comment
#96 - added `star-history`
Pull Request -
State: closed - Opened by hemangjoshi37a over 1 year ago
#95 - Add Common Crawl Dataset
Issue -
State: open - Opened by DrissiReda over 1 year ago
#94 - Training on "Shakespeare" dataset is faster by using MacBook Air (M2)
Issue -
State: open - Opened by xiningnlp over 1 year ago
- 7 comments
#93 - GPU specs for finetuning gpt2-xl
Issue -
State: open - Opened by yogi-miraje over 1 year ago
- 2 comments
#92 - Making nano chatgpt
Issue -
State: open - Opened by nebyu08 over 1 year ago
- 8 comments
#91 - SparceGPT + nanoGPT
Issue -
State: open - Opened by Grabber over 1 year ago
- 1 comment
#90 - Out of Memory
Issue -
State: open - Opened by kevinsaner91 over 1 year ago
#89 - Dataset load
Issue -
State: open - Opened by thremilien over 1 year ago
- 5 comments
#88 - Cuda out of Memory
Issue -
State: open - Opened by hanfluid over 1 year ago
- 3 comments
#87 - Use GradScaler in model only if dtype is float16
Pull Request -
State: closed - Opened by johnwildauer over 1 year ago
- 2 comments
#86 - Fix python builtin redefined
Pull Request -
State: closed - Opened by tpaviot over 1 year ago
- 2 comments
#85 - Google Coral
Issue -
State: open - Opened by Helvio88 over 1 year ago
#84 - my gpu only supports float16, how do i train a model?
Issue -
State: closed - Opened by breadbrowser over 1 year ago
- 5 comments
#83 - token_embedding and pos_embedding
Issue -
State: closed - Opened by pure-water over 1 year ago
- 2 comments
#82 - Missed two spots while relative pathing
Pull Request -
State: closed - Opened by danielgross over 1 year ago
- 1 comment
#81 - GPT with UNet architecture gets the loss down to ~1.0 with no significant computation costs.
Issue -
State: closed - Opened by englertbruno over 1 year ago
- 15 comments
#80 - Fix Issue with running prepare.py (modified repos/nanoGPT/data/openwebtext/prepare.py)
Pull Request -
State: closed - Opened by pierrebhat over 1 year ago
- 5 comments
#79 - Fix Issue with running prepare.py Description: This PR fixes an issue with running `python prepare.py` by modifying files in the repos/nanoGPT/data directory.
Pull Request -
State: closed - Opened by pierrebhat over 1 year ago
#78 - Fix Issue with running prepare.py in the nanoGPT repo Description: Fixes an issue with running `python prepare.py` that results in a `DatasetGenerationError` by modifying the files ['repos/nanoGPT/data/openwebtext/prepare.py'].
Pull Request -
State: closed - Opened by pierrebhat over 1 year ago
#77 - Fix Issue with running prepare.py - Modify prepare.py, shakespeare/prepare.py & shakespeare_char/prepare.py
Pull Request -
State: closed - Opened by pierrebhat over 1 year ago
#76 - Cache the KV projection history when generating
Pull Request -
State: closed - Opened by dfyz over 1 year ago
- 12 comments
#75 - create an openwebtext for non-english language
Issue -
State: open - Opened by toozande over 1 year ago
- 1 comment
#74 - Small fix to decode fn in shakespeare_char/prepare.py
Pull Request -
State: closed - Opened by venusatuluri over 1 year ago
#73 - Use relative paths
Pull Request -
State: closed - Opened by danielgross over 1 year ago
- 1 comment
#72 - replace copy add with inplace add in the Block
Pull Request -
State: closed - Opened by KucicM over 1 year ago
- 1 comment
#71 - Zero-grad more aggressively to save memory
Pull Request -
State: closed - Opened by cchan over 1 year ago
- 8 comments
#70 - OpenWebTextCorpus DataLoader
Issue -
State: open - Opened by vgoklani over 1 year ago
- 3 comments
#69 - A question on getting garbage in sample.py (Generator)
Issue -
State: open - Opened by hka3rs over 1 year ago
#68 - Add motivation for why to use Fabric
Pull Request -
State: closed - Opened by awaelchli over 1 year ago
- 1 comment
#67 - Just a question
Issue -
State: open - Opened by jpbruneton over 1 year ago
- 2 comments
#66 - fix typo ( params -> tokens)
Pull Request -
State: closed - Opened by PWhiddy over 1 year ago
#65 - More a question - is there an easy way to test generation?
Issue -
State: closed - Opened by fblissjr over 1 year ago
- 1 comment
#64 - Proposal for a slightly improved minimal configuration system
Issue -
State: open - Opened by adonath over 1 year ago
#63 - Support TensorFlow 2
Issue -
State: closed - Opened by pure-rgb over 1 year ago
- 1 comment
#62 - Error when using Pytorch 2.0 (Compile=False)
Issue -
State: open - Opened by hanfluid over 1 year ago
- 1 comment
#61 - CUDA out of memory
Issue -
State: closed - Opened by hanfluid over 1 year ago
- 3 comments
#60 - checkpoints don't seem to be working
Issue -
State: closed - Opened by eniompw over 1 year ago
- 2 comments
#59 - Why using learnable position embedding just like token embedding?
Issue -
State: open - Opened by tiendung over 1 year ago
- 3 comments
#58 - what is the main speed up trick for nanoGPT?
Issue -
State: open - Opened by brando90 over 1 year ago
- 3 comments
#57 - Improve readability of huge numbers
Pull Request -
State: closed - Opened by ryouze over 1 year ago
- 1 comment
#55 - DDP on multinode [not yet working]
Pull Request -
State: closed - Opened by karpathy over 1 year ago
- 3 comments
#54 - Give tqdm some love :)
Pull Request -
State: closed - Opened by MicroPanda123 over 1 year ago
- 3 comments
#53 - Please add a pakcage manager and requirements
Issue -
State: open - Opened by muddi900 over 1 year ago
- 1 comment
#52 - Finetune code translation tasks
Issue -
State: open - Opened by edgarriba over 1 year ago
#51 - implements torch sdpa
Pull Request -
State: closed - Opened by LucasLLC over 1 year ago
#50 - Issue with running prepare.py
Issue -
State: open - Opened by torial over 1 year ago
- 3 comments
#49 - Got stucked at the "dataset = load_dataset("openwebtext")
Issue -
State: closed - Opened by hanfluid over 1 year ago
- 1 comment
#48 - Another thank you
Issue -
State: closed - Opened by greydanus over 1 year ago
- 1 comment
#47 - How to load the GPT-2 model
Issue -
State: open - Opened by strangeoptics over 1 year ago
- 2 comments
#46 - Support for Logging with Comet!
Pull Request -
State: closed - Opened by sherpan over 1 year ago
- 1 comment
#45 - Doesn't have a CONTRIBUTING.md file
Issue -
State: open - Opened by izam-mohammed over 1 year ago
- 1 comment
#44 - Corrected some mistakes in README.md file
Pull Request -
State: closed - Opened by izam-mohammed over 1 year ago
- 2 comments
#43 - Use classes for examples
Pull Request -
State: closed - Opened by acheong08 over 1 year ago
#42 - Pluck last token before lm_head(x) during inference?
Issue -
State: closed - Opened by jxtps over 1 year ago
- 2 comments
#41 - Is it possible: davinci-003?
Issue -
State: open - Opened by gameveloster over 1 year ago
- 4 comments
#40 - copy model args from checkpint model when resuming the training
Pull Request -
State: closed - Opened by yogi-miraje over 1 year ago
#39 - Perhaps another dependency is on the transformers package
Issue -
State: closed - Opened by amiramir over 1 year ago
- 1 comment
#38 - Make wandb training logs public
Issue -
State: open - Opened by tcapelle over 1 year ago
- 2 comments
#37 - Hardware requirements for inference?
Issue -
State: closed - Opened by jjtolton over 1 year ago
- 1 comment
#36 - Stop words?
Issue -
State: open - Opened by BoyuanJackChen over 1 year ago
- 3 comments
#35 - Thank you
Issue -
State: closed - Opened by agamemnonc over 1 year ago
- 2 comments
#34 - Add gradient accumulation support
Pull Request -
State: closed - Opened by VHellendoorn over 1 year ago
- 6 comments
#33 - What is nanoGPT and how to use it?
Issue -
State: open - Opened by sudo-sand over 1 year ago
- 1 comment
#32 - is there a google colab/ jupyter notebook implimentation of this project ?
Issue -
State: open - Opened by SadafShafi over 1 year ago
- 2 comments
#31 - Using float16 via Gradscaler
Issue -
State: open - Opened by acheong08 over 1 year ago
- 1 comment
#30 - Training on AMD Ryzen 5 5600H with Radeon Graphics, 3301 Mhz (RTX 3050 Laptop), 6 Cores, 12 Threads
Issue -
State: closed - Opened by ElJaian over 1 year ago
- 2 comments
#29 - Use argparse in configurator.py
Pull Request -
State: closed - Opened by plotguy over 1 year ago
- 3 comments
#28 - Training on M1 "MPS"
Issue -
State: open - Opened by okpatil4u over 1 year ago
- 45 comments
#27 - Don't hard-code device in autocast
Pull Request -
State: closed - Opened by lantiga over 1 year ago
- 11 comments
#26 - Argparse but vars remain at global level and minimal boilerplate
Pull Request -
State: closed - Opened by murbard over 1 year ago
- 5 comments
#25 - I explored the functionalities of prepare.py on my own and prepared a post in Spanish
Issue -
State: open - Opened by lzeladam over 1 year ago
- 1 comment
#24 - change whitelist to allowlist and blacklist to blocklist
Pull Request -
State: closed - Opened by JonathanSum over 1 year ago
- 1 comment
#23 - Inefficiencies
Pull Request -
State: closed - Opened by Anri-Lombard over 1 year ago
- 2 comments
#22 - # note: each worker gets a different seed
Issue -
State: closed - Opened by vgoklani over 1 year ago
- 2 comments
#21 - Tie LM Head Weight to Token Embedding to match official GPT2 Code
Pull Request -
State: closed - Opened by fattorib over 1 year ago
- 9 comments
#20 - Make wandb import conditioned to wandb_log=True
Pull Request -
State: closed - Opened by lantiga over 1 year ago
- 7 comments
#19 - Strip unwanted prefix from state keys when loading model in sample.py
Pull Request -
State: closed - Opened by nat over 1 year ago
- 1 comment
#18 - Simple ml-collections instrumentation
Pull Request -
State: closed - Opened by tcapelle over 1 year ago
- 3 comments
#17 - Log the config params to wandb
Pull Request -
State: closed - Opened by tcapelle over 1 year ago
- 3 comments
#16 - Update README.md
Pull Request -
State: closed - Opened by jorahn over 1 year ago
- 1 comment
#15 - Requirements & encoding
Pull Request -
State: closed - Opened by nil-andreu over 1 year ago
- 4 comments
#14 - Requirements & Encoding
Pull Request -
State: closed - Opened by nil-andreu over 1 year ago
#13 - PyTorch-nightly dependency chain
Issue -
State: open - Opened by nlathia over 1 year ago
- 1 comment
#12 - Jax/Flax Rewrite
Issue -
State: open - Opened by jenkspt over 1 year ago
- 3 comments
#11 - Remove @torch.jit.script decorator when compiling the model?
Issue -
State: closed - Opened by vgoklani over 1 year ago
- 12 comments
#10 - batch file write
Pull Request -
State: closed - Opened by LaihoE over 1 year ago
- 3 comments
#9 - cpu support
Pull Request -
State: closed - Opened by Ricardicus over 1 year ago
- 9 comments
#8 - Running train.py on 2060 GPU
Issue -
State: open - Opened by lzeladam over 1 year ago
- 6 comments
#7 - Is there an extra charge?
Issue -
State: closed - Opened by phonefixnicole over 1 year ago
#6 - README.md
Pull Request -
State: closed - Opened by jarede-dev over 1 year ago
- 1 comment
#5 - batch and multiprocess file write
Pull Request -
State: closed - Opened by LaihoE over 1 year ago
- 2 comments
#4 - prepare.py: single-threaded write with mmap only once
Pull Request -
State: closed - Opened by proger over 1 year ago
- 8 comments
#3 - pytorch gelu tanh approximation
Pull Request -
State: closed - Opened by zacwellmer over 1 year ago
- 2 comments
#2 - Try using gelu approximate = 'tanh'
Issue -
State: closed - Opened by drisspg over 1 year ago
- 2 comments
#1 - Minor Frozen GPTConfig
Pull Request -
State: closed - Opened by ankandrew over 1 year ago
- 1 comment