EleutherAI/pythia issues and pull requests

#187 - include full 'end_iteration' batch

Pull Request - State: closed - Opened by efittschen 3 months ago - 1 comment

#186 - 'end_iteration' batch truncated in batch_viewer

Issue - State: closed - Opened by efittschen 3 months ago

#185 - fix dead links in README.md

Pull Request - State: closed - Opened by KarolisRam 5 months ago - 1 comment

#184 - PolyPythias: Include scripts for finding HMM training maps

Pull Request - State: open - Opened by oskarvanderwal 5 months ago

#183 - Shard hashes for `EleutherAI/pile-deduped-pythia-preshuffled`

Issue - State: open - Opened by pietrolesci 5 months ago

#182 - Add code to reproduce polypythia paper plots

Pull Request - State: closed - Opened by pietrolesci 5 months ago

#181 - fix wandb link and add some more

Pull Request - State: closed - Opened by Quentin-Anthony 8 months ago

#180 - Can't find the index file

Issue - State: closed - Opened by jaydeepborkar 9 months ago - 1 comment

#179 - add wandb link to loss curves

Pull Request - State: closed - Opened by Quentin-Anthony 9 months ago - 1 comment

#178 - `torch.concat` is supported when reproducing results with docker

Issue - State: open - Opened by pingzhili 10 months ago

#177 - Pythia 160M is giving unreasonable logit values

Issue - State: open - Opened by danielmisrael 10 months ago - 1 comment

#176 - Update batch_viewer docs to accurately reflect data indexing

Pull Request - State: closed - Opened by jeffreygwang 10 months ago - 3 comments

#175 - No EOD Tokens in EleutherAI/pile-deduped-pythia-preshuffled

Issue - State: open - Opened by markschoene 10 months ago - 1 comment

#174 - Iclr

Pull Request - State: closed - Opened by sunnyddelight 10 months ago - 1 comment

#173 - The possibility of modifying the checkpoint and reloading the model parameter

Issue - State: open - Opened by peteryang1031 10 months ago

#172 - Questions regarding the WSC evaluation results

Issue - State: open - Opened by mutiann 11 months ago

#171 - Clarification of Pythia Deduped Precision - bf16 or fp16?

Issue - State: closed - Opened by RylanSchaeffer 11 months ago - 1 comment

#170 - Update README.md

Pull Request - State: closed - Opened by MeDott29 12 months ago - 1 comment

#169 - Inquiry about Re-uploading Additional Pythia-410M Model Variants(i.e., seed1-9)

Issue - State: open - Opened by liudan193 about 1 year ago

#168 - Refactoring

Pull Request - State: closed - Opened by sunnyddelight about 1 year ago - 1 comment

#167 - open-source the training data used between two adjacent checkpoints

Issue - State: open - Opened by txy77 about 1 year ago

#166 - make README easier to follow

Pull Request - State: open - Opened by Arvid-pku about 1 year ago - 1 comment

#165 - Issue while showering NLO events with NLO

Issue - State: open - Opened by rash-eng about 1 year ago - 4 comments

#164 - How to use the Huggingface dataset /EleutherAI/pythia-memorized-evals in predictable-memorization?

Issue - State: open - Opened by Happy2Git about 1 year ago

#163 - cache_dir cannot be the same as model name

Issue - State: open - Opened by arunasank about 1 year ago

#162 - Pythia 12b flash config

Issue - State: open - Opened by jvendrow about 1 year ago

#161 - Sparse

Pull Request - State: closed - Opened by sunnyddelight about 1 year ago - 1 comment

#160 - how to use Pythia

Issue - State: open - Opened by gaohang about 1 year ago - 1 comment

#159 - Convert to GGUF

Issue - State: open - Opened by yanxon about 1 year ago

#158 - Reshape error in batch viewer

Issue - State: closed - Opened by activatedgeek about 1 year ago - 1 comment

#157 - Update README.md

Pull Request - State: closed - Opened by borgr over 1 year ago - 1 comment

#156 - tokenizer.pad_token

Issue - State: open - Opened by vincent317 over 1 year ago - 1 comment

#155 - instruct-tuned pythia

Issue - State: open - Opened by WilliamsToTo over 1 year ago

#154 - Correct link to huggingface

Pull Request - State: closed - Opened by l-ma over 1 year ago - 1 comment

#153 - Provide the shuffled index_mapping npy files for ease of reproducing training data

Issue - State: open - Opened by ziqi-zhang over 1 year ago - 1 comment

#152 - Optimizer states in HF format

Issue - State: open - Opened by seyuboglu over 1 year ago - 1 comment

#151 - Weird inconsistency in Tokenizer vocabulary

Issue - State: open - Opened by javirandor over 1 year ago - 1 comment

#150 - Is there existing code to resume training from specific checkpoint?

Issue - State: closed - Opened by javirandor over 1 year ago - 1 comment

#149 - "gas" configuration doesn't do anything

Issue - State: open - Opened by segyges over 1 year ago

#148 - Adding _warmup_mmap_file function missing from MMapIndexedDataset

Pull Request - State: closed - Opened by rdiehlmartinez over 1 year ago - 1 comment

#147 - Add training loss data

Pull Request - State: open - Opened by pietrolesci over 1 year ago - 1 comment

#146 - Update README.md

Pull Request - State: closed - Opened by speed1313 over 1 year ago - 1 comment

#145 - Would it be possible to share training loss curves on the original Pythia models?

Issue - State: closed - Opened by itsnamgyu over 1 year ago - 4 comments

#144 - [Pythia on Pile-Dedup] Training for ~1.5 epochs: how to identify the repeated sequences (i.e., the additional .5 epoch)?

Issue - State: open - Opened by pietrolesci over 1 year ago - 3 comments

#143 - Fix ToC

Pull Request - State: closed - Opened by osanseviero over 1 year ago - 1 comment

#142 - Details about "EleutherAI/pythia-160m-seed*" models

Issue - State: closed - Opened by IanMagnusson over 1 year ago - 3 comments

#141 - Missing / undownloadable checkpoints on huggingface

Issue - State: closed - Opened by mirandrom over 1 year ago - 3 comments

#140 - .

Issue - State: closed - Opened by ParthaKrPaul over 1 year ago - 2 comments

#139 - Wrong files in eval?

Issue - State: open - Opened by borgr over 1 year ago

#138 - Pytia or GPT-neox?

Issue - State: closed - Opened by borgr over 1 year ago - 1 comment

#137 - Deduplicated Pile dataset with Domain Attribution

Issue - State: closed - Opened by michaelduan8 over 1 year ago - 1 comment

#136 - Replicating the Training Data Order

Issue - State: closed - Opened by prakharg24 over 1 year ago - 1 comment

#135 - Inconsistent init methods of pythia-6.9b model

Issue - State: open - Opened by mqyqlx over 1 year ago - 3 comments

#134 - Update README.md

Pull Request - State: closed - Opened by segyges over 1 year ago

#133 - Add checksum for data from huggingface

Pull Request - State: closed - Opened by segyges over 1 year ago - 1 comment

#132 - The value of weight decay

Issue - State: closed - Opened by yehuitang over 1 year ago - 1 comment

#131 - Update requirements.txt

Pull Request - State: closed - Opened by segyges over 1 year ago

#130 - Typos in readme.md

Pull Request - State: closed - Opened by segyges over 1 year ago - 1 comment

#129 - Model Initialization Question

Issue - State: closed - Opened by yanlai00 over 1 year ago - 1 comment

#128 - Update readme to load preshuffled datasets

Pull Request - State: closed - Opened by uSaiPrashanth over 1 year ago

#127 - Has the data been shuffled?

Issue - State: open - Opened by Lisennlp over 1 year ago - 2 comments

#126 - Reading data is slowly！

Issue - State: open - Opened by Lisennlp over 1 year ago - 1 comment

#125 - Automatically calculate shard size

Pull Request - State: closed - Opened by uSaiPrashanth almost 2 years ago

#124 - Automatically determine shard size

Pull Request - State: closed - Opened by uSaiPrashanth almost 2 years ago - 1 comment

#123 - Batch Viewer : Why Sequence Length 2049?

Issue - State: closed - Opened by prakharg24 almost 2 years ago - 15 comments

#122 - The performance about pythia and LLaMA model architecture

Issue - State: closed - Opened by peiyingxin almost 2 years ago - 1 comment

#121 - Any results on the validation set?

Issue - State: open - Opened by chujiezheng almost 2 years ago - 1 comment

#120 - README Update

Pull Request - State: closed - Opened by StellaAthena almost 2 years ago - 1 comment

#119 - Update README.md

Pull Request - State: closed - Opened by StellaAthena almost 2 years ago

#118 - Mismatch about the evaluation results

Issue - State: closed - Opened by yuzc19 almost 2 years ago - 11 comments

#117 - Weights tying

Issue - State: closed - Opened by link-er almost 2 years ago - 1 comment

#116 - Convert the huggingface checkpoint to GPT-Neox checkpoint

Issue - State: closed - Opened by ZhiYuanZeng almost 2 years ago - 2 comments

#115 - Clarification of Pythia tokenizer(s) at different sizes, steps and data preprocessing?

Issue - State: closed - Opened by RylanSchaeffer almost 2 years ago - 1 comment

#114 - Error when running unshard_memmap.py

Issue - State: closed - Opened by ShaneeyS about 2 years ago - 2 comments

#113 - Can I provide custom data and continue training Pythia on this new data?

Issue - State: closed - Opened by GeorgiAngelov about 2 years ago - 1 comment

#112 - Difference between LFS and HuggingFace datasets?

Issue - State: closed - Opened by eric-mitchell about 2 years ago - 1 comment

#111 - Batch viewer

Pull Request - State: closed - Opened by uSaiPrashanth about 2 years ago

#110 - Multiple training runs of same model with different random seed for weight initialisation

Issue - State: closed - Opened by KarolisRam about 2 years ago - 1 comment

#109 - Update documentation for installing `batch_viewer.py` deps

Pull Request - State: closed - Opened by haileyschoelkopf about 2 years ago

#108 - Possible error in Pythia-12B-deduped step 32000

Issue - State: closed - Opened by smahdavi4 about 2 years ago - 2 comments

#107 - pythia-12b checkpoints missing on HuggingFace for step4000 and step32000

Issue - State: closed - Opened by byungdoh about 2 years ago - 2 comments

#106 - Is there a template poilerplate for the prompt used in C.1 gender bias intervention?

Issue - State: closed - Opened by ruyuan-zuo about 2 years ago - 1 comment

#105 - Draft new repo structure

Pull Request - State: closed - Opened by haileyschoelkopf about 2 years ago - 2 comments

#104 - Add Memorization Evals to repo

Pull Request - State: closed - Opened by uSaiPrashanth about 2 years ago - 1 comment

#103 - Added instructions for reproducing a Pythia training

Pull Request - State: closed - Opened by BaruchG about 2 years ago - 1 comment

#102 - Train/valid/test split

Issue - State: closed - Opened by choidami about 2 years ago - 1 comment

#101 - release of checkpoints of different steps

Issue - State: closed - Opened by TobiasLee about 2 years ago - 5 comments

#100 - Ensure flash attention in configs

Pull Request - State: closed - Opened by haileyschoelkopf about 2 years ago

#99 - Revamp experiment organization and migrate code when necessary

Issue - State: closed - Opened by StellaAthena over 2 years ago
Labels: documentation

#98 - Will memorization experimental codes be released?

Issue - State: closed - Opened by chujiezheng over 2 years ago - 2 comments

#97 - the loss of pythia training

Issue - State: closed - Opened by Wangpeiyi9979 over 2 years ago - 3 comments

#96 - Fine-tuning recommendations

Issue - State: closed - Opened by RainIwakura over 2 years ago - 2 comments

#95 - Update License

Pull Request - State: closed - Opened by StellaAthena over 2 years ago - 1 comment

#94 - Pythia 6.9B Model Missing Checkpoint

Issue - State: closed - Opened by chujiezheng over 2 years ago - 1 comment

#93 - Update README.md to remove work-in-progress disclaimer

Pull Request - State: closed - Opened by haileyschoelkopf over 2 years ago - 1 comment

#92 - Is there an access to the deduplicated version of the data with meta info?

Issue - State: closed - Opened by Jason3900 over 2 years ago - 7 comments

#91 - Add a citation to Readme

Pull Request - State: closed - Opened by haileyschoelkopf over 2 years ago

#90 - Cleanup old files

Pull Request - State: closed - Opened by haileyschoelkopf over 2 years ago

#89 - Fine tune for text generation on custom data

Issue - State: closed - Opened by samarthsarin over 2 years ago - 1 comment

#88 - Add paper to README

Pull Request - State: closed - Opened by Quentin-Anthony over 2 years ago

GitHub / EleutherAI/pythia issues and pull requests