allenai/OLMo issues and pull requests

#798 - Failed to resolve dependency while using uv

Issue - State: open - Opened by aztecher 3 days ago
Labels: type/bug

#797 - Activations Exploding Across Layers

Issue - State: open - Opened by c3-utsavdutta98 7 days ago
Labels: type/question

#796 - Optimizer and trainer states for OLMo-7B (Feb. 2024)

Issue - State: open - Opened by rahuln 8 days ago - 1 comment
Labels: type/question

#795 - Updated default value of memmap_dtype to uint32

Pull Request - State: open - Opened by aman-17 9 days ago

#794 - Request for Checkpoint for Mid-stage Training

Issue - State: open - Opened by liziniu 17 days ago - 1 comment
Labels: type/question

#793 - Update upload-artifacts version

Pull Request - State: closed - Opened by AkshitaB 21 days ago

#792 - Add data mix for ladder compatibility

Pull Request - State: closed - Opened by AkshitaB 21 days ago

#791 - Tokenizer to be used for generation of data to .npy files

Issue - State: open - Opened by WenJett 24 days ago - 3 comments
Labels: type/question

#790 - Tokenizer to be used for prepare_memmap_dataset.py

Issue - State: closed - Opened by WenJett 28 days ago - 2 comments
Labels: type/question

#789 - MPS support

Pull Request - State: closed - Opened by aman-17 30 days ago - 8 comments
Labels: type/feature

#788 - OOM error during checkpoint saving in large-node training

Issue - State: closed - Opened by SimonSuster about 1 month ago - 2 comments

#787 - Were you founded by Jeff Bezos?

Issue - State: closed - Opened by I-I-IT about 1 month ago - 4 comments
Labels: type/question

#785 - Use Union to support Python<3.10

Pull Request - State: open - Opened by xhochy about 1 month ago

#782 - Update model revision and dependencies for December releases

Pull Request - State: open - Opened by cnewell about 1 month ago - 1 comment

#781 - Add configuration option to allow users to specify a custom Dataset class

Pull Request - State: closed - Opened by mbsabath about 1 month ago - 1 comment

#780 - Allow arbitrary implementations of pytorch datasets to be used/specified in the configuration file

Issue - State: closed - Opened by mbsabath about 1 month ago - 1 comment
Labels: type/feature

#779 - Why is multiprocessing forced to spawn processes rather than forking

Issue - State: closed - Opened by mbsabath about 1 month ago - 2 comments
Labels: type/question

#778 - Merging new tokens into parts

Issue - State: open - Opened by RitwikGupta about 1 month ago - 2 comments
Labels: type/question

#777 - Add GSM8K in-loop

Pull Request - State: closed - Opened by davidheineman about 1 month ago

#776 - High CrossEntropy and Z Loss variance after loading from checkpoint

Issue - State: open - Opened by abhijangda about 2 months ago - 1 comment
Labels: type/bug

#775 - Generating training mix of OLMo2 from dolmino-mix

Issue - State: open - Opened by Cy-47 about 2 months ago - 1 comment
Labels: type/question

#774 - ❓ The question

Issue - State: closed - Opened by sweetpythoncode about 2 months ago - 2 comments
Labels: type/question

#773 - OLMo2 checkpoints for continued pretraining (non-HF)

Issue - State: closed - Opened by SimonSuster about 2 months ago - 4 comments
Labels: type/question

#772 - make it available as gguf and available in llama.cpp and ollama

Issue - State: closed - Opened by olumolu about 2 months ago - 4 comments
Labels: type/feature

#771 - Decoupled Momentum Optimization

Pull Request - State: open - Opened by peter-sk about 2 months ago - 1 comment

#770 - Single Accelerator training and MPS support (PR #769)

Issue - State: closed - Opened by peter-sk 2 months ago - 1 comment
Labels: type/feature

#769 - enable non-distributed training and MPS support

Pull Request - State: closed - Opened by peter-sk 2 months ago - 1 comment

#768 - Update README.md

Pull Request - State: closed - Opened by codeviking 2 months ago

#767 - Added OLMo-2-1124 intermediate checkpoints

Pull Request - State: closed - Opened by aman-17 2 months ago

#766 - Sudden data error during training

Issue - State: closed - Opened by faresobeid 2 months ago - 17 comments
Labels: type/bug

#765 - tokenizer.encode function`s param add_special_tokens=False not work.

Issue - State: open - Opened by xiaohan2909 2 months ago
Labels: type/bug

#765 - tokenizer.encode function`s param add_special_tokens=False not work.

Issue - State: open - Opened by xiaohan2909 2 months ago
Labels: type/bug

#764 - Pass dir as str to wandb.init

Pull Request - State: closed - Opened by 2015aroras 2 months ago

#764 - Pass dir as str to wandb.init

Pull Request - State: open - Opened by 2015aroras 2 months ago

#763 - How to inspect training data in a particular batch?

Issue - State: closed - Opened by explanare 2 months ago - 3 comments
Labels: type/question

#762 - Difference Between DDP and FSDP Modes

Issue - State: closed - Opened by lllabmaster 3 months ago - 1 comment
Labels: type/question

#761 - adding example script for hosting an OpenAI API server for OLMo 2 on Modal.com

Pull Request - State: closed - Opened by cnewell 3 months ago - 4 comments

#760 - update mean reduction zloss to ignore labels == ignore_index

Pull Request - State: open - Opened by jasonkrone 3 months ago

#759 - How to train the tinymodel(Like 300M or 150M)

Issue - State: closed - Opened by yongding-tao 3 months ago - 2 comments
Labels: type/question

#758 - Question about the OLMo2 Stage 2 training procedures: was the optimizer state from Stage 1 used during the training of Stage 2?

Issue - State: closed - Opened by Taoer1996 3 months ago - 2 comments
Labels: type/question

#757 - About eos_token_id in config file (20M, 1B)

Issue - State: open - Opened by lllabmaster 3 months ago
Labels: type/question

#756 - Adds extra configs for anneals to the Readme

Pull Request - State: closed - Opened by dirkgr 3 months ago

#755 - OLMo-2 held-out validation data

Issue - State: closed - Opened by chawins 3 months ago - 3 comments
Labels: type/question

#755 - OLMo-2 held-out validation data

Issue - State: closed - Opened by chawins 3 months ago - 3 comments
Labels: type/question

#754 - More checkpoint information

Pull Request - State: closed - Opened by dirkgr 3 months ago

#754 - More checkpoint information

Pull Request - State: closed - Opened by dirkgr 3 months ago

#753 - Figure for plotting Pareto frontier (Flops x Perf)

Pull Request - State: closed - Opened by kyleclo 3 months ago - 1 comment

#752 - Add OLMo 2 checkpoint converter and update docs

Pull Request - State: closed - Opened by 2015aroras 3 months ago

#752 - Add OLMo 2 checkpoint converter and update docs

Pull Request - State: closed - Opened by 2015aroras 3 months ago

#751 - Update README.md

Pull Request - State: closed - Opened by revbucket 3 months ago - 1 comment

#750 - Legal Whammy for 7B

Pull Request - State: closed - Opened by dirkgr 3 months ago

#750 - Legal Whammy for 7B

Pull Request - State: closed - Opened by dirkgr 3 months ago

#749 - Barely Legal Whammy

Pull Request - State: closed - Opened by dirkgr 3 months ago

#749 - Barely Legal Whammy

Pull Request - State: closed - Opened by dirkgr 3 months ago

#748 - Add test and train sets to in-loop oe-eval (for ladder work)

Pull Request - State: closed - Opened by liujch1998 3 months ago

#747 - Add intermediate size to hf_olmo

Pull Request - State: closed - Opened by 2015aroras 3 months ago - 2 comments

#746 - Difference between 0724 and 0424 7B models

Issue - State: closed - Opened by jiahai-feng 3 months ago - 1 comment
Labels: type/documentation

#745 - Documentation Improvements

Pull Request - State: closed - Opened by aman-17 3 months ago
Labels: type/documentation

#744 - dependency issue when running scripts/unshard.py

Issue - State: closed - Opened by viking-sudo-rm 4 months ago - 2 comments
Labels: type/bug

#743 - TypeError - running example code

Issue - State: closed - Opened by KPK101 4 months ago - 1 comment
Labels: type/bug

#742 - Improved support for Google Storage

Pull Request - State: closed - Opened by dirkgr 4 months ago

#741 - Fail to load tokenizer for checkpoints

Issue - State: open - Opened by tresiwald 4 months ago
Labels: type/bug

#740 - Adds support for converting from safetensors

Pull Request - State: open - Opened by soldni 4 months ago

#739 - Peteish13

Pull Request - State: closed - Opened by dirkgr 4 months ago - 1 comment

#738 - Annealing configs

Pull Request - State: closed - Opened by dirkgr 4 months ago - 1 comment

#737 - Error Encountered During Multi-Node Pretraining with Torchrun

Issue - State: open - Opened by Zehui127 4 months ago
Labels: type/bug

#736 - Create an eval-only script for existing ckpts

Pull Request - State: open - Opened by liujch1998 4 months ago - 1 comment

#735 - fixed up changelog

Pull Request - State: closed - Opened by revbucket 4 months ago

#734 - reduce the dataset size - update readme for default conda environment

Pull Request - State: closed - Opened by amazloumi 4 months ago

#733 - Update version.py

Pull Request - State: closed - Opened by revbucket 4 months ago

#732 - OLMo Checkpoints Website Down?

Issue - State: closed - Opened by jhsansom 4 months ago - 2 comments
Labels: type/bug

#731 - Adding script for processing many intermediate checkpoints at once for offline evals

Pull Request - State: open - Opened by IanMagnusson 5 months ago - 2 comments

#730 - Add regression tests for training

Pull Request - State: open - Opened by 2015aroras 5 months ago - 1 comment

#729 - I added some script to help people set up the env on vista

Pull Request - State: closed - Opened by leo-liuzy 5 months ago

#728 - Getting training data by sources

Issue - State: closed - Opened by chawins 5 months ago - 2 comments
Labels: type/question

#727 - Compile support for peteish13

Pull Request - State: closed - Opened by dirkgr 5 months ago

#726 - Missing OLMo checkpoints

Issue - State: open - Opened by mirandrom 5 months ago - 1 comment

#725 - Fix build errors

Pull Request - State: closed - Opened by 2015aroras 5 months ago

#724 - Update LUMI scripts

Pull Request - State: closed - Opened by 2015aroras 5 months ago

#723 - docker

Issue - State: closed - Opened by jacky080808 5 months ago - 1 comment
Labels: type/question

#722 - 8-bit allgather support

Issue - State: closed - Opened by yaroslavvb 5 months ago - 1 comment
Labels: type/question

#721 - Bump torch version

Pull Request - State: closed - Opened by vwxyzjn 5 months ago - 1 comment

#716 - Performance degrades after converting checkpoint to HF

Issue - State: closed - Opened by ahmadshapiro 6 months ago - 1 comment
Labels: type/question

#715 - Expected Data Format

Issue - State: open - Opened by aflah02 6 months ago - 1 comment
Labels: type/question

#714 - Which mmlu validation setting is recommend?

Issue - State: open - Opened by mathfinder 6 months ago - 1 comment
Labels: type/question

#713 - Criteria for Selecting acc vs. len_norm Metrics

Issue - State: closed - Opened by mathfinder 6 months ago - 1 comment
Labels: type/question

#706 - OLMoThreadError: generator thread data thread 0 failed

Issue - State: closed - Opened by ybdesire 6 months ago - 2 comments
Labels: type/question

#697 - Number of tokens Olmo-1B was trained: 2T or 3T?

Issue - State: closed - Opened by jiyeonkimd 7 months ago - 1 comment
Labels: type/question

#695 - Gflops computation is faulty for FSDP due to bug in `OLMo.num_params()`

Issue - State: open - Opened by AkshitaB 7 months ago - 1 comment

#692 - why CrossEntropyLoss is zero,i

Issue - State: closed - Opened by aizhweiwei 7 months ago - 2 comments
Labels: type/question

#687 - Extend functionality of Wandb Config Diff script

Pull Request - State: closed - Opened by kyleclo 7 months ago - 1 comment

#682 - DNM: Loss issue checkpoint with refine1b setups

Pull Request - State: open - Opened by undfined 7 months ago

#678 - Initial Loss increased from 10 (0.3.0 v) to 60 (0.4.0) !

Issue - State: closed - Opened by Xuekai-Zhu 7 months ago - 10 comments
Labels: type/bug

#677 - Ladder 1xC

Pull Request - State: open - Opened by AkshitaB 7 months ago

#675 - Alternative evals

Pull Request - State: open - Opened by AkshitaB 7 months ago

#655 - Can long text be splitted into short texts?

Issue - State: open - Opened by CoinCheung 7 months ago - 1 comment
Labels: type/question

#639 - MoE

Pull Request - State: open - Opened by Muennighoff 8 months ago - 5 comments

#632 - How the 1B and 7B model are initialized?

Issue - State: open - Opened by sanyalsunny111 8 months ago - 1 comment
Labels: type/question

#609 - Finetuning config file

Issue - State: closed - Opened by joellliu 9 months ago - 5 comments
Labels: type/question

#596 - why is the total_grad_norm increasing across training?

Issue - State: open - Opened by ryanyxw 9 months ago - 12 comments
Labels: type/question

GitHub / allenai/OLMo issues and pull requests