mosaicml/llm-foundry issues and pull requests

#127 - The error of test the Dataloader

Issue - State: closed - Opened by sysusicily over 1 year ago - 3 comments

#126 - How to choose a Chinese tokenizer?

Issue - State: open - Opened by sysusicily over 1 year ago

#125 - How to use composer to train mpt-7b on a single gpu?

Issue - State: closed - Opened by LisaWang0306 over 1 year ago - 3 comments

#124 - Finetuning with llm-foundry (composer format)

Issue - State: open - Opened by xgal over 1 year ago - 1 comment

#123 - MPT-7b models take very long to load (before actual model is loaded)

Issue - State: open - Opened by SinanAkkoyun over 1 year ago

#122 - How to install MPT-7B?

Issue - State: closed - Opened by SinanAkkoyun over 1 year ago - 10 comments

#121 - Fix licensing typo & context window typo

Pull Request - State: closed - Opened by brianjking over 1 year ago - 1 comment

#120 - MPT-7B Finetuning Jupyter notebook request

Issue - State: closed - Opened by GeorvityLabs over 1 year ago - 7 comments

#119 - Crash in hf_chat.py after lots of activity with longer output

Issue - State: closed - Opened by patrickhwood over 1 year ago - 4 comments

#118 - Set pad_token_id to tokenizer.pad_token_id if not set on command line

Pull Request - State: open - Opened by patrickhwood over 1 year ago

#117 - GPU memory issue

Issue - State: open - Opened by zhranj over 1 year ago - 3 comments

#116 - Add features to hf_generate

Pull Request - State: closed - Opened by alextrott16 over 1 year ago

#115 - larger scale of mpt

Issue - State: closed - Opened by WangJW424 over 1 year ago - 2 comments

#114 - how to run in v100 GPU

Issue - State: open - Opened by sysusicily over 1 year ago - 5 comments

#113 - fatal error: cuda.h: No such file or directory

Issue - State: open - Opened by Babramson over 1 year ago - 1 comment

#112 - does the non commercial license of the Chat model come from the dataset used to finetune?

Issue - State: closed - Opened by vince62s over 1 year ago - 1 comment

#111 - What are the hardware requirements?

Issue - State: closed - Opened by soloist-tech over 1 year ago - 1 comment

#110 - Add minimum `mosaicml-streaming` version

Pull Request - State: closed - Opened by hanlint over 1 year ago

#109 - Precision.AMP_BF16 is not supported for CPU training

Issue - State: closed - Opened by wj210 over 1 year ago - 2 comments

#108 - How to fine-tune instruct mpt-7b model?

Issue - State: open - Opened by dydx-git over 1 year ago - 6 comments

#107 - the problem about mosaicml-streaming

Issue - State: closed - Opened by sysusicily over 1 year ago - 1 comment

#106 - the problem about mosaicml-streaming

Issue - State: closed - Opened by sysusicily over 1 year ago - 1 comment

#105 - Generate shorter sentences

Issue - State: closed - Opened by NarenZen over 1 year ago - 1 comment

#104 - Question about the "streaming" package

Issue - State: closed - Opened by LisaWang0306 over 1 year ago - 1 comment

#103 - Consider using PyTorch 2.0 version of FlashAttention (remove dependency on flash-attn)

Issue - State: closed - Opened by Sciumo over 1 year ago - 2 comments

#102 - Update dataloader.py

Pull Request - State: closed - Opened by nelsontkq over 1 year ago

#101 - Make mpt7b finetuning more obvious

Pull Request - State: closed - Opened by samhavens over 1 year ago - 1 comment

#100 - Error in FSDP with composer

Issue - State: closed - Opened by bjoernpl over 1 year ago - 3 comments

#99 - Protobuf version conflict

Issue - State: closed - Opened by junaidid over 1 year ago - 1 comment

#98 - Add slack and license buttons to readme

Pull Request - State: closed - Opened by growlix over 1 year ago

#97 - Disable image for pypi

Pull Request - State: closed - Opened by mvpatel2000 over 1 year ago

#96 - FasterTransformer inference with MPT-7b

Issue - State: closed - Opened by SinanAkkoyun over 1 year ago - 4 comments

#95 - How to install torch 1.13.1+cu117?

Issue - State: closed - Opened by ighodgao over 1 year ago - 4 comments

#94 - Finetune MPT models with local dataset

Issue - State: closed - Opened by arpitkk over 1 year ago - 15 comments

#93 - Finetune MPT-7B with 48GiB of VRAM?

Issue - State: closed - Opened by juanps90 over 1 year ago - 1 comment

#92 - getting messy response (response includes #'s)

Issue - State: closed - Opened by NarenZen over 1 year ago

#91 - How to use the train.py finetuning the pre-trained MPT-7B?

Issue - State: open - Opened by metacarbon over 1 year ago - 1 comment

#90 - hf dict cfg overrides

Pull Request - State: closed - Opened by vchiley over 1 year ago - 1 comment

#89 - TypeError: Object of type DictConfig is not JSON serializable for HFCausalLM

Issue - State: closed - Opened by zanussbaum over 1 year ago - 2 comments

#88 - Evaluation result mismatch

Issue - State: closed - Opened by congyingxia over 1 year ago - 9 comments

#87 - Fix sed command for xentropy

Pull Request - State: closed - Opened by mvpatel2000 over 1 year ago

#86 - Remove xentropy from pypi

Pull Request - State: closed - Opened by mvpatel2000 over 1 year ago

#85 - Updates to prefixlm and t5

Pull Request - State: closed - Opened by alextrott16 over 1 year ago - 2 comments

#84 - Bump composer version

Pull Request - State: closed - Opened by vchiley over 1 year ago

#83 - HF Auth Token issue

Issue - State: closed - Opened by ighodgao over 1 year ago - 4 comments

#82 - Broken on docker image?

Issue - State: closed - Opened by tginart over 1 year ago - 14 comments

#81 - Installation issue

Issue - State: closed - Opened by ighodgao over 1 year ago - 4 comments

#80 - Fix pypi

Pull Request - State: closed - Opened by mvpatel2000 over 1 year ago

#79 - URL not found

Issue - State: closed - Opened by dariocazzani over 1 year ago - 4 comments

#78 - Error in Triton implementation

Issue - State: closed - Opened by NarenZen over 1 year ago - 1 comment

#77 - Error in triton implementation

Issue - State: closed - Opened by NarenZen over 1 year ago

#76 - Error in triton implementation

Issue - State: closed - Opened by NarenZen over 1 year ago

#75 - does it work on local machine or someone with limited resources

Issue - State: closed - Opened by rabsher over 1 year ago - 6 comments

#74 - Remove todo in workflow

Pull Request - State: closed - Opened by mvpatel2000 over 1 year ago

#73 - Update version

Pull Request - State: closed - Opened by dakinggg over 1 year ago

#72 - Update README.md

Pull Request - State: closed - Opened by ejyuen over 1 year ago

#71 - Windows support ?

Issue - State: closed - Opened by deepbeepmeep over 1 year ago - 7 comments

#70 - Configs say optimizer is AdamW but blog post says the optimizer is LION

Issue - State: closed - Opened by Craigacp over 1 year ago - 3 comments

#69 - Add venv instructions to readme

Pull Request - State: closed - Opened by dakinggg over 1 year ago

#68 - Finetuning

Issue - State: closed - Opened by canamika27 over 1 year ago - 7 comments

#67 - FasterTransformer

Issue - State: closed - Opened by xgal over 1 year ago - 34 comments

#66 - multilingual ability

Issue - State: closed - Opened by yangjianxin1 over 1 year ago - 10 comments

#65 - This project does not rely on DeepSpeed and Megatron？

Issue - State: closed - Opened by lc222 over 1 year ago - 1 comment
Labels: good first issue

#64 - Not an issue, a question - Peft/LoRa finetuning a possibility?

Issue - State: open - Opened by jamesd256 over 1 year ago - 11 comments

#63 - Issues following instructions on Ubuntu 22.04

Issue - State: closed - Opened by jamesd256 over 1 year ago - 8 comments

#62 - Fused Cross Entropy is not installed. Either (1) have a CUDA-compatible GPU and `pip install .[gpu]`, or (2) set your config model.loss_fn=torch_crossentropy.

Issue - State: closed - Opened by Stanlito-AI over 1 year ago - 6 comments

#61 - could you share links to the WandB runs of your trained models?

Issue - State: closed - Opened by DanqingZ over 1 year ago - 1 comment

#60 - Add GGML support

Issue - State: closed - Opened by jploski over 1 year ago - 5 comments

#59 - Reproduce result of Boolq on LLaMA-7B

Issue - State: closed - Opened by mx8435 over 1 year ago - 14 comments

#58 - mismatched config compared with blog

Issue - State: closed - Opened by donglixp over 1 year ago - 3 comments

#57 - Can we use PyTorch Version 2 to train models?

Issue - State: closed - Opened by linhduongtuan over 1 year ago - 1 comment

#56 - Any plans for 13B+ models?

Issue - State: closed - Opened by artyemk over 1 year ago - 2 comments

#55 - Update inference benchmarking script

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago - 1 comment

#54 - getting OOM on 8 nvidia GPUs with 40GB memory each

Issue - State: closed - Opened by arpitg1991 over 1 year ago - 13 comments

#53 - Update generation scripts

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#52 - Update README.md

Pull Request - State: closed - Opened by eltociear over 1 year ago - 1 comment

#51 - Add links to README

Pull Request - State: closed - Opened by hanlint over 1 year ago

#50 - Update README.md

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#49 - name -> hf_name

Pull Request - State: closed - Opened by alextrott16 over 1 year ago

#48 - Update header photo, add `eval.py` to quickstart

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#47 - Adding loss to MPTForCausalLM fwd if labels arg is not None

Pull Request - State: closed - Opened by vchiley over 1 year ago - 2 comments

#46 - update readme and updt init

Pull Request - State: closed - Opened by vchiley over 1 year ago

#45 - Update README.md

Pull Request - State: closed - Opened by growlix over 1 year ago - 1 comment

#44 - Fix local link in train README

Pull Request - State: closed - Opened by sashaDoubov over 1 year ago

#43 - Fix broken links in inference README

Pull Request - State: closed - Opened by sashaDoubov over 1 year ago

#42 - Cleanup README, YAMLs, `hf_generate.py`

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#41 - fix inference scripts and readme

Pull Request - State: closed - Opened by dakinggg over 1 year ago

#40 - Icl tests

Pull Request - State: closed - Opened by bmosaicml over 1 year ago - 2 comments

#39 - Fix Llama spelling

Pull Request - State: closed - Opened by bcui19 over 1 year ago

#38 - Fix PIQA continuation_delimiters

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#35 - gate warning with verbose flag

Pull Request - State: closed - Opened by vchiley over 1 year ago

#33 - HF, JSON, and CSV dataset prep

Pull Request - State: closed - Opened by codestar12 over 1 year ago - 1 comment

#32 - Add LLaMa and MPT HFCausalLM support

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#25 - mv cfg defaults

Pull Request - State: closed - Opened by vchiley over 1 year ago

#23 - convert examples v004 state dict to llm-foundry state dict

Pull Request - State: closed - Opened by vchiley over 1 year ago - 2 comments

#16 - Improve HF model loading behavior

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#14 - inference instructor example

Pull Request - State: closed - Opened by RR4787 over 1 year ago - 1 comment

#13 - Make ICL configurations more concise

Pull Request - State: closed - Opened by abhi-mosaic over 1 year ago

#12 - updt model config; add attn config

Pull Request - State: closed - Opened by vchiley over 1 year ago

#11 - Update README.md

Pull Request - State: closed - Opened by jacobfulano over 1 year ago

GitHub / mosaicml/llm-foundry issues and pull requests