jzhang38/TinyLlama issues and pull requests

#200 - Can this model reply correct answer?

Issue - State: open - Opened by xkungxfu 15 days ago

#199 - How to Load the Pretrained Dataset after Downloading?

Issue - State: open - Opened by JT-Ushio about 2 months ago

#198 - error while loading Tiny llama

Issue - State: open - Opened by survivebycoding 2 months ago

#197 - will it work on cpu?

Issue - State: open - Opened by survivebycoding 2 months ago

#196 - inconsistent size when I use the huggingface model

Issue - State: open - Opened by mathetian 2 months ago

#195 - TinyLlama Chat1.1B Q4

Issue - State: open - Opened by igor967 3 months ago

#193 - 你好，我想问下TinyLlama进行指令微调的代码有吗？以及微调的template格式是什么样呢？

Issue - State: closed - Opened by fang-siqi 4 months ago - 1 comment

#173 - Why FSDP not DPP？

Issue - State: open - Opened by noforit 8 months ago - 1 comment

#171 - Pretraining failing on IndexError: list index out of range in file packed_dataset.py

Issue - State: closed - Opened by databillm 8 months ago - 2 comments

#168 - Help me pls

Issue - State: open - Opened by aritralegndery 9 months ago - 3 comments

#164 - revise continue train from initial_iter

Pull Request - State: open - Opened by peiji1981 9 months ago - 2 comments

#100 - ncclRemoteError

Issue - State: closed - Opened by JerryDaHeLian about 1 year ago - 2 comments

#99 - Can anyone pre train tinyllama. py on v100s?

Issue - State: closed - Opened by JerryDaHeLian about 1 year ago - 1 comment

#98 - Update EVAL.md

Pull Request - State: closed - Opened by jzhang38 about 1 year ago

#97 - Consider providing safetensor files

Issue - State: closed - Opened by Calandiel about 1 year ago - 2 comments

#96 - Update EVAL.md

Pull Request - State: closed - Opened by TianduoWang about 1 year ago

#95 - 请问支持中文对话么？

Issue - State: closed - Opened by xman1991 about 1 year ago - 1 comment

#94 - Status of chat model

Issue - State: closed - Opened by galleon about 1 year ago - 2 comments

#93 - data mixture

Issue - State: closed - Opened by NonvolatileMemory about 1 year ago - 1 comment

#92 - Update init

Pull Request - State: closed - Opened by jzhang38 about 1 year ago

#91 - test

Pull Request - State: closed - Opened by TianduoWang about 1 year ago

#90 - Update convert_lit_checkpoint.py

Pull Request - State: closed - Opened by TianduoWang about 1 year ago

#89 - fix: Gradient problem when the number of devices is 1

Pull Request - State: closed - Opened by ChaosCodes about 1 year ago

#88 - Gradient problem when the number of devices is 1

Issue - State: closed - Opened by SivilTaram about 1 year ago - 4 comments

#87 - Update prepare_starcoder.py

Pull Request - State: closed - Opened by ChaosCodes about 1 year ago

#86 - How to know the max_step

Issue - State: closed - Opened by wangxidong06 about 1 year ago - 1 comment

#85 - fix bugs

Pull Request - State: closed - Opened by ChaosCodes about 1 year ago

#84 - Releasing TinyLlama T1.5

Issue - State: closed - Opened by erfanzar about 1 year ago - 1 comment

#83 - Why do we not set the `ignore_index` of `FusedCrossEntropy` to `bos_id`?

Issue - State: closed - Opened by larrylawl about 1 year ago - 3 comments

#82 - quantized version like GPTQ or INT4?

Issue - State: closed - Opened by tigerinus about 1 year ago - 1 comment

#81 - 1.5T release?

Issue - State: closed - Opened by binarycrayon about 1 year ago - 2 comments

#80 - What is the purpose of the "sanity check" which in the tinyllama.py?

Issue - State: closed - Opened by JerryDaHeLian about 1 year ago - 2 comments

#79 - How to disable flash attention?

Issue - State: closed - Opened by ZhouqyCH about 1 year ago - 3 comments

#78 - TinyLlama is not working with huggingface assisted generated

Issue - State: closed - Opened by kapilsprinklr about 1 year ago - 1 comment

#77 - Why is there a significant drop in `val_ppl` after fixing data-loading bug?

Issue - State: open - Opened by bobqianic about 1 year ago - 7 comments

#76 - Which tokenizer settings were used to process the dataset?

Issue - State: closed - Opened by awaelchli about 1 year ago - 3 comments

#75 - A question about Dataset combination

Issue - State: closed - Opened by PeiqinSun about 1 year ago - 3 comments

#74 - The queries generated a lot of repetitions. Possible to provide 1T again "fix"?

Issue - State: open - Opened by hiqsociety about 1 year ago - 6 comments

#73 - How to get the huggingface format model and config?

Issue - State: closed - Opened by SinclairCoder about 1 year ago - 6 comments

#72 - Current state of TinyLlama

Issue - State: closed - Opened by ghost about 1 year ago - 1 comment

#71 - How to control the num of epochs?

Issue - State: closed - Opened by SinclairCoder about 1 year ago - 1 comment

#70 - Update README.md: TinyLlama 1.5T checkpoint postponed

Pull Request - State: closed - Opened by TianduoWang about 1 year ago

#69 - Fix bug in dataloader

Pull Request - State: closed - Opened by jzhang38 about 1 year ago - 1 comment

#68 - release of 1.5T checkpoint

Issue - State: closed - Opened by sadransh about 1 year ago - 4 comments

#67 - Possible bug in TinyLlama's dataloading

Issue - State: closed - Opened by larrylawl about 1 year ago - 9 comments

#66 - compare train loss with larger models

Issue - State: closed - Opened by sadransh about 1 year ago - 2 comments

#65 - great work! can you do a mistral 1b tinyllama?

Issue - State: closed - Opened by hiqsociety about 1 year ago - 5 comments

#64 - How do you choose the value of batch-size?

Issue - State: closed - Opened by PeiqinSun about 1 year ago - 1 comment

#63 - Replay Finetuning & store as GGML

Issue - State: closed - Opened by galleon about 1 year ago - 1 comment

#62 - How to speedup tokenizer.encode?

Issue - State: closed - Opened by PeiqinSun about 1 year ago - 2 comments

#61 - 代码能在windows 环境下跑吗？

Issue - State: closed - Opened by JerryDaHeLian about 1 year ago - 1 comment

#60 - Converting Saved Model Files to Hugging Face Transformers Format

Issue - State: closed - Opened by dtxwhzw about 1 year ago - 2 comments

#59 - .

Issue - State: closed - Opened by ypxie about 1 year ago

#58 - fix special tokens for tokenizer in sft scripts

Pull Request - State: closed - Opened by ChaosCodes about 1 year ago

#57 - Very very poor perf using faraday and amd gpu ?

Issue - State: closed - Opened by maxime-fleury about 1 year ago - 1 comment

#56 - Problem with TinyLlama-1.1B-Chat-v0.3 tokenizer

Issue - State: closed - Opened by galleon about 1 year ago - 5 comments

#55 - Notes on chat fine-tuning and datacontent

Issue - State: closed - Opened by RonanKMcGovern about 1 year ago - 4 comments

#54 - python环境包有可能发出来一份吗

Issue - State: closed - Opened by lightcome about 1 year ago - 5 comments

#53 - Running on CPU using llama.cpp

Issue - State: closed - Opened by galleon about 1 year ago - 10 comments

#52 - Continue pretrain

Pull Request - State: closed - Opened by jzhang38 about 1 year ago

#51 - Jzhang38 patch 2

Pull Request - State: closed - Opened by jzhang38 about 1 year ago

#50 - update eval results for 1T checkpoint

Pull Request - State: closed - Opened by ChaosCodes about 1 year ago - 1 comment

#49 - when will be 1T check point ready?

Issue - State: closed - Opened by xiaoyunwu about 1 year ago - 7 comments

#48 - How to finetune on custom dataset

Issue - State: closed - Opened by hrsmanian about 1 year ago - 11 comments

#47 - Why train three epochs? not one epoch?

Issue - State: closed - Opened by PeiqinSun about 1 year ago - 1 comment

#46 - Any plans for the ONNX runtime?

Issue - State: closed - Opened by VatsaDev about 1 year ago - 3 comments

#45 - Update README_zh-CN.md

Pull Request - State: closed - Opened by ChaosCodes about 1 year ago

#44 - 下载数据集不便

Issue - State: closed - Opened by scorpjr1 about 1 year ago - 3 comments

#43 - Update prepare_slimpajama.py

Pull Request - State: closed - Opened by michael-c-max about 1 year ago

#42 - How to compute token numbers for a dataset？

Issue - State: closed - Opened by Arcmoon-Hu about 1 year ago - 4 comments

#41 - Why is the vocab size of `TinyLlama-1.1B-Chat-V0.1` 32001?

Issue - State: closed - Opened by Chillee about 1 year ago - 5 comments

#40 - Question Regarding the Absence of BOS and EOS Tokens in Tokenizer Encoding

Issue - State: closed - Opened by dtxwhzw about 1 year ago - 4 comments

#39 - Release format + artefact

Issue - State: closed - Opened by PierreColombo about 1 year ago - 3 comments

#38 - Update EVAL.md

Pull Request - State: closed - Opened by jzhang38 about 1 year ago

#37 - Why is Swiglu packed_weights = False?

Issue - State: closed - Opened by larrylawl about 1 year ago - 1 comment

#36 - Resuming training

Issue - State: open - Opened by artnoage about 1 year ago - 8 comments

#35 - TinyLlama-1.1B-orca-gpt4

Issue - State: closed - Opened by acalatrava about 1 year ago - 1 comment

#34 - info when load model

Issue - State: closed - Opened by shyoulala about 1 year ago - 3 comments

#33 - How did you determine the size of the TinyLlama model?

Issue - State: closed - Opened by dtxwhzw about 1 year ago - 2 comments

#32 - eval loss become nan after a single batch

Issue - State: closed - Opened by ThibaultCastells about 1 year ago - 4 comments

#31 - Request: Finetune the Model on more Data?

Issue - State: closed - Opened by VatsaDev about 1 year ago - 1 comment

#30 - Working Chat Demo

Pull Request - State: closed - Opened by VatsaDev about 1 year ago - 8 comments

#29 - TinyLlama-chat outputs truncated/small?

Issue - State: closed - Opened by VatsaDev about 1 year ago - 1 comment

#28 - Fix the finetune directory link in README.md

Pull Request - State: closed - Opened by VatsaDev about 1 year ago - 1 comment

#27 - Minimum learning rate

Issue - State: closed - Opened by artnoage about 1 year ago - 1 comment

#26 - 国内模型镜像

Issue - State: closed - Opened by Ma-Yongqiang about 1 year ago - 9 comments

#25 - Why does a dimension mismatch occur when I use AutoModelForCausalLM to load a model?

Issue - State: closed - Opened by BaenRH about 1 year ago - 2 comments

#24 - Getting gibberish output when running on llama.cpp

Issue - State: closed - Opened by luungoc2005 about 1 year ago - 33 comments

#23 - Why is tokenizer.model_max_length set to 1000000000000000019884624838656?

Issue - State: closed - Opened by kevinhu about 1 year ago - 2 comments

#22 - A guide to adding more datasets

Issue - State: closed - Opened by VatsaDev about 1 year ago - 3 comments

#21 - Has anyone used this code base for incremental pretraining of llama-2-7b?

Issue - State: closed - Opened by s1ghhh about 1 year ago - 4 comments

#20 - 我想要使用这个模型

Issue - State: closed - Opened by ChuXNobody about 1 year ago - 17 comments

#19 - where is "rotary_emb"?

Issue - State: closed - Opened by ScottishFold007 about 1 year ago - 1 comment

#18 - Credit to the FlashAttention repo

Pull Request - State: closed - Opened by tridao about 1 year ago - 1 comment

#17 - Are there any provided 4bit quant weights, or like a colab detailing quantization?

Issue - State: closed - Opened by VatsaDev about 1 year ago - 2 comments

#16 - Simple WebUI for the project

Pull Request - State: closed - Opened by VatsaDev about 1 year ago - 5 comments

#15 - Cleaned Notebook

Issue - State: closed - Opened by VatsaDev about 1 year ago - 2 comments

#14 - Can it run on CPU?

Issue - State: closed - Opened by abdul-jabbar-ms about 1 year ago - 10 comments

#13 - How to train model with databricks-dolly-15k.jsonl dataset format.

Issue - State: closed - Opened by TapendraBaduwal about 1 year ago - 4 comments

#12 - Have you considered code llama?

Issue - State: closed - Opened by IgorTodorovskiIBM about 1 year ago - 7 comments

GitHub / jzhang38/TinyLlama issues and pull requests