Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / dvlab-research/LongLoRA issues and pull requests
#195 - not able to reproduce the passkey retrieval accuracy
Issue -
State: open - Opened by zhuconv about 2 months ago
- 4 comments
#194 - LongBench evaluation
Issue -
State: open - Opened by Clement25 3 months ago
#193 - 是否支持如GPT2这类的supervised fine-tune?
Issue -
State: open - Opened by CharRic 3 months ago
#192 - How LongAlpaca Data was constructed?
Issue -
State: open - Opened by S1s-Z 3 months ago
#191 - 这套代码是否支持qwen/baichuan微调一个中文的长文本模型,代码需要做哪些修改?
Issue -
State: open - Opened by jy-101361-1810897 4 months ago
#190 - norm层不是没有参数矩阵吗
Issue -
State: open - Opened by changanxunyi 5 months ago
#189 - Update README.md
Pull Request -
State: open - Opened by Dominic789654 5 months ago
#188 - I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl.
Issue -
State: open - Opened by masteryqq 6 months ago
- 1 comment
#187 - 模型完全没法正常输出
Issue -
State: closed - Opened by Tangent-90C 6 months ago
- 1 comment
#186 - embedding 为什么要resize成32001?
Issue -
State: open - Opened by momandai 6 months ago
#185 - Something wrong with the torch version
Issue -
State: open - Opened by dian1414 6 months ago
#184 - What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)?
Issue -
State: open - Opened by ZackZikaiXiao 7 months ago
#183 - How did make questions and answers for long context(LongAlpaca)?
Issue -
State: open - Opened by ddoyles 8 months ago
#182 - When I set `per_device_train_batch_size=2`, the S2-Attn would not shift as expected
Issue -
State: open - Opened by linhaojia13 9 months ago
- 2 comments
#181 - HF models missing rope scaling in the config
Issue -
State: open - Opened by hsiehjackson 9 months ago
#180 - Machine don't install Flash Attention
Issue -
State: open - Opened by huilong-chen 9 months ago
#179 - global_step文件
Issue -
State: open - Opened by xxcoco763 9 months ago
#178 - Add callback for saving trainable parameters and model config
Pull Request -
State: open - Opened by GirinMan 9 months ago
#177 - Regarding the results in Table 8 and Table 14
Issue -
State: open - Opened by Statisticss 9 months ago
#176 - About the different datasets and corresponding models
Issue -
State: open - Opened by Statisticss 9 months ago
#175 - The proof-pile/test-sample-ids is not the exact ids for the proof-pile-testsample.bin
Issue -
State: closed - Opened by pangjh3 10 months ago
#174 - Memory usage "too small" for 7B Llama-2
Issue -
State: open - Opened by Linohong 10 months ago
#173 - training a LLM w/ shifted sparse attention from the scratch?
Issue -
State: open - Opened by we1k 10 months ago
#172 - merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge
Issue -
State: open - Opened by Spongeorge 10 months ago
#171 - Distributed inference issue
Issue -
State: open - Opened by yixliu1 10 months ago
#170 - 论文中的evaluate结果,推理时用的attention是shifted sparse attention?还是full attention?
Issue -
State: open - Opened by zhangxiann 10 months ago
#169 - Is it possible to increase the context length of phi-2 using LongLora? If yes, what changes need to be done to support it?
Issue -
State: open - Opened by dbanka 10 months ago
- 1 comment
#168 - the value of loss is too unstable when supervised-finetune the 7b-100k-ft model
Issue -
State: open - Opened by seanxuu 10 months ago
- 1 comment
#167 - streaming llm problem
Issue -
State: open - Opened by seanxuu 10 months ago
#166 - How can I use the Llama-2-7b-longlora-100k-ft model correctly
Issue -
State: open - Opened by seanxuu 10 months ago
#165 - bug report : RuntimeError: probability tensor contains either inf, nan or element < 0
Issue -
State: open - Opened by seanxuu 10 months ago
#164 - Is LongLoRA can be mixed with YaRN ?
Issue -
State: open - Opened by DevNullx64 11 months ago
#163 - 推理时候显存分配
Issue -
State: open - Opened by xxcoco763 11 months ago
- 2 comments
#162 - Adapting to new models
Issue -
State: open - Opened by epinnock 11 months ago
- 2 comments
#161 - 如何在LoRA训练中加入embed和norm层的训练?
Issue -
State: open - Opened by Zheng-Jay 11 months ago
#160 - Lora+deepspeed zero3 无法保存lora权重问题
Issue -
State: closed - Opened by AresXD 11 months ago
- 6 comments
#159 - What llama attn replacement to use for SFT-based inference?
Issue -
State: open - Opened by spring1915 11 months ago
#158 - 在没有报错的情况下,LongAlpaca-7B只对文本的第一段文字进行了响应
Issue -
State: open - Opened by waleyW 11 months ago
#157 - Configs in inference.py necessary for context length expansion in model serving?
Issue -
State: open - Opened by spring1915 11 months ago
#156 - 训练的时候使用的什么外推方式
Issue -
State: open - Opened by IT-five 11 months ago
#155 - 支持qwen、baichuan等中文模型微调吗
Issue -
State: open - Opened by kevinuserdd 11 months ago
#154 - inference OOM
Issue -
State: open - Opened by PharMolix 11 months ago
#153 - Is LongAlpaca model fine-tuned from llama-2 or the Alpaca model?
Issue -
State: open - Opened by Mooler0410 11 months ago
#152 - Can LongLoRA be used for incremental pre-training?
Issue -
State: open - Opened by Zheng-Jay 11 months ago
#151 - the current text generation call will exceed the model's predefined maximum length (4096)
Issue -
State: open - Opened by waleyW 12 months ago
- 4 comments
#150 - 微调数据
Issue -
State: closed - Opened by Go4miii 12 months ago
#149 - 推理 group整除问题
Issue -
State: closed - Opened by Michelleable 12 months ago
- 1 comment
#148 - LongLoRA + Flash Attention 2 causing illigal memory access
Issue -
State: open - Opened by ArturNiederfahrenhorst 12 months ago
- 7 comments
#147 - 32k inference result is garbled
Issue -
State: open - Opened by zhanglv0209 12 months ago
- 8 comments
#146 - torch.cuda.OutOfMemoryError: CUDA out of memory
Issue -
State: closed - Opened by zhanglv0209 12 months ago
- 3 comments
#145 - 中文领域进展
Issue -
State: closed - Opened by ccp123456789 12 months ago
- 1 comment
#144 - Added multiple GPUs evaluation.
Pull Request -
State: closed - Opened by weicheng113 12 months ago
- 1 comment
#143 - 扩充词表后,不改变其他代码和参数,预训练过程中能否对新添加的词元进行训练
Issue -
State: closed - Opened by THUchenzhou 12 months ago
#142 - Qustions about dynamic NTK interpolation fine-tuning and non-linear interpolation methods
Issue -
State: open - Opened by Yiyi-philosophy 12 months ago
- 1 comment
#141 - Question about inference use Llama-2-7b-longlora-8k-ft output nothing
Issue -
State: closed - Opened by ysanimals 12 months ago
- 4 comments
#140 - Inquiry Regarding the Tokenize Function
Issue -
State: closed - Opened by thanaphatt1 12 months ago
- 3 comments
#139 - To save model in HF format after supervised-fine-tune-qlora
Issue -
State: open - Opened by MyBruso 12 months ago
- 7 comments
#138 - How did you design questions and answers in the LongQA dataset?
Issue -
State: closed - Opened by finallymint 12 months ago
- 1 comment
#137 - How to eval Llama-2-7b-longlora-16k-ft?
Issue -
State: closed - Opened by rabi-fei almost 1 year ago
- 4 comments
#136 - Perplexity Validation Error
Issue -
State: closed - Opened by panpanli521 almost 1 year ago
- 2 comments
#135 - SFT Problem: Attention Mask doesn't match
Issue -
State: closed - Opened by Busdriver26 about 1 year ago
- 1 comment
#134 - Confused with eval.py perplexity implementation
Issue -
State: closed - Opened by weicheng113 about 1 year ago
- 1 comment
#133 - Cannot Convert Checkpint to Trainable Model
Issue -
State: open - Opened by believewhat about 1 year ago
- 3 comments
#132 - intel xpu qlora support related code changes
Pull Request -
State: closed - Opened by rnwang04 about 1 year ago
#131 - intel xpu qlora support related code changes
Pull Request -
State: closed - Opened by rnwang04 about 1 year ago
#130 - Bitstandbytes library verision error with sft
Issue -
State: closed - Opened by Breno-de-Angelo about 1 year ago
- 1 comment
#129 - How to train LongLoRA step-by-step ?
Issue -
State: closed - Opened by dhcode-cpp about 1 year ago
- 1 comment
#128 - uploaded inference script using qlora
Pull Request -
State: closed - Opened by zhounu about 1 year ago
- 1 comment
#127 - Torch.compile switches model back to training mode
Issue -
State: closed - Opened by gianlucamacri about 1 year ago
- 1 comment
#126 - Help to confirm understanding of forward_flashattn
Issue -
State: closed - Opened by weicheng113 about 1 year ago
- 2 comments
#125 - Is supervised-fine-tune.py required to run merge_lora_weight after fine-tuning?
Issue -
State: closed - Opened by caochuxueeee about 1 year ago
#124 - fix starting token repetition
Pull Request -
State: closed - Opened by gianlucamacri about 1 year ago
- 1 comment
#123 - Saving pytorch_model.bin with QLORA
Issue -
State: closed - Opened by grimulkan about 1 year ago
- 7 comments
#122 - No LongLora 100K Llama 2 7B?
Issue -
State: closed - Opened by TamirHCL about 1 year ago
#121 - Model training information?
Issue -
State: closed - Opened by TamirHCL about 1 year ago
- 6 comments
#120 - 能给一份S^2 Attension推理的代码吗?
Issue -
State: open - Opened by hxs91 about 1 year ago
- 4 comments
#119 - 关于sft实验效果
Issue -
State: closed - Opened by AresXD about 1 year ago
- 5 comments
#118 - Transformers <= 4.34.0 requirement
Issue -
State: closed - Opened by Breno-de-Angelo about 1 year ago
- 3 comments
#117 - Model differences?
Issue -
State: closed - Opened by TamirHCL about 1 year ago
- 2 comments
#116 - Catch none-valued rope scaling configs
Pull Request -
State: closed - Opened by j-frei about 1 year ago
- 1 comment
#115 - supervised fine_tuning for domain specific question-answering
Issue -
State: closed - Opened by MyBruso about 1 year ago
- 2 comments
#114 - turning exception into warning for flash attention inference
Pull Request -
State: closed - Opened by gianlucamacri about 1 year ago
- 1 comment
#113 - Added management of rope factor in previous configuration
Pull Request -
State: closed - Opened by gianlucamacri about 1 year ago
- 1 comment
#111 - RedPajama-Data-1T-Sample tokenization stuck
Issue -
State: closed - Opened by weicheng113 about 1 year ago
- 6 comments
#110 - Hardware requirements for 7B 100k
Issue -
State: closed - Opened by nedRad88 about 1 year ago
- 1 comment
#107 - support multiple round conversation
Issue -
State: closed - Opened by coranholmes about 1 year ago
- 15 comments
Labels: enhancement
#106 - Abnormal loss curve for supervised fine tuning on one GPU
Issue -
State: closed - Opened by Oscilloscope98 about 1 year ago
- 6 comments
#103 - Question: Why use "instruct" prompting on top of original LLaMa-2 prompting?
Issue -
State: closed - Opened by pseudotensor about 1 year ago
- 3 comments
Labels: enhancement
#102 - zero_to_fp32
Issue -
State: closed - Opened by bdytx5 about 1 year ago
- 2 comments
#100 - 能否在对llama-2-7b-chat-hf进行中文语料微调后的模型上,采用您的代码继续SFT?
Issue -
State: closed - Opened by YinSonglin1997 about 1 year ago
- 6 comments
#99 - What's the difference between finetune and supervised-finetune?
Issue -
State: closed - Opened by zejunwang1 about 1 year ago
- 2 comments
#98 - Get trainable weights from SFT
Issue -
State: closed - Opened by mces89 about 1 year ago
- 2 comments
#97 - 70B SFT out of memory?
Issue -
State: closed - Opened by mces89 about 1 year ago
- 2 comments
#96 - 代码/模型推理 bug?
Issue -
State: closed - Opened by xxzcc about 1 year ago
- 1 comment
#95 - addinput host and port in args for demo
Pull Request -
State: closed - Opened by jayxio about 1 year ago
- 1 comment
#94 - Is it possible to use with Mistral or Zephyr models?
Issue -
State: closed - Opened by versae about 1 year ago
- 1 comment
#93 - Applied flash attention usage
Issue -
State: closed - Opened by gyuwon12 about 1 year ago
- 5 comments
#92 - 中文长文本模型
Issue -
State: closed - Opened by ccp123456789 about 1 year ago
- 1 comment
#91 - supervised fine tuning 7b GPU requirement - CUDA out of memory
Issue -
State: closed - Opened by weicheng113113 about 1 year ago
- 22 comments
#90 - the rolling problem
Issue -
State: closed - Opened by teslacool about 1 year ago
- 1 comment