Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / dvlab-research/LongLoRA issues and pull requests
#195 - not able to reproduce the passkey retrieval accuracy
Issue -
State: open - Opened by zhuconv 5 months ago
- 4 comments
#194 - LongBench evaluation
Issue -
State: open - Opened by Clement25 5 months ago
#193 - 是否支持如GPT2这类的supervised fine-tune?
Issue -
State: open - Opened by CharRic 6 months ago
#192 - How LongAlpaca Data was constructed?
Issue -
State: open - Opened by S1s-Z 6 months ago
#191 - 这套代码是否支持qwen/baichuan微调一个中文的长文本模型,代码需要做哪些修改?
Issue -
State: open - Opened by jy-101361-1810897 7 months ago
#190 - norm层不是没有参数矩阵吗
Issue -
State: open - Opened by changanxunyi 8 months ago
#189 - Update README.md
Pull Request -
State: open - Opened by Dominic789654 8 months ago
#188 - I am unable to reproduce the results from the paper for llama-7B-32k-longlora ppl.
Issue -
State: open - Opened by masteryqq 8 months ago
- 1 comment
#187 - 模型完全没法正常输出
Issue -
State: closed - Opened by Tangent-90C 9 months ago
- 1 comment
#186 - embedding 为什么要resize成32001?
Issue -
State: open - Opened by momandai 9 months ago
#185 - Something wrong with the torch version
Issue -
State: open - Opened by dian1414 9 months ago
#184 - What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)?
Issue -
State: open - Opened by ZackZikaiXiao 10 months ago
#183 - How did make questions and answers for long context(LongAlpaca)?
Issue -
State: open - Opened by ddoyles 11 months ago
#182 - When I set `per_device_train_batch_size=2`, the S2-Attn would not shift as expected
Issue -
State: open - Opened by linhaojia13 11 months ago
- 2 comments
#181 - HF models missing rope scaling in the config
Issue -
State: open - Opened by hsiehjackson 11 months ago
#180 - Machine don't install Flash Attention
Issue -
State: open - Opened by huilong-chen 12 months ago
#179 - global_step文件
Issue -
State: open - Opened by xxcoco763 12 months ago
#178 - Add callback for saving trainable parameters and model config
Pull Request -
State: open - Opened by GirinMan 12 months ago
#177 - Regarding the results in Table 8 and Table 14
Issue -
State: open - Opened by Statisticss about 1 year ago
#176 - About the different datasets and corresponding models
Issue -
State: open - Opened by Statisticss about 1 year ago
#175 - The proof-pile/test-sample-ids is not the exact ids for the proof-pile-testsample.bin
Issue -
State: closed - Opened by pangjh3 about 1 year ago
#174 - Memory usage "too small" for 7B Llama-2
Issue -
State: open - Opened by Linohong about 1 year ago
#173 - training a LLM w/ shifted sparse attention from the scratch?
Issue -
State: open - Opened by we1k about 1 year ago
#172 - merge_lora_weights_and_save_hf_model.py Error while deserializing header: HeaderTooLarge
Issue -
State: open - Opened by Spongeorge about 1 year ago
#171 - Distributed inference issue
Issue -
State: open - Opened by yixliu1 about 1 year ago
#170 - 论文中的evaluate结果,推理时用的attention是shifted sparse attention?还是full attention?
Issue -
State: open - Opened by zhangxiann about 1 year ago
#169 - Is it possible to increase the context length of phi-2 using LongLora? If yes, what changes need to be done to support it?
Issue -
State: open - Opened by dbanka about 1 year ago
- 1 comment
#168 - the value of loss is too unstable when supervised-finetune the 7b-100k-ft model
Issue -
State: open - Opened by seanxuu about 1 year ago
- 1 comment
#167 - streaming llm problem
Issue -
State: open - Opened by seanxuu about 1 year ago
#166 - How can I use the Llama-2-7b-longlora-100k-ft model correctly
Issue -
State: open - Opened by seanxuu about 1 year ago
#165 - bug report : RuntimeError: probability tensor contains either inf, nan or element < 0
Issue -
State: open - Opened by seanxuu about 1 year ago
#164 - Is LongLoRA can be mixed with YaRN ?
Issue -
State: open - Opened by DevNullx64 about 1 year ago
#163 - 推理时候显存分配
Issue -
State: open - Opened by xxcoco763 about 1 year ago
- 2 comments
#162 - Adapting to new models
Issue -
State: open - Opened by epinnock about 1 year ago
- 2 comments
#161 - 如何在LoRA训练中加入embed和norm层的训练?
Issue -
State: open - Opened by Zheng-Jay about 1 year ago
#160 - Lora+deepspeed zero3 无法保存lora权重问题
Issue -
State: closed - Opened by AresXD about 1 year ago
- 6 comments
#159 - What llama attn replacement to use for SFT-based inference?
Issue -
State: open - Opened by spring1915 about 1 year ago
#158 - 在没有报错的情况下,LongAlpaca-7B只对文本的第一段文字进行了响应
Issue -
State: open - Opened by waleyW about 1 year ago
#157 - Configs in inference.py necessary for context length expansion in model serving?
Issue -
State: open - Opened by spring1915 about 1 year ago
#156 - 训练的时候使用的什么外推方式
Issue -
State: open - Opened by IT-five about 1 year ago
#155 - 支持qwen、baichuan等中文模型微调吗
Issue -
State: open - Opened by kevinuserdd about 1 year ago
#154 - inference OOM
Issue -
State: open - Opened by PharMolix about 1 year ago
#153 - Is LongAlpaca model fine-tuned from llama-2 or the Alpaca model?
Issue -
State: open - Opened by Mooler0410 about 1 year ago
#152 - Can LongLoRA be used for incremental pre-training?
Issue -
State: open - Opened by Zheng-Jay about 1 year ago
#151 - the current text generation call will exceed the model's predefined maximum length (4096)
Issue -
State: open - Opened by waleyW about 1 year ago
- 4 comments
#150 - 微调数据
Issue -
State: closed - Opened by Go4miii about 1 year ago
#149 - 推理 group整除问题
Issue -
State: closed - Opened by Michelleable about 1 year ago
- 1 comment
#148 - LongLoRA + Flash Attention 2 causing illigal memory access
Issue -
State: open - Opened by ArturNiederfahrenhorst about 1 year ago
- 7 comments
#147 - 32k inference result is garbled
Issue -
State: open - Opened by zhanglv0209 about 1 year ago
- 8 comments
#146 - torch.cuda.OutOfMemoryError: CUDA out of memory
Issue -
State: closed - Opened by zhanglv0209 about 1 year ago
- 3 comments
#145 - 中文领域进展
Issue -
State: closed - Opened by ccp123456789 about 1 year ago
- 1 comment
#144 - Added multiple GPUs evaluation.
Pull Request -
State: closed - Opened by weicheng113 about 1 year ago
- 1 comment
#143 - 扩充词表后,不改变其他代码和参数,预训练过程中能否对新添加的词元进行训练
Issue -
State: closed - Opened by THUchenzhou about 1 year ago
#142 - Qustions about dynamic NTK interpolation fine-tuning and non-linear interpolation methods
Issue -
State: open - Opened by Yiyi-philosophy about 1 year ago
- 1 comment
#141 - Question about inference use Llama-2-7b-longlora-8k-ft output nothing
Issue -
State: closed - Opened by ysanimals about 1 year ago
- 4 comments
#140 - Inquiry Regarding the Tokenize Function
Issue -
State: closed - Opened by thanaphatt1 about 1 year ago
- 3 comments
#139 - To save model in HF format after supervised-fine-tune-qlora
Issue -
State: open - Opened by MyBruso about 1 year ago
- 7 comments
#138 - How did you design questions and answers in the LongQA dataset?
Issue -
State: closed - Opened by finallymint about 1 year ago
- 1 comment
#137 - How to eval Llama-2-7b-longlora-16k-ft?
Issue -
State: closed - Opened by rabi-fei about 1 year ago
- 4 comments
#136 - Perplexity Validation Error
Issue -
State: closed - Opened by panpanli521 about 1 year ago
- 2 comments
#135 - SFT Problem: Attention Mask doesn't match
Issue -
State: closed - Opened by Busdriver26 about 1 year ago
- 1 comment
#134 - Confused with eval.py perplexity implementation
Issue -
State: closed - Opened by weicheng113 about 1 year ago
- 1 comment
#133 - Cannot Convert Checkpint to Trainable Model
Issue -
State: open - Opened by believewhat about 1 year ago
- 3 comments
#132 - intel xpu qlora support related code changes
Pull Request -
State: closed - Opened by rnwang04 about 1 year ago
#131 - intel xpu qlora support related code changes
Pull Request -
State: closed - Opened by rnwang04 about 1 year ago
#130 - Bitstandbytes library verision error with sft
Issue -
State: closed - Opened by Breno-de-Angelo about 1 year ago
- 1 comment
#129 - How to train LongLoRA step-by-step ?
Issue -
State: closed - Opened by dhcode-cpp about 1 year ago
- 1 comment
#128 - uploaded inference script using qlora
Pull Request -
State: closed - Opened by zhounu over 1 year ago
- 1 comment
#127 - Torch.compile switches model back to training mode
Issue -
State: closed - Opened by gianlucamacri over 1 year ago
- 1 comment
#126 - Help to confirm understanding of forward_flashattn
Issue -
State: closed - Opened by weicheng113 over 1 year ago
- 2 comments
#125 - Is supervised-fine-tune.py required to run merge_lora_weight after fine-tuning?
Issue -
State: closed - Opened by caochuxueeee over 1 year ago
#124 - fix starting token repetition
Pull Request -
State: closed - Opened by gianlucamacri over 1 year ago
- 1 comment
#123 - Saving pytorch_model.bin with QLORA
Issue -
State: closed - Opened by grimulkan over 1 year ago
- 7 comments
#122 - No LongLora 100K Llama 2 7B?
Issue -
State: closed - Opened by TamirHCL over 1 year ago
#121 - Model training information?
Issue -
State: closed - Opened by TamirHCL over 1 year ago
- 6 comments
#120 - 能给一份S^2 Attension推理的代码吗?
Issue -
State: open - Opened by hxs91 over 1 year ago
- 4 comments
#119 - 关于sft实验效果
Issue -
State: closed - Opened by AresXD over 1 year ago
- 5 comments
#118 - Transformers <= 4.34.0 requirement
Issue -
State: closed - Opened by Breno-de-Angelo over 1 year ago
- 3 comments
#117 - Model differences?
Issue -
State: closed - Opened by TamirHCL over 1 year ago
- 2 comments
#116 - Catch none-valued rope scaling configs
Pull Request -
State: closed - Opened by j-frei over 1 year ago
- 1 comment
#115 - supervised fine_tuning for domain specific question-answering
Issue -
State: closed - Opened by MyBruso over 1 year ago
- 2 comments
#114 - turning exception into warning for flash attention inference
Pull Request -
State: closed - Opened by gianlucamacri over 1 year ago
- 1 comment
#113 - Added management of rope factor in previous configuration
Pull Request -
State: closed - Opened by gianlucamacri over 1 year ago
- 1 comment
#111 - RedPajama-Data-1T-Sample tokenization stuck
Issue -
State: closed - Opened by weicheng113 over 1 year ago
- 6 comments
#110 - Hardware requirements for 7B 100k
Issue -
State: closed - Opened by nedRad88 over 1 year ago
- 1 comment
#107 - support multiple round conversation
Issue -
State: closed - Opened by coranholmes over 1 year ago
- 15 comments
Labels: enhancement
#106 - Abnormal loss curve for supervised fine tuning on one GPU
Issue -
State: closed - Opened by Oscilloscope98 over 1 year ago
- 6 comments
#103 - Question: Why use "instruct" prompting on top of original LLaMa-2 prompting?
Issue -
State: closed - Opened by pseudotensor over 1 year ago
- 3 comments
Labels: enhancement
#102 - zero_to_fp32
Issue -
State: closed - Opened by bdytx5 over 1 year ago
- 2 comments
#100 - 能否在对llama-2-7b-chat-hf进行中文语料微调后的模型上,采用您的代码继续SFT?
Issue -
State: closed - Opened by YinSonglin1997 over 1 year ago
- 6 comments
#99 - What's the difference between finetune and supervised-finetune?
Issue -
State: closed - Opened by zejunwang1 over 1 year ago
- 2 comments
#98 - Get trainable weights from SFT
Issue -
State: closed - Opened by mces89 over 1 year ago
- 2 comments
#97 - 70B SFT out of memory?
Issue -
State: closed - Opened by mces89 over 1 year ago
- 2 comments
#96 - 代码/模型推理 bug?
Issue -
State: closed - Opened by xxzcc over 1 year ago
- 1 comment
#95 - addinput host and port in args for demo
Pull Request -
State: closed - Opened by jayxio over 1 year ago
- 1 comment
#94 - Is it possible to use with Mistral or Zephyr models?
Issue -
State: closed - Opened by versae over 1 year ago
- 1 comment
#93 - Applied flash attention usage
Issue -
State: closed - Opened by gyuwon12 over 1 year ago
- 5 comments
#92 - 中文长文本模型
Issue -
State: closed - Opened by ccp123456789 over 1 year ago
- 1 comment
#91 - supervised fine tuning 7b GPU requirement - CUDA out of memory
Issue -
State: closed - Opened by weicheng113113 over 1 year ago
- 22 comments
#90 - the rolling problem
Issue -
State: closed - Opened by teslacool over 1 year ago
- 1 comment