Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / dvlab-research/LongLoRA issues and pull requests

#195 - not able to reproduce the passkey retrieval accuracy

Issue - State: open - Opened by zhuconv about 2 months ago - 4 comments

#194 - LongBench evaluation

Issue - State: open - Opened by Clement25 3 months ago

#193 - 是否支持如GPT2这类的supervised fine-tune?

Issue - State: open - Opened by CharRic 3 months ago

#192 - How LongAlpaca Data was constructed?

Issue - State: open - Opened by S1s-Z 3 months ago

#190 - norm层不是没有参数矩阵吗

Issue - State: open - Opened by changanxunyi 5 months ago

#189 - Update README.md

Pull Request - State: open - Opened by Dominic789654 5 months ago

#187 - 模型完全没法正常输出

Issue - State: closed - Opened by Tangent-90C 6 months ago - 1 comment

#186 - embedding 为什么要resize成32001?

Issue - State: open - Opened by momandai 6 months ago

#185 - Something wrong with the torch version

Issue - State: open - Opened by dian1414 6 months ago

#181 - HF models missing rope scaling in the config

Issue - State: open - Opened by hsiehjackson 9 months ago

#180 - Machine don't install Flash Attention

Issue - State: open - Opened by huilong-chen 9 months ago

#179 - global_step文件

Issue - State: open - Opened by xxcoco763 9 months ago

#178 - Add callback for saving trainable parameters and model config

Pull Request - State: open - Opened by GirinMan 9 months ago

#177 - Regarding the results in Table 8 and Table 14

Issue - State: open - Opened by Statisticss 9 months ago

#174 - Memory usage "too small" for 7B Llama-2

Issue - State: open - Opened by Linohong 10 months ago

#171 - Distributed inference issue

Issue - State: open - Opened by yixliu1 10 months ago

#167 - streaming llm problem

Issue - State: open - Opened by seanxuu 10 months ago

#164 - Is LongLoRA can be mixed with YaRN ?

Issue - State: open - Opened by DevNullx64 11 months ago

#163 - 推理时候显存分配

Issue - State: open - Opened by xxcoco763 11 months ago - 2 comments

#162 - Adapting to new models

Issue - State: open - Opened by epinnock 11 months ago - 2 comments

#160 - Lora+deepspeed zero3 无法保存lora权重问题

Issue - State: closed - Opened by AresXD 11 months ago - 6 comments

#156 - 训练的时候使用的什么外推方式

Issue - State: open - Opened by IT-five 11 months ago

#155 - 支持qwen、baichuan等中文模型微调吗

Issue - State: open - Opened by kevinuserdd 11 months ago

#154 - inference OOM

Issue - State: open - Opened by PharMolix 11 months ago

#152 - Can LongLoRA be used for incremental pre-training?

Issue - State: open - Opened by Zheng-Jay 11 months ago

#150 - 微调数据

Issue - State: closed - Opened by Go4miii 12 months ago

#149 - 推理 group整除问题

Issue - State: closed - Opened by Michelleable 12 months ago - 1 comment

#148 - LongLoRA + Flash Attention 2 causing illigal memory access

Issue - State: open - Opened by ArturNiederfahrenhorst 12 months ago - 7 comments

#147 - 32k inference result is garbled

Issue - State: open - Opened by zhanglv0209 12 months ago - 8 comments

#146 - torch.cuda.OutOfMemoryError: CUDA out of memory

Issue - State: closed - Opened by zhanglv0209 12 months ago - 3 comments

#145 - 中文领域进展

Issue - State: closed - Opened by ccp123456789 12 months ago - 1 comment

#144 - Added multiple GPUs evaluation.

Pull Request - State: closed - Opened by weicheng113 12 months ago - 1 comment

#141 - Question about inference use Llama-2-7b-longlora-8k-ft output nothing

Issue - State: closed - Opened by ysanimals 12 months ago - 4 comments

#140 - Inquiry Regarding the Tokenize Function

Issue - State: closed - Opened by thanaphatt1 12 months ago - 3 comments

#139 - To save model in HF format after supervised-fine-tune-qlora

Issue - State: open - Opened by MyBruso 12 months ago - 7 comments

#138 - How did you design questions and answers in the LongQA dataset?

Issue - State: closed - Opened by finallymint 12 months ago - 1 comment

#137 - How to eval Llama-2-7b-longlora-16k-ft?

Issue - State: closed - Opened by rabi-fei almost 1 year ago - 4 comments

#136 - Perplexity Validation Error

Issue - State: closed - Opened by panpanli521 almost 1 year ago - 2 comments

#135 - SFT Problem: Attention Mask doesn't match

Issue - State: closed - Opened by Busdriver26 about 1 year ago - 1 comment

#134 - Confused with eval.py perplexity implementation

Issue - State: closed - Opened by weicheng113 about 1 year ago - 1 comment

#133 - Cannot Convert Checkpint to Trainable Model

Issue - State: open - Opened by believewhat about 1 year ago - 3 comments

#132 - intel xpu qlora support related code changes

Pull Request - State: closed - Opened by rnwang04 about 1 year ago

#131 - intel xpu qlora support related code changes

Pull Request - State: closed - Opened by rnwang04 about 1 year ago

#130 - Bitstandbytes library verision error with sft

Issue - State: closed - Opened by Breno-de-Angelo about 1 year ago - 1 comment

#129 - How to train LongLoRA step-by-step ?

Issue - State: closed - Opened by dhcode-cpp about 1 year ago - 1 comment

#128 - uploaded inference script using qlora

Pull Request - State: closed - Opened by zhounu about 1 year ago - 1 comment

#127 - Torch.compile switches model back to training mode

Issue - State: closed - Opened by gianlucamacri about 1 year ago - 1 comment

#126 - Help to confirm understanding of forward_flashattn

Issue - State: closed - Opened by weicheng113 about 1 year ago - 2 comments

#124 - fix starting token repetition

Pull Request - State: closed - Opened by gianlucamacri about 1 year ago - 1 comment

#123 - Saving pytorch_model.bin with QLORA

Issue - State: closed - Opened by grimulkan about 1 year ago - 7 comments

#122 - No LongLora 100K Llama 2 7B?

Issue - State: closed - Opened by TamirHCL about 1 year ago

#121 - Model training information?

Issue - State: closed - Opened by TamirHCL about 1 year ago - 6 comments

#120 - 能给一份S^2 Attension推理的代码吗?

Issue - State: open - Opened by hxs91 about 1 year ago - 4 comments

#119 - 关于sft实验效果

Issue - State: closed - Opened by AresXD about 1 year ago - 5 comments

#118 - Transformers <= 4.34.0 requirement

Issue - State: closed - Opened by Breno-de-Angelo about 1 year ago - 3 comments

#117 - Model differences?

Issue - State: closed - Opened by TamirHCL about 1 year ago - 2 comments

#116 - Catch none-valued rope scaling configs

Pull Request - State: closed - Opened by j-frei about 1 year ago - 1 comment

#115 - supervised fine_tuning for domain specific question-answering

Issue - State: closed - Opened by MyBruso about 1 year ago - 2 comments

#114 - turning exception into warning for flash attention inference

Pull Request - State: closed - Opened by gianlucamacri about 1 year ago - 1 comment

#113 - Added management of rope factor in previous configuration

Pull Request - State: closed - Opened by gianlucamacri about 1 year ago - 1 comment

#111 - RedPajama-Data-1T-Sample tokenization stuck

Issue - State: closed - Opened by weicheng113 about 1 year ago - 6 comments

#110 - Hardware requirements for 7B 100k

Issue - State: closed - Opened by nedRad88 about 1 year ago - 1 comment

#107 - support multiple round conversation

Issue - State: closed - Opened by coranholmes about 1 year ago - 15 comments
Labels: enhancement

#106 - Abnormal loss curve for supervised fine tuning on one GPU

Issue - State: closed - Opened by Oscilloscope98 about 1 year ago - 6 comments

#103 - Question: Why use "instruct" prompting on top of original LLaMa-2 prompting?

Issue - State: closed - Opened by pseudotensor about 1 year ago - 3 comments
Labels: enhancement

#102 - zero_to_fp32

Issue - State: closed - Opened by bdytx5 about 1 year ago - 2 comments

#99 - What's the difference between finetune and supervised-finetune?

Issue - State: closed - Opened by zejunwang1 about 1 year ago - 2 comments

#98 - Get trainable weights from SFT

Issue - State: closed - Opened by mces89 about 1 year ago - 2 comments

#97 - 70B SFT out of memory?

Issue - State: closed - Opened by mces89 about 1 year ago - 2 comments

#96 - 代码/模型推理 bug?

Issue - State: closed - Opened by xxzcc about 1 year ago - 1 comment

#95 - addinput host and port in args for demo

Pull Request - State: closed - Opened by jayxio about 1 year ago - 1 comment

#94 - Is it possible to use with Mistral or Zephyr models?

Issue - State: closed - Opened by versae about 1 year ago - 1 comment

#93 - Applied flash attention usage

Issue - State: closed - Opened by gyuwon12 about 1 year ago - 5 comments

#92 - 中文长文本模型

Issue - State: closed - Opened by ccp123456789 about 1 year ago - 1 comment

#91 - supervised fine tuning 7b GPU requirement - CUDA out of memory

Issue - State: closed - Opened by weicheng113113 about 1 year ago - 22 comments

#90 - the rolling problem

Issue - State: closed - Opened by teslacool about 1 year ago - 1 comment