OpenNMT/CTranslate2 issues and pull requests

#1360 - Representing non-ASCII characters using ASCII

Pull Request - State: closed - Opened by BrightXiaoHan over 1 year ago - 2 comments

#1355 - LLAMA 2 support [Question] [Enhancement]

Issue - State: closed - Opened by trholding over 1 year ago - 4 comments

#1351 - While using the latest meta model (LLAMA-2-7b-chat-hf) converted with ctranslate2 ,getting ValueError: DequantizeGemmOutput: output should have a float type

Issue - State: closed - Opened by Apoorv7092 over 1 year ago - 5 comments

#1349 - Support left padding to forward batch prompts in a single step

Issue - State: open - Opened by guillaumekln over 1 year ago
Labels: enhancement

#1348 - A keyerror is raised when using the FALCON 40B model converted by ctranslate2

Issue - State: closed - Opened by srimouli04 over 1 year ago - 9 comments

#1343 - CPP inference Error. ** Error in `./run': double free or corruption (!prev):

Issue - State: closed - Opened by ustcdane over 1 year ago - 7 comments

#1337 - Get encoding from flan T5

Issue - State: open - Opened by Alexander-Jin over 1 year ago - 1 comment

#1333 - Continuous batching

Issue - State: open - Opened by andreapiso over 1 year ago - 6 comments
Labels: enhancement

#1330 - CMake error: CUDA_cublas_LIBRARY set to NOTFOUND

Issue - State: closed - Opened by Geremia over 1 year ago - 4 comments

#1329 - Code for chat inference server

Issue - State: closed - Opened by hobodrifterdavid over 1 year ago - 19 comments

#1324 - Exception when exporting bloomz model

Issue - State: open - Opened by jordimas over 1 year ago - 2 comments
Labels: bug

#1322 - How to use custom stopping criteria with the parameter callback in generate_batch() function

Issue - State: closed - Opened by curname over 1 year ago - 6 comments

#1320 - ct2-transformers-converter fails on falcon-rw-1b

Issue - State: closed - Opened by julianmukaj over 1 year ago - 3 comments
Labels: bug

#1306 - This CTranslate2 package was not compiled with CUDA support

Issue - State: closed - Opened by ciayomin over 1 year ago - 15 comments

#1300 - Request to support FlashAttention in cuda attention.cc

Issue - State: closed - Opened by nemoramo over 1 year ago - 23 comments
Labels: enhancement

#1296 - BERT Models: Huge difference in last hidden states of similar examples

Issue - State: closed - Opened by vakkov over 1 year ago - 5 comments

#1285 - Question for asynchronous in generate_batch

Issue - State: closed - Opened by Snowdar over 1 year ago - 2 comments

#1283 - can't convert opennmt.py model with alibi or rotary embeddings to ctranslate2

Issue - State: open - Opened by totaltube over 1 year ago - 8 comments
Labels: enhancement

#1250 - CUDA 12 support (libcublas.so.11 is not found)

Issue - State: closed - Opened by digitalsignalperson over 1 year ago - 25 comments

#1239 - Keep FFN output layer in float32 for T5 models

Pull Request - State: open - Opened by guillaumekln almost 2 years ago - 3 comments

#1238 - Do not hardcode the library major version in CMakeLists.txt

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1237 - same GPU memory between GPT2-13B-fp16 to GPT2-13B-int8 in CTranslate2

Issue - State: open - Opened by vicwer almost 2 years ago - 2 comments

#1236 - Repeated text with Marian Model generation

Issue - State: open - Opened by zzgsty almost 2 years ago - 1 comment

#1235 - Fix compilation with BUILD_SHARED_LIBS=OFF

Pull Request - State: closed - Opened by panosk almost 2 years ago - 3 comments

#1234 - Assisted Generation feature

Issue - State: open - Opened by wsxiaoys almost 2 years ago

#1233 - Quantized RedPajama responds in chinese

Issue - State: closed - Opened by NeonBohdan almost 2 years ago - 1 comment

#1232 - ValueError: Tokenizer class BloomTokenizer does not exist or is not currently imported.

Issue - State: open - Opened by moseshu almost 2 years ago - 1 comment

#1231 - Support for GPTBigCodeForCausalLM (StarCoder/ SantaCoder)

Issue - State: open - Opened by michaelfeil almost 2 years ago - 4 comments

#1230 - Adding support for transformers - Salesforce/CodeGen architecture

Pull Request - State: closed - Opened by michaelfeil almost 2 years ago - 3 comments

#1229 - Support for CodeT5pConfig

Issue - State: open - Opened by ferboz almost 2 years ago - 1 comment

#1228 - Support the MPT model from MosaicML

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1227 - Support paths with Unicode characters on Windows

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1226 - Generalize conversion of encoder-decoder models from OpenNMT-tf

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1225 - Raise asynchronous exception from generate_tokens method

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1224 - Invalid opcode on older AMD opteron cpu (avx support?)

Issue - State: closed - Opened by agittins almost 2 years ago - 3 comments

#1223 - add fairseq nllb insttructions

Issue - State: open - Opened by Omicronlawful almost 2 years ago - 1 comment

#1222 - Fix installation of Intel MKL package in manylinux2014

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1221 - add onmt-py converter for llama-onmt mpt-onmt

Pull Request - State: closed - Opened by vince62s almost 2 years ago

#1220 - lmsys/fastchat-t5-3b-v1.0: inconsistent generated output with converted model

Issue - State: closed - Opened by Matthieu-Tinycoaching almost 2 years ago - 12 comments

#1219 - CTranslate2 can support Llama?

Issue - State: closed - Opened by moseshu almost 2 years ago - 1 comment

#1218 - binary version v67324752 load problem

Issue - State: closed - Opened by syngokhan almost 2 years ago - 3 comments

#1217 - Extract last hidden state

Issue - State: open - Opened by dathudeptrai almost 2 years ago - 4 comments
Labels: enhancement

#1216 - ct2-fairseq-converter --vocab_mapping

Issue - State: closed - Opened by Omicronlawful almost 2 years ago - 1 comment

#1215 - ct2-fairseq-converter

Issue - State: closed - Opened by Omicronlawful almost 2 years ago

#1214 - It works fine, but gives an error.

Issue - State: closed - Opened by mayjack0312 almost 2 years ago - 2 comments

#1213 - How to add context to translation models?

Issue - State: closed - Opened by eyalmazuz almost 2 years ago - 11 comments

#1212 - Support for Mosaic ML MPT 7B

Issue - State: closed - Opened by praneetreddy017 almost 2 years ago - 4 comments

#1211 - How to trans a model with Parallel encoder

Issue - State: open - Opened by wangshauitj almost 2 years ago - 1 comment

#1210 - Resume model execution from where it stopped

Issue - State: closed - Opened by NeonBohdan almost 2 years ago - 1 comment
Labels: enhancement

#1209 - Different generation parameters in the same batch

Issue - State: open - Opened by juliensalinas almost 2 years ago

#1208 - Python Interface AutoModelConvert for Huggingface Transformers

Issue - State: closed - Opened by michaelfeil almost 2 years ago - 3 comments

#1207 - Model running fine on cpu but not on gpu

Issue - State: closed - Opened by mayanksinha900 almost 2 years ago - 4 comments

#1206 - GPT-NeoX

Issue - State: open - Opened by palladium123 almost 2 years ago - 3 comments

#1205 - MKL not used when static linking with WHOLE_ARCHIVE

Issue - State: open - Opened by panosk almost 2 years ago - 4 comments

#1204 - Optimize rotary embedding recreation

Issue - State: closed - Opened by janekb04 almost 2 years ago - 2 comments

#1203 - How to build the prebuild binaries?

Issue - State: closed - Opened by JustFrederik almost 2 years ago - 7 comments

#1202 - support ChatGLM

Issue - State: open - Opened by nghuyong almost 2 years ago - 6 comments

#1201 - Manually destroy cuBLAS and cuDNN handles before threads exit

Pull Request - State: open - Opened by guillaumekln almost 2 years ago

#1200 - GPT-J on Tesla T4: the target device or backend do not support efficient float16 computation

Issue - State: closed - Opened by juliensalinas almost 2 years ago - 2 comments

#1199 - Does CT2 support loading of two GPUs

Issue - State: open - Opened by lx0126z almost 2 years ago - 1 comment

#1198 - Update docstring for end_token argument

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1197 - Older architectures are not inserted to CUDA_ARCH_LIST

Issue - State: closed - Opened by panosk almost 2 years ago - 2 comments

#1196 - CMake errors when using -DBUILD_SHARED_LIBS=OFF after #1178

Issue - State: closed - Opened by panosk almost 2 years ago - 14 comments

#1195 - Add option to keep the end token in the results

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1194 - Centos 8 support

Issue - State: closed - Opened by kolserdav almost 2 years ago - 1 comment

#1193 - Different output for inference running inside a docker container on CPU

Issue - State: closed - Opened by kafan1986 almost 2 years ago - 1 comment

#1192 - Fix vocabulary loading when some tokens end with the carriage return

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1191 - Update Ruy submodule to commit 363f2522

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1190 - Llama 7B generator incorrectly creates sequence_ids from input tokens

Issue - State: closed - Opened by vancoykendall almost 2 years ago - 1 comment

#1189 - Add token streaming example in the documentation

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1188 - Fallback to a custom threading implementation when OpenMP is not used

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1187 - Make callback argument public

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1186 - Support peft's LoRa for HF transformer models.

Issue - State: open - Opened by Palmik almost 2 years ago - 4 comments
Labels: enhancement

#1185 - How much memory do I need to convert GPT-NeoX?

Issue - State: closed - Opened by palladium123 almost 2 years ago - 2 comments

#1184 - Skip callback for prefix tokens with include_prompt_in_result=False

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1183 - Idea: support for embeddings

Issue - State: closed - Opened by janekb04 almost 2 years ago - 10 comments

#1182 - Accept multiple tokens in argument end_token

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1181 - Importing `numpy` after `ctranslate2` causes segfault

Issue - State: open - Opened by janekb04 almost 2 years ago

#1180 - Support GPT-NeoX architecture

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1179 - Implement fused rotary kernel

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1178 - Expose CTranslate2::ctranslate2 as CMake package

Pull Request - State: closed - Opened by HennerM almost 2 years ago - 1 comment

#1177 - Use non blocking CUDA streams

Pull Request - State: open - Opened by guillaumekln almost 2 years ago

#1176 - GPT-NeoX

Issue - State: closed - Opened by sudo-carson almost 2 years ago - 1 comment
Labels: enhancement

#1175 - CUDA 12 Support

Issue - State: closed - Opened by jzju almost 2 years ago - 3 comments

#1174 - Prevent generating segments with zero duration

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1173 - Expose flag low_cpu_mem_usage of Transformers method from_pretrained

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1172 - Also load weights as FP16 when converting to int8_float16

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1171 - Raise an error if end_token or suppress_sequences contain OOV tokens

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1170 - Fix Whisper.align crash when num_frames//2 <= median_filter_width

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1169 - Idea: Add CoreML / Apple Neural Engine backend

Issue - State: open - Opened by janekb04 almost 2 years ago - 4 comments
Labels: enhancement

#1168 - Add alternative rotary implementation that slices the head dimensions

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1167 - Support variable number of frames in Whisper.align batch implementation

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1166 - Optimize quantization of FP16 weights during conversion

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1165 - Add method generate_tokens to return a generator of tokens

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1164 - Add note on Visual C++ runtime dependency on Windows

Pull Request - State: closed - Opened by jordimas almost 2 years ago

#1163 - CTranslate2 missing dependency in Windows

Issue - State: closed - Opened by jordimas almost 2 years ago - 1 comment

#1162 - Train Model Load

Issue - State: open - Opened by syngokhan almost 2 years ago - 3 comments

#1161 - Token Streaming

Issue - State: closed - Opened by henyee almost 2 years ago - 2 comments

#1160 - Expose parameters ffn_glu and rms_norm for TransformerDecoderModelSpec

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

#1159 - Clarify that the Docker image contains the required NVIDIA libs

Pull Request - State: closed - Opened by guillaumekln almost 2 years ago

GitHub / OpenNMT/CTranslate2 issues and pull requests