OpenNMT/CTranslate2 issues and pull requests

#1495 - Support yarn 128k model

Issue - State: open - Opened by trannhatquy about 1 year ago - 4 comments
Labels: enhancement

#1494 - Whisper alignment not working with batches

Issue - State: closed - Opened by Perpure about 1 year ago - 2 comments

#1493 - Difference in output for BART when beam size > 1

Issue - State: closed - Opened by shrey-singla about 1 year ago - 2 comments

#1492 - [Question] Model conversion

Issue - State: closed - Opened by FSchmidt-FUNKE about 1 year ago - 1 comment

#1491 - No conversion is registered for the model configuration Blip2Config

Issue - State: open - Opened by maiquanshen about 1 year ago - 2 comments

#1490 - CUDA 11.6: CUBLAS_STATUS_INVALID_VALUE on Gemm

Issue - State: closed - Opened by philip30 about 1 year ago - 1 comment

#1489 - Made it possible to return logits of each words

Pull Request - State: closed - Opened by philip30 about 1 year ago - 2 comments

#1488 - Memory increase

Issue - State: open - Opened by AIApprentice101 about 1 year ago - 5 comments

#1487 - Output is different when flan-t5-xl model is converted

Issue - State: closed - Opened by prashantkh19 about 1 year ago - 6 comments

#1486 - ctranslate2 compared to ggml/gguf and gptq

Issue - State: closed - Opened by BBC-Esq about 1 year ago - 3 comments

#1485 - Added converter for MixFormerSequentialForCausalLM

Pull Request - State: closed - Opened by AkashKarnatak about 1 year ago - 5 comments

#1484 - update install documentation

Pull Request - State: closed - Opened by zxdvd about 1 year ago

#1483 - Converting CodeLlama-34b-Python-hf

Issue - State: closed - Opened by BBC-Esq about 1 year ago - 2 comments

#1482 - Weird speed behavior int8* quantization

Issue - State: open - Opened by b-joris about 1 year ago - 2 comments

#1481 - Support for all llama2 models not just ones ending in "-hf"

Issue - State: closed - Opened by BBC-Esq about 1 year ago - 7 comments

#1480 - The impact of prompt length differences on inference speed

Issue - State: closed - Opened by zj2009 about 1 year ago - 2 comments

#1479 - Support Baichuan2?

Issue - State: open - Opened by lx0126z about 1 year ago - 6 comments

#1478 - Support for UMT5

Issue - State: open - Opened by QLutz about 1 year ago - 2 comments

#1477 - ID's from generate_batch and forward_batch don't agree

Issue - State: closed - Opened by RockyZhu29 about 1 year ago - 6 comments

#1476 - Remove unnecessary MKL build number in the Dockerfile

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1475 - Potential bugs in generating num hypothesis?

Issue - State: closed - Opened by waterhorse1 about 1 year ago - 4 comments

#1474 - Support Speculative Decoding

Issue - State: open - Opened by JOHW85 about 1 year ago - 5 comments
Labels: enhancement

#1473 - Support for MixFormerSequentialForCausalLM (Phi1.5)

Issue - State: closed - Opened by wsxiaoys about 1 year ago
Labels: enhancement

#1472 - attempting to convert tiiuae/falcon-180B-chat

Issue - State: closed - Opened by silvacarl2 about 1 year ago - 12 comments
Labels: enhancement

#1471 - Llama 2 example producing gibberish / German at times

Issue - State: closed - Opened by amrrs about 1 year ago - 3 comments

#1470 - Update requirements to include accelerate for llama demo

Pull Request - State: closed - Opened by amrrs about 1 year ago - 1 comment

#1469 - Support Constraint Generation

Issue - State: open - Opened by jgcb00 about 1 year ago - 1 comment
Labels: enhancement

#1468 - Remove epsilon in the Softmax CPU kernel for consistency

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1467 - Incorrect device index in `StorageView`

Issue - State: open - Opened by sumitupadhye12 about 1 year ago - 2 comments
Labels: enhancement

#1466 - can ct2-transformers-converter support tensorflow models?

Issue - State: closed - Opened by ILG2021 about 1 year ago - 2 comments

#1465 - support for gguf

Issue - State: closed - Opened by thistleknot about 1 year ago - 1 comment

#1464 - Optimize indexing in function negative_dtw

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1463 - Use a macro to factorize declaration of some float specializations

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1462 - I hope that in future versions, the Whisper-large-v2 model fine-tuned with peft can be converted :)

Issue - State: closed - Opened by YLQY about 1 year ago - 4 comments

#1461 - Merge BERT and DistilBERT test functions

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1460 - cannot find the profiling information

Issue - State: closed - Opened by zj2009 about 1 year ago - 3 comments

#1459 - Support for ELECTRA models

Issue - State: open - Opened by Hazqeel09 about 1 year ago - 1 comment
Labels: enhancement

#1458 - Encoder Cuda out of memory

Issue - State: closed - Opened by jgcb00 about 1 year ago - 1 comment

#1457 - Accept variable-length batch prompts for Whisper

Pull Request - State: open - Opened by guillaumekln about 1 year ago

#1456 - Codellama-34b conversion

Issue - State: closed - Opened by pshivraj about 1 year ago - 6 comments

#1455 - Mbart50 many to one conversion not working

Issue - State: closed - Opened by Latrolage about 1 year ago - 1 comment

#1454 - Support for TinyLLama 1.1B

Issue - State: closed - Opened by Apoorv7092 about 1 year ago - 3 comments

#1453 - Accept batch inputs in generate_tokens and add async_generate_tokens

Pull Request - State: closed - Opened by jgcb00 about 1 year ago - 7 comments

#1452 - Support for DeBERTa Models

Issue - State: open - Opened by sharhabeel about 1 year ago - 2 comments
Labels: enhancement

#1451 - Fix shape error in models using both MQA and relative positions

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1450 - Any way to manually clear the cache for static prompt for generator.generate_tokens?

Issue - State: open - Opened by waterhorse1 about 1 year ago - 3 comments

#1449 - Implement converter for FalconConfig (#1405)

Pull Request - State: closed - Opened by natsegal about 1 year ago

#1448 - Converting "embedding" models and running them on ctranslate2

Issue - State: closed - Opened by BBC-Esq about 1 year ago - 11 comments

#1447 - Fix max_time definition in PositionEncoder

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1446 - Relax shape checks for Whisper input features

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1445 - Make the RoPE base period configurable

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1444 - Increase revision of TransformerDecoderModelSpec to 8

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1443 - OpenNMT-py Multi Query not functioning in CT2

Issue - State: closed - Opened by ArtanisTheOne about 1 year ago - 6 comments

#1442 - Implement linear RoPE scaling

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1441 - Remove the default system prompt in the Llama 2 chat example

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1440 - Restore the original batch_id before calling the callback

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1439 - Build wheels for Python 3.12 and drop support for 3.7

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1438 - Fix AVX512 compilation with GCC 7

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1437 - Add property hypothesis_id in GenerationStepResult

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1436 - Support byte_fallback in inference callback

Issue - State: closed - Opened by wsxiaoys about 1 year ago - 1 comment

#1435 - NLLB-200 54b (MOE)

Issue - State: closed - Opened by ymoslem about 1 year ago - 1 comment
Labels: enhancement

#1434 - HF and LangChain support

Issue - State: closed - Opened by kripper about 1 year ago - 4 comments

#1433 - Is it possibile to controll the number of seconds the input audio file is chuncked in Ctraslate2 Whisper?

Issue - State: closed - Opened by Lorenzoncina about 1 year ago - 1 comment

#1432 - Example for XLM-RoBERTa

Issue - State: closed - Opened by sharhabeel about 1 year ago - 6 comments

#1431 - Weird behavior on V100 32GB

Issue - State: open - Opened by AmgadHasan about 1 year ago - 3 comments

#1430 - Can the quantized model be splitted to multi shards and loaded to different GPUs?

Issue - State: closed - Opened by ZJXNEFU about 1 year ago - 1 comment

#1429 - Support for Facebook's new SeamlessM4T (Multilingual + Multimodal)

Issue - State: open - Opened by Infinitay about 1 year ago - 4 comments
Labels: enhancement

#1428 - Instructions

Issue - State: closed - Opened by BBC-Esq about 1 year ago - 1 comment

#1427 - Size of converted "lmsys/vicuna-7b-v1.5" models

Issue - State: closed - Opened by Matthieu-Tinycoaching about 1 year ago - 2 comments

#1426 - Exception in c++ app

Issue - State: closed - Opened by Tonku about 1 year ago - 1 comment

#1425 - Token streaming and batch_ids

Issue - State: closed - Opened by fergusbarratt about 1 year ago - 3 comments
Labels: bug

#1424 - nucleus sampler problem?

Issue - State: open - Opened by SebastianBodza about 1 year ago - 4 comments

#1423 - Is batch streaming possible with the Text Generation functions?

Issue - State: closed - Opened by suhjohn about 1 year ago - 5 comments

#1422 - Why do I need to "import torch" first so ct2-transformers-converter doesn't give "free(): invalid pointer"?

Issue - State: closed - Opened by Geremia about 1 year ago - 2 comments

#1421 - Update model_spec.py

Pull Request - State: closed - Opened by SebastianBodza about 1 year ago

#1420 - whisper batch inference

Issue - State: closed - Opened by maiquanshen about 1 year ago - 2 comments

#1419 - How to use whisper inference using C++

Issue - State: closed - Opened by endink about 1 year ago - 2 comments

#1418 - token type ids can be set by optional argument up to python wrapper

Pull Request - State: closed - Opened by hachall about 1 year ago - 2 comments

#1417 - ct2-transformers-converter → "free(): invalid pointer"

Issue - State: closed - Opened by Geremia about 1 year ago - 2 comments

#1416 - Distributed mode

Issue - State: closed - Opened by EnricoBeltramo about 1 year ago - 5 comments
Labels: enhancement

#1415 - Huggingface added_tokens.json creates size mismatch in converter

Issue - State: closed - Opened by gbmarc1 about 1 year ago - 3 comments

#1414 - Unable to use translate_batch of the NLLB model via multiprocessing

Issue - State: closed - Opened by abhishekukmeesho about 1 year ago - 4 comments

#1413 - Different translation results from converted CTranslate2 model and original OpenNMT-py model.

Issue - State: closed - Opened by 13633491388 about 1 year ago - 5 comments

#1412 - Multi GPU

Issue - State: closed - Opened by zachNA2 about 1 year ago - 1 comment

#1411 - Loading model on low CPU memory

Issue - State: open - Opened by barschiiii about 1 year ago - 5 comments
Labels: enhancement

#1410 - Generator.generate_batch: Token X is not in the vocabulary when having a tokenizer with special tokens added?

Issue - State: closed - Opened by salahzoubi about 1 year ago

#1409 - KV caching?

Issue - State: closed - Opened by bryanhpchiang about 1 year ago - 1 comment

#1408 - Error occur when cmake kernels_avx512.cc.o

Issue - State: closed - Opened by Faken93 about 1 year ago - 1 comment

#1407 - int8 Quantization fails for LLaMa 13B

Issue - State: closed - Opened by QLutz about 1 year ago - 6 comments

#1406 - Compile error with master code if cpu not support avx512

Issue - State: closed - Opened by smallccn about 1 year ago - 1 comment

#1405 - Not able to convert falcon-rw-1b into ctranslate2

Issue - State: closed - Opened by prince-0911 about 1 year ago - 4 comments
Labels: enhancement

#1404 - Converted Vicuna-7b-v1.5-16k and longdoes not seem to generate sensible output

Issue - State: closed - Opened by chiiyeh about 1 year ago - 1 comment

#1403 - New vicuna 1.5-16k models are repeating words incorrectly

Issue - State: closed - Opened by marcelgoya about 1 year ago - 2 comments
Labels: enhancement

#1402 - Vectorize GEMM output dequantization and fuse bias & activation

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1401 - using whisper related c++ interface is slow

Issue - State: closed - Opened by syq163 about 1 year ago - 4 comments

#1400 - Use max function in CUDA ReLU functor

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1399 - Implement DistilBERT converter

Pull Request - State: closed - Opened by natsegal about 1 year ago

#1398 - Cleanup specialization of functor absolute_maximum_func

Pull Request - State: closed - Opened by guillaumekln about 1 year ago

#1397 - Fix typo in decoding.md

Pull Request - State: closed - Opened by eltociear about 1 year ago

#1396 - Cannot load model on multi gpus

Issue - State: closed - Opened by weiqisun about 1 year ago - 4 comments

GitHub / OpenNMT/CTranslate2 issues and pull requests