NVIDIA/FasterTransformer issues and pull requests

#715 - # feature request # GPT-Q 4 bit support

Issue - State: open - Opened by Xingxiangrui over 1 year ago - 6 comments

#715 - # feature request # GPT-Q 4 bit support

Issue - State: open - Opened by Xingxiangrui over 1 year ago - 6 comments

#713 - GPTNeox decoding argumentation

Issue - State: open - Opened by w775739733 over 1 year ago - 3 comments

#701 - build failed with tf-op

Issue - State: open - Opened by jackzhou121 over 1 year ago - 1 comment
Labels: bug

#696 - when fastertransformer support continuous batching and PagedAttention ?

Issue - State: open - Opened by ppppppppig over 1 year ago - 9 comments

#696 - when fastertransformer support continuous batching and PagedAttention ?

Issue - State: open - Opened by ppppppppig over 1 year ago - 9 comments

#688 - [Compile] Compilation Failed when CMake using CUDA 11.8 on Windows.

Issue - State: closed - Opened by FdyCN over 1 year ago - 3 comments
Labels: bug

#680 - infer_visiontransformer_op.py error

Issue - State: open - Opened by macrocredit over 1 year ago - 2 comments
Labels: bug

#680 - infer_visiontransformer_op.py error

Issue - State: open - Opened by macrocredit over 1 year ago - 2 comments
Labels: bug

#674 - Are there plans to support INT8 PTQ for other models (GPTNeox)

Issue - State: open - Opened by aitorormazabal over 1 year ago - 1 comment

#674 - Are there plans to support INT8 PTQ for other models (GPTNeox)

Issue - State: open - Opened by aitorormazabal over 1 year ago - 1 comment

#669 - Support for Falcon models

Issue - State: open - Opened by ankit201 over 1 year ago - 1 comment

#669 - Support for Falcon models

Issue - State: open - Opened by ankit201 over 1 year ago - 1 comment

#667 - Is the shape of positional embedding wrong?

Issue - State: open - Opened by dongluw over 1 year ago - 1 comment

#658 - how to pack fastertransformer

Issue - State: open - Opened by 77h2l over 1 year ago - 1 comment

#658 - how to pack fastertransformer

Issue - State: open - Opened by 77h2l over 1 year ago - 1 comment

#653 - gptneox & gptj int8 quantization & share context

Pull Request - State: open - Opened by rahuan over 1 year ago

#653 - gptneox & gptj int8 quantization & share context

Pull Request - State: open - Opened by rahuan over 1 year ago

#645 - GPT2 FP8 don't work on L4/L40 platform

Issue - State: open - Opened by champson over 1 year ago - 3 comments
Labels: bug

#645 - GPT2 FP8 don't work on L4/L40 platform

Issue - State: open - Opened by champson over 1 year ago - 3 comments
Labels: bug

#639 - AssertionError: tensor_para_size * pipeline_para_size must be equal to world_size. world_size always equals to -1

Issue - State: open - Opened by starlitsky2010 over 1 year ago - 2 comments
Labels: bug

#639 - AssertionError: tensor_para_size * pipeline_para_size must be equal to world_size. world_size always equals to -1

Issue - State: open - Opened by starlitsky2010 over 1 year ago - 2 comments
Labels: bug

#638 - libth_transformer.so: cannot open shared object file: No such file or directory

Issue - State: open - Opened by ma-siddiqui over 1 year ago - 13 comments

#634 - The model processed by SFT infers garbled results on the FasterTransformer backend

Issue - State: closed - Opened by WackyGem over 1 year ago - 12 comments
Labels: bug

#634 - The model processed by SFT infers garbled results on the FasterTransformer backend

Issue - State: closed - Opened by WackyGem over 1 year ago - 12 comments
Labels: bug

#632 - Fix for docker image build

Pull Request - State: closed - Opened by fredr over 1 year ago

#625 - What's the difference between FasterTransformer and TensorRT

Issue - State: open - Opened by puyuanOT over 1 year ago - 3 comments

#625 - What's the difference between FasterTransformer and TensorRT

Issue - State: open - Opened by puyuanOT over 1 year ago - 3 comments

#615 - Support for chatglm-6B/GLM models?

Issue - State: open - Opened by tianmala over 1 year ago - 16 comments

#615 - Support for chatglm-6B/GLM models?

Issue - State: open - Opened by tianmala over 1 year ago - 16 comments

#603 - Support for hugging face GPTBigCode model

Issue - State: open - Opened by jiaozaer over 1 year ago - 6 comments

#603 - Support for hugging face GPTBigCode model

Issue - State: open - Opened by jiaozaer over 1 year ago - 6 comments

#602 - GPT-NeoX gives poor results using FP16

Issue - State: open - Opened by eycheung over 1 year ago - 1 comment
Labels: bug

#602 - GPT-NeoX gives poor results using FP16

Issue - State: open - Opened by eycheung over 1 year ago - 1 comment
Labels: bug

#596 - MPT-7B model conversion?

Issue - State: open - Opened by SinanAkkoyun over 1 year ago - 2 comments

#596 - MPT-7B model conversion?

Issue - State: open - Opened by SinanAkkoyun over 1 year ago - 2 comments

#594 - CUDA memory is not released after inference

Issue - State: open - Opened by DayDayupupupup over 1 year ago - 5 comments

#594 - CUDA memory is not released after inference

Issue - State: open - Opened by DayDayupupupup over 1 year ago - 5 comments

#592 - [FT][ERROR] CUDA runtime error: operation not supported /workspace/FasterTransformer/src/fastertransformer/utils/allocator.h:160

Issue - State: open - Opened by Z3TA over 1 year ago - 3 comments
Labels: bug

#571 - Int4 Support

Issue - State: open - Opened by atyshka over 1 year ago - 3 comments

#571 - Int4 Support

Issue - State: open - Opened by atyshka over 1 year ago - 3 comments

#570 - Can support BLIP-2 model from huggingface?

Issue - State: open - Opened by joewale over 1 year ago - 3 comments

#570 - Can support BLIP-2 model from huggingface?

Issue - State: open - Opened by joewale over 1 year ago - 3 comments

#562 - T5 MoE docs need updates

Issue - State: open - Opened by jokerwyt over 1 year ago - 5 comments

#562 - T5 MoE docs need updates

Issue - State: open - Opened by jokerwyt over 1 year ago - 5 comments

#542 - Could NOT find MPI_CXX (missing: MPI_CXX_LIB_NAMES MPI_CXX_HEADER_DIR MPI_CXX_WORKS)

Issue - State: closed - Opened by amazingkmy over 1 year ago - 5 comments
Labels: bug

#532 - CUDA error: an illegal memory access was encountered

Issue - State: closed - Opened by ywfwyht over 1 year ago - 5 comments

#532 - CUDA error: an illegal memory access was encountered

Issue - State: closed - Opened by ywfwyht over 1 year ago - 5 comments

#520 - `generic_activation` is using too many registers

Issue - State: closed - Opened by lzhangzz over 1 year ago - 2 comments
Labels: bug

#520 - `generic_activation` is using too many registers

Issue - State: closed - Opened by lzhangzz over 1 year ago - 2 comments
Labels: bug

#519 - How to convert vit weights from pytorch to fastertransformer format for plugin use ?

Issue - State: closed - Opened by ywfwyht over 1 year ago - 9 comments
Labels: question

#519 - How to convert vit weights from pytorch to fastertransformer format for plugin use ?

Issue - State: closed - Opened by ywfwyht over 1 year ago - 9 comments
Labels: question

#506 - LLaMA support

Issue - State: open - Opened by michaelroyzen over 1 year ago - 176 comments
Labels: enhancement

#506 - LLaMA support

Issue - State: open - Opened by michaelroyzen over 1 year ago - 176 comments
Labels: enhancement

#492 - CUTLASS upgrade

Issue - State: open - Opened by no42name42 over 1 year ago - 2 comments

#492 - CUTLASS upgrade

Issue - State: open - Opened by no42name42 over 1 year ago - 2 comments

#449 - error with mpirun

Issue - State: open - Opened by lambda7xx over 1 year ago - 4 comments
Labels: bug

#449 - error with mpirun

Issue - State: open - Opened by lambda7xx over 1 year ago - 4 comments
Labels: bug

#439 - Build from source without container errors

Issue - State: open - Opened by YJHMITWEB almost 2 years ago - 2 comments
Labels: bug

#435 - fix swin train param is not match bug

Pull Request - State: closed - Opened by wm901115nwpu almost 2 years ago - 2 comments

#435 - fix swin train param is not match bug

Pull Request - State: closed - Opened by wm901115nwpu almost 2 years ago - 2 comments

#381 - Cutlass missing from 3rdparty in new 5.2 release

Issue - State: open - Opened by michaelroyzen almost 2 years ago - 2 comments
Labels: bug

#378 - CMake Error at CMakeLists.txt:199 (message): PyTorch >= 1.5.0 is needed for TorchScript mode.

Issue - State: closed - Opened by clam004 almost 2 years ago - 7 comments
Labels: bug

#378 - CMake Error at CMakeLists.txt:199 (message): PyTorch >= 1.5.0 is needed for TorchScript mode.

Issue - State: closed - Opened by clam004 almost 2 years ago - 7 comments
Labels: bug

#349 - T5 Beam Search Answer wrong

Issue - State: open - Opened by Tangzixia about 2 years ago - 25 comments
Labels: bug

#349 - T5 Beam Search Answer wrong

Issue - State: open - Opened by Tangzixia about 2 years ago - 25 comments
Labels: bug

#314 - /usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’: 435 | function(_Functor&& __f) |

Issue - State: closed - Opened by lucasjinreal about 2 years ago - 2 comments
Labels: bug

#314 - /usr/include/c++/11/bits/std_function.h:435:145: error: parameter packs not expanded with ‘...’: 435 | function(_Functor&& __f) |

Issue - State: closed - Opened by lucasjinreal about 2 years ago - 2 comments
Labels: bug

#220 - undefined symbol: _ZN17fastertransformer13cublasAlgoMapC1ESsSs

Issue - State: closed - Opened by NJU-yasuo over 2 years ago - 12 comments

#220 - undefined symbol: _ZN17fastertransformer13cublasAlgoMapC1ESsSs

Issue - State: closed - Opened by NJU-yasuo over 2 years ago - 12 comments

#148 - [WIP] feat: Modify the API of NCCL to fit the single GPU gpt without NCCL/MPI

Pull Request - State: closed - Opened by byshiue about 3 years ago - 1 comment

#56 - [Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1

Issue - State: closed - Opened by pommedeterresautee over 3 years ago - 4 comments

#56 - [Faster transformer] having a guide on how to use weights from a Hugginface transfomer model (Roberta based) with faster transformer 3.1

Issue - State: closed - Opened by pommedeterresautee over 3 years ago - 4 comments

GitHub / NVIDIA/FasterTransformer issues and pull requests