Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / NVIDIA/FasterTransformer issues and pull requests
#795 - Can FasterTransformer support SAM2 ( Meta segment anything model 2)?
Issue -
State: open - Opened by jackwei86 2 months ago
#794 - How to implement a gemm with FP16 and INT4 using kernel in FasterTransformer/src/fastertransformer/kernels/cutlass_kernels/fpA_intB_gemm
Issue -
State: open - Opened by AkatsukiChiri 4 months ago
#793 - An error occurred for the specific cuda version
Issue -
State: open - Opened by CSLiuPeng 6 months ago
Labels: bug
#792 - can be used in diffusion models,like sd and sdxl? how?where is the demos?tks
Issue -
State: open - Opened by henbucuoshanghai 7 months ago
Labels: bug
#791 - fix: fix position_encoding_table memory error.
Pull Request -
State: open - Opened by johnson-magic 8 months ago
#790 - bug: memory of position_encoding_table is not malloced correctly.
Issue -
State: open - Opened by johnson-magic 8 months ago
Labels: bug
#789 - error: ‘CUDNN_DATA_BFLOAT16’ was not declared in this scope; did you mean ‘CUDNN_DATA_FLOAT’
Issue -
State: closed - Opened by johnson-magic 9 months ago
#788 - How to know the correspondence between versions vcr.io/nvidia/pytorch:xx.xx-py3 and pytorch?
Issue -
State: open - Opened by johnson-magic 9 months ago
#787 - what is the mean of EFF-FT?
Issue -
State: open - Opened by johnson-magic 9 months ago
#786 - Are `fuseQKV masked attention` and Flash Attention the same?
Issue -
State: open - Opened by likejazz 9 months ago
#785 - M2M
Pull Request -
State: closed - Opened by sfc-gh-ybsat 10 months ago
- 1 comment
#784 - on H800 can not exec nvidia/pytorch:23.09-py3 container success
Issue -
State: open - Opened by chenglinjun 11 months ago
#783 - Confidence is not returned in the decoding example?
Issue -
State: open - Opened by liuzhuang1024 11 months ago
Labels: bug
#782 - multi_block_mode performance issue
Issue -
State: closed - Opened by akhoroshev 11 months ago
- 1 comment
#782 - multi_block_mode performance issue
Issue -
State: closed - Opened by akhoroshev 11 months ago
- 1 comment
#781 - Does FasterTransformer support multi-stream pipeline parallelism ?
Issue -
State: open - Opened by FlyingPotatoZ 12 months ago
#781 - Does FasterTransformer support multi-stream pipeline parallelism ?
Issue -
State: open - Opened by FlyingPotatoZ 12 months ago
#780 - Free memory buffer - Llama
Pull Request -
State: closed - Opened by sfc-gh-ybsat about 1 year ago
#780 - Free memory buffer - Llama
Pull Request -
State: closed - Opened by sfc-gh-ybsat about 1 year ago
#779 - error You need C++17 to compile PyTorch
Issue -
State: open - Opened by ranggihwang about 1 year ago
Labels: bug
#779 - error You need C++17 to compile PyTorch
Issue -
State: open - Opened by ranggihwang about 1 year ago
Labels: bug
#778 - can support decoder only bart? such as MBartForCausalLM
Issue -
State: open - Opened by sjtu-cz about 1 year ago
#778 - can support decoder only bart? such as MBartForCausalLM
Issue -
State: open - Opened by sjtu-cz about 1 year ago
#777 - repetition_penalty logic in FT has bug
Issue -
State: closed - Opened by hezeli123 about 1 year ago
- 1 comment
Labels: bug
#777 - repetition_penalty logic in FT has bug
Issue -
State: closed - Opened by hezeli123 about 1 year ago
- 1 comment
Labels: bug
#776 - Update README.md
Pull Request -
State: open - Opened by eltociear about 1 year ago
#776 - Update README.md
Pull Request -
State: open - Opened by eltociear about 1 year ago
#775 - Sparsity support
Issue -
State: open - Opened by zhang662817 about 1 year ago
Labels: bug
#775 - Sparsity support
Issue -
State: open - Opened by zhang662817 about 1 year ago
Labels: bug
#774 - How to get started?
Issue -
State: open - Opened by turbobuilt about 1 year ago
#774 - How to get started?
Issue -
State: open - Opened by turbobuilt about 1 year ago
#773 - Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel.
Pull Request -
State: open - Opened by FengDSP about 1 year ago
#773 - Fix shape mismatch on the masked_tokens param in decoder masked multi-head attention kernel.
Pull Request -
State: open - Opened by FengDSP about 1 year ago
#772 - How to serving multi-gpu inference?
Issue -
State: closed - Opened by Alone-wl about 1 year ago
- 1 comment
#772 - How to serving multi-gpu inference?
Issue -
State: closed - Opened by Alone-wl about 1 year ago
- 1 comment
#771 - Is llama2 70b supported? Do you know minimal configuration?
Issue -
State: open - Opened by ChristineSeven about 1 year ago
- 1 comment
#771 - Is llama2 70b supported? Do you know minimal configuration?
Issue -
State: open - Opened by ChristineSeven about 1 year ago
- 1 comment
#770 - Include stdio.h
Pull Request -
State: open - Opened by JihaoXin about 1 year ago
- 2 comments
#770 - Include stdio.h
Pull Request -
State: open - Opened by JihaoXin about 1 year ago
- 2 comments
#769 - Supporting for expert parallelism in MoE inference
Issue -
State: open - Opened by iteratorlee about 1 year ago
#769 - Supporting for expert parallelism in MoE inference
Issue -
State: open - Opened by iteratorlee about 1 year ago
#768 - Whether fastertransformer supports gpt-2 classification model, such as GPT2ForSequenceClassification?
Issue -
State: open - Opened by cabbagetalk about 1 year ago
#768 - Whether fastertransformer supports gpt-2 classification model, such as GPT2ForSequenceClassification?
Issue -
State: open - Opened by cabbagetalk about 1 year ago
#767 - cuSPARSELt is slower?
Issue -
State: open - Opened by BDHU about 1 year ago
- 1 comment
#766 - Incorrect inline ptx device assembly code usage
Issue -
State: open - Opened by zhiweij1 about 1 year ago
Labels: bug
#766 - Incorrect inline ptx device assembly code usage
Issue -
State: open - Opened by zhiweij1 about 1 year ago
Labels: bug
#765 - CUDA code compile error with clang: function template partial specialization is not allowed
Issue -
State: open - Opened by zhiweij1 about 1 year ago
Labels: bug
#764 - How to calculate local batch size?
Issue -
State: open - Opened by fotstrt about 1 year ago
#764 - How to calculate local batch size?
Issue -
State: open - Opened by fotstrt about 1 year ago
#763 - src/fastertransformer/kernels/decoder_masked_multihead_attention /decoder_masked_multihead_attention_template.hpp:36 open this macro definition, it'll find a build error
Issue -
State: open - Opened by pengl about 1 year ago
Labels: bug
#763 - src/fastertransformer/kernels/decoder_masked_multihead_attention /decoder_masked_multihead_attention_template.hpp:36 open this macro definition, it'll find a build error
Issue -
State: open - Opened by pengl about 1 year ago
Labels: bug
#762 - Ft llama opt
Pull Request -
State: open - Opened by dypshong about 1 year ago
#762 - Ft llama opt
Pull Request -
State: open - Opened by dypshong about 1 year ago
#761 - terminate called after throwing an instance of 'std::runtime_error'
Issue -
State: open - Opened by HalFTeen about 1 year ago
#760 - fastertransformer/utils/nccl_utils.cc:62 'unhandled cuda error'
Issue -
State: open - Opened by wangweiwei1188 about 1 year ago
Labels: bug
#760 - fastertransformer/utils/nccl_utils.cc:62 'unhandled cuda error'
Issue -
State: open - Opened by wangweiwei1188 about 1 year ago
Labels: bug
#759 - Support for "no_repeat_ngram_size" parameter for generation
Issue -
State: open - Opened by shreysingla11 about 1 year ago
- 2 comments
#759 - Support for "no_repeat_ngram_size" parameter for generation
Issue -
State: open - Opened by shreysingla11 about 1 year ago
- 2 comments
#758 - Does FasterTransformer use FlashAttention?
Issue -
State: open - Opened by niyunsheng about 1 year ago
#758 - Does FasterTransformer use FlashAttention?
Issue -
State: open - Opened by niyunsheng about 1 year ago
#757 - Which part should I modify to achieve inference pipeline schedule (like micro-batch)?
Issue -
State: open - Opened by dannyxiaocn about 1 year ago
#757 - Which part should I modify to achieve inference pipeline schedule (like micro-batch)?
Issue -
State: open - Opened by dannyxiaocn about 1 year ago
#756 - Support Seq length up to 8K
Pull Request -
State: open - Opened by zhen-jia about 1 year ago
#756 - Support Seq length up to 8K
Pull Request -
State: open - Opened by zhen-jia about 1 year ago
#755 - [cmake] fix cmake policy for ENABLE_FP8
Pull Request -
State: closed - Opened by DefTruth about 1 year ago
#755 - [cmake] fix cmake policy for ENABLE_FP8
Pull Request -
State: closed - Opened by DefTruth about 1 year ago
#754 - flashattention only enabled for gpt-styled models
Issue -
State: open - Opened by flexwang about 1 year ago
- 7 comments
#754 - flashattention only enabled for gpt-styled models
Issue -
State: open - Opened by flexwang about 1 year ago
- 7 comments
#753 - How to get a npz file that satisfy the input requirement?
Issue -
State: open - Opened by jy00161yang about 1 year ago
- 1 comment
Labels: bug
#753 - How to get a npz file that satisfy the input requirement?
Issue -
State: open - Opened by jy00161yang about 1 year ago
- 1 comment
Labels: bug
#752 - [Long seq length] GPT Seq length constrain
Issue -
State: open - Opened by zhen-jia about 1 year ago
- 14 comments
#752 - [Long seq length] GPT Seq length constrain
Issue -
State: open - Opened by zhen-jia about 1 year ago
- 14 comments
#751 - specify the recognition language for Whisper
Issue -
State: open - Opened by echodjx about 1 year ago
#751 - specify the recognition language for Whisper
Issue -
State: open - Opened by echodjx about 1 year ago
#750 - [BugFix] GPT inference error when pipeline_para_size > 1 and int8_mode != 0
Pull Request -
State: open - Opened by 00why00 about 1 year ago
#750 - [BugFix] GPT inference error when pipeline_para_size > 1 and int8_mode != 0
Pull Request -
State: open - Opened by 00why00 about 1 year ago
#749 - Is it possible to serve GPT-NeoX ONNX exported through optimum?
Issue -
State: open - Opened by sonientaegi about 1 year ago
#749 - Is it possible to serve GPT-NeoX ONNX exported through optimum?
Issue -
State: open - Opened by sonientaegi about 1 year ago
#748 - [feature request] transformer on orin
Issue -
State: open - Opened by superpigforever about 1 year ago
#748 - [feature request] transformer on orin
Issue -
State: open - Opened by superpigforever about 1 year ago
#747 - How to run multi_gpu_gpt_examples.py without mpirun/mpiexe
Issue -
State: closed - Opened by ZZWHU about 1 year ago
#746 - core dumped of swin model
Issue -
State: open - Opened by chiemon about 1 year ago
- 1 comment
Labels: bug
#746 - core dumped of swin model
Issue -
State: open - Opened by chiemon about 1 year ago
- 1 comment
Labels: bug
#744 - Failed building t5 model in FastTransformer (Reached 82% then stopped)
Issue -
State: open - Opened by EmanElrefai12 about 1 year ago
- 3 comments
Labels: bug
#744 - Failed building t5 model in FastTransformer (Reached 82% then stopped)
Issue -
State: open - Opened by EmanElrefai12 about 1 year ago
- 3 comments
Labels: bug
#736 - Using faster transformers to infer the bloom model, the accuracy rate is 0
Issue -
State: open - Opened by hurun over 1 year ago
- 2 comments
Labels: bug
#736 - Using faster transformers to infer the bloom model, the accuracy rate is 0
Issue -
State: open - Opened by hurun over 1 year ago
- 2 comments
Labels: bug
#735 - OSError: lib/libth_transformer.so: cannot open shared object file: No such file or directory
Issue -
State: open - Opened by arnabmanna619 over 1 year ago
- 1 comment
#735 - OSError: lib/libth_transformer.so: cannot open shared object file: No such file or directory
Issue -
State: open - Opened by arnabmanna619 over 1 year ago
- 1 comment
#734 - TP=2, Loss of accuracy
Issue -
State: open - Opened by coderchem over 1 year ago
- 2 comments
#734 - TP=2, Loss of accuracy
Issue -
State: open - Opened by coderchem over 1 year ago
- 2 comments
#730 - Compatibility issue with CUDA 12.2
Issue -
State: open - Opened by MinghaoYan over 1 year ago
- 6 comments
Labels: bug
#729 - llama support inference?
Issue -
State: open - Opened by double-vin over 1 year ago
- 2 comments
#729 - llama support inference?
Issue -
State: open - Opened by double-vin over 1 year ago
- 2 comments
#728 - [Question] Is it possible to use my own pretrained weights for ViT QAT
Issue -
State: open - Opened by proevgenii over 1 year ago
- 3 comments
Labels: bug
#728 - [Question] Is it possible to use my own pretrained weights for ViT QAT
Issue -
State: open - Opened by proevgenii over 1 year ago
- 3 comments
Labels: bug
#727 - Are MQA and GQA in development?
Issue -
State: open - Opened by ljayx over 1 year ago
- 8 comments
#727 - Are MQA and GQA in development?
Issue -
State: open - Opened by ljayx over 1 year ago
- 8 comments
#720 - docker/Dockerfile.torch occurs errors
Issue -
State: closed - Opened by b3y0nd over 1 year ago
- 4 comments
Labels: bug
#720 - docker/Dockerfile.torch occurs errors
Issue -
State: closed - Opened by b3y0nd over 1 year ago
- 4 comments
Labels: bug