Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / NVIDIA/apex issues and pull requests
#1471 - RuntimeError: Error compiling objects for extension error: subprocess-exited-with-error
Issue -
State: open - Opened by SamAct over 2 years ago
- 3 comments
#1455 - .cu files should not include torch/extension.h
Pull Request -
State: open - Opened by lostmsu over 2 years ago
- 4 comments
#1439 - Add BF16 support to FusedMixedPrecisionLamb
Pull Request -
State: closed - Opened by nv-joseli over 2 years ago
#1420 - Could not find permutation search CUDA kernels, falling back to CPU path
Issue -
State: closed - Opened by te-shi over 2 years ago
- 5 comments
Labels: bug
#1415 - NVCC --threads option is hardcoded
Issue -
State: open - Opened by wvidana over 2 years ago
- 2 comments
Labels: bug
#1408 - how to invoke amp.initialize() and amp.scale_loss() from different module
Issue -
State: closed - Opened by kehuanfeng over 2 years ago
- 2 comments
Labels: bug
#1400 - [transformer] Port Sequence Parallelism (takeover of #1396)
Pull Request -
State: closed - Opened by crcrpar over 2 years ago
- 1 comment
#1394 - FusedDenseGeluDense output NAN
Issue -
State: open - Opened by gongjingcs over 2 years ago
- 2 comments
Labels: bug
#1326 - Installation Error
Issue -
State: closed - Opened by GMN23362 almost 3 years ago
- 2 comments
#1314 - `fused_weight_gradient_mlp_cuda` module not found. gradient accumulation fusion with weight gradient computation disabled.
Issue -
State: open - Opened by adore1979 almost 3 years ago
- 25 comments
#1293 - The following error occurred while installing apex
Issue -
State: closed - Opened by xxw11 almost 3 years ago
- 2 comments
#1282 - Handle len(cached_x.grad_fn.next_functions) == 1 in cached_cast
Pull Request -
State: open - Opened by jiafatom almost 3 years ago
- 8 comments
#1230 - Using apex leeads to a `CUDA out of memory` on an A100
Issue -
State: closed - Opened by StrangeTcy about 3 years ago
- 2 comments
#1229 - [FMHA] add support for later CUDA (8.x)
Pull Request -
State: closed - Opened by jqueguiner about 3 years ago
- 4 comments
#1227 - I am a Research Institute of Microsoft Research Institute. When I used apex in mmdection software, the following error occurred, We look forward to your answer. Thank you very much
Issue -
State: closed - Opened by xianglei3 about 3 years ago
- 3 comments
#1204 - pipeline_parallel - ModuleNotFoundError: No module named 'amp_C'
Issue -
State: open - Opened by MatthieuCed about 3 years ago
- 20 comments
#1193 - RuntimeError: apex.optimizers.FusedAdam requires cuda extensions
Issue -
State: open - Opened by life97 over 3 years ago
- 18 comments
#1178 - BFloat16 support in multi_tensor_*
Issue -
State: closed - Opened by zhengwy888 over 3 years ago
- 2 comments
#1175 - no_sync equivalent used for gradient accumulation
Issue -
State: open - Opened by amsword over 3 years ago
- 2 comments
#1141 - install apex error, flatten_unflatten.obj cannot open
Issue -
State: open - Opened by MrBook2019 over 3 years ago
- 6 comments
#1089 - Failed to install apex on CUDA 10.1 torch 1.6.0
Issue -
State: closed - Opened by Ema1997 over 3 years ago
- 2 comments
#1072 - FastLayerNorm ext not found after install on master
Issue -
State: closed - Opened by sshleifer almost 4 years ago
- 3 comments
#990 - TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
Issue -
State: open - Opened by KrisWongz about 4 years ago
- 14 comments
#965 - RuntimeError: expected scalar type Float but found Half
Issue -
State: open - Opened by superlwx over 4 years ago
- 7 comments
#961 - Error occurs when building 'apex_C' extension: no such file -> 'flatten_unflatten.o'
Issue -
State: closed - Opened by selous123 over 4 years ago
- 3 comments
#957 - fatal error: cublas_v2.h: No such file or directory
Issue -
State: open - Opened by shizhediao over 4 years ago
- 6 comments
#955 - could not install with "pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./"
Issue -
State: open - Opened by songtaoshi over 4 years ago
- 6 comments
#954 - ModuleNotFoundError: No module named 'fused_layer_norm_cuda'
Issue -
State: closed - Opened by ajesujoba over 4 years ago
- 3 comments
#874 - Anaconda fail to build with "--cpp_ext" and "--cuda_ext" options
Issue -
State: open - Opened by BurguerJohn over 4 years ago
- 2 comments
#865 - distributed lamb breaks python-only amp
Issue -
State: closed - Opened by lisadunlap over 4 years ago
- 10 comments
#855 - LAMB and gradient clipping (instructions vs api)
Issue -
State: open - Opened by ggaemo over 4 years ago
- 2 comments
#832 - Expected tensor for argument #1 'input' to have the same type as tensor for argument #2 'rois'; but type torch.cuda.HalfTensor does not equal torch.cuda.FloatTensor
Issue -
State: open - Opened by sarmientoj24 over 4 years ago
- 7 comments
#810 - super slow to build Apex from source in docker
Issue -
State: open - Opened by alexucb over 4 years ago
- 1 comment
#802 - Build error (error: expected primary-expression before 'some' token)
Issue -
State: open - Opened by kkjh0723 over 4 years ago
- 24 comments
#777 - " ZeroDivisionError: float division by zero" in scaler.py
Issue -
State: closed - Opened by qmpzzpmq almost 5 years ago
- 2 comments
#774 - Grad norm cut in half every 2000 steps?
Issue -
State: closed - Opened by PCerles almost 5 years ago
#769 - cupy.cuda.driver.CUDADriverError: CUDA_ERROR_ILLEGAL_ADDRESS: an illegal memory access was encountered
Issue -
State: open - Opened by MittalShruti almost 5 years ago
- 3 comments
#715 - problems with fp16 on multi-gpu training
Issue -
State: closed - Opened by ssp573 almost 5 years ago
- 1 comment
#702 - Update pyprof for nsight
Pull Request -
State: closed - Opened by ghost almost 5 years ago
- 3 comments
#698 - Avoid exception when initializing FusedNovoGrad with amp
Pull Request -
State: closed - Opened by henrymai almost 5 years ago
#694 - Multiple independent models, only one requires apex.amp, crash in non-amp CPU model
Issue -
State: open - Opened by lopuhin almost 5 years ago
- 13 comments
#635 - Gradient overflow. Skipping step, loss scaler 0 reducing loss scale to
Issue -
State: open - Opened by zsun1029 about 5 years ago
- 11 comments
#621 - ImportError: cannot import name 'amp'
Issue -
State: open - Opened by vr25 about 5 years ago
- 13 comments
#573 - Original ImportError was: ModuleNotFoundError("No module named 'amp_C')
Issue -
State: closed - Opened by misslibra about 5 years ago
- 9 comments
#550 - cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.
Issue -
State: closed - Opened by antgr about 5 years ago
- 15 comments
#548 - Problem installation
Issue -
State: open - Opened by emsansone over 5 years ago
- 7 comments
#547 - Module 'torch.nn' has no attribute 'backends'
Issue -
State: closed - Opened by YuryBolkonsky over 5 years ago
- 8 comments
#533 - Not able to observe any speedup on a Nvidia T4 (Turing arch)
Issue -
State: open - Opened by aditya1709 over 5 years ago
- 4 comments
#519 - RuntimeError: main thread is not in main loop
Issue -
State: open - Opened by H-YunHui over 5 years ago
- 3 comments
#497 - Installation Error.
Issue -
State: open - Opened by chunyuanY over 5 years ago
- 2 comments
#466 - remove deprecated backend.FunctionBackend calls
Pull Request -
State: closed - Opened by ptrblck over 5 years ago
- 2 comments
#465 - AttributeError: 'DistributedDataParallel' object has no attribute 'buckets_ready_size'
Issue -
State: open - Opened by makslevental over 5 years ago
- 3 comments
#464 - Keep certain modules as FP32
Issue -
State: closed - Opened by yaysummeriscoming over 5 years ago
- 3 comments
#393 - I try the example when init init_process_group got an error
Issue -
State: closed - Opened by PistonY over 5 years ago
- 15 comments
#370 - undefined symbol: __ZN2at19UndefinedTensorImpl10_singletonE
Issue -
State: closed - Opened by rmrao over 5 years ago
- 4 comments
#368 - FileNotFoundError: [Errno 2] No such file or directory: ':/usr/local/cuda:/usr/local/cuda-10.1/bin/nvcc': ':/usr/local/cuda:/usr/local/cuda-10.1/bin/nvcc'
Issue -
State: open - Opened by allianceai over 5 years ago
- 21 comments
#323 - Hard error on mismatch between torch.version.cuda and + the Cuda toolkit version being used to compile Apex
Pull Request -
State: closed - Opened by mcarilli over 5 years ago
- 24 comments
#318 - How to handle gradient overflow when training a deep model with mixed precision?
Issue -
State: open - Opened by tfwu over 5 years ago
- 29 comments
#187 - bugs after apex installation
Issue -
State: open - Opened by yinwenpeng almost 6 years ago
- 7 comments
Labels: extension build
#161 - No module named 'fused_layer_norm_cuda'
Issue -
State: closed - Opened by alvin-leong almost 6 years ago
- 23 comments
#116 - TypeError: Class advice impossible in Python3
Issue -
State: closed - Opened by lynnna-xu about 6 years ago
- 15 comments
#107 - AttributeError: 'DistributedDataParallel' object has no attribute 'callback_queued'
Issue -
State: closed - Opened by LightToYang about 6 years ago
- 7 comments
#99 - ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable WORLD_SIZE expected, but not set
Issue -
State: closed - Opened by yuanfuqiang456 about 6 years ago
- 5 comments
#86 - Warning: apex was installed without --cuda_ext.
Issue -
State: closed - Opened by amuier about 6 years ago
- 35 comments