Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / NVIDIA/k8s-device-plugin issues and pull requests

#443 - MPS with Kubernetes on NVIDIA GPU

Issue - State: open - Opened by selinnilesy 12 months ago - 27 comments
Labels: feature

#426 - GPU gets marked as unhealthy on systemctl daemon-reloads + kubelet restarts (on Kubernetes Upgrades)

Issue - State: open - Opened by sstrk about 1 year ago - 5 comments
Labels: needs-triage

#404 - Questions about GPU time-sharing on Kubernetes

Issue - State: open - Opened by jxl4650152 over 1 year ago

#403 - Add ability to restart container on device failures

Pull Request - State: closed - Opened by lxpbl over 1 year ago

#402 - Clarify FS watcher error with path

Pull Request - State: open - Opened by Dentrax over 1 year ago - 1 comment

#401 - Who is maintaining this repo??

Issue - State: closed - Opened by maaft over 1 year ago - 3 comments

#398 - feat(plugin): Make resource name configurable

Pull Request - State: closed - Opened by YitzyD over 1 year ago

#397 - failed to construct NVML resource managers

Issue - State: closed - Opened by shortwavedave over 1 year ago - 1 comment

#396 - Bump github.com/opencontainers/runc from 1.1.4 to 1.1.5

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#393 - gitlab location is where developers should open PRs on

Pull Request - State: closed - Opened by kannon92 over 1 year ago

#392 - K8 Job does not get marked as completed after the pod succeeds in AKS version 1.25.5

Issue - State: closed - Opened by narendrakumar-nj over 1 year ago - 2 comments
Labels: lifecycle/stale

#391 - Add prestart-hook for device plugin

Pull Request - State: closed - Opened by kannon92 over 1 year ago - 7 comments

#390 - NVIDIA device plugin isn't advertising the GPUs

Issue - State: closed - Opened by glopezdiest over 1 year ago - 12 comments

#386 - report index as RegisteredDevices when device-id-strategy set to index

Pull Request - State: closed - Opened by borgerli over 1 year ago - 4 comments

#385 - Bump golang.org/x/text from 0.3.3 to 0.3.8

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: dependencies

#384 - Add documentation for CRI-O

Issue - State: open - Opened by AndreasMurk over 1 year ago

#382 - 0/1 nodes are available: 1 Insufficient nvidia.com/gpu.

Issue - State: open - Opened by liufangpeng over 1 year ago - 2 comments

#381 - nvidia-cuda-mps-control: command not found

Issue - State: closed - Opened by LoRKaa over 1 year ago

#380 - fix expired link

Pull Request - State: closed - Opened by zhouhao3 over 1 year ago - 2 comments

#379 - About GPU cleanup features

Issue - State: open - Opened by zhouhao3 over 1 year ago

#378 - can not distinguish t4 and a100 ?

Issue - State: open - Opened by ggjjlldd over 1 year ago - 1 comment

#377 - Plug in does not detect Tegra device Jetson Nano

Issue - State: open - Opened by VladoPortos over 1 year ago - 9 comments

#376 - The plugin container registry is inaccessible via IPv6

Issue - State: open - Opened by osipov over 1 year ago

#375 - Update README.md

Pull Request - State: closed - Opened by HeGaoYuan over 1 year ago - 1 comment

#374 - Update README.md

Pull Request - State: closed - Opened by HeGaoYuan over 1 year ago

#373 - container failed to start after the VM node migrated to another host

Issue - State: open - Opened by borgerli over 1 year ago - 3 comments

#372 - nvidia-device-plugin getting CrashLoopBackOff while installing using helm

Issue - State: open - Opened by captainsk7 over 1 year ago - 2 comments

#370 - Update README.md

Pull Request - State: open - Opened by hholst80 almost 2 years ago

#369 - apt-key is deprecated

Issue - State: open - Opened by hholst80 almost 2 years ago

#368 - k8s-device-plugin restarts on k3s deployment (on top of containerd)

Issue - State: open - Opened by hholst80 almost 2 years ago - 15 comments
Labels: lifecycle/stale

#354 - timeslice config

Issue - State: open - Opened by segamishuichi almost 2 years ago

#353 - Enable resource renaming in time-slicing shared GPUs

Issue - State: closed - Opened by Telemaco019 almost 2 years ago - 4 comments

#352 - k3s nvidia-device-plugin-daemonset report error - Fixed

Issue - State: open - Opened by liujie1008cn almost 2 years ago - 2 comments

#351 - compatible gpu type with k8s gpu time slicing

Issue - State: open - Opened by alirezadaghigh99 almost 2 years ago

#350 - compatible gpu type with k8s gpu time slicing

Issue - State: open - Opened by alirezadaghigh99 almost 2 years ago

#349 - How do I know that the upgrade of the NVIDIA device plugin went well?

Issue - State: closed - Opened by Borchies almost 2 years ago - 4 comments

#347 - how to support nvlink between several k8s pods?

Issue - State: open - Opened by pokerc almost 2 years ago - 3 comments

#346 - "nvidia-smi": executable file not found in $PATH: unknown

Issue - State: open - Opened by devriesewouter89 almost 2 years ago - 3 comments

#345 - How does GPU Pod dynamically schedule clusters

Issue - State: open - Opened by Kry1702 almost 2 years ago - 1 comment

#344 - GPU is not available with a GPU EC2 instance in EKS cluster (1.23)

Issue - State: open - Opened by garyyang6 almost 2 years ago - 1 comment

#343 - Question about MIG config persistent

Issue - State: open - Opened by slow-zhang almost 2 years ago - 11 comments

#342 - Share GPU in same pod using volume-mounts strategy

Issue - State: open - Opened by dcarrion87 almost 2 years ago - 5 comments

#341 - nvdi-smi hogs CPU

Issue - State: open - Opened by duk0011 almost 2 years ago - 2 comments

#340 - MIG for A6000

Issue - State: closed - Opened by arijitthegame almost 2 years ago - 2 comments

#339 - Question about safely upgrading device plugin

Issue - State: open - Opened by henrysecond1 almost 2 years ago

#338 - Device driver panics randomly with unknown error

Issue - State: open - Opened by olemarkus almost 2 years ago - 2 comments

#337 - Update README.md

Pull Request - State: closed - Opened by tico88612 almost 2 years ago - 1 comment

#336 - How do I run my application on the GPU assigned by k8s

Issue - State: closed - Opened by c-android almost 2 years ago - 2 comments

#335 - Pods with GPU terminating very slowly

Issue - State: open - Opened by Zhurik about 2 years ago

#334 - Previous issue - #297, Is plugin ready for jetson nano devices

Issue - State: open - Opened by ravinayag about 2 years ago - 8 comments

#332 - Getting GPU device minor number: Not Supported

Issue - State: open - Opened by zengzhengrong about 2 years ago - 13 comments

#331 - Function not found for nvml methods

Issue - State: open - Opened by kbkartik about 2 years ago - 3 comments

#330 - Startup race condition with dcgm-exporter

Issue - State: closed - Opened by skraga about 2 years ago - 3 comments

#329 - How can I use nvidia gpu in kubernetes pod?

Issue - State: open - Opened by misupopo about 2 years ago - 1 comment

#328 - Pods are not scheduled in all GPUs of a physical server.

Issue - State: closed - Opened by shan100github about 2 years ago - 25 comments

#327 - Make repo go install friendly

Issue - State: open - Opened by anthonyrisinger about 2 years ago

#326 - undefined symbol nvmlGpuInstanceGetComputeInstanceProfileInfoV in v12+

Issue - State: closed - Opened by anthonyrisinger about 2 years ago - 3 comments

#323 - CUDA memory error

Issue - State: closed - Opened by Borchies about 2 years ago - 8 comments

#321 - Unable to get nvidia.com/gpu: "1" greater than 1 for Quadro P2000

Issue - State: closed - Opened by brianbrady about 2 years ago - 13 comments

#320 - Incorrect indentation for securityContext > capability in daemonset

Issue - State: closed - Opened by waynelwh about 2 years ago - 2 comments

#318 - How to check the NVIDIA k8-device-plugin version?

Issue - State: closed - Opened by esparig about 2 years ago - 7 comments

#317 - NVIDIA A10 GPUs - are these drivers in the NVIDIA / k8s-device-plugin

Issue - State: open - Opened by jeffreydahan about 2 years ago - 9 comments

#315 - nvidia-device-plugin daemonset has 0 desired and no pod is launched

Issue - State: open - Opened by blackjack2015 over 2 years ago - 4 comments

#314 - device plugin default_runtime_name requirement and documentation

Issue - State: closed - Opened by rptaylor over 2 years ago - 2 comments

#311 - Fix containerd config in README.md

Pull Request - State: closed - Opened by gmrukwa over 2 years ago - 2 comments

#302 - How to use the device plugin with new k8s 1.24 version?

Issue - State: open - Opened by Zigko over 2 years ago - 21 comments

#300 - [add]: support for hostNetwork parameter in daemonset deployment

Pull Request - State: closed - Opened by vasudev-singhc-by over 2 years ago - 2 comments

#298 - fix: Fixed a build error on io.ReadAll

Pull Request - State: closed - Opened by aelgasser over 2 years ago - 6 comments

#297 - Cannot run nvidia-device-plugins on arm64 with cuda 10.2

Issue - State: closed - Opened by Fvoiretryzig over 2 years ago - 3 comments

#292 - how to schedule jobs to different type of gpus?

Issue - State: closed - Opened by silverlining21 over 2 years ago - 6 comments

#289 - pod fail to find gpu some time after created

Issue - State: closed - Opened by JuHyung-Son over 2 years ago - 14 comments

#284 - update the Dockerfile: NVIDIA_DRIVER_CAPABILITIES=utility,compute

Pull Request - State: closed - Opened by alex337 almost 3 years ago - 2 comments

#274 - Cannot find GPU information in Capacity when kubectl describe a K8s GPU node.

Issue - State: open - Opened by Zeyu-ZEYU almost 3 years ago - 3 comments

#267 - k8s-device-plugin seems to think gpu healthy when it is not usable due to Uncorrectable ECC Error

Issue - State: open - Opened by tingweiwu about 3 years ago - 8 comments
Labels: lifecycle/stale

#258 - Multi card

Pull Request - State: closed - Opened by archlitchi about 3 years ago - 2 comments

#253 - Installation failed k8s-device-plugin(v0.9.0)

Issue - State: open - Opened by Kwonho over 3 years ago - 12 comments

#199 - Setting "failOnInitError" unexpectedly "works" with a small 2 node cluster.

Issue - State: closed - Opened by supertetelman almost 4 years ago - 2 comments

#169 - Support sharing GPUs

Issue - State: open - Opened by ktarplee over 4 years ago - 37 comments

#151 - Update nvidia-device-plugin.yml

Pull Request - State: closed - Opened by frankenstien-831 almost 5 years ago - 3 comments

#143 - How to use specific NVIDIA GPU type(model) in pod yaml

Issue - State: closed - Opened by estherxyz almost 5 years ago - 8 comments