GitHub / NVIDIA/gpu-operator issues and pull requests
#1539 - Bump the k8sio group with 4 updates
Pull Request -
State: open - Opened by dependabot[bot] 17 days ago
Labels: dependencies
#1538 - Bump k8s.io/code-generator from 0.33.2 to 0.33.3 in /tools
Pull Request -
State: open - Opened by dependabot[bot] 17 days ago
Labels: dependencies
#1449 - Install On Air-Gapped OKD 4.15.0-0 FCOS
Issue -
State: open - Opened by jvincze84 2 months ago
#1448 - Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.6 to 1.17.7
Pull Request -
State: open - Opened by dependabot[bot] 2 months ago
- 1 comment
Labels: dependencies
#1447 - Bump NVIDIA Container Toolkit to 1.17.7
Pull Request -
State: open - Opened by JunAr7112 3 months ago
- 1 comment
#1446 - Bump k8s.io/code-generator from 0.32.3 to 0.33.1 in /tools
Pull Request -
State: open - Opened by dependabot[bot] 3 months ago
- 1 comment
Labels: dependencies
#1445 - Added SECURITY.md to the repo
Pull Request -
State: closed - Opened by JunAr7112 3 months ago
- 1 comment
#1444 - [dcgm-exporter] add support for setting dcgmexporter service type and internalTrafficPolicy
Pull Request -
State: closed - Opened by tariq1890 3 months ago
- 3 comments
Labels: needs-backport
#1443 - Bump Device Plugin 0.17.2
Pull Request -
State: closed - Opened by JunAr7112 3 months ago
- 2 comments
Labels: needs-backport
#1442 - Bump github.com/NVIDIA/go-nvlib from 0.7.1 to 0.7.2
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 3 comments
Labels: dependencies
#1441 - bump golang version to v1.24.3
Pull Request -
State: closed - Opened by tariq1890 3 months ago
- 1 comment
Labels: needs-backport
#1440 - bump DCGM/DCGM-Exporter to version 4.2.3
Pull Request -
State: closed - Opened by tariq1890 3 months ago
- 1 comment
Labels: needs-backport
#1439 - DCGM Exporter Service ignores internalTrafficPolicy: Local in v25.3.0
Issue -
State: closed - Opened by Sicarius07 3 months ago
- 4 comments
#1438 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.81.0 to 0.82.2
Pull Request -
State: open - Opened by dependabot[bot] 3 months ago
- 1 comment
Labels: dependencies
#1437 - Security Vulnerability: GNU C Library (glibc) 2.13 <= 2.40 - Local Arbitrary Code Execution Vulnerability - 2.41
Issue -
State: open - Opened by shwethadec01 3 months ago
#1436 - For some reason, the ubuntu24.04 daemonset is selecting a 22.04 binary driver image (reopens #722)
Issue -
State: open - Opened by doctorpangloss 3 months ago
#1435 - reuse utils.GetObjectHash() in object_controls.go
Pull Request -
State: closed - Opened by tariq1890 3 months ago
- 1 comment
#1434 - Bump sigs.k8s.io/controller-tools from 0.17.2 to 0.18.0 in /tools
Pull Request -
State: open - Opened by dependabot[bot] 3 months ago
- 1 comment
Labels: dependencies
#1433 - Issue: GPU Operator Fails on Jetson Orin (ARM64) — Needed for Kai Scheduler
Issue -
State: open - Opened by Ashwinraj2000 3 months ago
#1432 - Helm install faild , the `defaultRuntime` is required by CRD
Issue -
State: open - Opened by LinkMaq 3 months ago
- 2 comments
#1431 - Bump nvidia/cuda from 12.8.1-base-ubi9 to 12.9.0-base-ubi9 in /docker
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 3 comments
Labels: dependencies, docker, needs-backport
#1430 - Bump nvidia/cuda from 12.8.1-base-ubi9 to 12.9.0-base-ubi9 in /validator
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 3 comments
Labels: dependencies, docker, needs-backport
#1429 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.81.0 to 0.82.1
Pull Request -
State: open - Opened by dependabot[bot] 3 months ago
- 1 comment
Labels: dependencies
#1428 - CUDA_ERROR_SYSTEM_DRIVER_MISMATCH
Issue -
State: open - Opened by RangaSamudrala 3 months ago
- 1 comment
#1427 - [unit-test-coverage] integrate coveralls into the CI pipeline
Pull Request -
State: closed - Opened by tariq1890 3 months ago
#1426 - Bump golangci/golangci-lint-action from 7 to 8
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 3 comments
Labels: dependencies, github_actions
#1425 - Use drop-ins for containerd configuration when available
Issue -
State: open - Opened by mkjpryor 3 months ago
#1424 - Add unit test for TransformNodeStatusExporter
Pull Request -
State: closed - Opened by shivakunv 3 months ago
- 5 comments
#1423 - Allow k8s-device-plugin daemonset run in nonpriveleged mode
Pull Request -
State: open - Opened by curlup 3 months ago
- 2 comments
#1422 - #1421 Add port annotation for prometheus
Pull Request -
State: open - Opened by DominicWatson 3 months ago
- 2 comments
#1421 - Prometheus unable to scrape stats as the scrape port annotation is not set on the dcgm-exporter service
Issue -
State: open - Opened by DominicWatson 3 months ago
#1420 - Bump github.com/operator-framework/api from 0.30.0 to 0.31.0
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 2 comments
Labels: dependencies
#1419 - [release-25.3] Cherrypick commits
Pull Request -
State: closed - Opened by tariq1890 3 months ago
- 1 comment
#1418 - bump NFD to v0.17.3
Pull Request -
State: closed - Opened by tariq1890 3 months ago
Labels: needs-backport
#1417 - Switch to distroless base image
Pull Request -
State: open - Opened by cdesiniotis 3 months ago
Labels: needs-backport
#1416 - Update conditions for a stale driver ds in the NVIDIADriver controller
Pull Request -
State: closed - Opened by cdesiniotis 3 months ago
Labels: needs-backport
#1415 - [Dockerfile] add support for removing rpm packages
Pull Request -
State: closed - Opened by tariq1890 3 months ago
#1414 - Bump k8s.io/code-generator from 0.32.3 to 0.33.0 in /tools
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 2 comments
Labels: dependencies
#1413 - Bump github.com/regclient/regclient from 0.8.2 to 0.8.3
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 2 comments
Labels: dependencies
#1412 - Bump github.com/NVIDIA/nvidia-container-toolkit from 1.17.5 to 1.17.6
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 2 comments
Labels: dependencies
#1411 - Bump github.com/mittwald/go-helm-client from 0.12.16 to 0.12.17
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 2 comments
Labels: dependencies
#1410 - Bump the k8sio group with 4 updates
Pull Request -
State: open - Opened by dependabot[bot] 3 months ago
- 1 comment
Labels: dependencies
#1409 - update driver versions to 570.133.20 550.163.01 & 535.247.01
Pull Request -
State: closed - Opened by tariq1890 3 months ago
Labels: needs-backport
#1408 - Bump GDRCopy driver image to v2.5
Pull Request -
State: closed - Opened by cdesiniotis 3 months ago
Labels: needs-backport
#1407 - Bump NVIDIA Container Toolkit to 1.17.6
Pull Request -
State: closed - Opened by cdesiniotis 3 months ago
Labels: needs-backport
#1406 - nvidia-driver-ctr sleep
Pull Request -
State: closed - Opened by tariq1890 3 months ago
#1405 - Facing issue with DCGM exporter due to Nvidia GPU Operator initialization problem
Issue -
State: open - Opened by jaipreetnagpal 3 months ago
- 5 comments
#1404 - [Do not merge] Test requestor upgrade integration. Disabled by default
Pull Request -
State: closed - Opened by heyvister1 4 months ago
- 3 comments
#1403 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.81.0 to 0.82.0
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 3 comments
Labels: dependencies
#1402 - Bump golang.org/x/net from 0.36.0 to 0.38.0 in /tools
Pull Request -
State: open - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies, go
#1401 - Updated .release:staging to stage images in nvstaging
Pull Request -
State: open - Opened by JunAr7112 4 months ago
- 1 comment
#1400 - Multi-GPU allocation with precise control in shared environment
Issue -
State: open - Opened by FourierMourier 4 months ago
- 6 comments
#1399 - Is it possible to enable MIG only on specific nodes when using the GPU Operator?
Issue -
State: open - Opened by larcane97 4 months ago
#1398 - Everything seems to be ok, but it doesn't work Ubuntu 24.04, Operator v25.3.0
Issue -
State: open - Opened by blumfontein 4 months ago
- 2 comments
#1397 - GPU Operator v25.3.0 with DCGM exporter v4.1.1-2: DCGM_FI_PROF_GR_ENGINE_ACTIVE': metric not enabled
Issue -
State: open - Opened by gseidlerhpe 4 months ago
#1396 - [cherrypick][release-25.3] update golang to 1.24.2
Pull Request -
State: closed - Opened by tariq1890 4 months ago
#1395 - update golang to 1.24.2
Pull Request -
State: closed - Opened by tariq1890 4 months ago
Labels: needs-backport
#1394 - Bump helm.sh/helm/v3 from 3.16.4 to 3.17.3
Pull Request -
State: open - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies, go
#1393 - when gdrcopy is enabled i was getting crashloopback error for nvdia-driver-daemonset pods with gpu operator v25.3.0
Issue -
State: open - Opened by Katakam-Rakesh 4 months ago
#1392 - Clarification on Automatic Component Cleanup When Node Labels Change (e.g., `container` ↔ `vm-passthrough`)
Issue -
State: open - Opened by kingeasternsun 4 months ago
#1391 - GPU related labels persists on the nodes even after uninstalling GPU Operator
Issue -
State: closed - Opened by rajendragosavi 4 months ago
- 4 comments
#1390 - Unable to load the kernel module 'nvidia.ko
Issue -
State: open - Opened by rajendragosavi 4 months ago
- 2 comments
#1389 - Bump github.com/prometheus/client_golang from 1.21.1 to 1.22.0
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies
#1388 - In the case of multiple GPUs, specify the GPU to use
Issue -
State: closed - Opened by geekopslc 4 months ago
- 1 comment
#1387 - [cherrypick] bump DCGM and DCGM-Exporter versions to 4.2.0
Pull Request -
State: closed - Opened by tariq1890 4 months ago
#1386 - bump DCGM and DCGM-Exporter versions to 4.2.0
Pull Request -
State: closed - Opened by tariq1890 4 months ago
Labels: needs-backport
#1385 - Bump github.com/onsi/ginkgo/v2 from 2.23.3 to 2.23.4
Pull Request -
State: open - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies
#1384 - Feat: Allow to configure kubelet root directory
Pull Request -
State: open - Opened by apten-fors 4 months ago
- 2 comments
#1383 - Can’t understand the audit log related to taint assigned by gpu-operator
Issue -
State: closed - Opened by 1eedaegon 4 months ago
- 2 comments
#1382 - Bump github.com/onsi/gomega from 1.36.3 to 1.37.0
Pull Request -
State: open - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies
#1381 - Understanding the Role of Virtual GPU Software and Licensing
Issue -
State: open - Opened by JT-Trend 4 months ago
#1380 - Bump NVIDIA/holodeck from 0.2.6 to 0.2.7
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 3 comments
Labels: dependencies, github_actions
#1379 - Bump sigs.k8s.io/controller-tools from 0.17.2 to 0.17.3 in /tools
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 2 comments
Labels: dependencies
#1378 - sync top-of-tree OLM bundle with that of v25.3.0
Pull Request -
State: closed - Opened by tariq1890 4 months ago
#1377 - sync top-of-tree OLM bundle with that of v25.3.0
Pull Request -
State: closed - Opened by tariq1890 4 months ago
Labels: needs-backport
#1376 - Cherrypick: Add OLM bundle for v25.3.0
Pull Request -
State: closed - Opened by cdesiniotis 4 months ago
#1375 - Nvidia Operator fails to detect the vGPU devices on OpenShift Cluster with A100 GPU node
Issue -
State: open - Opened by sderohan 4 months ago
#1374 - toolkit-validation container fails with "nvidia-smi": executable file not found in $PATH (after clean installation on Ubuntu 24.04.2 + Kubespray 1.32.2)
Issue -
State: open - Opened by botterweck 4 months ago
- 15 comments
#1373 - Update github issue template for raising bug reports
Pull Request -
State: closed - Opened by cdesiniotis 4 months ago
#1372 - Nvidia GPU operator issue on openshift(4.17.20)
Issue -
State: open - Opened by Nikhil-VW 4 months ago
- 6 comments
#1371 - gpu-operator on RKE2 using precompiled driver
Issue -
State: open - Opened by govindkailas 4 months ago
- 10 comments
#1370 - Bump actions/upload-pages-artifact to v4
Pull Request -
State: closed - Opened by cdesiniotis 4 months ago
#1369 - Add OLM bundle for v25.3.0
Pull Request -
State: closed - Opened by cdesiniotis 4 months ago
- 1 comment
Labels: needs-backport
#1368 - NVIDIADriver CRD: Endless Termination Cycle of NVIDIA Driver Pods
Issue -
State: closed - Opened by leoleg91 4 months ago
- 6 comments
#1367 - MicroK8s containerd-template.toml is wrong when docker is installed in parallel
Issue -
State: open - Opened by s-bernhardt 4 months ago
- 1 comment
#1366 - Helm release for v25.3.0
Pull Request -
State: closed - Opened by cdesiniotis 4 months ago
#1365 - Bump project version to v25.3.0
Pull Request -
State: closed - Opened by cdesiniotis 4 months ago
- 4 comments
#1364 - nvidia-operator-validator toolkit-validation fails Init:CrashLoopBackOff
Issue -
State: open - Opened by RangaSamudrala 4 months ago
#1363 - Bump golangci/golangci-lint-action from 6 to 7
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 3 comments
Labels: dependencies, github_actions
#1362 - Bump sigs.k8s.io/controller-runtime from 0.20.3 to 0.20.4
Pull Request -
State: open - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies
#1361 - Containers get stuck starting up after driver upgrade from 560 to 570
Issue -
State: open - Opened by dasantonym 4 months ago
- 7 comments
#1360 - Bump github.com/onsi/gomega from 1.36.2 to 1.36.3
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 1 comment
Labels: dependencies
#1359 - Bump github.com/onsi/ginkgo/v2 from 2.23.0 to 2.23.3
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 2 comments
Labels: dependencies
#1358 - Could DCGMExporterSpec add a field to support to specify the custom path of kubelet rootDir?
Issue -
State: open - Opened by xiaoyao 4 months ago
- 3 comments
#1357 - Driver validation doesn't succeed because /usr/bin is a symlink
Issue -
State: open - Opened by wokalski 4 months ago
- 5 comments
#1356 - kata-nvidia-gpu runtime pod return failed to create containerd task: failed to create shim task: Failed to Check if grpc server is working: ttrpc: closed: unknown
Issue -
State: open - Opened by garygan89 4 months ago
- 5 comments
#1355 - Bump github.com/onsi/ginkgo/v2 from 2.23.0 to 2.23.2
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 2 comments
Labels: dependencies
#1354 - bump mig-manager to v0.12.1
Pull Request -
State: closed - Opened by tariq1890 4 months ago
#1353 - Initramfs scan failed ("lsinitrd requires a file path argument")
Issue -
State: open - Opened by lindhe 4 months ago
#1352 - Bump github.com/onsi/ginkgo/v2 from 2.23.0 to 2.23.1
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
- 2 comments
Labels: dependencies