Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / NVIDIA/gpu-operator issues and pull requests
#1078 - Re-instate NV-GHA
Pull Request -
State: closed - Opened by ArangoGutierrez 19 days ago
- 1 comment
#1077 - Bump sigs.k8s.io/controller-tools from 0.16.4 to 0.16.5 in /tools
Pull Request -
State: open - Opened by dependabot[bot] 19 days ago
Labels: dependencies
#1076 - Always add 'config' emptyDir volume to MPS daemonset
Pull Request -
State: closed - Opened by cdesiniotis 20 days ago
#1075 - Bump nvidia-container-toolkit to v0.17.0-rc.2 and k8s-device-plugin to v0.17.0-rc.1
Pull Request -
State: closed - Opened by cdesiniotis 20 days ago
- 2 comments
#1074 - nfd-worker shocks on motherboard name
Issue -
State: open - Opened by qdii 22 days ago
#1073 - bump mig-manager to v0.10.0
Pull Request -
State: closed - Opened by tariq1890 23 days ago
#1072 - Rollback NV-GitHub-Actions runner changes
Pull Request -
State: closed - Opened by tariq1890 23 days ago
#1071 - re-enable github actions pipelines for pull requests
Pull Request -
State: closed - Opened by tariq1890 23 days ago
#1070 - Add init container to GFD for handling imex nodes config mount
Pull Request -
State: closed - Opened by cdesiniotis 23 days ago
- 2 comments
#1069 - Exhaustive list of Kubernetes Objects / Pods
Issue -
State: open - Opened by jube-pimy 23 days ago
#1068 - Copy pr boot
Pull Request -
State: closed - Opened by ArangoGutierrez 23 days ago
#1067 - Bump github.com/NVIDIA/k8s-kata-manager from 0.2.0 to 0.2.2
Pull Request -
State: closed - Opened by dependabot[bot] 23 days ago
Labels: dependencies
#1066 - Bump sigs.k8s.io/controller-runtime from 0.19.0 to 0.19.1
Pull Request -
State: open - Opened by dependabot[bot] 23 days ago
Labels: dependencies
#1065 - GPU resources are not recovered even XID error is resolved
Issue -
State: open - Opened by jslouisyou 24 days ago
- 1 comment
#1064 - Add missing 1g.36gb config to default-mig-parted-config
Pull Request -
State: closed - Opened by cdesiniotis 24 days ago
#1063 - chroot: failed to run command 'nvidia-smi': No such file or directory
Issue -
State: open - Opened by vanloswang 24 days ago
#1062 - Migrate to NV-GHA Runners
Pull Request -
State: closed - Opened by ArangoGutierrez 24 days ago
- 3 comments
#1061 - Bump k8s.io/code-generator from 0.31.1 to 0.31.2 in /tools
Pull Request -
State: open - Opened by dependabot[bot] 24 days ago
Labels: dependencies
#1060 - Bump the k8sio group with 4 updates
Pull Request -
State: open - Opened by dependabot[bot] 24 days ago
- 1 comment
Labels: dependencies
#1059 - bump GDS image to 2.20.5
Pull Request -
State: closed - Opened by tariq1890 25 days ago
#1058 - bump R550 and R535 drivers to 550.127.05 and 535.216.01
Pull Request -
State: closed - Opened by tariq1890 25 days ago
#1057 - Add GH200 144G HBM3e (x234810DE) to default-mig-parted-config
Pull Request -
State: closed - Opened by cdesiniotis 25 days ago
- 1 comment
#1056 - amazon linux 2023 support
Pull Request -
State: closed - Opened by shivakunv 25 days ago
#1055 - Make the IMEX nodes config file available to GFD
Pull Request -
State: closed - Opened by cdesiniotis 26 days ago
#1054 - ArgoCD issue - Failed to clone 'cnt-ci'
Issue -
State: closed - Opened by ErrickVDW 26 days ago
#1053 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.76.2 to 0.77.2
Pull Request -
State: open - Opened by dependabot[bot] 26 days ago
Labels: dependencies
#1052 - Disable `node-feature-discovery` when `nfd.enabled` is false
Pull Request -
State: open - Opened by Baalekshan 27 days ago
#1051 - add OCP 4.17 to the supported Openshift versions list
Pull Request -
State: closed - Opened by tariq1890 27 days ago
#1050 - add glibc/lib to library search paths
Pull Request -
State: closed - Opened by Hexoplon 29 days ago
- 1 comment
#1049 - Garbage Collector OOM get OOM Killed
Issue -
State: open - Opened by sebgoa about 1 month ago
#1048 - Bump github.com/prometheus/client_golang from 1.20.4 to 1.20.5
Pull Request -
State: open - Opened by dependabot[bot] about 1 month ago
Labels: dependencies
#1047 - bump k8s-kata-manager to v0.2.2
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1046 - bump openshift-client-go to the latest main
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1045 - bump kubevirt-gpu-device-plugin to v1.2.10
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1044 - bump node-feature-discovery to v0.16.5
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1043 - Stucking into init or crashbackoff status on some pods (nvidia-cuda-validator and nvidia-operator-validator)
Issue -
State: closed - Opened by thsmfe001 about 1 month ago
- 1 comment
#1042 - bump gdrcopy, vgpu-device-mgr and k8s-driver-mgr versions
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1041 - bump cuda base image in helm chart and OLM bundle
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1040 - Bump nvidia/cuda from 12.6.1-base-ubi9 to 12.6.2-base-ubi9 in /validator
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
Labels: dependencies, docker
#1039 - Bump nvidia/cuda from 12.6.1-base-ubi9 to 12.6.2-base-ubi9 in /docker
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
Labels: dependencies, docker
#1038 - ServiceAccount `node-feature-discovery` should not be included in ClusterRoleBinding when `nfd.enabled: false`
Issue -
State: open - Opened by cmontemuino about 1 month ago
#1037 - Update OpenSSL to address CVE-2022-1292 and CVE-2022-2068
Issue -
State: open - Opened by chris-lsn about 1 month ago
#1036 - Bump github.com/urfave/cli/v2 from 2.27.4 to 2.27.5
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
Labels: dependencies
#1035 - Add support to GPU operator for RHEL 9
Issue -
State: open - Opened by kimminw00 about 1 month ago
- 3 comments
#1034 - Revert "disable privileged mode for toolkit-validation init containers"
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1033 - Bump sigs.k8s.io/controller-tools from 0.16.3 to 0.16.4 in /tools
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
- 1 comment
Labels: dependencies
#1032 - unified define k8s-driver-manager image info in values.yaml
Pull Request -
State: open - Opened by lengrongfu about 1 month ago
#1031 - [gpu-operator-validator] minor code cleanup
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1030 - Allow adding custom labels and securityContext to the components deployed by ClusterPolicy
Issue -
State: open - Opened by inesshz about 1 month ago
#1029 - Bump sigs.k8s.io/kustomize/kustomize/v5 from 5.4.3 to 5.5.0 in /tools
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
Labels: dependencies
#1028 - Bump github.com/NVIDIA/go-nvlib from 0.6.1 to 0.7.0
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
Labels: dependencies
#1027 - How to modify /etc/nvidia-container-runtime/config.toml?
Issue -
State: open - Opened by janetat about 1 month ago
- 2 comments
#1026 - [state-driver] add downward API envars to fetch node name and IP
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
- 2 comments
#1025 - bump NFD to v0.16.4
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1024 - enable gpu-operator and NFD CRD updates by default
Pull Request -
State: closed - Opened by tariq1890 about 1 month ago
#1023 - [v24.6.2] add RHOCP certified OLM bundle
Pull Request -
State: closed - Opened by tariq1890 about 2 months ago
#1022 - bump cuda base img to 12.6.1 and go to 1.22.8
Pull Request -
State: closed - Opened by tariq1890 about 2 months ago
#1021 - GPU already used, showing up in multiple containers
Issue -
State: open - Opened by astranero about 2 months ago
- 1 comment
#1020 - 403 Unauthorized for helm image
Issue -
State: closed - Opened by AriBerisha about 2 months ago
- 1 comment
#1019 - set RUNTIME_CONFIG and RUNTIME_SOCKET envars to support new toolkit versions
Pull Request -
State: closed - Opened by tariq1890 about 2 months ago
#1018 - vGPU pods stuck/fail after the installation
Issue -
State: open - Opened by tunahanertekin about 2 months ago
#1017 - added runtimeClassName to fix Cuda version error on gpu-pod.yaml test
Pull Request -
State: open - Opened by armagankaratosun about 2 months ago
#1016 - Nvidia-driver-daemonset stuck in CrashLoopBackOff
Issue -
State: open - Opened by CarlGJ about 2 months ago
#1015 - failed to create NVIDIA device nodes
Issue -
State: closed - Opened by dstrbad about 2 months ago
- 7 comments
#1014 - Bump github.com/NVIDIA/nvidia-container-toolkit from 1.16.1 to 1.16.2
Pull Request -
State: closed - Opened by dependabot[bot] about 2 months ago
Labels: dependencies
#1013 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.76.2 to 0.77.1
Pull Request -
State: closed - Opened by dependabot[bot] about 2 months ago
- 1 comment
Labels: dependencies
#1012 - Helm release for v24.6.2
Pull Request -
State: closed - Opened by cdesiniotis about 2 months ago
#1011 - Cherry-picks for 24.6.2
Pull Request -
State: closed - Opened by cdesiniotis about 2 months ago
#1010 - Bump github.com/mittwald/go-helm-client from 0.12.13 to 0.12.14
Pull Request -
State: closed - Opened by dependabot[bot] about 2 months ago
Labels: dependencies
#1009 - Verification of Kubernetes compatibility
Issue -
State: open - Opened by BrianV801 about 2 months ago
- 1 comment
#1008 - Bump project version to 24.6.2
Pull Request -
State: closed - Opened by cdesiniotis about 2 months ago
#1007 - Support the `DevicePluginCDIDevices` feature gate
Pull Request -
State: open - Opened by jfroy about 2 months ago
- 1 comment
#1006 - vsphere e2e tests setup
Pull Request -
State: closed - Opened by shivakunv about 2 months ago
- 1 comment
#1005 - fix govet issues and pin golangci-lint version
Pull Request -
State: closed - Opened by tariq1890 about 2 months ago
#1004 - [release-24.6] bump cuda base images to fix CVE 2024-6345
Pull Request -
State: closed - Opened by tariq1890 about 2 months ago
- 3 comments
#1003 - bump dcgm and dcg-exporter to versions 3.3.8 and 3.3.8-3.6.0
Pull Request -
State: closed - Opened by tariq1890 about 2 months ago
#1002 - Not able to view Gpu utilization metrics in openshift dashboard
Issue -
State: open - Opened by umeshvw about 2 months ago
- 7 comments
#1001 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.76.2 to 0.77.0
Pull Request -
State: closed - Opened by dependabot[bot] about 2 months ago
- 1 comment
Labels: dependencies
#1000 - DriverToolkit is enabled in the GPU Operator ClusterPolicy, but the NFD version deployed in the cluster is too old to support it.
Issue -
State: closed - Opened by CarlGJ about 2 months ago
#999 - drop dist tag suffix when referencing images in scan and sign jobs
Pull Request -
State: closed - Opened by tariq1890 2 months ago
#998 - Bump github.com/prometheus/client_golang from 1.20.3 to 1.20.4
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: dependencies
#997 - add gpu driver container 550.90.12
Pull Request -
State: closed - Opened by tariq1890 2 months ago
#996 - [nvidia-ci] drop dist tag suffix when cloning ghcr.io images
Pull Request -
State: closed - Opened by tariq1890 2 months ago
#995 - Bump nvidia/cuda from 12.6.0-base-ubi9 to 12.6.1-base-ubi9 in /docker
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: dependencies, docker
#994 - Bump nvidia/cuda from 12.6.0-base-ubi9 to 12.6.1-base-ubi9 in /validator
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: dependencies, docker
#993 - downgrade go from 1.23.0 to 1.22.7
Pull Request -
State: closed - Opened by tariq1890 2 months ago
#992 - Following gpu-operator documentation will break RKE2 cluster after reboot
Issue -
State: open - Opened by aiicore 2 months ago
- 4 comments
#991 - containerd restart from nvidia-container-toolkit causes other daemonsets to get stuck
Issue -
State: open - Opened by chiragjn 2 months ago
- 1 comment
#990 - Fatal Error: Openshift 4.16.10 not compatible with Nvidia-GPU-Operator-24.6.1
Issue -
State: closed - Opened by jayteaftw 2 months ago
- 12 comments
#989 - Bump sigs.k8s.io/controller-tools from 0.16.2 to 0.16.3 in /tools
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: dependencies
#988 - Bump k8s.io/code-generator from 0.31.0 to 0.31.1 in /tools
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: dependencies
#987 - Bump the k8sio group with 4 updates
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
- 2 comments
Labels: dependencies
#986 - [RBAC cleanup] move namespaced resources to Role from ClusterRole
Pull Request -
State: closed - Opened by tariq1890 2 months ago
- 1 comment
#985 - Bump github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring from 0.76.1 to 0.76.2
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
Labels: dependencies
#984 - configure crun as the low-level runtime to prioritise when using CRI-O
Pull Request -
State: closed - Opened by tariq1890 2 months ago
#983 - DCGM_FI_DEV_GPU_UTIL metric giving empty value from prometheus
Issue -
State: open - Opened by Vijaygawate 2 months ago
#982 - nvidia.com/gpu.deploy.driver label is not pre-installed
Issue -
State: open - Opened by lengrongfu 2 months ago
- 2 comments
#981 - How to use GPU Operator with MIG to configure 2 GPUs on one node separately
Issue -
State: closed - Opened by marlowsw 2 months ago
- 3 comments
#980 - helm instal gpu-operator was in Init stage for a long time
Issue -
State: open - Opened by JShuang7711 2 months ago
- 1 comment
#979 - update K8s version used by holodeck to v1.31
Pull Request -
State: closed - Opened by tariq1890 2 months ago