GitHub / ROCm/k8s-device-plugin issues and pull requests
#143 - Can Multiple Containers Share the Same GPU?
Issue -
State: open - Opened by kristian-lemurian 17 days ago
#140 - Bump rocm-docs-core from 1.18.2 to 1.21.1 in /docs/sphinx
Pull Request -
State: open - Opened by dependabot[bot] about 1 month ago
- 1 comment
Labels: documentation
#139 - Bump rocm-docs-core from 1.18.2 to 1.21.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] about 1 month ago
- 2 comments
Labels: documentation
#138 - Update Helm Chart to 0.20.0 to point to release 1.31.0.7
Pull Request -
State: closed - Opened by sriram-30 about 1 month ago
#137 - Pod in privilidged mode
Issue -
State: open - Opened by mimellin about 1 month ago
#136 - Add KubeVirt GPU Support: SR-IOV VF and PF Passthrough for AMD GPUs
Pull Request -
State: open - Opened by bhatnitish about 2 months ago
#135 - Bump rocm-docs-core from 1.18.2 to 1.20.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] about 2 months ago
- 2 comments
Labels: documentation
#134 - [Feature]: Upload Helm Chart as OCI
Issue -
State: open - Opened by LucaDev about 2 months ago
#133 - Bump rocm-docs-core from 1.18.2 to 1.20.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] about 2 months ago
- 2 comments
Labels: documentation
#132 - Bump rocm-docs-core from 1.18.2 to 1.19.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
- 2 comments
Labels: documentation
#131 - Bump rocm-docs-core from 1.18.2 to 1.19.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 2 months ago
- 2 comments
Labels: documentation
#130 - [Issue]: AMD Radeon (Raphael) support
Issue -
State: open - Opened by bernardgut 2 months ago
#129 - [Issue]: alexnet-tf-gpu-pod is broken
Issue -
State: open - Opened by bernardgut 2 months ago
#128 - [Feature]: expose gpu model name as resource
Issue -
State: open - Opened by baddoub 3 months ago
- 1 comment
#127 - Bump rocm-docs-core from 1.18.2 to 1.18.4 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 3 months ago
- 2 comments
Labels: documentation
#126 - Add documentation for new gpu-partition related node labeller arguments
Pull Request -
State: closed - Opened by sriram-30 3 months ago
#125 - add fallback mechanism in case allocator init fails
Pull Request -
State: closed - Opened by biluriuday 3 months ago
- 1 comment
#124 - update example device-plugin yaml file
Pull Request -
State: closed - Opened by biluriuday 3 months ago
#123 - Fix k8s-node-labeller cleanup on node labels
Pull Request -
State: closed - Opened by yansun1996 3 months ago
#122 - Support args, command for helm chart
Pull Request -
State: open - Opened by kwang1121 3 months ago
#121 - Support updateStrategy for helm chart daemonsets
Pull Request -
State: closed - Opened by jaeyung1001 3 months ago
#120 - Update rhubi-based image labels for OpenShift certification
Pull Request -
State: closed - Opened by yansun1996 4 months ago
#119 - Device Plugin and Allocator Documentation
Pull Request -
State: closed - Opened by sriram-30 4 months ago
- 2 comments
#118 - Bump rocm-docs-core from 1.18.1 to 1.18.2 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 4 months ago
Labels: documentation
#117 - Device Plugin and Node Labeller support for gpu partitions
Pull Request -
State: closed - Opened by sriram-30 4 months ago
- 1 comment
#116 - Change SimpleHealthCheck to use `/sys/class/kfd`
Pull Request -
State: closed - Opened by fluidnumerics-joe 4 months ago
- 3 comments
#115 - add support for gpu partitioning
Pull Request -
State: closed - Opened by biluriuday 4 months ago
#114 - Sphinx config and updated docs for device plugin
Pull Request -
State: closed - Opened by AMD-melliott 5 months ago
- 3 comments
Labels: documentation
#113 - Bump rocm-docs-core from 1.17.1 to 1.18.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 5 months ago
Labels: documentation
#112 - [Issue]: Product-name information missing for MI300 GPUs in AKS
Issue -
State: open - Opened by lalitmsft 5 months ago
- 4 comments
#111 - [Feature]: time-slicing GPUs
Issue -
State: open - Opened by aviallon 5 months ago
#110 - Device Plugin and Node Labeller Partition Changes
Pull Request -
State: closed - Opened by sriram-30 5 months ago
#109 - Bump rocm-docs-core from 1.17.0 to 1.17.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 5 months ago
Labels: documentation
#108 - Node labeller vram label support for partitions
Pull Request -
State: closed - Opened by sriram-30 5 months ago
- 5 comments
#107 - remove pulse from base images, add timeout grpc req
Pull Request -
State: closed - Opened by spraveenio 5 months ago
#106 - Bump rocm-docs-core from 1.15.0 to 1.17.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 5 months ago
Labels: documentation
#105 - Update mounts of labeller in helm chart
Pull Request -
State: closed - Opened by amdlin 6 months ago
#104 - node labeller "-product-name" failing for some platforms on k8s
Pull Request -
State: closed - Opened by spraveenio 6 months ago
#103 - [Issue]: Helm Chart 0.17.0 Breaks Talos Support
Issue -
State: closed - Opened by j0sh3rs 6 months ago
- 1 comment
#102 - skip empty product name for older platforms
Pull Request -
State: closed - Opened by spraveenio 6 months ago
#101 - Bump rocm-docs-core from 1.14.1 to 1.15.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 6 months ago
Labels: documentation
#100 - exporter endpoint svc to check for gpu health
Pull Request -
State: closed - Opened by spraveenio 6 months ago
- 2 comments
#99 - Disabled metrics server in manager of controller-runtime
Pull Request -
State: closed - Opened by amdlin 6 months ago
#98 - Bump rocm-docs-core from 1.13.0 to 1.14.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 6 months ago
Labels: documentation
#97 - Update helm chart and README.md of labeller
Pull Request -
State: closed - Opened by amdlin 6 months ago
#96 - [Documentation]: Missing compatibility matrix for device plugin to supported k8s version
Issue -
State: closed - Opened by balkrishna93 7 months ago
- 3 comments
#95 - Bump rocm-docs-core from 1.12.1 to 1.13.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 7 months ago
Labels: documentation
#94 - Bump rocm-docs-core from 1.12.0 to 1.12.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 7 months ago
- 1 comment
Labels: documentation
#93 - [Issue]: Run ROCm containers as non-root, unprivileged user
Issue -
State: closed - Opened by hmoazzem 7 months ago
- 1 comment
#92 - MLO-12: Changes to support AMD GPUS
Pull Request -
State: closed - Opened by taddeusb90 8 months ago
#91 - Bump rocm-docs-core from 1.11.0 to 1.12.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 8 months ago
Labels: documentation
#89 - exporter endpoint svc to check for gpu health
Pull Request -
State: closed - Opened by spraveenio 8 months ago
#88 - Bump rocm-docs-core from 1.10.0 to 1.11.0 in /docs/sphinx
Pull Request -
State: open - Opened by dependabot[bot] 8 months ago
Labels: documentation
#87 - Add license to ubi based images for certification request
Pull Request -
State: closed - Opened by yansun1996 8 months ago
#86 - Add the example to deploy vLLM serve
Pull Request -
State: open - Opened by AlexHe99 8 months ago
#85 - Bump rocm-docs-core from 1.9.0 to 1.10.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 8 months ago
Labels: documentation
#84 - Bump rocm-docs-core from 1.9.0 to 1.9.2 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 8 months ago
- 1 comment
Labels: documentation
#83 - Bump rocm-docs-core from 1.9.0 to 1.9.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 8 months ago
- 1 comment
Labels: documentation
#82 - Bump rocm-docs-core from 1.8.5 to 1.9.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 8 months ago
Labels: documentation
#81 - Create Dockerfile for building images with ubi base image
Pull Request -
State: closed - Opened by yansun1996 8 months ago
#80 - Bump rocm-docs-core from 1.8.4 to 1.8.5 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 8 months ago
Labels: documentation
#79 - Bump rocm-docs-core from 1.8.3 to 1.8.4 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 9 months ago
Labels: documentation
#78 - [Issue]: Getting errors with TensorFlow sample
Issue -
State: open - Opened by bsctl 9 months ago
- 4 comments
#77 - Bump rocm-docs-core from 1.8.2 to 1.8.3 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 9 months ago
Labels: documentation
#76 - [Issue]: Unable to Update ( 1.25.2.7 → 1.25.2.8 )
Issue -
State: open - Opened by KeyboardDabbler 10 months ago
- 6 comments
#75 - Bump cryptography from 43.0.0 to 43.0.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 10 months ago
Labels: documentation
#74 - Bump rocm-docs-core from 1.8.1 to 1.8.2 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 10 months ago
Labels: documentation
#73 - Bump rocm-docs-core from 1.7.2 to 1.8.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 11 months ago
Labels: documentation
#72 - Bump rocm-docs-core from 1.7.2 to 1.8.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 11 months ago
- 1 comment
Labels: documentation
#71 - Bump rocm-docs-core from 1.7.0 to 1.7.2 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 11 months ago
Labels: documentation
#70 - Bump rocm-docs-core from 1.7.0 to 1.7.1 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 12 months ago
- 1 comment
Labels: documentation
#69 - [Feature]: Sharing GPUs among all containers of a K8s pod
Issue -
State: open - Opened by yx-lamini 12 months ago
- 1 comment
#68 - Bump rocm-docs-core from 1.6.2 to 1.7.0 in /docs/sphinx
Pull Request -
State: closed - Opened by dependabot[bot] 12 months ago
Labels: documentation
#67 - Set up initial configurations for Read the Docs
Pull Request -
State: closed - Opened by samjwu 12 months ago
Labels: documentation
#66 - [Feature]: Sharing without restrictions
Issue -
State: open - Opened by sfxworks about 1 year ago
- 4 comments
#65 - Device permissions set by the device-plugin cause unexpected access() syscall responses, ending up in Pytorch failures
Issue -
State: closed - Opened by elukey about 1 year ago
- 8 comments
#64 - nit: update broken ROCm docs link
Pull Request -
State: closed - Opened by servusdei2018 about 1 year ago
#63 - [Documentation]: Introduction link broken
Issue -
State: closed - Opened by chipzoller about 1 year ago
#62 - update use of namespaces in chart
Pull Request -
State: closed - Opened by rptaylor over 1 year ago
- 5 comments
#61 - [Issue]: need a way to overwrite node_selector labels
Issue -
State: open - Opened by rptaylor over 1 year ago
- 2 comments
#60 - [Issue]: namespace configuration in Helm chart
Issue -
State: closed - Opened by rptaylor over 1 year ago
#59 - question about privileged mode for labeller
Issue -
State: open - Opened by rptaylor over 1 year ago
- 4 comments
#58 - add node_selector_enabled variable
Pull Request -
State: closed - Opened by rptaylor over 1 year ago
- 1 comment
#57 - [Documentation]: clarification on kernel/driver installation
Issue -
State: closed - Opened by rptaylor over 1 year ago
- 1 comment
#56 - [Documentation]: MI200 support
Issue -
State: closed - Opened by rptaylor over 1 year ago
- 1 comment
#55 - Remove reference to deprecated allow-privileged flag for kubelet
Pull Request -
State: closed - Opened by rptaylor over 1 year ago
- 3 comments
#54 - [Feature]: Support detection, allocation and resetting of GPU partitions in CDNA cards
Issue -
State: open - Opened by lohbe over 1 year ago
- 2 comments
#53 - fix benchmark test example manifest
Pull Request -
State: closed - Opened by andy108369 over 1 year ago
- 1 comment
#52 - [Issue]: benchmark example is broken
Issue -
State: closed - Opened by andy108369 over 1 year ago
- 1 comment
#51 - [Issue]: ArtifactHUB helm install points to non-existant chart repo
Issue -
State: closed - Opened by jkoelker over 1 year ago
- 1 comment
#50 - Updated references to GitHub Org
Pull Request -
State: closed - Opened by dgaliffiAMD over 1 year ago
- 1 comment
#49 - [Issue]: Helm Chart unavailable
Issue -
State: closed - Opened by LucaDev over 1 year ago
- 2 comments
#48 - Runtime Error with AMD GPU Helm Chart Installation in Kubernetes
Issue -
State: open - Opened by maarten-blokker over 1 year ago
- 2 comments
#47 - Update broken link to ROCM docs in readme file
Pull Request -
State: closed - Opened by maarten-blokker over 1 year ago
#46 - Fix dead link to ROCm system requirement
Pull Request -
State: closed - Opened by y2kenny over 1 year ago
#45 - GPU isolation options
Issue -
State: open - Opened by andy108369 over 1 year ago
#44 - Libraries/binaries mounted in the container (analogous to NVIDIA_DRIVER_CAPABILITIES)
Issue -
State: open - Opened by andy108369 over 1 year ago
#43 - 404 link in README.md
Issue -
State: closed - Opened by hy-tomas-terala over 1 year ago
#42 - Is AMD Radeon Vega 8 supported?
Issue -
State: closed - Opened by dmfrey over 1 year ago
- 18 comments
#41 - Updating labeller rbac version
Pull Request -
State: closed - Opened by LarryGF over 1 year ago