Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / GoogleCloudPlatform/container-engine-accelerators issues and pull requests

#322 - Remove check for existing driver modules in driver installer

Pull Request - State: closed - Opened by Jiaqicao257 about 1 year ago - 2 comments

#322 - Remove check for existing driver modules in driver installer

Pull Request - State: closed - Opened by Jiaqicao257 about 1 year ago - 2 comments

#321 - Remove mount update

Pull Request - State: closed - Opened by grac3gao about 1 year ago

#321 - Remove mount update

Pull Request - State: closed - Opened by grac3gao about 1 year ago

#320 - tcpx: update nccl-plugin-gpudirecttcpx image

Pull Request - State: closed - Opened by samuelkarp about 1 year ago

#320 - tcpx: update nccl-plugin-gpudirecttcpx image

Pull Request - State: closed - Opened by samuelkarp about 1 year ago

#319 - tcpx: update nccl-plugin-gpudirecttcpx image

Pull Request - State: closed - Opened by samuelkarp about 1 year ago

#319 - tcpx: update nccl-plugin-gpudirecttcpx image

Pull Request - State: closed - Opened by samuelkarp about 1 year ago

#318 - gpudirect-tcpx: update test images

Pull Request - State: closed - Opened by samuelkarp about 1 year ago - 2 comments

#317 - gpudirect-tcpx: update installer image tag

Pull Request - State: closed - Opened by samuelkarp about 1 year ago

#317 - gpudirect-tcpx: update installer image tag

Pull Request - State: closed - Opened by samuelkarp about 1 year ago

#303 - Multiple high severity CVEs on latest nvidia-device-plugin(v1.0.20)

Issue - State: open - Opened by sakshisharma84 over 1 year ago - 1 comment

#303 - Multiple high severity CVEs on latest nvidia-device-plugin(v1.0.20)

Issue - State: open - Opened by sakshisharma84 over 1 year ago - 1 comment

#301 - Bump google.golang.org/grpc from 1.28.1 to 1.53.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#301 - Bump google.golang.org/grpc from 1.28.1 to 1.53.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#295 - Update to latest device plugin manifest

Pull Request - State: closed - Opened by grac3gao over 1 year ago

#294 - go-nvml follow-up fix

Pull Request - State: closed - Opened by grac3gao over 1 year ago

#293 - bump up version

Pull Request - State: closed - Opened by grac3gao over 1 year ago

#292 - Bump up version

Pull Request - State: closed - Opened by grac3gao over 1 year ago

#291 - Use go-nvml library for metrics

Pull Request - State: closed - Opened by grac3gao over 1 year ago

#290 - merge master into dev branch

Pull Request - State: closed - Opened by crystalzhaizhai over 1 year ago

#289 - Update partition gpu base image and golang version

Pull Request - State: closed - Opened by Jiaqicao257 over 1 year ago - 1 comment

#288 - Remove unsupported tf serving demo

Pull Request - State: closed - Opened by kyewei over 1 year ago

#287 - Delete tensorflow-notebook-image

Pull Request - State: closed - Opened by kyewei over 1 year ago

#286 - Remove outdated TPU resnet demo

Pull Request - State: closed - Opened by kyewei over 1 year ago

#285 - Migrate podresources API from v1alpha to v1

Pull Request - State: closed - Opened by danielvegamyhre over 1 year ago - 1 comment

#284 - Bump golang.org/x/sys from 0.0.0-20201214210602-f9fddec55a1e to 0.1.0

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#283 - Bump golang.org/x/net from 0.0.0-20201110031124-69a78807bb2b to 0.7.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#282 - Update NCCL fast socket Dockerfile base image

Pull Request - State: closed - Opened by melody789 over 1 year ago

#281 - Bump golang.org/x/text from 0.3.3 to 0.3.8

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#280 - bump to 1.0.22

Pull Request - State: closed - Opened by melody789 over 1 year ago

#279 - Update GPU Xid workaround method

Pull Request - State: closed - Opened by melody789 over 1 year ago - 2 comments

#278 - Bump github.com/prometheus/client_golang from 1.9.0 to 1.11.1

Pull Request - State: open - Opened by dependabot[bot] almost 2 years ago
Labels: dependencies

#277 - Update NCCL fast socket Dockerfile base image

Pull Request - State: closed - Opened by melody789 almost 2 years ago

#276 - Change CPU to serialized

Pull Request - State: closed - Opened by avkudryashov almost 2 years ago - 2 comments

#274 - bump to 1.0.21

Pull Request - State: closed - Opened by grac3gao almost 2 years ago

#273 - Update preloaded NVIDIA driver installer DaemonSet

Pull Request - State: closed - Opened by Jiaqicao257 almost 2 years ago

#272 - Update the NCCL fast Socket base image

Pull Request - State: closed - Opened by melody789 almost 2 years ago

#271 - Change NCCL daemonset name

Pull Request - State: closed - Opened by melody789 almost 2 years ago

#270 - update golang version from 1.15 to 1.19

Pull Request - State: closed - Opened by grac3gao almost 2 years ago - 1 comment

#269 - Specify type in the nccl installer

Pull Request - State: closed - Opened by grac3gao almost 2 years ago

#268 - Combine MIG and non-MIG to use the same driver installer yaml file.

Pull Request - State: closed - Opened by grac3gao almost 2 years ago

#267 - Revert "Combine the installer for normal GPU and MIG GPU"

Pull Request - State: closed - Opened by grac3gao almost 2 years ago

#266 - #265 moved misplaced volume spec to within volumes

Pull Request - State: closed - Opened by capfish almost 2 years ago - 2 comments

#265 - unknown field "spec.template.spec.initContainers[0].volumeMounts[5].hostPath

Issue - State: open - Opened by maxpain almost 2 years ago - 1 comment

#264 - Add fast Socket notes on README.md

Pull Request - State: closed - Opened by melody789 almost 2 years ago

#263 - add option to mount nvidia tools into container

Pull Request - State: open - Opened by fmoessbauer almost 2 years ago - 1 comment

#262 - xid error mitigation

Pull Request - State: open - Opened by crystalzhaizhai almost 2 years ago - 1 comment

#261 - Typo fix in flag description

Pull Request - State: closed - Opened by ashaltu almost 2 years ago

#260 - Typo fix in flag description

Pull Request - State: closed - Opened by ashaltu almost 2 years ago

#259 - Combine the installer for normal GPU and MIG GPU

Pull Request - State: closed - Opened by grac3gao almost 2 years ago

#258 - Update the base image used in Dockerfile

Pull Request - State: closed - Opened by Jiaqicao257 about 2 years ago

#257 - Add system-node-critical priority to driver installer

Pull Request - State: closed - Opened by alculquicondor about 2 years ago - 2 comments

#256 - Update test device plugin yaml for MPS test

Pull Request - State: closed - Opened by grac3gao about 2 years ago

#255 - Update metrics to avoid possible segfaults

Pull Request - State: closed - Opened by grac3gao about 2 years ago

#254 - Update test device plugin yaml for MPS

Pull Request - State: closed - Opened by grac3gao about 2 years ago

#253 - Update test device plugin yaml for MPS

Pull Request - State: closed - Opened by grac3gao about 2 years ago

#252 - Update test device plugin yaml for MPS

Pull Request - State: closed - Opened by grac3gao about 2 years ago

#251 - Update device plugin yaml for MPS test

Pull Request - State: closed - Opened by grac3gao about 2 years ago - 1 comment

#250 - Add support for non A100 GPUs - using NVML

Pull Request - State: open - Opened by fmoessbauer about 2 years ago - 2 comments

#249 - update the base image to bullseye-v1.4.2-gke.4

Pull Request - State: closed - Opened by melody789 about 2 years ago

#248 - CUDA unknown error when checking torch.cuda.is_available

Issue - State: open - Opened by csaroff about 2 years ago

#247 - Change NCCL fast socket base image

Pull Request - State: closed - Opened by melody789 about 2 years ago

#246 - Add github actions that trigger checks

Pull Request - State: closed - Opened by kyewei about 2 years ago - 2 comments

#245 - Updating the v1.0.20 images to daemonsets

Pull Request - State: closed - Opened by crystalzhaizhai over 2 years ago

#244 - Update fast-socket-installer Dockerfile to fix the base image

Pull Request - State: closed - Opened by richardsliu over 2 years ago

#243 - Update Makefile

Pull Request - State: closed - Opened by richardsliu over 2 years ago - 1 comment

#242 - Add support for non A100 GPUs

Pull Request - State: closed - Opened by fmoessbauer over 2 years ago - 6 comments

#241 - Add new mig partition for the new accelerator type

Pull Request - State: closed - Opened by crystalzhaizhai over 2 years ago - 1 comment

#238 - Update request validation GPU sharing

Pull Request - State: closed - Opened by grac3gao over 2 years ago

#236 - version change

Pull Request - State: closed - Opened by crystalzhaizhai over 2 years ago

#235 - Update nvidia-gpu-partition image

Pull Request - State: closed - Opened by pradvenkat over 2 years ago

#234 - Perform more graceful node reboot after enabling MIG mode on GPUs

Pull Request - State: closed - Opened by pradvenkat over 2 years ago - 1 comment

#233 - Update version to 1.0.16

Pull Request - State: closed - Opened by grac3gao over 2 years ago

#232 - Update version to 1.0.15

Pull Request - State: closed - Opened by grac3gao over 2 years ago

#231 - Update GPU config validation

Pull Request - State: closed - Opened by grac3gao over 2 years ago

#230 - Fastsocket installer folder

Pull Request - State: closed - Opened by melody789 over 2 years ago

#229 - Add fast-socket-installer.yaml file

Pull Request - State: closed - Opened by melody789 over 2 years ago - 1 comment

#228 - Auto detect Xid 79 (Hardware Error ) on GPU

Pull Request - State: closed - Opened by crystalzhaizhai over 2 years ago

#227 - add .phony for fastsocket_installer

Pull Request - State: open - Opened by melody789 over 2 years ago - 1 comment

#226 - add fastsocket_installer code

Pull Request - State: closed - Opened by melody789 over 2 years ago - 1 comment

#225 - Add Dockerfile for NCCL Fast Socket installer

Pull Request - State: closed - Opened by richardsliu over 2 years ago

#224 - Add a new image for driver installer for testing purpose

Pull Request - State: closed - Opened by grac3gao over 2 years ago

#223 - Add daemonset for using --version=latest in COS GPU driver installers

Pull Request - State: closed - Opened by arnav-kansal over 2 years ago - 2 comments

#222 - Use GPU index instead of UUID for metrics

Pull Request - State: closed - Opened by thebinaryone1 almost 3 years ago

#221 - Merge pull request #1 from GoogleCloudPlatform/master

Pull Request - State: closed - Opened by thebinaryone1 almost 3 years ago

#220 - Add the ability to export GPU metrics on the node level

Pull Request - State: closed - Opened by thebinaryone1 almost 3 years ago

#219 - Delete alpha plugin from pkg/gpu/nvidia

Pull Request - State: closed - Opened by richardsliu almost 3 years ago - 1 comment

#218 - Delete alpha plugin from pkg/gpu/nvidia

Pull Request - State: closed - Opened by richardsliu almost 3 years ago

#217 - Update device-plugin.yaml

Pull Request - State: closed - Opened by richardsliu almost 3 years ago

#216 - Add NVIDIA MPS support to device plugin

Pull Request - State: closed - Opened by pradvenkat almost 3 years ago

#215 - Add retry for a race condition

Pull Request - State: closed - Opened by grac3gao almost 3 years ago

#214 - Add MIG/MIG+time-sharing UT for plugin

Pull Request - State: closed - Opened by grac3gao almost 3 years ago

#213 - Fix `/home/kubernetes/bin/nvidia is not a directory`

Pull Request - State: open - Opened by c202c almost 3 years ago - 1 comment