Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / leptonai/gpud issues and pull requests

#69 - feat(nvidia/query): fabric manager debugging info from journalctl

Pull Request - State: closed - Opened by gyuho 2 months ago

#68 - feat(pkg/process): rename stop to abort, add systemd/journal utils

Pull Request - State: closed - Opened by gyuho 2 months ago

#68 - feat(pkg/process): rename stop to abort, add systemd/journal utils

Pull Request - State: closed - Opened by gyuho 2 months ago

#67 - feat(pkg/systemd): remove redundant utils, move "pkg/update"

Pull Request - State: closed - Opened by gyuho 2 months ago

#67 - feat(pkg/systemd): remove redundant utils, move "pkg/update"

Pull Request - State: closed - Opened by gyuho 2 months ago

#66 - feat(internal/session): add missing writer close for session writer

Pull Request - State: closed - Opened by gyuho 3 months ago - 2 comments

#66 - feat(internal/session): add missing writer close for session writer

Pull Request - State: closed - Opened by gyuho 3 months ago - 2 comments

#65 - client(v1): move examples, add info by component

Pull Request - State: closed - Opened by gyuho 3 months ago

#65 - client(v1): move examples, add info by component

Pull Request - State: closed - Opened by gyuho 3 months ago

#64 - feat(docker): list all containers in docker

Pull Request - State: closed - Opened by cardyok 3 months ago

#64 - feat(docker): list all containers in docker

Pull Request - State: closed - Opened by cardyok 3 months ago

#63 - feat(goreleaser): use ubuntu 20.04 build as default linux artifact

Pull Request - State: closed - Opened by gyuho 3 months ago

#63 - feat(goreleaser): use ubuntu 20.04 build as default linux artifact

Pull Request - State: closed - Opened by gyuho 3 months ago

#62 - fix(components/docker): do not set not healthy if docker client version incompatible

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#62 - fix(components/docker): do not set not healthy if docker client version incompatible

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#61 - fix(update): check update version in "gpud update" command

Pull Request - State: closed - Opened by hm2501 3 months ago
Labels: bug

#61 - fix(update): check update version in "gpud update" command

Pull Request - State: closed - Opened by hm2501 3 months ago
Labels: bug

#60 - fix(nvidia/query): handle error when lsmod reader is already closed for peermem checker

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#60 - fix(nvidia/query): handle error when lsmod reader is already closed for peermem checker

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#59 - feat(systemd): enable gpud service

Pull Request - State: closed - Opened by cardyok 3 months ago

#59 - feat(systemd): enable gpud service

Pull Request - State: closed - Opened by cardyok 3 months ago

#58 - feat(client/v1): add basic get/read v1 API calls

Pull Request - State: closed - Opened by gyuho 3 months ago

#58 - feat(client/v1): add basic get/read v1 API calls

Pull Request - State: closed - Opened by gyuho 3 months ago

#57 - doc(README): add badges, official links

Pull Request - State: closed - Opened by gyuho 3 months ago

#57 - doc(README): add badges, official links

Pull Request - State: closed - Opened by gyuho 3 months ago

#56 - fix(accelerator/nvidia/gpm): add missing Healthy: true field

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#56 - fix(accelerator/nvidia/gpm): add missing Healthy: true field

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#55 - fix(event): add timestamp for xid/sxid error event

Pull Request - State: closed - Opened by cardyok 3 months ago

#55 - fix(event): add timestamp for xid/sxid error event

Pull Request - State: closed - Opened by cardyok 3 months ago

#54 - fix(session): handle io closed on write failure

Pull Request - State: closed - Opened by cardyok 3 months ago

#54 - fix(session): handle io closed on write failure

Pull Request - State: closed - Opened by cardyok 3 months ago

#53 - fix(accelerator/nvidia): panic when ibstat command fails, when recording errors

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#53 - fix(accelerator/nvidia): panic when ibstat command fails, when recording errors

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#51 - fix(nvidia/nvml): mark xid 68 as user app error, document

Pull Request - State: closed - Opened by gyuho 3 months ago

#51 - fix(nvidia/nvml): mark xid 68 as user app error, document

Pull Request - State: closed - Opened by gyuho 3 months ago

#50 - feat(nvidia/nvml): include device uuid for xid event

Pull Request - State: closed - Opened by gyuho 3 months ago

#50 - feat(nvidia/nvml): include device uuid for xid event

Pull Request - State: closed - Opened by gyuho 3 months ago

#48 - fix(nvidia): skip clock events NVML check if not supported by old drivers

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#48 - fix(nvidia): skip clock events NVML check if not supported by old drivers

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#47 - fix(docs): pkg.go.dev links, add Makefile CGO_ENABLED=1

Pull Request - State: closed - Opened by flyer103 3 months ago - 1 comment

#47 - fix(docs): pkg.go.dev links, add Makefile CGO_ENABLED=1

Pull Request - State: closed - Opened by flyer103 3 months ago - 1 comment

#46 - doc(nvidia/error/xid): document how xid error is detected using dmesg

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#46 - doc(nvidia/error/xid): document how xid error is detected using dmesg

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#45 - fix(dmesg): fallback in case "dmesg --since" flag doesn't exist in older versions

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#45 - fix(dmesg): fallback in case "dmesg --since" flag doesn't exist in older versions

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#44 - doc(nvidia/query): explain gpm sm occupancy in more detail

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#44 - doc(nvidia/query): explain gpm sm occupancy in more detail

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#43 - Fix readme typo

Pull Request - State: closed - Opened by erjanmx 3 months ago

#43 - Fix readme typo

Pull Request - State: closed - Opened by erjanmx 3 months ago

#42 - docs(WHY): simple comparison with dcgm

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#42 - docs(WHY): simple comparison with dcgm

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#41 - fix(cmd/gpud): clean up "gpud down" logs

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: enhancement

#41 - fix(cmd/gpud): clean up "gpud down" logs

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: enhancement

#40 - feat(nvidia): support GPM metrics (SM occupancy), lower xid poll frequency

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: feature, nvidia

#40 - feat(nvidia): support GPM metrics (SM occupancy), lower xid poll frequency

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: feature, nvidia

#39 - feat(nvidia/query): check lsmod output with retries

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: enhancement

#39 - feat(nvidia/query): check lsmod output with retries

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: enhancement

#38 - fix(components/fd): skip error if descriptor file is deleted

Pull Request - State: closed - Opened by gyuho 3 months ago

#38 - fix(components/fd): skip error if descriptor file is deleted

Pull Request - State: closed - Opened by gyuho 3 months ago

#37 - fix(systemd): handle exit status for "gpud status" command when gpud is inactive

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#37 - fix(systemd): handle exit status for "gpud status" command when gpud is inactive

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: bug

#36 - feat(install.sh): detect ubuntu distro version

Pull Request - State: closed - Opened by gyuho 3 months ago

#36 - feat(install.sh): detect ubuntu distro version

Pull Request - State: closed - Opened by gyuho 3 months ago

#35 - doc(README): fix wording for auto-updates

Pull Request - State: closed - Opened by gyuho 3 months ago

#35 - doc(README): fix wording for auto-updates

Pull Request - State: closed - Opened by gyuho 3 months ago

#34 - feat(goreleaser): support linux/amd64 ubuntu 24.04 builds

Pull Request - State: closed - Opened by gyuho 3 months ago - 1 comment

#34 - feat(goreleaser): support linux/amd64 ubuntu 24.04 builds

Pull Request - State: closed - Opened by gyuho 3 months ago - 1 comment

#33 - fix(query/log/tail): do not show output on scan failures

Pull Request - State: closed - Opened by gyuho 3 months ago - 1 comment

#33 - fix(query/log/tail): do not show output on scan failures

Pull Request - State: closed - Opened by gyuho 3 months ago - 1 comment

#32 - gpud scan (dmesg) fills the stderr with kernel logs, and fails with unknown flags

Issue - State: closed - Opened by eicca 3 months ago - 12 comments
Labels: bug, awaiting feedback

#32 - gpud scan (dmesg) fills the stderr with kernel logs, and fails with unknown flags

Issue - State: closed - Opened by eicca 3 months ago - 12 comments
Labels: bug, awaiting feedback

#31 - Installation on Ubuntu 20.04.6 LTS `GLIBC_2.33' not found

Issue - State: closed - Opened by eicca 3 months ago - 10 comments
Labels: dependency-issue

#31 - Installation on Ubuntu 20.04.6 LTS `GLIBC_2.33' not found

Issue - State: closed - Opened by eicca 3 months ago - 10 comments
Labels: dependency-issue

#30 - feat(gpud): rename/add "run --web-enable --enable-auto-update" flags

Pull Request - State: closed - Opened by gyuho 3 months ago

#30 - feat(gpud): rename/add "run --web-enable --enable-auto-update" flags

Pull Request - State: closed - Opened by gyuho 3 months ago

#29 - feat(update): flag for controlling auto update

Pull Request - State: closed - Opened by hm2501 3 months ago

#29 - feat(update): flag for controlling auto update

Pull Request - State: closed - Opened by hm2501 3 months ago

#28 - feat(integration): add integration doc

Pull Request - State: closed - Opened by cardyok 3 months ago

#28 - feat(integration): add integration doc

Pull Request - State: closed - Opened by cardyok 3 months ago

#27 - adding ubuntu 20.04

Pull Request - State: closed - Opened by sysbot 3 months ago - 3 comments

#27 - adding ubuntu 20.04

Pull Request - State: closed - Opened by sysbot 3 months ago - 3 comments

#26 - feat(pkg/process): initial commit, fix dmesg scan on non-root

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: feature

#26 - feat(pkg/process): initial commit, fix dmesg scan on non-root

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: feature

#25 - nits(*): add missing package descriptions, clean up imports

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#25 - nits(*): add missing package descriptions, clean up imports

Pull Request - State: closed - Opened by gyuho 3 months ago
Labels: documentation

#24 - feat(components/file): initial commit for file status tracking

Pull Request - State: closed - Opened by gyuho 3 months ago - 3 comments
Labels: feature

#22 - feat(gossip): remove hostname and public ip in gossip payload

Pull Request - State: closed - Opened by cardyok 3 months ago

#22 - feat(gossip): remove hostname and public ip in gossip payload

Pull Request - State: closed - Opened by cardyok 3 months ago

#21 - nits(gpud): remove workspace id mention in command-line helps

Pull Request - State: closed - Opened by gyuho 3 months ago

#21 - nits(gpud): remove workspace id mention in command-line helps

Pull Request - State: closed - Opened by gyuho 3 months ago

#20 - Update Readme

Pull Request - State: closed - Opened by bobmayuze 3 months ago

#20 - Update Readme

Pull Request - State: closed - Opened by bobmayuze 3 months ago

#19 - fix(install): follow the latest tar archive format

Pull Request - State: closed - Opened by hm2501 3 months ago

#19 - fix(install): follow the latest tar archive format

Pull Request - State: closed - Opened by hm2501 3 months ago