Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / leptonai/gpud issues and pull requests
#69 - feat(nvidia/query): fabric manager debugging info from journalctl
Pull Request -
State: closed - Opened by gyuho 2 months ago
#68 - feat(pkg/process): rename stop to abort, add systemd/journal utils
Pull Request -
State: closed - Opened by gyuho 2 months ago
#68 - feat(pkg/process): rename stop to abort, add systemd/journal utils
Pull Request -
State: closed - Opened by gyuho 2 months ago
#67 - feat(pkg/systemd): remove redundant utils, move "pkg/update"
Pull Request -
State: closed - Opened by gyuho 2 months ago
#67 - feat(pkg/systemd): remove redundant utils, move "pkg/update"
Pull Request -
State: closed - Opened by gyuho 2 months ago
#66 - feat(internal/session): add missing writer close for session writer
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 2 comments
#66 - feat(internal/session): add missing writer close for session writer
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 2 comments
#65 - client(v1): move examples, add info by component
Pull Request -
State: closed - Opened by gyuho 3 months ago
#65 - client(v1): move examples, add info by component
Pull Request -
State: closed - Opened by gyuho 3 months ago
#64 - feat(docker): list all containers in docker
Pull Request -
State: closed - Opened by cardyok 3 months ago
#64 - feat(docker): list all containers in docker
Pull Request -
State: closed - Opened by cardyok 3 months ago
#63 - feat(goreleaser): use ubuntu 20.04 build as default linux artifact
Pull Request -
State: closed - Opened by gyuho 3 months ago
#63 - feat(goreleaser): use ubuntu 20.04 build as default linux artifact
Pull Request -
State: closed - Opened by gyuho 3 months ago
#62 - fix(components/docker): do not set not healthy if docker client version incompatible
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#62 - fix(components/docker): do not set not healthy if docker client version incompatible
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#61 - fix(update): check update version in "gpud update" command
Pull Request -
State: closed - Opened by hm2501 3 months ago
Labels: bug
#61 - fix(update): check update version in "gpud update" command
Pull Request -
State: closed - Opened by hm2501 3 months ago
Labels: bug
#60 - fix(nvidia/query): handle error when lsmod reader is already closed for peermem checker
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#60 - fix(nvidia/query): handle error when lsmod reader is already closed for peermem checker
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#59 - feat(systemd): enable gpud service
Pull Request -
State: closed - Opened by cardyok 3 months ago
#59 - feat(systemd): enable gpud service
Pull Request -
State: closed - Opened by cardyok 3 months ago
#58 - feat(client/v1): add basic get/read v1 API calls
Pull Request -
State: closed - Opened by gyuho 3 months ago
#58 - feat(client/v1): add basic get/read v1 API calls
Pull Request -
State: closed - Opened by gyuho 3 months ago
#57 - doc(README): add badges, official links
Pull Request -
State: closed - Opened by gyuho 3 months ago
#57 - doc(README): add badges, official links
Pull Request -
State: closed - Opened by gyuho 3 months ago
#56 - fix(accelerator/nvidia/gpm): add missing Healthy: true field
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#56 - fix(accelerator/nvidia/gpm): add missing Healthy: true field
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#55 - fix(event): add timestamp for xid/sxid error event
Pull Request -
State: closed - Opened by cardyok 3 months ago
#55 - fix(event): add timestamp for xid/sxid error event
Pull Request -
State: closed - Opened by cardyok 3 months ago
#54 - fix(session): handle io closed on write failure
Pull Request -
State: closed - Opened by cardyok 3 months ago
#54 - fix(session): handle io closed on write failure
Pull Request -
State: closed - Opened by cardyok 3 months ago
#53 - fix(accelerator/nvidia): panic when ibstat command fails, when recording errors
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#53 - fix(accelerator/nvidia): panic when ibstat command fails, when recording errors
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#52 - fix(components/fd): use system-wide file descriptor limit, add default 1-million threshold limit, remove "_avg" metrics in fd component
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#52 - fix(components/fd): use system-wide file descriptor limit, add default 1-million threshold limit, remove "_avg" metrics in fd component
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#51 - fix(nvidia/nvml): mark xid 68 as user app error, document
Pull Request -
State: closed - Opened by gyuho 3 months ago
#51 - fix(nvidia/nvml): mark xid 68 as user app error, document
Pull Request -
State: closed - Opened by gyuho 3 months ago
#50 - feat(nvidia/nvml): include device uuid for xid event
Pull Request -
State: closed - Opened by gyuho 3 months ago
#50 - feat(nvidia/nvml): include device uuid for xid event
Pull Request -
State: closed - Opened by gyuho 3 months ago
#49 - fix(nvidia/nvml): remove xid event polling gaps, log when event happens
Pull Request -
State: closed - Opened by gyuho 3 months ago
#49 - fix(nvidia/nvml): remove xid event polling gaps, log when event happens
Pull Request -
State: closed - Opened by gyuho 3 months ago
#48 - fix(nvidia): skip clock events NVML check if not supported by old drivers
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#48 - fix(nvidia): skip clock events NVML check if not supported by old drivers
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#47 - fix(docs): pkg.go.dev links, add Makefile CGO_ENABLED=1
Pull Request -
State: closed - Opened by flyer103 3 months ago
- 1 comment
#47 - fix(docs): pkg.go.dev links, add Makefile CGO_ENABLED=1
Pull Request -
State: closed - Opened by flyer103 3 months ago
- 1 comment
#46 - doc(nvidia/error/xid): document how xid error is detected using dmesg
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#46 - doc(nvidia/error/xid): document how xid error is detected using dmesg
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#45 - fix(dmesg): fallback in case "dmesg --since" flag doesn't exist in older versions
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#45 - fix(dmesg): fallback in case "dmesg --since" flag doesn't exist in older versions
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#44 - doc(nvidia/query): explain gpm sm occupancy in more detail
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#44 - doc(nvidia/query): explain gpm sm occupancy in more detail
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#43 - Fix readme typo
Pull Request -
State: closed - Opened by erjanmx 3 months ago
#43 - Fix readme typo
Pull Request -
State: closed - Opened by erjanmx 3 months ago
#42 - docs(WHY): simple comparison with dcgm
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#42 - docs(WHY): simple comparison with dcgm
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#41 - fix(cmd/gpud): clean up "gpud down" logs
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: enhancement
#41 - fix(cmd/gpud): clean up "gpud down" logs
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: enhancement
#40 - feat(nvidia): support GPM metrics (SM occupancy), lower xid poll frequency
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: feature, nvidia
#40 - feat(nvidia): support GPM metrics (SM occupancy), lower xid poll frequency
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: feature, nvidia
#39 - feat(nvidia/query): check lsmod output with retries
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: enhancement
#39 - feat(nvidia/query): check lsmod output with retries
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: enhancement
#38 - fix(components/fd): skip error if descriptor file is deleted
Pull Request -
State: closed - Opened by gyuho 3 months ago
#38 - fix(components/fd): skip error if descriptor file is deleted
Pull Request -
State: closed - Opened by gyuho 3 months ago
#37 - fix(systemd): handle exit status for "gpud status" command when gpud is inactive
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#37 - fix(systemd): handle exit status for "gpud status" command when gpud is inactive
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: bug
#36 - feat(install.sh): detect ubuntu distro version
Pull Request -
State: closed - Opened by gyuho 3 months ago
#36 - feat(install.sh): detect ubuntu distro version
Pull Request -
State: closed - Opened by gyuho 3 months ago
#35 - doc(README): fix wording for auto-updates
Pull Request -
State: closed - Opened by gyuho 3 months ago
#35 - doc(README): fix wording for auto-updates
Pull Request -
State: closed - Opened by gyuho 3 months ago
#34 - feat(goreleaser): support linux/amd64 ubuntu 24.04 builds
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 1 comment
#34 - feat(goreleaser): support linux/amd64 ubuntu 24.04 builds
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 1 comment
#33 - fix(query/log/tail): do not show output on scan failures
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 1 comment
#33 - fix(query/log/tail): do not show output on scan failures
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 1 comment
#32 - gpud scan (dmesg) fills the stderr with kernel logs, and fails with unknown flags
Issue -
State: closed - Opened by eicca 3 months ago
- 12 comments
Labels: bug, awaiting feedback
#32 - gpud scan (dmesg) fills the stderr with kernel logs, and fails with unknown flags
Issue -
State: closed - Opened by eicca 3 months ago
- 12 comments
Labels: bug, awaiting feedback
#31 - Installation on Ubuntu 20.04.6 LTS `GLIBC_2.33' not found
Issue -
State: closed - Opened by eicca 3 months ago
- 10 comments
Labels: dependency-issue
#31 - Installation on Ubuntu 20.04.6 LTS `GLIBC_2.33' not found
Issue -
State: closed - Opened by eicca 3 months ago
- 10 comments
Labels: dependency-issue
#30 - feat(gpud): rename/add "run --web-enable --enable-auto-update" flags
Pull Request -
State: closed - Opened by gyuho 3 months ago
#30 - feat(gpud): rename/add "run --web-enable --enable-auto-update" flags
Pull Request -
State: closed - Opened by gyuho 3 months ago
#29 - feat(update): flag for controlling auto update
Pull Request -
State: closed - Opened by hm2501 3 months ago
#29 - feat(update): flag for controlling auto update
Pull Request -
State: closed - Opened by hm2501 3 months ago
#28 - feat(integration): add integration doc
Pull Request -
State: closed - Opened by cardyok 3 months ago
#28 - feat(integration): add integration doc
Pull Request -
State: closed - Opened by cardyok 3 months ago
#27 - adding ubuntu 20.04
Pull Request -
State: closed - Opened by sysbot 3 months ago
- 3 comments
#27 - adding ubuntu 20.04
Pull Request -
State: closed - Opened by sysbot 3 months ago
- 3 comments
#26 - feat(pkg/process): initial commit, fix dmesg scan on non-root
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: feature
#26 - feat(pkg/process): initial commit, fix dmesg scan on non-root
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: feature
#25 - nits(*): add missing package descriptions, clean up imports
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#25 - nits(*): add missing package descriptions, clean up imports
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#24 - feat(components/file): initial commit for file status tracking
Pull Request -
State: closed - Opened by gyuho 3 months ago
- 3 comments
Labels: feature
#23 - doc(README): clarify env var, remove unused public ip field, remove more workspace mention in the token docs
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#23 - doc(README): clarify env var, remove unused public ip field, remove more workspace mention in the token docs
Pull Request -
State: closed - Opened by gyuho 3 months ago
Labels: documentation
#22 - feat(gossip): remove hostname and public ip in gossip payload
Pull Request -
State: closed - Opened by cardyok 3 months ago
#22 - feat(gossip): remove hostname and public ip in gossip payload
Pull Request -
State: closed - Opened by cardyok 3 months ago
#21 - nits(gpud): remove workspace id mention in command-line helps
Pull Request -
State: closed - Opened by gyuho 3 months ago
#21 - nits(gpud): remove workspace id mention in command-line helps
Pull Request -
State: closed - Opened by gyuho 3 months ago
#20 - Update Readme
Pull Request -
State: closed - Opened by bobmayuze 3 months ago
#20 - Update Readme
Pull Request -
State: closed - Opened by bobmayuze 3 months ago
#19 - fix(install): follow the latest tar archive format
Pull Request -
State: closed - Opened by hm2501 3 months ago
#19 - fix(install): follow the latest tar archive format
Pull Request -
State: closed - Opened by hm2501 3 months ago