An open API service for providing issue and pull request metadata for open source projects.

GitHub / kubeflow/mpi-operator issues and pull requests

#700 - New fix kustomize5 warnings

Pull Request - State: closed - Opened by vikas-saxena02 3 months ago - 4 comments
Labels: approved, lgtm, size/M

#699 - Bump golang.org/x/net from 0.36.0 to 0.38.0

Pull Request - State: closed - Opened by dependabot[bot] 4 months ago - 1 comment
Labels: approved, lgtm, size/M, dependencies, go

#698 - [feature] pull image from ghcr in manifest

Pull Request - State: open - Opened by mahdikhashan 4 months ago - 1 comment
Labels: size/S

#697 - Use cncf-hosted gha runners

Pull Request - State: open - Opened by jeefy 4 months ago - 2 comments
Labels: size/XS

#696 - remove zw0610 from reviewer

Pull Request - State: closed - Opened by zw0610 4 months ago - 5 comments
Labels: approved, lgtm, size/XS

#695 - Upgrade Go version to 1.24

Pull Request - State: closed - Opened by tenzen-y 4 months ago - 2 comments
Labels: lgtm, size/S

#694 - Bump golang.org/x/crypto from 0.31.0 to 0.35.0

Pull Request - State: closed - Opened by dependabot[bot] 4 months ago - 1 comment
Labels: approved, lgtm, size/M, dependencies, go

#693 - Remove alculquicondor from OWNERS

Pull Request - State: closed - Opened by alculquicondor 4 months ago - 6 comments
Labels: approved, size/XS

#692 - Trust the Intel OneAPI PGP key until it satisfies new APT PGP requirments

Pull Request - State: closed - Opened by tenzen-y 4 months ago - 2 comments
Labels: approved, lgtm, size/XS

#690 - Fix missing ReplicaIndexLabel when using RunLauncherAsWorker

Pull Request - State: closed - Opened by GonzaloSaez 4 months ago - 6 comments
Labels: approved, lgtm, size/S

#689 - [feature]: migrate docker image push to ghcr

Pull Request - State: open - Opened by mahdikhashan 4 months ago - 10 comments
Labels: size/S

#688 - Fix kustomize5 warnings

Pull Request - State: closed - Opened by vikas-saxena02 4 months ago - 13 comments
Labels: size/M, do-not-merge/hold

#686 - Perform Image building in parallel in CI

Pull Request - State: closed - Opened by tenzen-y 5 months ago - 2 comments
Labels: approved, size/S

#685 - Upgrade Debian version to trixie for OpenMPI v5.0

Pull Request - State: closed - Opened by tenzen-y 5 months ago - 1 comment
Labels: approved, lgtm, size/S

#684 - Upload container images to Github Container Registry

Issue - State: open - Opened by tenzen-y 5 months ago - 4 comments
Labels: kind/feature

#683 - Bump golang.org/x/net from 0.28.0 to 0.36.0

Pull Request - State: closed - Opened by dependabot[bot] 5 months ago - 3 comments
Labels: approved, lgtm, size/XS, dependencies, go

#682 - bug(MPI Training) : Scheduling Policy doc bug for MPIJob

Issue - State: closed - Opened by ttakahashi21 5 months ago - 9 comments
Labels: kind/bug

#681 - increase `intel-oneapi-mpi-devel` version to 2021.14

Pull Request - State: open - Opened by mahdikhashan 6 months ago - 1 comment
Labels: do-not-merge/work-in-progress, size/XS

#679 - chore: update k8s to v1.32

Pull Request - State: open - Opened by dongjiang1989 6 months ago - 5 comments
Labels: approved, lgtm, size/XXL, do-not-merge/hold

#678 - Upgrade Intel MPI version to 2021.14

Issue - State: open - Opened by tenzen-y 7 months ago - 6 comments
Labels: help wanted, kind/bug

#677 - Bump golang.org/x/net from 0.28.0 to 0.33.0

Pull Request - State: closed - Opened by dependabot[bot] 7 months ago - 4 comments
Labels: approved, lgtm, size/XS, dependencies

#676 - Fix E2E Intel MPI integ tests

Pull Request - State: closed - Opened by GonzaloSaez 7 months ago - 2 comments
Labels: approved, lgtm, size/M

#675 - Failed IntelMPI E2E tests

Issue - State: closed - Opened by tenzen-y 7 months ago - 10 comments
Labels: kind/bug

#674 - Expose job controller's workqueue rate limiting configs

Pull Request - State: closed - Opened by roteme-runai 7 months ago - 8 comments
Labels: approved, lgtm, size/M

#673 - chore: bump golang.org/x/crypto from v0.26.0 to v0.31.0

Pull Request - State: closed - Opened by cmontemuino 8 months ago - 8 comments
Labels: approved, size/M

#672 - CVE-2024-45337 in golang.org/x/crypto package

Issue - State: closed - Opened by cmontemuino 8 months ago - 1 comment

#671 - DO NOT MERGE: E2E CI CHECK

Pull Request - State: closed - Opened by tenzen-y 9 months ago - 4 comments
Labels: size/L, do-not-merge/hold

#670 - Do not create the launcher job if the job starts suspended

Pull Request - State: open - Opened by GonzaloSaez 9 months ago - 3 comments
Labels: size/M

#669 - Fix crash in podgroup when runLauncherAsWorker is true

Pull Request - State: closed - Opened by GonzaloSaez 9 months ago - 9 comments
Labels: approved, lgtm, size/L

#668 - Update image tag with release-0.6

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 4 comments
Labels: approved, lgtm, size/XS

#667 - Reuse the core kubernetes API reason for the BackoffLimitExceeded

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 2 comments
Labels: approved, lgtm, size/S

#666 - Fix the 'printf: non-constant format string in call to fmt.Errorf (govet)' lint errors

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 2 comments
Labels: approved, lgtm, size/M

#665 - Prepare v0.6.0 release

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 6 comments
Labels: approved, lgtm, size/S

#664 - Bump to k8s 1.31

Pull Request - State: closed - Opened by ArangoGutierrez 10 months ago - 11 comments
Labels: approved, lgtm, size/XXL

#663 - Obviously specify the supported platforms in Makefile

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 2 comments
Labels: approved, lgtm, size/S

#662 - Error Building Custom MPI Image Following Documentation

Issue - State: open - Opened by luancaarvalho 10 months ago - 1 comment

#661 - Introduce debian bookworm

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 3 comments
Labels: approved, lgtm, size/S

#660 - Add support for linux/ppc64le for MPICH

Issue - State: open - Opened by tenzen-y 10 months ago
Labels: kind/feature

#659 - Upgrade volcano version to v1.10.0

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 3 comments
Labels: approved, lgtm, size/XS

#658 - Issue connecting to nodes that are not within the same cluster

Issue - State: open - Opened by yxusnapchat 10 months ago - 2 comments

#657 - Upgrade the k8s dependency versions to 1.30

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 3 comments
Labels: approved, lgtm, size/XXL

#656 - Adjust the comment for managedBy

Pull Request - State: closed - Opened by mszadkow 10 months ago - 2 comments
Labels: approved, lgtm, size/XS

#655 - Bump K8s to 1.31

Pull Request - State: closed - Opened by ArangoGutierrez 10 months ago - 2 comments
Labels: size/XXL

#654 - Release v0.6.0 requirements

Issue - State: closed - Opened by tenzen-y 10 months ago - 7 comments

#653 - Upgrade the scheduler-plugins to v0.29.8

Pull Request - State: closed - Opened by tenzen-y 10 months ago - 4 comments
Labels: approved, lgtm, size/M

#652 - Next release date with updated k8s libraries for 1.31

Issue - State: closed - Opened by klueska 10 months ago - 9 comments

#650 - Introduce ManagedBy field in RunPolicy

Pull Request - State: closed - Opened by mszadkow 10 months ago - 4 comments
Labels: approved, lgtm, size/L, ok-to-test

#649 - How the file at tensorflow-benchmarks.yaml can run an MPI job ?

Issue - State: closed - Opened by luancaarvalho 11 months ago - 4 comments

#648 - What scale can mpi-operator support?

Issue - State: open - Opened by yxzhao6 11 months ago - 3 comments

#646 - Add support for the managedBy field

Issue - State: closed - Opened by mimowo 12 months ago - 6 comments

#644 - ttlSecondsAfterFinished for MPIJob, not only launcher

Issue - State: open - Opened by hy00nc about 1 year ago - 6 comments

#643 - "cleanPodPolicy: All" does not clean up launcher pod

Issue - State: open - Opened by hy00nc about 1 year ago - 1 comment

#642 - Connection reset

Issue - State: closed - Opened by bbenshab about 1 year ago - 4 comments

#641 - how could mpijob of mpi operator worker get the hostname of launcher

Issue - State: closed - Opened by Oneal65 about 1 year ago - 2 comments

#640 - fix #639 provide NCCL tests example

Pull Request - State: open - Opened by samos123 over 1 year ago - 4 comments
Labels: do-not-merge/work-in-progress, size/L

#639 - NCCL tests example

Issue - State: open - Opened by samos123 over 1 year ago - 1 comment

#638 - Update image tag with 0.5

Pull Request - State: closed - Opened by tenzen-y over 1 year ago - 2 comments
Labels: approved, lgtm, size/XS

#637 - Upgrade golang and controller-gen

Pull Request - State: closed - Opened by tenzen-y over 1 year ago - 2 comments
Labels: approved, lgtm, size/XXL

#636 - Upgrade golang and controller-gen

Pull Request - State: closed - Opened by alculquicondor over 1 year ago - 9 comments
Labels: size/XXL

#635 - Replace original pointer methods with ptr libs

Pull Request - State: closed - Opened by tenzen-y over 1 year ago - 6 comments
Labels: approved, lgtm, size/L

#634 - Introduce resource multiplication

Pull Request - State: closed - Opened by tenzen-y over 1 year ago - 4 comments
Labels: approved, lgtm, size/S

#633 - Upgrade K8s dependencies to v1.29

Pull Request - State: closed - Opened by tenzen-y over 1 year ago - 12 comments
Labels: approved, lgtm, size/XXL

#632 - Promote @tenzen-y to approver

Pull Request - State: closed - Opened by terrytangyuan over 1 year ago - 2 comments
Labels: approved, size/XS

#631 - Prepare for release 0.5.0

Pull Request - State: closed - Opened by alculquicondor over 1 year ago - 5 comments
Labels: approved, lgtm, size/S

#630 - Remove unnecessary RBAC rule for mpijobs-admin***

Pull Request - State: open - Opened by vishvajit79 over 1 year ago - 2 comments
Labels: size/XS

#629 - Bump google.golang.org/protobuf from 1.31.0 to 1.33.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: approved, lgtm, size/XS, dependencies

#628 - Fix: no overwrite when run launcher as worker

Pull Request - State: closed - Opened by kuizhiqing over 1 year ago - 1 comment
Labels: approved, lgtm, size/L

#627 - Deprecated pointer, use ptr instead

Pull Request - State: closed - Opened by kuizhiqing over 1 year ago - 2 comments
Labels: approved, lgtm, size/L

#626 - make namespace parsing and informers pluggable

Pull Request - State: open - Opened by emsixteeen over 1 year ago - 9 comments
Labels: size/L

#625 - removing klog.Fatalf in favor of a shutdown request

Pull Request - State: closed - Opened by emsixteeen over 1 year ago - 6 comments
Labels: size/XS

#624 - adding Mac .DS_Store to gitignore

Pull Request - State: closed - Opened by emsixteeen over 1 year ago - 1 comment
Labels: approved, lgtm, size/XS

#623 - update auto gen file year to verify generate

Pull Request - State: closed - Opened by kuizhiqing over 1 year ago - 2 comments
Labels: approved, lgtm, size/M

#622 - Fix: add ns filter to podLister

Pull Request - State: closed - Opened by kuizhiqing over 1 year ago - 3 comments
Labels: approved, lgtm, size/XS

#621 - Wrong host info in discover_hosts.sh

Issue - State: closed - Opened by kuizhiqing over 1 year ago

#620 - Running in a subset of namespaces

Issue - State: open - Opened by emsixteeen over 1 year ago - 8 comments

#619 - Fails mpi-operator early if access to list or watch objects is denied

Pull Request - State: closed - Opened by emsixteeen over 1 year ago - 8 comments
Labels: approved, lgtm, size/S

#618 - adding timeout for cache sync

Pull Request - State: closed - Opened by emsixteeen over 1 year ago - 14 comments
Labels: size/S

#617 - fix the condition

Pull Request - State: open - Opened by wang-mask over 1 year ago - 12 comments
Labels: size/XS

#616 - change1 mv to cp

Pull Request - State: closed - Opened by wang-mask over 1 year ago - 3 comments
Labels: approved, lgtm, size/XS

#614 - "make generate" command run failed

Issue - State: closed - Opened by wang-mask over 1 year ago

#613 - Replace the plain pod workers with Indexed Job

Issue - State: open - Opened by tenzen-y over 1 year ago - 4 comments

#612 - run worker process in launcher pod

Pull Request - State: closed - Opened by kuizhiqing over 1 year ago - 31 comments
Labels: approved, lgtm, size/L

#611 - Work with DeepSpeed for large scale training

Issue - State: open - Opened by kuizhiqing over 1 year ago - 28 comments

#610 - add deepspeed example

Pull Request - State: open - Opened by kuizhiqing over 1 year ago - 5 comments
Labels: do-not-merge/work-in-progress, size/M

#609 - Bump golang.org/x/crypto from 0.14.0 to 0.17.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: approved, lgtm, size/S, dependencies

#606 - fix bug about status absence when worker pod spec is invalid

Pull Request - State: open - Opened by congpeiqing over 1 year ago - 1 comment
Labels: size/S

#604 - Cant get mpijob status when pod template is invalid

Issue - State: open - Opened by congpeiqing over 1 year ago - 9 comments

#603 - Bumping opentelemetry libraries

Pull Request - State: closed - Opened by tenzen-y over 1 year ago - 2 comments
Labels: approved, lgtm, size/L

#602 - Bump go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc from 0.35.0 to 0.46.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 4 comments
Labels: size/M, dependencies

#601 - Fix invalid link for horovod cpu-only example Dockerfile

Pull Request - State: closed - Opened by lianghao208 over 1 year ago - 2 comments
Labels: approved, lgtm, size/XS

#600 - Fix invalid link for horovod cpu-only example

Pull Request - State: closed - Opened by lianghao208 over 1 year ago - 1 comment
Labels: approved, lgtm, size/XS