GitHub / ray-project/kuberay issues and pull requests
#3872 - [Bug] Restore random head pod name when Ray < 2.48 and Autoscaler v2 and GCS FT are enabled
Pull Request -
State: open - Opened by machichima 17 days ago
#3837 - Get rid of wget dependency
Issue -
State: open - Opened by jjyao about 1 month ago
- 4 comments
Labels: 1.5.0
#3836 - Feature/cron scheduling rayjob 2426
Pull Request -
State: open - Opened by DW-Han about 1 month ago
#3835 - Use Go 1.24.0 in go module
Pull Request -
State: open - Opened by tenzen-y about 1 month ago
#3834 - Avoid requiring specific Go patch version in go module
Issue -
State: open - Opened by tenzen-y about 1 month ago
- 1 comment
#3833 - Add RayCluster YAML for verl example
Pull Request -
State: closed - Opened by kevin85421 about 1 month ago
#3832 - [Feature] [kubectl-plugin] Improve support for autoscaling clusters
Issue -
State: open - Opened by jleben about 1 month ago
- 2 comments
Labels: enhancement, triage
#3831 - [feat][operator] validate Ray resource metadata in webhook
Pull Request -
State: open - Opened by davidxia about 1 month ago
#3830 - [feat][python-client]: add rayjob support to kuberay python-client
Pull Request -
State: open - Opened by kryanbeane about 1 month ago
#3829 - [Feature] Add RayJob support to Kuberay python-client
Issue -
State: open - Opened by kryanbeane about 1 month ago
- 1 comment
Labels: enhancement, triage
#3828 - [cherry-pick] Cherry-pick #3826 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3827 - [Bug] Exiting because this node manager has mistakenly been marked as dead by the GCS
Issue -
State: closed - Opened by hobin2017 about 1 month ago
Labels: bug, triage
#3826 - Fix ray nightly image env var setup
Pull Request -
State: closed - Opened by dayshah about 1 month ago
- 1 comment
#3825 - [Test][Release] Change upgrade test version to test upgrade from 1.3.2 to 1.4.0
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
- 1 comment
#3824 - [release] Update upgrade test during release process
Issue -
State: closed - Opened by kevin85421 about 1 month ago
Labels: enhancement, release-blocker, 1.4.0
#3823 - [Fix][Release] Fix Krew release indenetation error
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
- 2 comments
#3822 - KubeRay v1.4.0 default non-login BASH shell issue tracking
Issue -
State: open - Opened by MortalHappiness about 1 month ago
#3821 - [kubectl-plugin] Remove ephemeral storage check
Pull Request -
State: open - Opened by win5923 about 1 month ago
#3820 - [Feature] Add prometheus metrics reset support
Issue -
State: open - Opened by win5923 about 1 month ago
Labels: enhancement, triage
#3819 - [Chore] Remove CHANGELOG.md
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3818 - [Fix] changelog-generator.py failed to parse some commit messages
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
- 3 comments
#3817 - [Doc] Update RayJob Quick Start Job When V1.5 Release
Issue -
State: open - Opened by weizhaowz about 1 month ago
- 1 comment
Labels: 1.5.0
#3816 - [Release] Update KubeRay version references for 1.4.0
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3815 - [kubeclt-plugin] use solid value as default value in get and create
Pull Request -
State: open - Opened by fscnick about 1 month ago
- 1 comment
#3814 - [cherry-pick] Cherry-pick #3809 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3813 - [cherry-pick] Cherry-pick #3804 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3812 - [apiserver] Add migration doc from v1 to v2
Pull Request -
State: open - Opened by nadongjun about 1 month ago
#3811 - [Fix] kubectl ray create cluster config file CPU overwrites the whole resource requests and limits
Pull Request -
State: open - Opened by CheyuWu about 1 month ago
#3810 - [Cherry-pick][Helm Chart] Set honorLabel of serviceMonitor to true (#3805)
Pull Request -
State: closed - Opened by owenowenisme about 1 month ago
- 1 comment
#3809 - [kubeclt-plugin] fix get cluster all namespace
Pull Request -
State: closed - Opened by fscnick about 1 month ago
- 1 comment
#3808 - [Bug] Add default value for entrypoint flags in job_submit.go
Pull Request -
State: open - Opened by 400Ping about 1 month ago
- 1 comment
#3807 - [Bug] Wrong default value for entrypoint flags in job_submit.go
Issue -
State: open - Opened by 400Ping about 1 month ago
#3806 - [Bug] kubectl-plugin get cluster without all-namespace still shows all-namespace.
Issue -
State: closed - Opened by fscnick about 1 month ago
Labels: bug, release-blocker, 1.4.0
#3805 - [Helm Chart] Set honorLabel of serviceMonitor to `true`
Pull Request -
State: closed - Opened by owenowenisme about 1 month ago
- 1 comment
Labels: release-blocker, 1.4.0
#3804 - [Docs] Add kubectl plugin create cluster sample yaml config files
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
- 1 comment
Labels: release-blocker, 1.4.0
#3803 - [Bug] kubectl ray create cluster config file CPU overwrites the whole resource requests and limits
Issue -
State: open - Opened by MortalHappiness about 1 month ago
- 1 comment
Labels: bug, cli, 1.5.0
#3802 - [Doc] `ray job submit -- echo aaa; echo bbb` and `echo aaa` will be executed in the cluster but `echo bbb` will be executed in the submitter
Issue -
State: open - Opened by kevin85421 about 1 month ago
Labels: enhancement, triage
#3801 - [Bug] Wrong default value for head and worker ray start params in create_cluster.go
Issue -
State: open - Opened by MortalHappiness about 1 month ago
- 1 comment
Labels: bug, good-first-issue, cli
#3800 - chore: reduce memory allocation on handling http response
Pull Request -
State: open - Opened by fscnick about 1 month ago
#3799 - [Bug] RayCluster unavailable due to health probe failure
Issue -
State: open - Opened by ChenYi015 about 1 month ago
Labels: bug, triage
#3798 - [cherry-pick] Cherry-pick #3795 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3797 - [cherry-pick] Cherry-pick #3796 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3796 - [Chore][Sample-yaml] Upgrade pytorch-lightning to 1.8.5 for `ray-job.pytorch-distributed-training.yaml`
Pull Request -
State: closed - Opened by MortalHappiness about 1 month ago
#3795 - [Metrics] Remove serviceMonitor.yaml
Pull Request -
State: closed - Opened by owenowenisme about 1 month ago
- 1 comment
#3794 - Question: Autoscaler v1 vs v2 configuration and performance
Issue -
State: open - Opened by testinfected about 2 months ago
- 12 comments
#3793 - [Feature] Support `runtimeClassName` in values.yaml
Issue -
State: open - Opened by unclebenel about 2 months ago
Labels: enhancement, triage
#3792 - [Test][Autoscaler] add fake single-host TPU tests
Pull Request -
State: open - Opened by davidxia about 2 months ago
#3791 - [Bug] why readinessProbe port 8000 check
Issue -
State: closed - Opened by Helion-z about 2 months ago
- 8 comments
Labels: bug, triage
#3790 - Use ImplementationSpecific in ray-cluster.separate-ingress.yaml (#3781)
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3789 - [cherry-pick] Cherry-pick #3786 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3788 - [cherry-pick] Cherry-pick #3782 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3787 - [cherry-pick] Cherry-pick #3779 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3786 - Remove vLLM examples in favor of Ray Serve LLM
Pull Request -
State: closed - Opened by kevin85421 about 2 months ago
Labels: release-blocker, 1.4.0
#3785 - pass client when call batchscheduler.New()
Pull Request -
State: open - Opened by KunWuLuan about 2 months ago
- 3 comments
#3784 - [Release] Update KubeRay version references for 1.4.0-rc.2
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3783 - [cherry-pick] Cherry-pick #3780 into release-1.4 branch
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3782 - Update update-ray-job.kueue-toy-sample.yaml
Pull Request -
State: closed - Opened by troychiu about 2 months ago
- 3 comments
Labels: release-blocker, 1.4.0
#3781 - Use ImplementationSpecific in ray-cluster.separate-ingress.yaml
Pull Request -
State: closed - Opened by troychiu about 2 months ago
- 3 comments
Labels: release-blocker, 1.4.0
#3780 - [Doc][Fix] correct the indention of storageClass in ray-cluster.persistent-redis.yaml
Pull Request -
State: closed - Opened by rueian about 2 months ago
- 3 comments
#3779 - [Feat] Add e2e test for applying `ray-job.interactive-mode.yaml`
Pull Request -
State: closed - Opened by CheyuWu about 2 months ago
- 4 comments
#3778 - [Feat] Add e2e test for applying `ray-job.interactive-mode.yaml`
Issue -
State: closed - Opened by CheyuWu about 2 months ago
Labels: bug, triage
#3777 - [cherry-pick][doc] Improve APIServer v2 doc (#3773)
Pull Request -
State: closed - Opened by kevin85421 about 2 months ago
#3776 - [Feature] gcsFaultToleranceOptions spec support in RayCluster Helm Chart
Issue -
State: open - Opened by tplass-ias about 2 months ago
Labels: enhancement, triage
#3775 - [kubectl-plugin] Use a more Golang-native approach to retrieve the CR status for testing
Pull Request -
State: open - Opened by machichima about 2 months ago
#3774 - [Release] Reset ray-operator version in root go.mod to v0.0.0
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
- 1 comment
#3773 - [doc] Improve APIServer v2 doc
Pull Request -
State: closed - Opened by kevin85421 about 2 months ago
Labels: release-blocker, 1.4.0
#3772 - [cherry-pick] Cherry-pick #3771 into release-1.4
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
#3771 - Revert "Fix issue where unescaped semicolons caused task execution failures. (#3691)"
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
- 2 comments
#3769 - [scheduler-plugins] Support second scheduler mode
Issue -
State: open - Opened by kevin85421 about 2 months ago
- 1 comment
Labels: enhancement, raycluster, 1.5.0, scheduling
#3764 - [Bug] `ray-job.use-existing-raycluster.yaml` entrypoint error
Issue -
State: closed - Opened by MortalHappiness about 2 months ago
- 8 comments
Labels: bug, operator, P0, release-blocker, 1.4.0
#3763 - [Doc] Reference helm chart version in `helm-chart/kuberay-operator/README.md.gotmpl` with go template
Pull Request -
State: closed - Opened by MortalHappiness about 2 months ago
- 1 comment
#3746 - [scheduler-plugins] Kuberay should pass client when call batchscheduler.New()
Issue -
State: open - Opened by KunWuLuan about 2 months ago
- 2 comments
Labels: raycluster, 1.5.0, scheduling
#3741 - [Bug] ENABLE_RAY_HEAD_CLUSTER_IP_SERVICE set true does not create Ray Head with CLUSTER-IP set as expected
Issue -
State: closed - Opened by amholler about 2 months ago
- 1 comment
Labels: bug, triage
#3740 - [RayService][Test] create curl pod waiting until running
Pull Request -
State: open - Opened by fscnick about 2 months ago
#3736 - test: enable upgrade to image built from source
Pull Request -
State: open - Opened by pawelpaszki about 2 months ago
- 2 comments
#3731 - [RayJob] Support deletion policies based on job status
Pull Request -
State: closed - Opened by weizhaowz 2 months ago
- 8 comments
#3714 - [Feature] Update RayJob DeletionPolicy API to differentiate success/failure scenarios
Issue -
State: open - Opened by andrewsykim 2 months ago
- 14 comments
Labels: enhancement, triage
#3703 - [Helm] Add priorityClassName for kuberay-operator chart
Pull Request -
State: open - Opened by win5923 2 months ago
- 1 comment
#3659 - [Feature] Provide a better experience to manage the KubeRay with ArgoCD and GitOps
Issue -
State: open - Opened by cmontemuino 2 months ago
- 12 comments
Labels: enhancement, triage, 1.5.0
#3643 - [Grafana] Allow auto-load dashboard jsons
Pull Request -
State: open - Opened by owenowenisme 2 months ago
- 1 comment
#3642 - [ray-operator][Bug] Rayjob is Failed or Succeed, but Raycluster status(jobDeploymentStatus) is still Running(#3553)
Pull Request -
State: open - Opened by dushulin 2 months ago
#3641 - [Feature] Support for Volcano Network Topology Aware Scheduling
Issue -
State: open - Opened by thaison1496 2 months ago
Labels: enhancement, triage
#3640 - Single go.mod file
Pull Request -
State: open - Opened by troychiu 2 months ago
#3639 - [Feature][Kubectl-plugin] Make flags override yaml file for kubectl ray job submit
Issue -
State: open - Opened by MortalHappiness 2 months ago
- 2 comments
Labels: cli
#3638 - [Feature] RayJob labels should be propogated to job and pod
Issue -
State: open - Opened by anson627 2 months ago
Labels: enhancement, triage
#3637 - [Feature] Propagate labels from RayJob to submitter Kubernetes job
Pull Request -
State: open - Opened by anson627 2 months ago
#3636 - test: reduce requests in sample ray service yaml config
Pull Request -
State: closed - Opened by pawelpaszki 2 months ago
- 1 comment
#3635 - [SLI Metrics] Add metric kuberay_cluster_condition_provisioned
Pull Request -
State: open - Opened by win5923 2 months ago
- 1 comment
#3634 - [Bug] KubeRay not compatible with CUDA MPS of NVIDIA device plugin for Kubernetes
Issue -
State: open - Opened by PaoPaoYue 2 months ago
Labels: bug, triage
#3633 - Use go workspace
Pull Request -
State: closed - Opened by troychiu 2 months ago
#3632 - Bump the kubernetes group across 3 directories with 10 updates
Pull Request -
State: open - Opened by dependabot[bot] 2 months ago
Labels: go, dependencies
#3631 - Bump the google-golang group across 3 directories with 1 update
Pull Request -
State: open - Opened by dependabot[bot] 2 months ago
Labels: go, dependencies
#3630 - [RayCluster] add annotation to enable non-login bash
Pull Request -
State: open - Opened by fscnick 3 months ago
- 5 comments
#3629 - [Test][Autoscaler] Add an E2E test for CPU tasks on GPU nodes.
Pull Request -
State: closed - Opened by LeoLiao123 3 months ago
- 2 comments
#3628 - [Doc][CI] Align K8s version in Doc and CI with minimal required version
Pull Request -
State: closed - Opened by kenchung285 3 months ago
- 1 comment
#3627 - [Feature] [kubectl-plugin] Expose setting `shutdownAfterJobFinishes` and `ttlSecondsAfterFinished` in ray job submit
Pull Request -
State: open - Opened by CheyuWu 3 months ago
- 2 comments
#3626 - [kubectl-plugin] Use a more Golang-native approach to retrieve the CR status for testing
Issue -
State: open - Opened by kevin85421 3 months ago
- 1 comment
Labels: enhancement, cli, 1.4.0
#3625 - [Feature] Support node selector for running kuberay-operator via helm chart
Issue -
State: open - Opened by pkgajulapalli 3 months ago
Labels: enhancement, triage
#3624 - [kubectl-plugin] Handle multiple jobs in `ray job list`
Issue -
State: open - Opened by LeoLiao123 3 months ago
Labels: enhancement, triage
#3623 - [Test][Autoscaler] Add an E2E test for updating maxReplicas on a worker group
Pull Request -
State: open - Opened by machichima 3 months ago
- 4 comments