Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / skypilot-org/skypilot issues and pull requests

#4267 - [jobs] autodown managed job clusters

Pull Request - State: closed - Opened by cg505 10 days ago

#4266 - [k8s] Add `lsof` to k8s base image

Issue - State: closed - Opened by romilbhardwaj 11 days ago - 1 comment

#4265 - runpod 4090 spot not available

Issue - State: open - Opened by alita-moore 11 days ago - 4 comments

#4263 - [Core] Submitting 1000 jobs to a cluster

Issue - State: closed - Opened by Michaelvll 11 days ago - 1 comment
Labels: P0

#4262 - [docs]: OCI key_file path clarrification

Pull Request - State: closed - Opened by HysunHe 12 days ago - 2 comments

#4261 - [k8s] Add flag to disable ssh setup

Pull Request - State: closed - Opened by romilbhardwaj 12 days ago - 2 comments

#4260 - [Core] Ray job refused to submit jobs in PENDING status

Issue - State: open - Opened by Michaelvll 12 days ago
Labels: P0

#4259 - Linting updates

Pull Request - State: open - Opened by andylizf 12 days ago - 1 comment

#4258 - Add a pre commit config to help format before pushing

Pull Request - State: open - Opened by zpoint 12 days ago - 5 comments

#4257 - [Jobs] Allowing to specify intermediate bucket for file upload

Pull Request - State: open - Opened by zpoint 12 days ago - 3 comments

#4253 - [Dashboard] Add a simple status filter.

Pull Request - State: closed - Opened by concretevitamin 13 days ago

#4251 - SSH Agent forwarding not working for `run` section

Issue - State: open - Opened by chris-aeviator 13 days ago - 1 comment

#4250 - Bug: `stream_logs_by_id` incorrectly handles task retry logic

Issue - State: open - Opened by andylizf 14 days ago - 2 comments

#4249 - Refactor `stream_logs_by_id` to extract single task monitoring logic

Pull Request - State: open - Opened by andylizf 14 days ago - 1 comment

#4248 - [Jobs] Limit number of concurrent jobs & launches.

Pull Request - State: open - Opened by cblmemo 14 days ago

#4247 - do not redirect stderr to /dev/null when submitting job

Pull Request - State: closed - Opened by cg505 15 days ago - 2 comments

#4246 - Disable more potential unattended upgrade sources for AWS

Pull Request - State: closed - Opened by yika-luo 15 days ago

#4245 - [Jobs] A way to keep the managed job for a while after user program failure

Issue - State: open - Opened by Michaelvll 15 days ago - 1 comment
Labels: P0, triage

#4244 - [AWS/Azure] Avoid error out during image size check

Pull Request - State: closed - Opened by Michaelvll 15 days ago

#4243 - [Jobs] Managed job controller process taking too much memory during peak time

Issue - State: open - Opened by Michaelvll 15 days ago - 1 comment
Labels: P0

#4242 - [k8s] pod resource limit

Issue - State: closed - Opened by bgyoon 15 days ago

#4241 - [UX] Support --tail parameter for sky logs

Pull Request - State: closed - Opened by zpoint 15 days ago - 3 comments

#4240 - [k8s] Parallelize setup for faster multi-node provisioning

Pull Request - State: closed - Opened by romilbhardwaj 15 days ago - 2 comments

#4239 - [Storage] Avoid opt-in regions for S3

Pull Request - State: closed - Opened by romilbhardwaj 16 days ago

#4238 - [tests] Exclude runpod from smoke tests unless specified

Pull Request - State: closed - Opened by romilbhardwaj 16 days ago

#4237 - [Core] dpkg lock showing up with AWS custom ubuntu image

Issue - State: closed - Opened by Michaelvll 16 days ago - 1 comment
Labels: P0

#4236 - [Tests] `test_tpu_vm_pod` failing on master

Issue - State: open - Opened by romilbhardwaj 16 days ago

#4233 - [k8s] Prevent mounting of /dev/shm in pods

Issue - State: open - Opened by roclark 16 days ago - 1 comment

#4232 - [Core/UX] Improve the display of returncode for multi-node

Issue - State: open - Opened by Michaelvll 16 days ago
Labels: P0

#4231 - [ux] add sky jobs launch --fast

Pull Request - State: closed - Opened by cg505 16 days ago - 1 comment

#4230 - [UX] Show 0.25 on controller queue

Pull Request - State: closed - Opened by Michaelvll 16 days ago - 2 comments

#4229 - [k8s] Parallelize pod initialization steps

Issue - State: open - Opened by romilbhardwaj 16 days ago

#4228 - [Release] Release 0.7.0

Pull Request - State: closed - Opened by romilbhardwaj 16 days ago - 2 comments

#4227 - [Core] Make home address replacement more robust

Pull Request - State: closed - Opened by Michaelvll 16 days ago

#4225 - [k8s] Skip SSH setup for faster provisioning

Issue - State: open - Opened by romilbhardwaj 16 days ago

#4224 - Update K8s docker image build and the source artifact registry

Pull Request - State: closed - Opened by yika-luo 17 days ago - 1 comment

#4223 - fix docstring for write_cluster_config

Pull Request - State: closed - Opened by cg505 17 days ago

#4222 - [UX] `sky logs` should be able to tail the last lines of the logs instead of showing all logs

Issue - State: closed - Opened by Michaelvll 17 days ago - 1 comment
Labels: good first issue, P0

#4221 - [Docs] Tpu v6 docs

Pull Request - State: closed - Opened by Michaelvll 17 days ago

#4220 - [Core] Support TPU v6

Pull Request - State: closed - Opened by cblmemo 17 days ago

#4218 - [Catalog] Add TPU V6e.

Pull Request - State: closed - Opened by cblmemo 17 days ago - 1 comment

#4217 - [test] smoke test fixes for managed jobs

Pull Request - State: closed - Opened by cg505 17 days ago

#4216 - [Tests] Fix public bucket tests

Pull Request - State: closed - Opened by romilbhardwaj 17 days ago - 2 comments

#4215 - [TPU] TPU v6 support

Pull Request - State: closed - Opened by Michaelvll 17 days ago - 1 comment

#4214 - [Tests] Add test for `max_restarts_on_errors`

Issue - State: open - Opened by Michaelvll 17 days ago

#4213 - [Jobs] Fix jobs name

Pull Request - State: closed - Opened by Michaelvll 17 days ago - 1 comment

#4210 - [UI] Ads on the SkyPilot documentation page

Issue - State: open - Opened by MaoZiming 17 days ago - 1 comment

#4209 - [Core] Fix issue with the wrong path of setup logs

Pull Request - State: closed - Opened by Michaelvll 17 days ago

#4208 - [k8s] Fix show-gpus when limited permissions are available

Pull Request - State: closed - Opened by romilbhardwaj 18 days ago - 4 comments

#4207 - [Jobs] Support syncing down logs for `sky jobs logs`

Pull Request - State: open - Opened by euclidgame 18 days ago

#4206 - [k8s] Add validation for `pod_config`

Issue - State: open - Opened by romilbhardwaj 18 days ago

#4205 - [Performance] Speed up Azure A10 instance creation

Pull Request - State: closed - Opened by yika-luo 18 days ago - 1 comment

#4204 - Upgrade Azure SDK version requirement

Pull Request - State: closed - Opened by yika-luo 18 days ago

#4203 - Update packer scripts

Pull Request - State: closed - Opened by yika-luo 18 days ago

#4202 - [Azure] Update azure dependencies in setup.py

Issue - State: closed - Opened by romilbhardwaj 18 days ago - 1 comment

#4201 - [serve] fix aws s3 sync in other regions

Pull Request - State: closed - Opened by cg505 19 days ago - 1 comment

#4200 - [ux] re-provision cluster if --fast but skypilot wheel is outdated

Pull Request - State: closed - Opened by cg505 19 days ago

#4199 - [UX] Better logging when user program OOM'ed

Issue - State: closed - Opened by Michaelvll 19 days ago - 1 comment
Labels: P0

#4198 - [UX] Improve Formatting of Post Job Creation Logs

Pull Request - State: closed - Opened by andylizf 19 days ago - 4 comments

#4197 - Skypilot only wants to spawn 4 core cpu controller when sky serve up

Issue - State: open - Opened by mainey 19 days ago - 3 comments

#4196 - Remove outdated pylint disabling comments

Pull Request - State: closed - Opened by andylizf 20 days ago - 1 comment

#4195 - [Jobs DAG] Flexible DAG Workflow Job Cancellation Policy

Issue - State: open - Opened by andylizf 20 days ago - 3 comments

#4194 - [Jobs] Fix `is_chain` to check in- and out-degrees

Pull Request - State: closed - Opened by andylizf 21 days ago

#4193 - [Core] Fix job race condition.

Pull Request - State: closed - Opened by cblmemo 21 days ago - 7 comments

#4192 - [Core][Tests] Several smoke test failed on latest master

Issue - State: open - Opened by cblmemo 21 days ago - 5 comments

#4191 - [k8s] Inconsistent Display in Setting up a Local Cluster

Issue - State: closed - Opened by root-hbx 21 days ago - 8 comments

#4189 - Switching to AWS account with insufficient permissions with AWS enabled crashes `sky launch`

Issue - State: open - Opened by concretevitamin 21 days ago
Labels: good first issue, friction-log, interface/ux

#4187 - Bug: `is_chain` function inaccurately detects chain structure

Issue - State: closed - Opened by andylizf 21 days ago - 1 comment

#4186 - [Job] Support DAG execution by replacing `is_chain` with `is_dag` check

Pull Request - State: closed - Opened by andylizf 21 days ago - 1 comment

#4185 - [Jobs] Refactor: Extract task failure state update helper

Pull Request - State: closed - Opened by andylizf 21 days ago - 1 comment

#4184 - [dev] restrict pylint to changed files

Pull Request - State: closed - Opened by cg505 22 days ago - 1 comment

#4183 - Minor: Jobs docs fix.

Pull Request - State: closed - Opened by concretevitamin 22 days ago

#4182 - [test] update default clouds for smoke tests

Pull Request - State: closed - Opened by cg505 22 days ago

#4180 - [OCI] lazy import

Pull Request - State: closed - Opened by asaiacai 22 days ago - 2 comments

#4179 - [Core] No oci installation abort the `sky launch`

Issue - State: closed - Opened by cblmemo 22 days ago - 1 comment
Labels: P0

#4178 - Fix OCI import issue

Pull Request - State: closed - Opened by yika-luo 22 days ago

#4177 - [Docs] Update Managed Jobs page.

Pull Request - State: closed - Opened by concretevitamin 22 days ago - 1 comment

#4176 - [k8s] Add retry for apparmor failures

Pull Request - State: closed - Opened by romilbhardwaj 23 days ago

#4175 - [Core] Remove backward compatibility code for 0.6.0 & 0.7.0

Pull Request - State: closed - Opened by cblmemo 23 days ago - 12 comments

#4173 - [UX] remove all uses of deprecated `sky jobs`

Pull Request - State: closed - Opened by cg505 23 days ago - 1 comment

#4171 - [UX] remove deprecated `sky spot` CLI and `sky.spot_xxx` API

Pull Request - State: open - Opened by cg505 23 days ago

#4169 - [Jobs] Add option to specify `max_restarts_on_errors`

Pull Request - State: closed - Opened by Michaelvll 23 days ago

#4168 - Remove --system-site-packages when setup sky cluster

Pull Request - State: closed - Opened by yika-luo 23 days ago