Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / skypilot-org/skypilot issues and pull requests
#4268 - [k8s] Jobs controller on stale context needs better error messages
Issue -
State: open - Opened by romilbhardwaj 10 days ago
- 2 comments
#4267 - [jobs] autodown managed job clusters
Pull Request -
State: closed - Opened by cg505 10 days ago
#4266 - [k8s] Add `lsof` to k8s base image
Issue -
State: closed - Opened by romilbhardwaj 11 days ago
- 1 comment
#4265 - runpod 4090 spot not available
Issue -
State: open - Opened by alita-moore 11 days ago
- 4 comments
#4264 - [Core] Avoid PENDING job to be set to FAILED and speed up job scheduling
Pull Request -
State: closed - Opened by Michaelvll 11 days ago
#4263 - [Core] Submitting 1000 jobs to a cluster
Issue -
State: closed - Opened by Michaelvll 11 days ago
- 1 comment
Labels: P0
#4262 - [docs]: OCI key_file path clarrification
Pull Request -
State: closed - Opened by HysunHe 12 days ago
- 2 comments
#4261 - [k8s] Add flag to disable ssh setup
Pull Request -
State: closed - Opened by romilbhardwaj 12 days ago
- 2 comments
#4260 - [Core] Ray job refused to submit jobs in PENDING status
Issue -
State: open - Opened by Michaelvll 12 days ago
Labels: P0
#4259 - Linting updates
Pull Request -
State: open - Opened by andylizf 12 days ago
- 1 comment
#4258 - Add a pre commit config to help format before pushing
Pull Request -
State: open - Opened by zpoint 12 days ago
- 5 comments
#4257 - [Jobs] Allowing to specify intermediate bucket for file upload
Pull Request -
State: open - Opened by zpoint 12 days ago
- 3 comments
#4256 - Add Envoy as an alternative Sky Serve load balancer implementation
Pull Request -
State: open - Opened by ejj 12 days ago
#4255 - Implement Automatic Bucket Creation and Data Transfer in `with_data` API
Issue -
State: open - Opened by andylizf 13 days ago
- 1 comment
#4254 - Implement `with_data` API for Edge-Based Data Flow in Task DAGs
Issue -
State: open - Opened by andylizf 13 days ago
#4253 - [Dashboard] Add a simple status filter.
Pull Request -
State: closed - Opened by concretevitamin 13 days ago
#4252 - [AWS] Disable additional auto update services for ubuntu image with cloud-init
Pull Request -
State: closed - Opened by Michaelvll 13 days ago
#4251 - SSH Agent forwarding not working for `run` section
Issue -
State: open - Opened by chris-aeviator 13 days ago
- 1 comment
#4250 - Bug: `stream_logs_by_id` incorrectly handles task retry logic
Issue -
State: open - Opened by andylizf 14 days ago
- 2 comments
#4249 - Refactor `stream_logs_by_id` to extract single task monitoring logic
Pull Request -
State: open - Opened by andylizf 14 days ago
- 1 comment
#4248 - [Jobs] Limit number of concurrent jobs & launches.
Pull Request -
State: open - Opened by cblmemo 14 days ago
#4247 - do not redirect stderr to /dev/null when submitting job
Pull Request -
State: closed - Opened by cg505 15 days ago
- 2 comments
#4246 - Disable more potential unattended upgrade sources for AWS
Pull Request -
State: closed - Opened by yika-luo 15 days ago
#4245 - [Jobs] A way to keep the managed job for a while after user program failure
Issue -
State: open - Opened by Michaelvll 15 days ago
- 1 comment
Labels: P0, triage
#4244 - [AWS/Azure] Avoid error out during image size check
Pull Request -
State: closed - Opened by Michaelvll 15 days ago
#4243 - [Jobs] Managed job controller process taking too much memory during peak time
Issue -
State: open - Opened by Michaelvll 15 days ago
- 1 comment
Labels: P0
#4242 - [k8s] pod resource limit
Issue -
State: closed - Opened by bgyoon 15 days ago
#4241 - [UX] Support --tail parameter for sky logs
Pull Request -
State: closed - Opened by zpoint 15 days ago
- 3 comments
#4240 - [k8s] Parallelize setup for faster multi-node provisioning
Pull Request -
State: closed - Opened by romilbhardwaj 15 days ago
- 2 comments
#4239 - [Storage] Avoid opt-in regions for S3
Pull Request -
State: closed - Opened by romilbhardwaj 16 days ago
#4238 - [tests] Exclude runpod from smoke tests unless specified
Pull Request -
State: closed - Opened by romilbhardwaj 16 days ago
#4237 - [Core] dpkg lock showing up with AWS custom ubuntu image
Issue -
State: closed - Opened by Michaelvll 16 days ago
- 1 comment
Labels: P0
#4236 - [Tests] `test_tpu_vm_pod` failing on master
Issue -
State: open - Opened by romilbhardwaj 16 days ago
#4235 - cannot run `sky jobs logs -n <job_name>` on SUCCEEDED job
Issue -
State: open - Opened by cg505 16 days ago
#4234 - [Managed Jobs] Reduce the resource requirement for the controller process for more parallel jobs
Issue -
State: open - Opened by Michaelvll 16 days ago
#4233 - [k8s] Prevent mounting of /dev/shm in pods
Issue -
State: open - Opened by roclark 16 days ago
- 1 comment
#4232 - [Core/UX] Improve the display of returncode for multi-node
Issue -
State: open - Opened by Michaelvll 16 days ago
Labels: P0
#4231 - [ux] add sky jobs launch --fast
Pull Request -
State: closed - Opened by cg505 16 days ago
- 1 comment
#4230 - [UX] Show 0.25 on controller queue
Pull Request -
State: closed - Opened by Michaelvll 16 days ago
- 2 comments
#4229 - [k8s] Parallelize pod initialization steps
Issue -
State: open - Opened by romilbhardwaj 16 days ago
#4228 - [Release] Release 0.7.0
Pull Request -
State: closed - Opened by romilbhardwaj 16 days ago
- 2 comments
#4227 - [Core] Make home address replacement more robust
Pull Request -
State: closed - Opened by Michaelvll 16 days ago
#4225 - [k8s] Skip SSH setup for faster provisioning
Issue -
State: open - Opened by romilbhardwaj 16 days ago
#4224 - Update K8s docker image build and the source artifact registry
Pull Request -
State: closed - Opened by yika-luo 17 days ago
- 1 comment
#4223 - fix docstring for write_cluster_config
Pull Request -
State: closed - Opened by cg505 17 days ago
#4222 - [UX] `sky logs` should be able to tail the last lines of the logs instead of showing all logs
Issue -
State: closed - Opened by Michaelvll 17 days ago
- 1 comment
Labels: good first issue, P0
#4221 - [Docs] Tpu v6 docs
Pull Request -
State: closed - Opened by Michaelvll 17 days ago
#4220 - [Core] Support TPU v6
Pull Request -
State: closed - Opened by cblmemo 17 days ago
#4219 - Add user toolkits to all sky custom images and fix PyTorch issue on A10
Pull Request -
State: closed - Opened by yika-luo 17 days ago
#4218 - [Catalog] Add TPU V6e.
Pull Request -
State: closed - Opened by cblmemo 17 days ago
- 1 comment
#4217 - [test] smoke test fixes for managed jobs
Pull Request -
State: closed - Opened by cg505 17 days ago
#4216 - [Tests] Fix public bucket tests
Pull Request -
State: closed - Opened by romilbhardwaj 17 days ago
- 2 comments
#4215 - [TPU] TPU v6 support
Pull Request -
State: closed - Opened by Michaelvll 17 days ago
- 1 comment
#4214 - [Tests] Add test for `max_restarts_on_errors`
Issue -
State: open - Opened by Michaelvll 17 days ago
#4213 - [Jobs] Fix jobs name
Pull Request -
State: closed - Opened by Michaelvll 17 days ago
- 1 comment
#4212 - Mitigating the Impact of Pylint's Inherent Limitations on Functionality of `format.sh`
Pull Request -
State: open - Opened by root-hbx 17 days ago
- 13 comments
#4211 - [Tests] Managed Jobs smoke test failed on latest master
Issue -
State: closed - Opened by cblmemo 17 days ago
#4210 - [UI] Ads on the SkyPilot documentation page
Issue -
State: open - Opened by MaoZiming 17 days ago
- 1 comment
#4209 - [Core] Fix issue with the wrong path of setup logs
Pull Request -
State: closed - Opened by Michaelvll 17 days ago
#4208 - [k8s] Fix show-gpus when limited permissions are available
Pull Request -
State: closed - Opened by romilbhardwaj 18 days ago
- 4 comments
#4207 - [Jobs] Support syncing down logs for `sky jobs logs`
Pull Request -
State: open - Opened by euclidgame 18 days ago
#4206 - [k8s] Add validation for `pod_config`
Issue -
State: open - Opened by romilbhardwaj 18 days ago
#4205 - [Performance] Speed up Azure A10 instance creation
Pull Request -
State: closed - Opened by yika-luo 18 days ago
- 1 comment
#4204 - Upgrade Azure SDK version requirement
Pull Request -
State: closed - Opened by yika-luo 18 days ago
#4203 - Update packer scripts
Pull Request -
State: closed - Opened by yika-luo 18 days ago
#4202 - [Azure] Update azure dependencies in setup.py
Issue -
State: closed - Opened by romilbhardwaj 18 days ago
- 1 comment
#4201 - [serve] fix aws s3 sync in other regions
Pull Request -
State: closed - Opened by cg505 19 days ago
- 1 comment
#4200 - [ux] re-provision cluster if --fast but skypilot wheel is outdated
Pull Request -
State: closed - Opened by cg505 19 days ago
#4199 - [UX] Better logging when user program OOM'ed
Issue -
State: closed - Opened by Michaelvll 19 days ago
- 1 comment
Labels: P0
#4198 - [UX] Improve Formatting of Post Job Creation Logs
Pull Request -
State: closed - Opened by andylizf 19 days ago
- 4 comments
#4197 - Skypilot only wants to spawn 4 core cpu controller when sky serve up
Issue -
State: open - Opened by mainey 19 days ago
- 3 comments
#4196 - Remove outdated pylint disabling comments
Pull Request -
State: closed - Opened by andylizf 20 days ago
- 1 comment
#4195 - [Jobs DAG] Flexible DAG Workflow Job Cancellation Policy
Issue -
State: open - Opened by andylizf 20 days ago
- 3 comments
#4194 - [Jobs] Fix `is_chain` to check in- and out-degrees
Pull Request -
State: closed - Opened by andylizf 21 days ago
#4193 - [Core] Fix job race condition.
Pull Request -
State: closed - Opened by cblmemo 21 days ago
- 7 comments
#4192 - [Core][Tests] Several smoke test failed on latest master
Issue -
State: open - Opened by cblmemo 21 days ago
- 5 comments
#4191 - [k8s] Inconsistent Display in Setting up a Local Cluster
Issue -
State: closed - Opened by root-hbx 21 days ago
- 8 comments
#4190 - [k8s] Requesting `--cpus 1.5` and starting a user Ray program crashes
Issue -
State: open - Opened by concretevitamin 21 days ago
Labels: k8s
#4189 - Switching to AWS account with insufficient permissions with AWS enabled crashes `sky launch`
Issue -
State: open - Opened by concretevitamin 21 days ago
Labels: good first issue, friction-log, interface/ux
#4188 - [k8s] Support in-cluster and kubeconfig auth simultaneously
Pull Request -
State: open - Opened by romilbhardwaj 21 days ago
#4187 - Bug: `is_chain` function inaccurately detects chain structure
Issue -
State: closed - Opened by andylizf 21 days ago
- 1 comment
#4186 - [Job] Support DAG execution by replacing `is_chain` with `is_dag` check
Pull Request -
State: closed - Opened by andylizf 21 days ago
- 1 comment
#4185 - [Jobs] Refactor: Extract task failure state update helper
Pull Request -
State: closed - Opened by andylizf 21 days ago
- 1 comment
#4184 - [dev] restrict pylint to changed files
Pull Request -
State: closed - Opened by cg505 22 days ago
- 1 comment
#4183 - Minor: Jobs docs fix.
Pull Request -
State: closed - Opened by concretevitamin 22 days ago
#4182 - [test] update default clouds for smoke tests
Pull Request -
State: closed - Opened by cg505 22 days ago
#4181 - [Tests] Add unit test to avoid SkyPilot from failing when only some clouds are enabled
Issue -
State: open - Opened by Michaelvll 22 days ago
#4180 - [OCI] lazy import
Pull Request -
State: closed - Opened by asaiacai 22 days ago
- 2 comments
#4179 - [Core] No oci installation abort the `sky launch`
Issue -
State: closed - Opened by cblmemo 22 days ago
- 1 comment
Labels: P0
#4178 - Fix OCI import issue
Pull Request -
State: closed - Opened by yika-luo 22 days ago
#4177 - [Docs] Update Managed Jobs page.
Pull Request -
State: closed - Opened by concretevitamin 22 days ago
- 1 comment
#4176 - [k8s] Add retry for apparmor failures
Pull Request -
State: closed - Opened by romilbhardwaj 23 days ago
#4175 - [Core] Remove backward compatibility code for 0.6.0 & 0.7.0
Pull Request -
State: closed - Opened by cblmemo 23 days ago
- 12 comments
#4174 - [k8s] Using AppArmor causes provisioning failures on certain k8s clusters
Issue -
State: closed - Opened by romilbhardwaj 23 days ago
#4173 - [UX] remove all uses of deprecated `sky jobs`
Pull Request -
State: closed - Opened by cg505 23 days ago
- 1 comment
#4172 - [Test] `tests/unit_tests/test_controller_utils.py` fails when controller resources is set
Issue -
State: open - Opened by cblmemo 23 days ago
#4171 - [UX] remove deprecated `sky spot` CLI and `sky.spot_xxx` API
Pull Request -
State: open - Opened by cg505 23 days ago
#4170 - [k8s][gke][dws] autodown not toggled if file sync fails
Issue -
State: open - Opened by asaiacai 23 days ago
#4169 - [Jobs] Add option to specify `max_restarts_on_errors`
Pull Request -
State: closed - Opened by Michaelvll 23 days ago
#4168 - Remove --system-site-packages when setup sky cluster
Pull Request -
State: closed - Opened by yika-luo 23 days ago