Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / skypilot-org/skypilot issues and pull requests
#4369 - Mount cached mode
Pull Request -
State: open - Opened by landscapepainter 1 day ago
#4368 - [Serve] Feature request: support num_nodes for the Controller
Issue -
State: open - Opened by HysunHe 1 day ago
#4367 - [Core] NoCloudAccessError check is escaped from storage sync
Issue -
State: open - Opened by HysunHe 1 day ago
- 1 comment
#4366 - [Core] NoCloudAccessError check is escaped from storage sync
Pull Request -
State: closed - Opened by HysunHe 1 day ago
#4365 - Preliminary Vast AI support
Pull Request -
State: open - Opened by kristopolous 1 day ago
#4364 - [DAG] Run global optimization on controller for task placement
Pull Request -
State: closed - Opened by andylizf 2 days ago
- 1 comment
#4363 - [Core] Environment variables should be parsed at task execution, not `sky.Task` instantiation
Issue -
State: open - Opened by romilbhardwaj 2 days ago
#4362 - [WIP][Serve] Enable launching multiple external LB on controller.
Pull Request -
State: open - Opened by cblmemo 2 days ago
#4361 - [Docs] Fix some issues with Managed Jobs example.
Pull Request -
State: open - Opened by concretevitamin 2 days ago
#4360 - GCS file mount sync hangs if GCP credentials are expired
Issue -
State: open - Opened by cg505 2 days ago
#4359 - [FluidStack] Fix provisioning and add new gpu types
Pull Request -
State: open - Opened by mjibril 2 days ago
#4358 - [Jobs] Remove assertion for one single controller resources.
Pull Request -
State: closed - Opened by cblmemo 2 days ago
#4357 - [k8s] fix managed job issue on k8s
Pull Request -
State: open - Opened by nkwangleiGIT 2 days ago
#4356 - [Serve] Enable multiple ports in SkyServe replicas
Pull Request -
State: open - Opened by Conless 3 days ago
- 1 comment
#4355 - [Core] Unblock user program for SIGINT
Issue -
State: open - Opened by Michaelvll 3 days ago
Labels: triage
#4354 - [Docs] resize image and move path up a level.
Pull Request -
State: closed - Opened by concretevitamin 3 days ago
#4353 - decorated functions are not properly typechecked
Issue -
State: open - Opened by cg505 3 days ago
#4352 - [Docs] Update k8s docs
Pull Request -
State: closed - Opened by romilbhardwaj 3 days ago
#4351 - remove empty file mount from yaml config
Issue -
State: open - Opened by cg505 3 days ago
#4350 - [AWS] Not robust identity checking
Issue -
State: closed - Opened by Michaelvll 3 days ago
- 1 comment
#4349 - [Serve] Failure-count based unrecoverable failure detection
Issue -
State: open - Opened by cblmemo 3 days ago
#4348 - [Serve] Fall back to latest ready version when detects unrecoverable failure
Issue -
State: open - Opened by cblmemo 3 days ago
#4347 - Added user agent string for catalog downloading request
Pull Request -
State: closed - Opened by shashank2000 3 days ago
#4346 - sky jobs launch on Kubernetes seems not working now
Issue -
State: open - Opened by nkwangleiGIT 3 days ago
- 13 comments
#4345 - Update `--env-file` to sky doc
Pull Request -
State: closed - Opened by zpoint 3 days ago
#4344 - Doesn't use right GCP config path on Windows
Issue -
State: open - Opened by alexkreidler 4 days ago
#4343 - [k8s] Leaked kubectl port-forward processes
Issue -
State: open - Opened by romilbhardwaj 4 days ago
#4342 - [Docs] Add a concept page.
Pull Request -
State: closed - Opened by concretevitamin 4 days ago
#4341 - [perf] optimizations for sky jobs launch
Pull Request -
State: open - Opened by cg505 4 days ago
#4340 - [timeline] disable trace collection if SKYPILOT_TIMELINE_FILE_PATH is not set
Issue -
State: open - Opened by cg505 4 days ago
#4339 - [Docs] Use `--fast` for job submission in tutorials
Pull Request -
State: open - Opened by Michaelvll 4 days ago
- 2 comments
#4338 - [OCI] Enable SkyServe for OCI
Pull Request -
State: closed - Opened by HysunHe 4 days ago
- 8 comments
#4337 - [k8s] support to use custom gpu resource name if it's not nvidia.com/gpu
Pull Request -
State: open - Opened by nkwangleiGIT 4 days ago
- 1 comment
#4336 - [UX] user-friendly message shown if Kubernetes is not enabled.
Pull Request -
State: open - Opened by zpoint 4 days ago
#4335 - [smoke] if --generic-cloud is set, force enable that cloud
Pull Request -
State: closed - Opened by cg505 5 days ago
#4334 - [Core] Importing `sky` and `sky.status(refresh=True)` takes about 65MB / 200MB memory
Issue -
State: open - Opened by Michaelvll 5 days ago
- 3 comments
#4333 - [Serve] Update log pattern in `_follow_replica_logs` for new UX 3.0
Pull Request -
State: closed - Opened by andylizf 5 days ago
- 2 comments
#4332 - [ux] cache cluster status of autostop or spot clusters for 2s
Pull Request -
State: open - Opened by cg505 5 days ago
#4331 - improve tracing reporting and coverage
Pull Request -
State: closed - Opened by cg505 5 days ago
#4330 - fix broken links when read the docs
Pull Request -
State: open - Opened by nkwangleiGIT 5 days ago
#4329 - [Serve] Temporary failure: infinite retry on GCP `compute.images.useReadOnly` permission error
Issue -
State: open - Opened by andylizf 5 days ago
#4328 - [fast] if cluster is INIT, force refresh before deciding to provision
Pull Request -
State: closed - Opened by cg505 5 days ago
#4327 - [Bug] Smoke tests `--generic-cloud` flag is ignored when specified cloud is not in `default_clouds_to_run`
Issue -
State: closed - Opened by andylizf 6 days ago
- 3 comments
#4326 - Add hourly price and instance type to env SKYPILOT_CLUSTER_INFO
Pull Request -
State: open - Opened by tylerweitzman 6 days ago
#4325 - [Tests] Fix smoke tests for new job creation log format
Pull Request -
State: closed - Opened by andylizf 6 days ago
#4324 - [Kubernates] Not user-friendly message shown if Kubernates is not enabled.
Issue -
State: closed - Opened by HysunHe 6 days ago
- 2 comments
#4323 - Refactor: Consolidate log streaming logic into centralized `log_utils.follow_logs()`
Pull Request -
State: closed - Opened by andylizf 6 days ago
- 1 comment
#4322 - [Catalog] fix GCP catalog missing SKUs
Pull Request -
State: closed - Opened by cblmemo 6 days ago
#4321 - [Jobs] Fast jobs cancellation for PENDING managed jobs
Pull Request -
State: open - Opened by Michaelvll 6 days ago
#4320 - [DAG] Integrate Data Storage Buckets for Data-Bearing Edges in Optimization
Pull Request -
State: open - Opened by euclidgame 7 days ago
- 4 comments
#4319 - [WIP] Advanced DAG Workflow.
Pull Request -
State: open - Opened by cblmemo 7 days ago
#4318 - [Core] Replace ray job submit for 3x/8.5x faster job scheduling for cluster/managed jobs
Pull Request -
State: closed - Opened by Michaelvll 7 days ago
- 10 comments
#4317 - [Storage] Call `sync_file_mounts` when either rsync or storage file_mounts are specified
Pull Request -
State: open - Opened by romilbhardwaj 7 days ago
#4316 - [docs][azure] Update config doc for azure resource group specification
Pull Request -
State: closed - Opened by landscapepainter 7 days ago
#4315 - [Storage] set_storage_mounts not working in python API
Issue -
State: open - Opened by romilbhardwaj 7 days ago
#4313 - [feature] the ability to recover skypilot data or commit to git
Issue -
State: open - Opened by alita-moore 7 days ago
#4312 - [feature] better handling of failed rollouts
Issue -
State: open - Opened by alita-moore 7 days ago
- 2 comments
#4311 - [Core] Allow more PENDING jobs to be scheduled concurrently (1.4x faster)
Pull Request -
State: open - Opened by Michaelvll 7 days ago
- 1 comment
#4310 - [Core] Avoid job scheduling race condition
Pull Request -
State: closed - Opened by Michaelvll 8 days ago
- 5 comments
#4309 - [DAG] Update Diamond Example For New Tentative Data API
Pull Request -
State: closed - Opened by andylizf 8 days ago
- 1 comment
#4308 - Flaky test: `test_optimizer_dryruns.py` occasionally fails
Issue -
State: closed - Opened by andylizf 8 days ago
- 1 comment
#4307 - [Core] Add `NO_UPLOAD` for `remote_identity`
Pull Request -
State: open - Opened by romilbhardwaj 8 days ago
#4306 - Custom benchmark for inference
Issue -
State: open - Opened by tylerweitzman 8 days ago
#4305 - [AWS] SSH issue when a large number of nodes are used in a cluster
Issue -
State: open - Opened by Michaelvll 8 days ago
Labels: triage
#4304 - [k8s] Remove `lsof` dependence for tailing logs
Pull Request -
State: closed - Opened by romilbhardwaj 8 days ago
#4303 - Fix AWS Route Table caching which causes invalid failures in other regions after an initial valid failure.
Pull Request -
State: closed - Opened by sfrolich 8 days ago
- 1 comment
#4302 - [Test] Fix unittest for region infer
Pull Request -
State: closed - Opened by Michaelvll 8 days ago
#4301 - sky serve update doesn't roll out updated service unless the yaml config changes
Issue -
State: open - Opened by alita-moore 8 days ago
- 2 comments
#4300 - [UX] Unnecessary logs from ray
Issue -
State: open - Opened by Michaelvll 8 days ago
- 1 comment
#4299 - [Jobs] Jobs launch --fast does not start the dashboard
Issue -
State: open - Opened by Michaelvll 8 days ago
- 1 comment
#4298 - Replace `len()` Zero Checks with Pythonic Empty Sequence Checks
Pull Request -
State: open - Opened by andylizf 9 days ago
- 1 comment
#4297 - [k8s] Parallelize multi-node setup
Pull Request -
State: closed - Opened by romilbhardwaj 9 days ago
#4296 - [Jobs] Cancelling managed jobs can take a long time
Issue -
State: open - Opened by Michaelvll 9 days ago
Labels: P0
#4295 - [Core] Speed up job scheduling speed on unmanaged jobs
Issue -
State: closed - Opened by Michaelvll 9 days ago
Labels: P0
#4294 - [Jobs] Speed up the time for managed jobs to be scheduled
Issue -
State: closed - Opened by Michaelvll 9 days ago
Labels: P0
#4293 - [Core] Cancel 1000 jobs can take 5-10 mins
Issue -
State: closed - Opened by Michaelvll 9 days ago
Labels: P0
#4292 - Refactor: Consolidate log streaming logic into centralized `log_utils.follow_logs()`
Pull Request -
State: closed - Opened by andylizf 9 days ago
- 2 comments
#4291 - Update Lambda Cloud regions
Pull Request -
State: closed - Opened by cbrownstein-lambda 9 days ago
#4290 - [Core] Make ssh connection more robust with custom proxy
Pull Request -
State: closed - Opened by Michaelvll 9 days ago
#4289 - make --fast robust against credential or wheel updates
Pull Request -
State: open - Opened by cg505 9 days ago
#4288 - [RunPod] Fix assertion in ports query.
Pull Request -
State: closed - Opened by cblmemo 9 days ago
#4287 - [Core][Docker] Support docker login on RunPod.
Pull Request -
State: open - Opened by cblmemo 9 days ago
- 1 comment
#4286 - AssertionError after manually deleting runpod instance
Issue -
State: closed - Opened by alita-moore 9 days ago
- 7 comments
#4285 - stuck at "STARTING" when launching with a custom image on runpod
Issue -
State: open - Opened by alita-moore 9 days ago
- 3 comments
#4284 - Support event based smoke test instead of sleep time based to reduce flaky test and faster test
Pull Request -
State: open - Opened by zpoint 9 days ago
#4283 - [Jobs] Managed jobs database use WAL mode
Pull Request -
State: closed - Opened by Michaelvll 9 days ago
#4281 - [AWS] Explicitly check credential and refresh if needed
Pull Request -
State: closed - Opened by Michaelvll 9 days ago
- 1 comment
#4280 - [DAG] Add Edge-Based Data Flow Support
Pull Request -
State: closed - Opened by andylizf 9 days ago
- 2 comments
#4279 - [DAG] Add DAG Visualization with Jupyter Support
Pull Request -
State: closed - Opened by andylizf 9 days ago
- 2 comments
#4278 - Set minimum port number a Ray worker can listen on to 11002
Pull Request -
State: closed - Opened by cbrownstein-lambda 9 days ago
#4277 - Add Basic Visualization Support for DAGs
Issue -
State: closed - Opened by andylizf 10 days ago
#4276 - [K8s] list_pod_for_all_namespaces gives ApiException: (403) if the user doesn't have necessary permissions
Issue -
State: open - Opened by hemildesai 10 days ago
- 3 comments
#4275 - [AWS] Credential retry for rotation is not effective
Issue -
State: open - Opened by Michaelvll 10 days ago
Labels: P0
#4274 - Fix `stream_logs` Duplicate Job Handling and TypeError
Pull Request -
State: closed - Opened by andylizf 10 days ago
- 1 comment
#4273 - Bug: `stream_logs` Fails Due to Incorrect Job ID Handling and Duplicate Job Names in Managed Jobs
Issue -
State: closed - Opened by andylizf 10 days ago
#4272 - Update comments pointing to Lambda's docs
Pull Request -
State: closed - Opened by cbrownstein-lambda 10 days ago
#4271 - [Admin Policy] Apply policy in CLI
Pull Request -
State: closed - Opened by Michaelvll 10 days ago
#4270 - [k8s] Fix check pod privileges
Pull Request -
State: closed - Opened by romilbhardwaj 10 days ago
- 1 comment
#4269 - runpod docker credentials not working when using image_id from private repository
Issue -
State: open - Opened by alita-moore 10 days ago
- 5 comments