Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / facebookincubator/submitit issues and pull requests
#1781 - Setting `env` in `CommandFunction` overrides PATH env var
Issue -
State: open - Opened by niansong1996 about 22 hours ago
#1780 - Gracefully exit rank != 0 job steps on slurm cluster
Pull Request -
State: open - Opened by erjel 18 days ago
Labels: CLA Signed
#1779 - Run in container
Issue -
State: closed - Opened by philmod-h about 1 month ago
- 1 comment
#1778 - Max jobs progressively
Pull Request -
State: closed - Opened by NickKocher about 2 months ago
- 1 comment
#1777 - Include email parameters in job submission
Issue -
State: open - Opened by nicolazilio about 2 months ago
#1776 - Determine if a job failed due to exceeding the time limit
Issue -
State: open - Opened by lee-jin-gyu96 2 months ago
#1775 - How to catch timed out array jobs
Issue -
State: closed - Opened by lee-jin-gyu96 2 months ago
- 2 comments
#1774 - Fix black
Pull Request -
State: closed - Opened by jrapin 2 months ago
Labels: CLA Signed
#1773 - Update CI packages
Pull Request -
State: closed - Opened by jrapin 2 months ago
Labels: CLA Signed
#1772 - ModuleNotFoundError: No module named "models"
Issue -
State: closed - Opened by victoris93 3 months ago
- 4 comments
#1771 - Keep original tmp slurm submission file as a hidden symlink
Pull Request -
State: closed - Opened by xman1979 3 months ago
- 1 comment
Labels: CLA Signed
#1770 - Consider PREEMPTED state as "not done"
Pull Request -
State: open - Opened by denizokt 3 months ago
- 1 comment
#1769 - Out Of Memory
Issue -
State: open - Opened by owaisCS 3 months ago
#1768 - Activate github actions
Pull Request -
State: closed - Opened by jrapin 4 months ago
Labels: CLA Signed
#1767 - Allow hiding extra env variables in clean_env()
Pull Request -
State: closed - Opened by baldassarreFe 4 months ago
- 1 comment
Labels: CLA Signed
#1766 - How to retrieve a running job and cancel it
Issue -
State: open - Opened by mzhang2FW 6 months ago
- 4 comments
#1765 - Failure of submitit 1.5.1 with python 3.12 because of missing pkg_resources
Issue -
State: open - Opened by GianlucaFicarelli 6 months ago
#1764 - Can I use torchrun with submitit?
Issue -
State: open - Opened by vasudev-sharma 9 months ago
- 1 comment
#1763 - Using `RsyncSnapshot` with a editable package install
Issue -
State: closed - Opened by jc-audet 9 months ago
- 2 comments
#1762 - Submitit jobs die with no error on cluster with SLURM 19.05
Issue -
State: open - Opened by mihdalal 10 months ago
- 1 comment
#1761 - Unexpected behavior of memory specification between `AutoExecutor` and `SlurmExecutor`
Issue -
State: open - Opened by mshvartsman 10 months ago
#1760 - Turn off Signal Handling
Issue -
State: open - Opened by lukasbm 11 months ago
- 2 comments
#1759 - Too many sacct requests for batched tasks
Issue -
State: open - Opened by Fadelis98 11 months ago
#1758 - Failed to launch: Invalid wckey specification
Issue -
State: open - Opened by rskwesterman 11 months ago
- 1 comment
#1757 - When 'submitit' meet 'mpirun', there will be a very strange BUG.
Issue -
State: closed - Opened by yinkaaiwu 12 months ago
- 3 comments
#1756 - Improving performance with NVidia GPU affinity?
Issue -
State: open - Opened by giorgos117 12 months ago
#1755 - Bump version to 1.5.1
Pull Request -
State: closed - Opened by jrapin about 1 year ago
Labels: CLA Signed
#1754 - Add optional setup step for local executor
Pull Request -
State: closed - Opened by jrapin about 1 year ago
- 1 comment
Labels: CLA Signed
#1752 - Update version to 1.5.0 (and drop support for Python 3.6 and 3.7)
Pull Request -
State: closed - Opened by jrapin about 1 year ago
Labels: CLA Signed
#1751 - Update pylint version
Pull Request -
State: closed - Opened by jrapin about 1 year ago
Labels: CLA Signed
#1750 - Enable python executable selection for local executor
Pull Request -
State: closed - Opened by jrapin about 1 year ago
- 3 comments
Labels: CLA Signed
#1749 - Update black and isort versions
Pull Request -
State: closed - Opened by jrapin about 1 year ago
- 1 comment
Labels: CLA Signed
#1748 - Does setting `folder` in `AutoExecutor` interfere with sattach?
Issue -
State: open - Opened by fleimgruber about 1 year ago
- 1 comment
#1747 - Update mypy (and CI python version to 3.8)
Pull Request -
State: closed - Opened by jrapin about 1 year ago
- 1 comment
Labels: CLA Signed
#1746 - Make local job instances picklable
Pull Request -
State: closed - Opened by jrapin about 1 year ago
Labels: CLA Signed
#1745 - Add an option for not using srun
Pull Request -
State: closed - Opened by jrapin about 1 year ago
- 1 comment
Labels: CLA Signed
#1744 - Add support for OAR Scheduler
Pull Request -
State: open - Opened by ychiat35 about 1 year ago
- 3 comments
Labels: CLA Signed
#1743 - Update version tag to 1.4.6 for release
Pull Request -
State: closed - Opened by jrapin about 1 year ago
Labels: CLA Signed
#1742 - timeout_min=0 results in pending jobs when a Slurm partition timelimit is set
Issue -
State: closed - Opened by ddangu525 about 1 year ago
#1741 - Support Slurm Heterogeneous Job
Issue -
State: open - Opened by sunshine-syz about 1 year ago
- 2 comments
#1740 - Add nodelist/mail params/dependency as first class slurm parameters
Pull Request -
State: closed - Opened by jrapin over 1 year ago
- 1 comment
Labels: CLA Signed
#1739 - Enabling sbatch file re-use.
Issue -
State: open - Opened by alexnwang over 1 year ago
- 2 comments
#1738 - AttributeError , AutoExecutor attribute not recognised by submitit
Issue -
State: closed - Opened by willianck over 1 year ago
- 1 comment
#1737 - Conda version out of date
Issue -
State: open - Opened by Ubadub over 1 year ago
- 1 comment
#1736 - array_parallelism on local machine
Issue -
State: closed - Opened by sparisi over 1 year ago
- 1 comment
#1735 - Documentation of `executor.update_parameters` arguments
Issue -
State: open - Opened by JoeZiminski over 1 year ago
- 1 comment
#1734 - Submitit with SLURM sub-scheduling
Issue -
State: open - Opened by giorgos117 over 1 year ago
- 2 comments
#1733 - Running on Galahad
Pull Request -
State: closed - Opened by mb010 over 1 year ago
- 1 comment
#1732 - Requeueing on timeouts when launching jobs with CommandFunction
Issue -
State: open - Opened by Niccolo-Ajroldi over 1 year ago
#1731 - SLURM Job keeps running after Successful Job Completon (Hydra Submitit Plugin)
Issue -
State: open - Opened by subho406 over 1 year ago
- 2 comments
#1730 - SLURM Jobs keep running after successful job completion.
Issue -
State: closed - Opened by subho406 over 1 year ago
#1729 - Add singularity compatibility #1608
Pull Request -
State: closed - Opened by gwenzek over 1 year ago
- 3 comments
Labels: CLA Signed
#1728 - Add custom options to sbatch command in SLURM
Issue -
State: open - Opened by nilskober over 1 year ago
- 4 comments
#1727 - duplicate tasks when using `SlurmExecutor.map_array`
Issue -
State: closed - Opened by eringrant almost 2 years ago
- 3 comments
#1726 - Submitit with sbatch
Issue -
State: open - Opened by pfrwilson almost 2 years ago
- 6 comments
Labels: question
#1725 - Remove `#SBATCH --nodes=1`
Issue -
State: closed - Opened by sgbaird almost 2 years ago
- 3 comments
#1724 - Compute Canada
Issue -
State: closed - Opened by kaijieshi7 almost 2 years ago
- 1 comment
#1723 - Can submitit manage chain dependencies?
Issue -
State: closed - Opened by eserie almost 2 years ago
- 1 comment
#1722 - Should we submit job on login node?
Issue -
State: open - Opened by surajmenon72 almost 2 years ago
- 1 comment
Labels: question
#1721 - No user code logging output is shown in logs
Issue -
State: closed - Opened by fleimgruber about 2 years ago
- 2 comments
#1720 - be tolerating about sacct error?
Issue -
State: closed - Opened by min-xu-ai about 2 years ago
- 2 comments
#1719 - Consider supporting slurm rest api
Issue -
State: open - Opened by zeronewb about 2 years ago
- 2 comments
Labels: question
#1718 - array_parallelism for LocalExecutor
Issue -
State: closed - Opened by se-ok about 2 years ago
- 2 comments
#1717 - How to load the original code point when preempted and rescheduled if the code is changed before rescheduling?
Issue -
State: closed - Opened by dahyun-kang about 2 years ago
- 1 comment
#1716 - InfoWatch might get previous jobid info after slurm restart
Issue -
State: open - Opened by Liangtaiwan about 2 years ago
- 1 comment
Labels: question
#1715 - reraise exception back to user
Pull Request -
State: open - Opened by gwenzek about 2 years ago
- 6 comments
Labels: CLA Signed
#1714 - Printing in Signal Handlers May Be Unsafe
Issue -
State: open - Opened by Queuecumber about 2 years ago
- 1 comment
Labels: enhancement
#1713 - Add a timeout to scontrol requeue + explicitely delete function before pickling
Pull Request -
State: closed - Opened by jrapin about 2 years ago
- 1 comment
Labels: CLA Signed
#1712 - UnicodeDecodeError fails the job
Issue -
State: open - Opened by phtu-cs about 2 years ago
- 2 comments
Labels: question
#1711 - Submit Over SSH?
Issue -
State: open - Opened by JRJacoby about 2 years ago
- 4 comments
Labels: enhancement
#1710 - submitit.core.utils.FailedJobError: sbatch: error: Parameter --gres=gpu:1 no longer acceptable, please switch to --gpus=1
Issue -
State: closed - Opened by RoyAmoyal about 2 years ago
- 1 comment
#1709 - Switching from USR1 Breaks Pytorch Lightning
Issue -
State: open - Opened by Queuecumber about 2 years ago
- 4 comments
#1708 - Unwanted behavior after a slurm job time limit
Issue -
State: closed - Opened by ofir1080 over 2 years ago
- 1 comment
#1707 - Recover jobs after kernel dies
Issue -
State: closed - Opened by SamuelGabriel over 2 years ago
- 1 comment
#1706 - Fallback to slurm for TorchDistributedEnv
Pull Request -
State: open - Opened by jrapin over 2 years ago
- 1 comment
Labels: CLA Signed
#1705 - Option to overwrite exported variables in TorchDistributedEnvironment
Pull Request -
State: closed - Opened by qasfb over 2 years ago
- 3 comments
Labels: CLA Signed
#1704 - Submitit puts all tasks on a single GPU
Issue -
State: closed - Opened by Bai-YT over 2 years ago
- 3 comments
#1703 - Add helper class to facilitate PyTorch distributed initialization
Pull Request -
State: closed - Opened by patricklabatut over 2 years ago
- 3 comments
Labels: CLA Signed
#1702 - Revert to using USR2 ("cleaner" option)
Pull Request -
State: closed - Opened by jrapin over 2 years ago
Labels: CLA Signed
#1701 - Use SIGHUP as default for preemption signal
Pull Request -
State: closed - Opened by jrapin over 2 years ago
Labels: CLA Signed
#1700 - [Tentative] Rerun jobs more easily
Pull Request -
State: open - Opened by jrapin over 2 years ago
- 1 comment
Labels: CLA Signed
#1699 - Add a helper for temporary removing slurm and submitit env variables
Pull Request -
State: closed - Opened by jrapin over 2 years ago
Labels: CLA Signed
#1698 - submit job array to multiple partitions
Issue -
State: open - Opened by MinkyuHa over 2 years ago
#1697 - [NCCL Conflict] Use USR2 instead of USR1
Pull Request -
State: closed - Opened by jrapin over 2 years ago
- 2 comments
Labels: CLA Signed
#1696 - How to specify GPUs when executing locally?
Issue -
State: closed - Opened by j0ma over 2 years ago
- 5 comments
#1695 - Fix submissions running on Windows
Pull Request -
State: open - Opened by tmct over 2 years ago
Labels: CLA Signed
#1694 - Skip unnecessary pickle file checks
Pull Request -
State: closed - Opened by tmct over 2 years ago
- 5 comments
Labels: CLA Signed
#1693 - Fix truncation of "Executor" in executor class name
Pull Request -
State: closed - Opened by tmct over 2 years ago
- 5 comments
Labels: CLA Signed
#1692 - NodeList Declaration
Issue -
State: closed - Opened by Bontempogianpaolo1 over 2 years ago
- 2 comments
#1691 - make a git tag on `make release`
Pull Request -
State: closed - Opened by gwenzek over 2 years ago
Labels: CLA Signed
#1690 - Latest versions' tags not on Github
Issue -
State: closed - Opened by tmct over 2 years ago
- 2 comments
#1689 - Task does not wait for GPU memory resources
Issue -
State: closed - Opened by chirico85 over 2 years ago
- 1 comment
#1688 - `make integration` should also run `pip install --upgrade`
Pull Request -
State: closed - Opened by gwenzek over 2 years ago
Labels: CLA Signed
#1687 - filedescriptor out of range in select()
Issue -
State: closed - Opened by timlacroix over 2 years ago
- 1 comment
#1686 - Use poll instead of select
Pull Request -
State: closed - Opened by timlacroix over 2 years ago
- 2 comments
Labels: CLA Signed
#1685 - `tasks_per_node=1` does not keep the number of tasks to 1 for the `LocalExecutor`
Issue -
State: open - Opened by ihowell over 2 years ago
- 4 comments
Labels: enhancement
#1684 - Progress Bar for Jobs (Implementation)
Issue -
State: closed - Opened by yuvalkirstain over 2 years ago
- 4 comments
#1683 - Update broken link in nevergrad.md
Pull Request -
State: closed - Opened by charmoniumQ over 2 years ago
- 3 comments
Labels: CLA Signed
#1682 - Release version 1.4 to pypi
Issue -
State: closed - Opened by OhadRubin over 2 years ago
- 1 comment
#1681 - [enhancement] Time info, like time taken, within Job objects
Issue -
State: open - Opened by mennowitteveen over 2 years ago
- 2 comments
Labels: enhancement