Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / aws/sagemaker-pytorch-training-toolkit issues and pull requests

#253 - "Train": executable file not found in $PATH

Issue - State: open - Opened by celsofranssa 9 months ago

#252 - change: bypass DNS check for studio local exec

Pull Request - State: closed - Opened by mufaddal-rohawala 9 months ago - 12 comments

#251 - Fix: pin coverage version to fix pipeline issue

Pull Request - State: closed - Opened by yl-to about 1 year ago - 4 comments

#250 - Add PyTorch version environment variable, to facilitate SMTT

Pull Request - State: closed - Opened by yongyanrao about 1 year ago - 6 comments

#248 - feature: Add torch_distributed support for Trainium

Pull Request - State: closed - Opened by satishpasumarthi over 1 year ago - 12 comments

#247 - CVE-2007-4559 Patch

Pull Request - State: open - Opened by TrellixVulnTeam over 1 year ago

#246 - documentation: update README and contributing guidelines

Pull Request - State: closed - Opened by satishpasumarthi almost 2 years ago - 4 comments

#245 - Update README.rst with how it related to SMTT

Pull Request - State: closed - Opened by gilinachum almost 2 years ago - 3 comments

#244 - fix: provide option to use native process launcher

Pull Request - State: closed - Opened by satishpasumarthi almost 2 years ago - 30 comments

#243 - aaa

Pull Request - State: closed - Opened by cyberitech almost 2 years ago

#242 - aaa

Pull Request - State: closed - Opened by cyberitech almost 2 years ago

#241 - Feature: Support new distribution mechanism for PT-XLA

Pull Request - State: closed - Opened by Lokiiiiii almost 2 years ago - 8 comments

#240 - Test/fix

Pull Request - State: closed - Opened by nish2104 almost 2 years ago - 5 comments

#239 - test: empty commit

Pull Request - State: closed - Opened by nish21 almost 2 years ago - 4 comments

#238 - fix: derive master node from training environment

Pull Request - State: closed - Opened by satishpasumarthi almost 2 years ago - 8 comments

#237 - upodate

Pull Request - State: closed - Opened by gijayah213 almost 2 years ago - 4 comments

#236 - feature: add support for native PT DDP distribution

Pull Request - State: closed - Opened by vishwakaria almost 2 years ago - 28 comments

#235 - feature: Add Heterogeneous Cluster support

Pull Request - State: closed - Opened by satishpasumarthi almost 2 years ago - 17 comments

#234 - fix: CI changes

Pull Request - State: closed - Opened by satishpasumarthi almost 2 years ago - 29 comments

#233 - empty commit to trigger ci

Pull Request - State: closed - Opened by nish21 about 2 years ago - 16 comments

#231 - feature: Added Native Pytorch DDP support

Pull Request - State: closed - Opened by satishpasumarthi over 2 years ago - 8 comments

#227 - Example use case

Issue - State: open - Opened by akinolawilson over 3 years ago - 2 comments
Labels: type: question, type: documentation

#226 - Error importing torchaudio

Issue - State: open - Opened by bbalaji-ucsd over 3 years ago - 2 comments
Labels: type: bug

#225 - feature: add reinvent 2020 features

Pull Request - State: closed - Opened by ChoiByungWook over 3 years ago - 73 comments

#224 - fix: not raising excpetion if no image to delete

Pull Request - State: open - Opened by chuyang-deng over 3 years ago - 4 comments

#223 - Getting cudnn error while training on ml.p2.xlarge instance

Issue - State: closed - Opened by shubhamsharma1609 almost 4 years ago - 2 comments

#222 - cannot recognize num_gpus for more than 1 gpu per instance

Issue - State: closed - Opened by zhaoanbei almost 4 years ago - 4 comments
Labels: type: feature request

#221 - change: Update main buildspec to only perform CPU integration tests

Pull Request - State: closed - Opened by bveeramani almost 4 years ago - 15 comments

#220 - change: Pin SageMaker version to less than v2

Pull Request - State: closed - Opened by bveeramani almost 4 years ago - 3 comments

#219 - docs: Fix docstring style in training.py

Pull Request - State: closed - Opened by bveeramani almost 4 years ago - 6 comments

#218 - change: Add GPU and unit test buildspecs

Pull Request - State: closed - Opened by bveeramani almost 4 years ago - 4 comments

#217 - feature: Use MPIRunnerType

Pull Request - State: closed - Opened by bveeramani almost 4 years ago - 55 comments

#216 - feature: update pytorch vanilla version to 1.6.0

Pull Request - State: closed - Opened by chuyang-deng almost 4 years ago - 3 comments

#215 - FastAI v1.0.59 causes failed training job

Issue - State: closed - Opened by dean-cpi about 4 years ago - 1 comment

#214 - infra: add issue templates

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 4 comments

#213 - doc: remove confusing information from the Readme.

Pull Request - State: closed - Opened by nadiaya about 4 years ago - 3 comments

#212 - infra: do not duplicate test dependencies in tox.ini

Pull Request - State: closed - Opened by nadiaya about 4 years ago - 20 comments

#211 - fix: Rename buildspec files.

Pull Request - State: closed - Opened by nadiaya about 4 years ago - 4 comments

#210 - fix: bump version of sagemaker-training for script entry point fix

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 4 comments

#209 - infra: Make docker folder read only, remove unused tests.

Pull Request - State: closed - Opened by nadiaya about 4 years ago - 6 comments

#208 - unable to build final dockerfile.cpu

Issue - State: closed - Opened by Vertika09 about 4 years ago - 4 comments

#207 - Pytorch 1.5 build issue

Issue - State: closed - Opened by dwang-sflscientific about 4 years ago - 2 comments

#206 - change: install ipywidgets in 1.5.0 Python 3 Dockerfiles

Pull Request - State: closed - Opened by laurenyu about 4 years ago - 2 comments

#205 - fix: Bump version of sagemaker-training for typing fix

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 1 comment

#204 - feature: add Python 3.7 support

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 2 comments

#203 - fix: upgrade dependency versions

Pull Request - State: closed - Opened by chuyang-deng about 4 years ago - 7 comments

#202 - Pin Smdebug to the latest version (0.7.2)

Pull Request - State: closed - Opened by TusharKanekiDey about 4 years ago - 2 comments

#201 - infra: use tox in buildspecs

Pull Request - State: closed - Opened by chuyang-deng about 4 years ago - 3 comments

#200 - feature: add Dockerfiles for PyTorch 1.5.0

Pull Request - State: closed - Opened by TusharKanekiDey about 4 years ago - 20 comments

#199 - breaking: Replace sagemaker-containers with sagemaker-training

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 5 comments

#198 - infra: parallelize SageMaker integ test runs

Pull Request - State: closed - Opened by laurenyu about 4 years ago - 2 comments

#196 - fix: change miniconda installation in 1.4.0 Dockerfiles

Pull Request - State: closed - Opened by laurenyu about 4 years ago - 1 comment

#195 - infra: remove (unused) model_fn from training scripts

Pull Request - State: closed - Opened by laurenyu about 4 years ago - 3 comments

#194 - infra: add requirements.txt integ test

Pull Request - State: closed - Opened by laurenyu about 4 years ago - 1 comment

#193 - upgrade pillow etc. to fix safety issues in 1.4.0 dockerfiles

Pull Request - State: closed - Opened by YYStreet over 4 years ago - 1 comment

#192 - Upgrade sagemaker-containers and test with more than 1 epoch

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 1 comment

#191 - upgrade Pillow and use pip to install

Pull Request - State: closed - Opened by YYStreet over 4 years ago - 2 comments

#190 - Bump smdebug version

Pull Request - State: closed - Opened by NihalHarish over 4 years ago - 2 comments

#189 - requirements.txt not working

Issue - State: closed - Opened by hrsma2i over 4 years ago - 2 comments
Labels: type: bug, status: pending release

#188 - infra: run test-toolkit unit tests for release

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 1 comment

#187 - fix: upgrade sagemaker-containers to 2.8.2

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 1 comment

#186 - Install jupyter_client 5.3.4 in advance for py2 gpu image

Pull Request - State: closed - Opened by YYStreet over 4 years ago - 1 comment

#185 - update smdebug

Pull Request - State: closed - Opened by vandanavk over 4 years ago - 2 comments

#184 - Revert "Update smdebug to 0.7.0"

Pull Request - State: closed - Opened by YYStreet over 4 years ago - 1 comment

#183 - infra: run build steps only when necessary.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 4 comments

#182 - feature: Install toolkit from PyPI.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 5 comments

#181 - Issue with torchvision::nms using custom Pytorch and TorchVision

Issue - State: closed - Opened by mmaybeno over 4 years ago - 20 comments

#180 - skip on smexperiments import error

Pull Request - State: closed - Opened by danabens over 4 years ago - 3 comments

#179 - install sm experiments always when python 3.6 or greater

Pull Request - State: closed - Opened by danabens over 4 years ago - 1 comment

#178 - Custom serving code with framework_version beyond 1.1.0

Issue - State: closed - Opened by austinmw over 4 years ago - 5 comments
Labels: type: question

#177 - set min version instead of exact version for sm experiments requirement

Pull Request - State: closed - Opened by danabens over 4 years ago - 1 comment

#176 - skip python2 for experiments test

Pull Request - State: closed - Opened by danabens over 4 years ago - 1 comment

#175 - install sagemaker-experiments package only for 3.6

Pull Request - State: closed - Opened by danabens over 4 years ago - 1 comment

#174 - infra: refactor toolkit tests.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 5 comments

#173 - WA to torchvision dataset issue

Pull Request - State: closed - Opened by vandanavk over 4 years ago - 2 comments

#172 - Update smdebug to 0.7.0

Pull Request - State: closed - Opened by vandanavk over 4 years ago - 14 comments

#171 - Install awscli from pypi instead of conda for PyTorch containers

Pull Request - State: closed - Opened by YYStreet over 4 years ago - 7 comments

#170 - change: install SageMaker Python SDK into Python 3 images

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 3 comments

#169 - change: Fix python 2 tox dependencies.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 1 comment

#168 - change: copy all tests to test-toolkit folder.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 2 comments

#167 - feature: Remove unnecessary dependencies.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 1 comment

#166 - Training on GPU with a custom container based on official pytorch-training container

Issue - State: closed - Opened by jason-morgan over 4 years ago - 2 comments
Labels: type: question

#165 - update: Update license URL

Pull Request - State: closed - Opened by saimidu over 4 years ago - 2 comments

#163 - upgrade to latest sagemaker-experiments

Pull Request - State: closed - Opened by danabens over 4 years ago - 23 comments

#162 - change: Fix flake8 erros. Add flake configuration to run during PR.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 2 comments

#161 - Add twine section to tox.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 1 comment

#160 - feature: Add release to PyPI. Change package name to sagemaker-pytorch-training.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 5 comments

#159 - fix: remove call to deprecated function download_and_install

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 3 comments

#158 - Adding changes for PyTorch 1.4.0 DLC

Pull Request - State: closed - Opened by abhinavs95 over 4 years ago - 10 comments

#157 - Sagemaker PyTorch Not Recognizing Model_FN

Issue - State: closed - Opened by zacharyFerretti over 4 years ago - 9 comments

#152 - Pytorch Lightning pkg pin request in AWS sagemaker Pytorch base container

Issue - State: open - Opened by amitmukh over 4 years ago
Labels: type: feature request

#139 - Prebuilt PyTorch image difference

Issue - State: closed - Opened by ruijianw over 4 years ago - 15 comments
Labels: type: question