Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / aws/sagemaker-tensorflow-training-toolkit issues and pull requests

#426 - fix: Update deprecated dependency package name from sklearn to scikit-learn

Pull Request - State: closed - Opened by kace almost 2 years ago - 4 comments

#425 - CVE-2007-4559 Patch

Pull Request - State: open - Opened by TrellixVulnTeam almost 2 years ago

#424 - documentation: update README and add CONTRIBUTING.md

Pull Request - State: open - Opened by satishpasumarthi about 2 years ago - 4 comments

#423 - documentation: update README and CONTRIBUTING guidelines

Pull Request - State: closed - Opened by satishpasumarthi about 2 years ago - 1 comment

#422 - Update README.txt with how this toolkit related to SMTT

Pull Request - State: open - Opened by gilinachum about 2 years ago - 2 comments

#421 - feature: Add heterogeneous cluster changes

Pull Request - State: closed - Opened by satishpasumarthi about 2 years ago - 15 comments

#420 - Testing MWMS in TF 2.9.1 with TF Model Garden

Pull Request - State: open - Opened by Lokiiiiii over 2 years ago - 42 comments

#419 - Fix/ci

Pull Request - State: closed - Opened by nish21 over 2 years ago - 137 comments

#418 - deprecation: drop py2 support, Update python and other CI

Pull Request - State: closed - Opened by satishpasumarthi over 2 years ago - 79 comments

#417 - TF 2 py37 update

Pull Request - State: closed - Opened by ydaiming over 2 years ago - 8 comments

#416 - Py37 update

Pull Request - State: closed - Opened by ydaiming over 2 years ago - 7 comments

#415 - Feature: Cluster setup for MultiWorkerMirroredStrategy

Pull Request - State: closed - Opened by Lokiiiiii over 2 years ago - 159 comments

#414 - fix: upgrade to sagemaker-training 3.7.1

Pull Request - State: closed - Opened by icywang86rui almost 4 years ago - 8 comments

#413 - infra: include granular buildspecs for dlc and generic cpu and gpu testing

Pull Request - State: closed - Opened by metrizable almost 4 years ago - 38 comments

#412 - DO NOT MERGE testing webhook

Pull Request - State: closed - Opened by icywang86rui almost 4 years ago - 2 comments

#411 - feature: use tensorflow 2.3.1 and add data parallel integ test

Pull Request - State: closed - Opened by ChoiByungWook almost 4 years ago - 19 comments

#410 - feature: include sm-data-distributed and upgrade dependencies

Pull Request - State: closed - Opened by metrizable almost 4 years ago - 5 comments

#409 - feature: Add reinvent 2020 features

Pull Request - State: closed - Opened by ChoiByungWook almost 4 years ago - 15 comments

#408 - fix: workaround to print stderr when capture_error is True

Pull Request - State: closed - Opened by ajaykarpur almost 4 years ago - 2 comments
Labels: priority: high

#407 - fix: workaround to print stderr when capture_error is True

Pull Request - State: closed - Opened by ajaykarpur almost 4 years ago - 2 comments
Labels: priority: high

#406 - fix: propagate log level

Pull Request - State: closed - Opened by ajaykarpur almost 4 years ago - 3 comments

#405 - fix: propagate log level

Pull Request - State: closed - Opened by ajaykarpur almost 4 years ago - 4 comments

#404 - add condition to avoid error when 'model_dir' is None

Pull Request - State: closed - Opened by yijiezh almost 4 years ago - 1 comment

#403 - fix: add condition to avoid error when 'model_dir' is None

Pull Request - State: closed - Opened by chuyang-deng almost 4 years ago - 2 comments

#402 - add condition to avoid error when 'model_dir' is None

Pull Request - State: closed - Opened by yijiezh almost 4 years ago - 2 comments

#400 - fix: missing comma in conftest

Pull Request - State: closed - Opened by chuyang-deng about 4 years ago - 3 comments

#399 - fix: call entry_point.run with capture_error=True

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 4 comments
Labels: type: bug, type: maintenance

#398 - fix: call entry_point.run with capture_error=True

Pull Request - State: closed - Opened by ajaykarpur about 4 years ago - 3 comments
Labels: type: bug, type: maintenance

#397 - feature: tensorflow 2.3 support

Pull Request - State: closed - Opened by chuyang-deng about 4 years ago - 8 comments

#396 - infra: add integration test for MPI env vars propagation

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 3 comments
Labels: priority: high, related: Horovod

#395 - infra: add integration test for MPI env vars propagation

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 2 comments
Labels: priority: high, related: Horovod

#394 - Update horovod version

Pull Request - State: closed - Opened by moaradwan over 4 years ago - 5 comments

#393 - infra: add issue templates

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 6 comments

#392 - How to get evaluation metrics in output logs

Issue - State: open - Opened by MelissaKR over 4 years ago - 5 comments
Labels: type: question

#391 - Support different tf.distribute.Strategies for distributed training on SageMaker

Issue - State: closed - Opened by anirudhacharya over 4 years ago - 14 comments
Labels: type: question

#390 - infra: add single-instance, multi-process Horovod test for local GPU

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 3 comments

#389 - infra: add single-instance, multi-process Horovod test for local GPU

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 3 comments

#388 - local-gpu for TF horovod local sagemaker integration test?

Issue - State: closed - Opened by ChaiBapchya over 4 years ago - 5 comments
Labels: type: enhancement

#387 - doc: remove confusing information from the Readme.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 5 comments

#386 - Parameter Server entrypoint

Issue - State: closed - Opened by ChaiBapchya over 4 years ago - 1 comment
Labels: type: question

#385 - infra: Rename buildspec files.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 2 comments

#384 - infra: Make docker folder read only, remove unused tests.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 7 comments

#383 - doc: Update README.rst

Pull Request - State: closed - Opened by ChaiBapchya over 4 years ago - 3 comments

#382 - Chai bapchya patch 2

Pull Request - State: closed - Opened by ChaiBapchya over 4 years ago - 2 comments

#381 - doc: remove functional test info from tf-2

Pull Request - State: closed - Opened by chuyang-deng over 4 years ago - 4 comments

#380 - doc: remove functional test info from master

Pull Request - State: closed - Opened by chuyang-deng over 4 years ago - 11 comments

#379 - pytest test/integration error

Issue - State: open - Opened by ChaiBapchya over 4 years ago - 4 comments
Labels: type: question

#378 - Incorrect usage: pytest tests/functional

Issue - State: closed - Opened by ChaiBapchya over 4 years ago - 2 comments
Labels: type: documentation

#377 - doc: Update README.rst

Pull Request - State: closed - Opened by ChaiBapchya over 4 years ago - 10 comments

#376 - doc: Update README.rst

Pull Request - State: closed - Opened by ChaiBapchya over 4 years ago - 6 comments

#375 - fix dl_container not found

Pull Request - State: closed - Opened by ChaiBapchya over 4 years ago - 5 comments

#374 - fix: bump version of sagemaker-training for script entry point fix

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 4 comments

#373 - fix: bump version of sagemaker-training for script entry point fix

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 7 comments

#372 - Install nginx for SageMaker endpoint deployment

Pull Request - State: closed - Opened by cuongvng over 4 years ago - 4 comments

#371 - Trigger PR

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 10 comments

#370 - infra: Make docker folder read only, remove unused tests.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 20 comments

#369 - change: update sagemaker-tensorflow-training version

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 3 comments

#368 - doc: update image-building instructions

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 1 comment

#367 - change: update sagemaker-tensorflow-training version

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 3 comments

#366 - infra: remove unused build scripts.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 2 comments

#365 - fix: Bump version of sagemaker-training for typing fix

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 6 comments

#364 - fix: Bump version of sagemaker-training for typing fix

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 2 comments

#363 - infra: fix typo in release buildspec.

Pull Request - State: closed - Opened by nadiaya over 4 years ago - 2 comments

#362 - feature: add Python 3.7 support

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 2 comments

#361 - change: Add py37 to sm tests

Pull Request - State: closed - Opened by saimidu over 4 years ago - 4 comments

#360 - remove sagemaker, keras pks in py37 docker files

Pull Request - State: closed - Opened by Satish615 over 4 years ago - 2 comments

#359 - Fix sm integration issues

Pull Request - State: closed - Opened by Satish615 over 4 years ago - 4 comments

#358 - infra: Fix buildspecs

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 2 comments

#357 - infra: use tox in buildspecs

Pull Request - State: closed - Opened by chuyang-deng over 4 years ago - 1 comment

#356 - add dockerfiles for tf 1.15.2 py37 containers

Pull Request - State: closed - Opened by Satish615 over 4 years ago - 19 comments

#355 - breaking: Replace sagemaker-containers with sagemaker-training

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 5 comments

#354 - infra: remove CHANGELOG entries from failed builds

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 2 comments

#353 - feature: Python 3.7 support

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 11 comments

#352 - infra: bump version to prepare for new version scheme

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 2 comments

#351 - doc: remove extra newlines for consistency

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 1 comment

#349 - infra: add training script to benchmark directory

Pull Request - State: closed - Opened by laurenyu over 4 years ago - 4 comments

#348 - breaking: Replace sagemaker-containers with sagemaker-training

Pull Request - State: closed - Opened by ajaykarpur over 4 years ago - 11 comments

#346 - Spot instances: checkpoint paths for intermediate data?

Issue - State: closed - Opened by dav009 over 4 years ago - 1 comment
Labels: type: question

#345 - Submodule Error

Issue - State: closed - Opened by larsll over 4 years ago - 4 comments
Labels: type: bug

#344 - Restore from checkpoints Tensorflow eager execution

Issue - State: closed - Opened by jenishah over 4 years ago - 2 comments

#341 - Revert "Smdebug version bump for TF2"

Pull Request - State: closed - Opened by TusharKanekiDey over 4 years ago - 1 comment

#340 - --model_dir is inconsistent, confusing, and unnecessary

Issue - State: open - Opened by athewsey over 4 years ago - 4 comments
Labels: type: enhancement

#320 - Revert "Add smdebug to TF 2.x"

Pull Request - State: closed - Opened by ghost over 4 years ago

#318 - Revert "update smdebug wheel"

Pull Request - State: closed - Opened by ghost over 4 years ago - 1 comment

#271 - pytest unit test error

Issue - State: closed - Opened by xzy0223 over 4 years ago - 4 comments

#260 - Custom CUDA Operations

Issue - State: closed - Opened by JKurzer almost 5 years ago - 2 comments

#248 - docker/1.14.0/py3/Dockfile.cpu can't be build through.

Issue - State: closed - Opened by samlovestech almost 5 years ago - 5 comments

#202 - Where are the container images for script mode ?

Issue - State: closed - Opened by keerath over 5 years ago - 6 comments
Labels: type: bug

#188 - Multiple inputs failing with different types

Issue - State: closed - Opened by mhwilder over 5 years ago - 3 comments
Labels: type: enhancement

#174 - Multi-record requests (at least with CSV) don't work

Issue - State: closed - Opened by andremoeller over 5 years ago - 2 comments
Labels: type: bug

#153 - [Minor] script-mode branch lacks Dockerfile patch to deal with S3 response timeout configurability

Issue - State: closed - Opened by zmjjmz over 5 years ago - 4 comments
Labels: type: bug, type: enhancement

#135 - Default GRPC timeout for EI & Allow timeout to be configurable

Pull Request - State: closed - Opened by ChoiByungWook almost 6 years ago - 1 comment

#66 - Upgrade botocore package to higher version

Issue - State: closed - Opened by tin-trong-nguyen about 6 years ago - 4 comments
Labels: type: bug

#62 - Support Distributed Training Strategies

Issue - State: closed - Opened by andrewortman about 6 years ago - 6 comments
Labels: type: enhancement

#50 - Can we have dockerfiles for python 3

Issue - State: closed - Opened by yshvrdhn over 6 years ago - 6 comments
Labels: type: enhancement

#46 - Custom request timeout for TF-Serving

Issue - State: closed - Opened by ghost over 6 years ago - 7 comments
Labels: type: enhancement

#43 - How to put model for training in the container?

Issue - State: closed - Opened by nekojan over 6 years ago - 6 comments
Labels: type: enhancement