Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / aws/sagemaker-tensorflow-training-toolkit issues and pull requests
#426 - fix: Update deprecated dependency package name from sklearn to scikit-learn
Pull Request -
State: closed - Opened by kace almost 2 years ago
- 4 comments
#425 - CVE-2007-4559 Patch
Pull Request -
State: open - Opened by TrellixVulnTeam almost 2 years ago
#424 - documentation: update README and add CONTRIBUTING.md
Pull Request -
State: open - Opened by satishpasumarthi about 2 years ago
- 4 comments
#423 - documentation: update README and CONTRIBUTING guidelines
Pull Request -
State: closed - Opened by satishpasumarthi about 2 years ago
- 1 comment
#422 - Update README.txt with how this toolkit related to SMTT
Pull Request -
State: open - Opened by gilinachum about 2 years ago
- 2 comments
#421 - feature: Add heterogeneous cluster changes
Pull Request -
State: closed - Opened by satishpasumarthi about 2 years ago
- 15 comments
#420 - Testing MWMS in TF 2.9.1 with TF Model Garden
Pull Request -
State: open - Opened by Lokiiiiii over 2 years ago
- 42 comments
#419 - Fix/ci
Pull Request -
State: closed - Opened by nish21 over 2 years ago
- 137 comments
#418 - deprecation: drop py2 support, Update python and other CI
Pull Request -
State: closed - Opened by satishpasumarthi over 2 years ago
- 79 comments
#417 - TF 2 py37 update
Pull Request -
State: closed - Opened by ydaiming over 2 years ago
- 8 comments
#416 - Py37 update
Pull Request -
State: closed - Opened by ydaiming over 2 years ago
- 7 comments
#415 - Feature: Cluster setup for MultiWorkerMirroredStrategy
Pull Request -
State: closed - Opened by Lokiiiiii over 2 years ago
- 159 comments
#414 - fix: upgrade to sagemaker-training 3.7.1
Pull Request -
State: closed - Opened by icywang86rui almost 4 years ago
- 8 comments
#413 - infra: include granular buildspecs for dlc and generic cpu and gpu testing
Pull Request -
State: closed - Opened by metrizable almost 4 years ago
- 38 comments
#412 - DO NOT MERGE testing webhook
Pull Request -
State: closed - Opened by icywang86rui almost 4 years ago
- 2 comments
#411 - feature: use tensorflow 2.3.1 and add data parallel integ test
Pull Request -
State: closed - Opened by ChoiByungWook almost 4 years ago
- 19 comments
#410 - feature: include sm-data-distributed and upgrade dependencies
Pull Request -
State: closed - Opened by metrizable almost 4 years ago
- 5 comments
#409 - feature: Add reinvent 2020 features
Pull Request -
State: closed - Opened by ChoiByungWook almost 4 years ago
- 15 comments
#408 - fix: workaround to print stderr when capture_error is True
Pull Request -
State: closed - Opened by ajaykarpur almost 4 years ago
- 2 comments
Labels: priority: high
#407 - fix: workaround to print stderr when capture_error is True
Pull Request -
State: closed - Opened by ajaykarpur almost 4 years ago
- 2 comments
Labels: priority: high
#406 - fix: propagate log level
Pull Request -
State: closed - Opened by ajaykarpur almost 4 years ago
- 3 comments
#405 - fix: propagate log level
Pull Request -
State: closed - Opened by ajaykarpur almost 4 years ago
- 4 comments
#404 - add condition to avoid error when 'model_dir' is None
Pull Request -
State: closed - Opened by yijiezh almost 4 years ago
- 1 comment
#403 - fix: add condition to avoid error when 'model_dir' is None
Pull Request -
State: closed - Opened by chuyang-deng almost 4 years ago
- 2 comments
#402 - add condition to avoid error when 'model_dir' is None
Pull Request -
State: closed - Opened by yijiezh almost 4 years ago
- 2 comments
#401 - Model deployment is failing with the error "The primary container for production variant AllTraffic did not pass the ping health check.
Issue -
State: open - Opened by vishwath96 about 4 years ago
- 5 comments
Labels: type: question
#400 - fix: missing comma in conftest
Pull Request -
State: closed - Opened by chuyang-deng about 4 years ago
- 3 comments
#399 - fix: call entry_point.run with capture_error=True
Pull Request -
State: closed - Opened by ajaykarpur about 4 years ago
- 4 comments
Labels: type: bug, type: maintenance
#398 - fix: call entry_point.run with capture_error=True
Pull Request -
State: closed - Opened by ajaykarpur about 4 years ago
- 3 comments
Labels: type: bug, type: maintenance
#397 - feature: tensorflow 2.3 support
Pull Request -
State: closed - Opened by chuyang-deng about 4 years ago
- 8 comments
#396 - infra: add integration test for MPI env vars propagation
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 3 comments
Labels: priority: high, related: Horovod
#395 - infra: add integration test for MPI env vars propagation
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 2 comments
Labels: priority: high, related: Horovod
#394 - Update horovod version
Pull Request -
State: closed - Opened by moaradwan over 4 years ago
- 5 comments
#393 - infra: add issue templates
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 6 comments
#392 - How to get evaluation metrics in output logs
Issue -
State: open - Opened by MelissaKR over 4 years ago
- 5 comments
Labels: type: question
#391 - Support different tf.distribute.Strategies for distributed training on SageMaker
Issue -
State: closed - Opened by anirudhacharya over 4 years ago
- 14 comments
Labels: type: question
#390 - infra: add single-instance, multi-process Horovod test for local GPU
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 3 comments
#389 - infra: add single-instance, multi-process Horovod test for local GPU
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 3 comments
#388 - local-gpu for TF horovod local sagemaker integration test?
Issue -
State: closed - Opened by ChaiBapchya over 4 years ago
- 5 comments
Labels: type: enhancement
#387 - doc: remove confusing information from the Readme.
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 5 comments
#386 - Parameter Server entrypoint
Issue -
State: closed - Opened by ChaiBapchya over 4 years ago
- 1 comment
Labels: type: question
#385 - infra: Rename buildspec files.
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 2 comments
#384 - infra: Make docker folder read only, remove unused tests.
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 7 comments
#383 - doc: Update README.rst
Pull Request -
State: closed - Opened by ChaiBapchya over 4 years ago
- 3 comments
#382 - Chai bapchya patch 2
Pull Request -
State: closed - Opened by ChaiBapchya over 4 years ago
- 2 comments
#381 - doc: remove functional test info from tf-2
Pull Request -
State: closed - Opened by chuyang-deng over 4 years ago
- 4 comments
#380 - doc: remove functional test info from master
Pull Request -
State: closed - Opened by chuyang-deng over 4 years ago
- 11 comments
#379 - pytest test/integration error
Issue -
State: open - Opened by ChaiBapchya over 4 years ago
- 4 comments
Labels: type: question
#378 - Incorrect usage: pytest tests/functional
Issue -
State: closed - Opened by ChaiBapchya over 4 years ago
- 2 comments
Labels: type: documentation
#377 - doc: Update README.rst
Pull Request -
State: closed - Opened by ChaiBapchya over 4 years ago
- 10 comments
#376 - doc: Update README.rst
Pull Request -
State: closed - Opened by ChaiBapchya over 4 years ago
- 6 comments
#375 - fix dl_container not found
Pull Request -
State: closed - Opened by ChaiBapchya over 4 years ago
- 5 comments
#374 - fix: bump version of sagemaker-training for script entry point fix
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 4 comments
#373 - fix: bump version of sagemaker-training for script entry point fix
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 7 comments
#372 - Install nginx for SageMaker endpoint deployment
Pull Request -
State: closed - Opened by cuongvng over 4 years ago
- 4 comments
#371 - Trigger PR
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 10 comments
#370 - infra: Make docker folder read only, remove unused tests.
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 20 comments
#369 - change: update sagemaker-tensorflow-training version
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 3 comments
#368 - doc: update image-building instructions
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 1 comment
#367 - change: update sagemaker-tensorflow-training version
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 3 comments
#366 - infra: remove unused build scripts.
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 2 comments
#365 - fix: Bump version of sagemaker-training for typing fix
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 6 comments
#364 - fix: Bump version of sagemaker-training for typing fix
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 2 comments
#363 - infra: fix typo in release buildspec.
Pull Request -
State: closed - Opened by nadiaya over 4 years ago
- 2 comments
#362 - feature: add Python 3.7 support
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 2 comments
#361 - change: Add py37 to sm tests
Pull Request -
State: closed - Opened by saimidu over 4 years ago
- 4 comments
#360 - remove sagemaker, keras pks in py37 docker files
Pull Request -
State: closed - Opened by Satish615 over 4 years ago
- 2 comments
#359 - Fix sm integration issues
Pull Request -
State: closed - Opened by Satish615 over 4 years ago
- 4 comments
#358 - infra: Fix buildspecs
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 2 comments
#357 - infra: use tox in buildspecs
Pull Request -
State: closed - Opened by chuyang-deng over 4 years ago
- 1 comment
#356 - add dockerfiles for tf 1.15.2 py37 containers
Pull Request -
State: closed - Opened by Satish615 over 4 years ago
- 19 comments
#355 - breaking: Replace sagemaker-containers with sagemaker-training
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 5 comments
#354 - infra: remove CHANGELOG entries from failed builds
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 2 comments
#353 - feature: Python 3.7 support
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 11 comments
#352 - infra: bump version to prepare for new version scheme
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 2 comments
#351 - doc: remove extra newlines for consistency
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 1 comment
#349 - infra: add training script to benchmark directory
Pull Request -
State: closed - Opened by laurenyu over 4 years ago
- 4 comments
#348 - breaking: Replace sagemaker-containers with sagemaker-training
Pull Request -
State: closed - Opened by ajaykarpur over 4 years ago
- 11 comments
#347 - Failed. Reason: The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint
Issue -
State: closed - Opened by ghost over 4 years ago
- 1 comment
Labels: type: question
#346 - Spot instances: checkpoint paths for intermediate data?
Issue -
State: closed - Opened by dav009 over 4 years ago
- 1 comment
Labels: type: question
#345 - Submodule Error
Issue -
State: closed - Opened by larsll over 4 years ago
- 4 comments
Labels: type: bug
#344 - Restore from checkpoints Tensorflow eager execution
Issue -
State: closed - Opened by jenishah over 4 years ago
- 2 comments
#341 - Revert "Smdebug version bump for TF2"
Pull Request -
State: closed - Opened by TusharKanekiDey over 4 years ago
- 1 comment
#340 - --model_dir is inconsistent, confusing, and unnecessary
Issue -
State: open - Opened by athewsey over 4 years ago
- 4 comments
Labels: type: enhancement
#320 - Revert "Add smdebug to TF 2.x"
Pull Request -
State: closed - Opened by ghost over 4 years ago
#318 - Revert "update smdebug wheel"
Pull Request -
State: closed - Opened by ghost over 4 years ago
- 1 comment
#271 - pytest unit test error
Issue -
State: closed - Opened by xzy0223 over 4 years ago
- 4 comments
#263 - RuntimeError: Failed to run: ['docker-compose', '-f', '/tmp/tmp93Sn5U/docker-compose.yaml', 'up', '--build', '--abort-on-container-exit'], Process exited with code: 127
Issue -
State: closed - Opened by starrylive almost 5 years ago
- 8 comments
#260 - Custom CUDA Operations
Issue -
State: closed - Opened by JKurzer almost 5 years ago
- 2 comments
#248 - docker/1.14.0/py3/Dockfile.cpu can't be build through.
Issue -
State: closed - Opened by samlovestech almost 5 years ago
- 5 comments
#202 - Where are the container images for script mode ?
Issue -
State: closed - Opened by keerath over 5 years ago
- 6 comments
Labels: type: bug
#188 - Multiple inputs failing with different types
Issue -
State: closed - Opened by mhwilder over 5 years ago
- 3 comments
Labels: type: enhancement
#174 - Multi-record requests (at least with CSV) don't work
Issue -
State: closed - Opened by andremoeller over 5 years ago
- 2 comments
Labels: type: bug
#153 - [Minor] script-mode branch lacks Dockerfile patch to deal with S3 response timeout configurability
Issue -
State: closed - Opened by zmjjmz over 5 years ago
- 4 comments
Labels: type: bug, type: enhancement
#135 - Default GRPC timeout for EI & Allow timeout to be configurable
Pull Request -
State: closed - Opened by ChoiByungWook almost 6 years ago
- 1 comment
#66 - Upgrade botocore package to higher version
Issue -
State: closed - Opened by tin-trong-nguyen about 6 years ago
- 4 comments
Labels: type: bug
#62 - Support Distributed Training Strategies
Issue -
State: closed - Opened by andrewortman about 6 years ago
- 6 comments
Labels: type: enhancement
#50 - Can we have dockerfiles for python 3
Issue -
State: closed - Opened by yshvrdhn over 6 years ago
- 6 comments
Labels: type: enhancement
#46 - Custom request timeout for TF-Serving
Issue -
State: closed - Opened by ghost over 6 years ago
- 7 comments
Labels: type: enhancement
#43 - How to put model for training in the container?
Issue -
State: closed - Opened by nekojan over 6 years ago
- 6 comments
Labels: type: enhancement