GitHub / aws/sagemaker-python-sdk issues and pull requests
Labelled with: bug
#4227 - `probability_threshold_attribute` in ModelQualityCheckConfig cannot be a PipelineVariable
Issue -
State: closed - Opened by leviBernadine about 2 years ago
- 4 comments
Labels: bug, component: pipelines
#4198 - local_gpu not working on an ml.g5.2xlarge
Issue -
State: closed - Opened by matthewchung74 about 2 years ago
- 2 comments
Labels: bug
#4195 - ModuleNotFoundError: Sagemaker only copies `entry_point` file to `/opt/ml/code/` instead of the holy-cloned source code
Issue -
State: closed - Opened by celsofranssa about 2 years ago
Labels: bug
#4194 - EETQ not available when using TGI via get_huggingface_llm_image_uri
Issue -
State: open - Opened by TRT-BradleyB about 2 years ago
- 4 comments
Labels: bug, component: image uri
#4193 - Sagemaker mistakenly creates a `command train` instead of the specified `main.py`
Issue -
State: closed - Opened by celsofranssa about 2 years ago
- 1 comment
Labels: bug
#4191 - "Train": executable file not found in $PATH: unknown
Issue -
State: closed - Opened by celsofranssa about 2 years ago
- 3 comments
Labels: bug
#4183 - v2.192
Issue -
State: closed - Opened by DRKolev-code about 2 years ago
Labels: bug, Local Mode
#4179 - HuggingFace Estimator requires `py_version` when `image_uri` is specified
Issue -
State: closed - Opened by j-adamczyk about 2 years ago
- 1 comment
Labels: bug
#4168 - Remove upper bound on urllib in the `local_requirements.txt` for CVE-2023-43804
Issue -
State: closed - Opened by jmahlik about 2 years ago
- 2 comments
Labels: bug
#4166 - JumpstartEstimator doesn't recognize trn1 instance types
Issue -
State: closed - Opened by mstfldmr about 2 years ago
- 2 comments
Labels: bug
#4142 - Pipeline code upload location invalid (standalone job pattern applied) for steps included in Condition step
Issue -
State: closed - Opened by AndreiVoinovTR about 2 years ago
- 4 comments
Labels: bug, component: pipelines
#4137 - docker compose v2
Issue -
State: open - Opened by DRKolev-code about 2 years ago
Labels: bug, type: feature request, Docker
#4130 - Valid JSONPath failing in QualityCheckStep
Issue -
State: closed - Opened by vmatekole about 2 years ago
- 4 comments
Labels: bug, Pending information, component: pipelines, component: Inference APIs
#4120 - SageMaker Model Cards UI doesn't show Model Cards created using model package details
Issue -
State: closed - Opened by l-m-j about 2 years ago
- 4 comments
Labels: bug, UI
#4115 - Endpoint failing after initially passing ping health check
Issue -
State: open - Opened by nfarley-soaren about 2 years ago
- 2 comments
Labels: bug, component: hosting
#4113 - [FATAL tini (7)] exec train failed: No such file or directory
Issue -
State: closed - Opened by celsofranssa about 2 years ago
- 12 comments
Labels: bug, type: logging/error reporting
#4106 - Sagemaker jumpstart session breaks sagemaker SDK import when partial AWS credentials are present
Issue -
State: closed - Opened by elemakil over 2 years ago
- 3 comments
Labels: bug
#4097 - Estimator modifies input InstanceGroup configurations in-place, preventing them from being reused
Issue -
State: closed - Opened by saimidu over 2 years ago
Labels: bug
#4090 - Remote function fails on windows machine
Issue -
State: closed - Opened by andrevus over 2 years ago
Labels: bug, OS: Windows
#4071 - Sagemaker with Dockerfile
Issue -
State: closed - Opened by celsofranssa over 2 years ago
- 4 comments
Labels: bug
#4048 - Cron is 1 day off
Issue -
State: open - Opened by numeric-lee over 2 years ago
Labels: bug
#4043 - Jumpstart with Cross-Account Role Assumption Fails with GetObject Access Denial
Issue -
State: closed - Opened by mencarellic over 2 years ago
- 1 comment
Labels: bug
#4042 - MultiRecord Batch Transform is not working for a Multi container Model (Inference Pipeline)
Issue -
State: open - Opened by Elkinmt19 over 2 years ago
Labels: bug, type: logging/error reporting
#4041 - Register method in PipelineModel does not return a ModelPackage instance as said in documentation
Issue -
State: closed - Opened by SachieTran over 2 years ago
Labels: bug, type: documentation
#4038 - Docker compose v2 support in local mode
Issue -
State: closed - Opened by sateeshmannar over 2 years ago
- 4 comments
Labels: bug
#4034 - `DatasetBuilder.to_dataframe()` fails if S3 buckets are encrypted with server-side KMS encryption and a KMS key is supplied.
Issue -
State: open - Opened by groverpr over 2 years ago
Labels: bug
#4028 - TensorFlowProcessor tries to run python script using /bin/bash as its entrypoint
Issue -
State: closed - Opened by svpino over 2 years ago
- 2 comments
Labels: bug, Tensorflow
#4024 - Local mode errors for PySparkProcessor for instance_type="local"
Issue -
State: closed - Opened by j-adamczyk over 2 years ago
- 1 comment
Labels: bug
#4020 - DJLModel inference error - Allocation larger than expected: tag 'qkv'
Issue -
State: closed - Opened by yapweiyih over 2 years ago
- 1 comment
Labels: bug
#4017 - SageMaker pipeline parallelism_config doesn't work
Issue -
State: closed - Opened by jrevuelta-chwy over 2 years ago
- 1 comment
Labels: bug, component: pipelines
#4016 - Pipeline paramaters validation for Local mode
Issue -
State: closed - Opened by patrick-239 over 2 years ago
- 2 comments
Labels: bug, component: pipelines
#4006 - Fine-tuning Tensorflow object detection algorithm on custom data error related to 'transfer_learning.py'
Issue -
State: closed - Opened by yangwu-hpa over 2 years ago
- 2 comments
Labels: bug
#3997 - Transform job from Model does not pass role
Issue -
State: open - Opened by shakedel over 2 years ago
Labels: bug, component: Inference APIs
#3993 - ValueError(f"Bad value for instance type: '{instance_type}'")
Issue -
State: closed - Opened by mlsquareup over 2 years ago
- 7 comments
Labels: bug, component: pipelines
#3989 - SageMaker experiments not being created within training job when mandated to have tags
Issue -
State: open - Opened by inchara1990 over 2 years ago
- 8 comments
Labels: bug, component: experiments
#3988 - ContecualVersionConflict: Sagemaker
Issue -
State: open - Opened by MorganWeiss over 2 years ago
- 1 comment
Labels: bug
#3974 - Pipeline AttributeError: 'ParameterString' object has no attribute 'startswith
Issue -
State: closed - Opened by urirosenberg over 2 years ago
- 3 comments
Labels: bug, component: pipelines
#3959 - Cannot schedule model quality job
Issue -
State: closed - Opened by Nick-McElroy over 2 years ago
- 3 comments
Labels: bug, component: Inference APIs
#3955 - Sagemaker Pipeline DHCP error in VPC config
Issue -
State: closed - Opened by m-rajput over 2 years ago
- 7 comments
Labels: bug, component: processing
#3952 - Pytorch 2.0 dataloader workers have wrong cpu affinity set.
Issue -
State: closed - Opened by usamec over 2 years ago
- 3 comments
Labels: bug, PyTorch
#3931 - Bloomz models having task name as textgeneration1 on JumpStart
Issue -
State: open - Opened by mrgiba over 2 years ago
- 2 comments
Labels: bug, In progress, component: jumpstart
#3928 - Unable to upgrade to new sagemaker version due to PyYAML conflict
Issue -
State: closed - Opened by brianloyal over 2 years ago
- 1 comment
Labels: bug
#3917 - Passing instance type as a ParameterString to PyTorch estimator in 2.163.0 crashes
Issue -
State: closed - Opened by pviolette3 over 2 years ago
- 6 comments
Labels: bug, component: pipelines
#3908 - AutoMLStep Does Not Support Constant-Valued problem_type
Issue -
State: closed - Opened by mbbourgo over 2 years ago
- 8 comments
Labels: bug, component: auto-ml
#3905 - Deploy Falcon-7b model on Sagemaker endpoint
Issue -
State: closed - Opened by karthikgali over 2 years ago
- 10 comments
Labels: bug
#3883 - Customized model data download timeout is not supported in China region(cn-north-1&cn-northwest-1)
Issue -
State: closed - Opened by KraftZzz over 2 years ago
- 1 comment
Labels: bug, Pending information
#3876 - No support for llama model type on because of not supporting version 4.28 of transformers
Issue -
State: open - Opened by SKLC1 over 2 years ago
Labels: bug, LLama2
#3874 - KeyError: 'ResourceName'
Issue -
State: closed - Opened by cybor0 over 2 years ago
Labels: bug, component: Utility APIs
#3860 - Object of type ParameterString is not JSON serializable
Issue -
State: open - Opened by aravinddeveloper over 2 years ago
Labels: bug
#3857 - Relax local-mode PyPI requirements on urllib3
Issue -
State: closed - Opened by ozancaglayan over 2 years ago
Labels: bug, Local Mode
#3815 - cannot import name 'get_base_python_image_uri' from 'sagemaker.image_uris' (/home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages/sagemaker/image_uris.py)
Issue -
State: closed - Opened by KraftZzz over 2 years ago
- 5 comments
Labels: bug, remote-function
#3803 - Neuron Image URI config file is out of date
Issue -
State: open - Opened by mmcclean-aws over 2 years ago
- 1 comment
Labels: bug, component: neo, type: config missing, component: image uri
#3772 - SageMaker exhibits unpredictable behavior when passing entry point to `Model`.
Issue -
State: closed - Opened by collincunn over 2 years ago
- 1 comment
Labels: bug
#3769 - DJL not passing through sagemaker session when downloading s3 artifact
Issue -
State: closed - Opened by jbarz1 over 2 years ago
- 1 comment
Labels: bug
#3758 - sagemaker:CreateProcessingJob don't recieve empty string unlike local mode that can handle it
Issue -
State: closed - Opened by idanmoradarthas over 2 years ago
- 1 comment
Labels: bug, component: processing
#3748 - DJLModel doesn't support ml.p4d.24xlarge for deployment
Issue -
State: closed - Opened by andjsmi over 2 years ago
- 1 comment
Labels: bug
#3743 - The `to_input_req` method of the TuningJobCompletionCriteriaConfig doesn't work and makes it unusable
Issue -
State: open - Opened by konradsemsch over 2 years ago
- 1 comment
Labels: bug
#3742 - %pip install --upgrade pip sagemaker==2.140.1 fails with error
Issue -
State: closed - Opened by teraiyam over 2 years ago
- 1 comment
Labels: bug
#3721 - Training job conflict error when installing local src
Issue -
State: open - Opened by cirofdo over 2 years ago
- 1 comment
Labels: bug
#3702 - sagemaker.image_uri.retrieve "container_version" parameter does not work as expected
Issue -
State: open - Opened by arjkesh almost 3 years ago
Labels: bug, component: Utility APIs
#3699 - KMS key not supported for FrameworkProcessor.run()
Issue -
State: closed - Opened by huyqd almost 3 years ago
Labels: bug
#3690 - Predictor: SSL validation failed
Issue -
State: closed - Opened by jordanparker6 almost 3 years ago
- 4 comments
Labels: bug
#3677 - Deprecate from Error to Warning: "KMS key is not supported for NVMe instance storage"
Issue -
State: closed - Opened by sermolin almost 3 years ago
- 2 comments
Labels: bug, type: logging/error reporting
#3673 - list_runs doesn't work twice. Raises a ValueError about the name length of the run.
Issue -
State: closed - Opened by JanetMatsen almost 3 years ago
- 3 comments
Labels: bug, component: experiments
#3619 - sagemaker limitations
Issue -
State: open - Opened by Shaked35 almost 3 years ago
- 3 comments
Labels: bug
#3557 - TransformStep transforms files it should not
Issue -
State: closed - Opened by HarryPommier almost 3 years ago
- 2 comments
Labels: bug, component: pipelines
#3494 - Baseline Job for ModelExplainabilityMonitor failing due to ClientError: An error occurred (ModelError) when calling the InvokeEndpoint operation (reached max retries: 0): Received server error (500)
Issue -
State: open - Opened by irdanish11 about 3 years ago
- 1 comment
Labels: bug, component: clarify, component: model monitor
#3491 - Make sourcedir.tar.gz and repacked model.tar.gz structure consistent
Issue -
State: open - Opened by plienhar about 3 years ago
Labels: bug, component: Inference APIs
#3455 - Training job not saved to S3 despite providing S3 output location due to no model artifact saved under path /opt/ml/model
Issue -
State: closed - Opened by yshen92 about 3 years ago
- 3 comments
Labels: bug, component: training
#3454 - Estimator .fit() fails in Jakarta region with built-in algo, due to added "USE_SMDEBUG" environment variable when debugger is not supported
Issue -
State: open - Opened by yudho about 3 years ago
Labels: bug, component: training
#3452 - Unexpected keyword argument 'strategy_config' at Session._map_tuning_config()
Issue -
State: closed - Opened by yshen92 about 3 years ago
- 8 comments
Labels: bug, type: documentation
#3397 - Can not set endpoint environment variables when deploying a model package
Issue -
State: closed - Opened by l3ku about 3 years ago
- 2 comments
Labels: bug
#3392 - SageMaker pipeline Caching Pipeline Steps not working for Transform step
Issue -
State: closed - Opened by mouhannadali about 3 years ago
- 5 comments
Labels: bug, component: pipelines
#3389 - FrameworkProcessor does not use output_kms_key when uploading the code artifact
Issue -
State: closed - Opened by szamarin about 3 years ago
- 2 comments
Labels: bug
#3386 - .get_result() method of async endpoint response not working
Issue -
State: closed - Opened by joaopcm1996 about 3 years ago
- 4 comments
Labels: bug
#3381 - GIT_SSH script file handle not yet closed when clone command issued.
Issue -
State: closed - Opened by croth1 about 3 years ago
- 1 comment
Labels: bug, contributions welcome, Good First Issue
#3376 - Deploying Huggingface model into ml.inf1.xlarge succeeded, but ml.inf1.2xlarge failed
Issue -
State: closed - Opened by tagucci about 3 years ago
- 5 comments
Labels: bug, type: question, component: neo, HuggingFace, Neo
#3367 - DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
Issue -
State: closed - Opened by humanzz about 3 years ago
- 1 comment
Labels: bug
#3361 - HuggingFaceModel does not properly accept script mode environment variables
Issue -
State: closed - Opened by athewsey about 3 years ago
- 1 comment
Labels: bug, contributions welcome, HuggingFace, Good First Issue
#3360 - Sagemaker: AttributeError: 'LocalSagemakerClient' object has no attribute 'create_feature_group'
Issue -
State: open - Opened by akramIOT about 3 years ago
- 1 comment
Labels: bug, type: question, Local Mode
#3357 - OOM when resuming training from checkpoint
Issue -
State: open - Opened by renziver about 3 years ago
- 1 comment
Labels: bug, HuggingFace
#3356 - sagemaker.session.download_data() is unable to download S3 content.
Issue -
State: closed - Opened by anrikus about 3 years ago
- 1 comment
Labels: bug
#3332 - Passed in session should be used for feature group ingest
Issue -
State: open - Opened by sampoorna over 3 years ago
- 3 comments
Labels: bug, component: feature store
#3331 - Increase shared memory for local mode
Issue -
State: closed - Opened by bengruher over 3 years ago
- 2 comments
Labels: bug, contributions welcome, Local Mode
#3319 - Huggingface estimator tries to import tensorflow when pytorch is defined
Issue -
State: closed - Opened by marinone94 over 3 years ago
- 3 comments
Labels: bug, HuggingFace
#3295 - New line causes ValidationException (potential race condition)
Issue -
State: closed - Opened by l1x over 3 years ago
- 2 comments
Labels: bug
#3279 - v2.102 crashes when launching Pytorch estimator job
Issue -
State: closed - Opened by rahul003 over 3 years ago
- 2 comments
Labels: bug
#3267 - AthenaDatasetDefinition has required params
Issue -
State: open - Opened by hes-dev23 over 3 years ago
Labels: bug
#3250 - Missing Tensorflow 2.9 inference image
Issue -
State: closed - Opened by plumdog over 3 years ago
- 2 comments
Labels: bug
#3243 - Inference of PyTorch Model when mapped to Elastic Inference Accelarator is 15 times slow as compared to the CPU inference
Issue -
State: closed - Opened by Bilal-Yousaf over 3 years ago
- 3 comments
Labels: bug, component: hosting
#3229 - SageMaker Image Classification - Validation accuracy inconsistent
Issue -
State: open - Opened by rauldiaz over 3 years ago
- 1 comment
Labels: bug, Pending information
#3225 - Local mode does not work on EC2 instances
Issue -
State: closed - Opened by MatthewCaseres over 3 years ago
- 3 comments
Labels: bug, contributions welcome, Local Mode, Docker
#3166 - FrameworkProcessor doesn't install packages in requirements.txt if it's in a Sagemaker Project
Issue -
State: closed - Opened by mstfldmr over 3 years ago
- 4 comments
Labels: bug, Pending information
#3147 - setup.py sets upper bound on `importlib-metadata`
Issue -
State: closed - Opened by dror-weiss over 3 years ago
- 2 comments
Labels: bug
#3090 - Feature Store methods are broken
Issue -
State: closed - Opened by Guillem96 over 3 years ago
- 5 comments
Labels: bug, contributions welcome, component: feature store
#3079 - AttributeError: 'CustomFramework' object has no attribute 'framework_version'
Issue -
State: closed - Opened by jagadeesr over 3 years ago
- 3 comments
Labels: bug, Custom Framework
#3062 - Sagemaker arbitrarily stops copying checkpoints to S3
Issue -
State: closed - Opened by mnslarcher over 3 years ago
- 13 comments
Labels: bug
#3030 - `sagemake.pytorch.PyTorchModel` requires `framework_version` even if `image_uri` is provided
Issue -
State: closed - Opened by alar0330 over 3 years ago
- 1 comment
Labels: bug, PyTorch, component: hosting
#3012 - Cannot deploy Huggingface model onto serverless endpoint
Issue -
State: closed - Opened by Peter-Devine over 3 years ago
- 5 comments
Labels: bug, Pending information
#2952 - SM Elastic Inference Accelerators are not available during inference
Issue -
State: closed - Opened by vinayak-shanawad almost 4 years ago
- 1 comment
Labels: bug, component: Inference APIs