Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / oap-project/cloudtik issues and pull requests
#1362 - Examples: Horovod on Spark examples GPU support
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1361 - Examples: torch checkpoint to save model in cpu location
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1360 - Examples: synthetic ImageNet example for PyTorch distributed
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1359 - Examples: fix the makedirs permission issue
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1358 - ML: Fix driver NIC issue for Horovod
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1357 - Examples: example folder change the name to examples
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1356 - Examples: PyTorch examples to support GPU
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1355 - Templates: Smaller head for standard and small GPU templates
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1354 - Templates: add very small GPU templates for use of testing
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1353 - Templates: make the GPU templates consistent on naming
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1352 - Alibaba Cloud: add integration test cases
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1351 - Example: add cluster examples for ml (CPU, GPU and oneAPI)
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1350 - Example: ML example for resnet50 with IPEX (need workaround fix)
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1349 - Core: no wait for minimal nodes with an operating quorum
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1348 - Core: fix the quorum launch check logic
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1347 - Core: implement the quorum management of minimal nodes
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1346 - Core: allow minimal nodes cluster to avoid scale on failure
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1345 - Examples: add zookeeper test example
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1344 - Tools: install dlib which removed from the core ml runtime
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1343 - Tools: use the fixed intelai-models commit
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1342 - Core: by default disable head automatic runtime detection
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1341 - Core: commands to handle GPU resource info
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1340 - Providers: auto detect GPU resources from instance type
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1339 - Azure: built-in GPU templates for Azure
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1338 - AWS: refine GPU templates with a base config
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1337 - GCP: Fix the wait for driver
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1336 - GCP: check driver installation only on worker
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1335 - AWS: gpu templates rename to lower case
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1334 - GCP: built-in templates for GPU instances
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1333 - Core: retrying commands as common practice because of possible restart
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1332 - Core: config merge support advanced list appending
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1331 - AWS: aws gpu templates
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1330 - ML: consistent GPU cuda libraries version
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1329 - GCP: Fix the order of setting image source
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1328 - AWS: update the latest image ids of the regions for GPU
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1327 - Core: docker to use GPU tagged image based on runtime
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1326 - Alibaba Cloud: use cpu or gpu image based on runtime
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1325 - GCP: choose the cpu or gup image based on runtime
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1324 - Azure: choose cpu or gpu image at bootstrap
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1323 - AWS: auto configure the image id if gpu is configured
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1322 - AWS: refine the database instance management for workspace
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1321 - ML: Fix the ML runtime docker to set the right env
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1320 - Dev: release docker with GPU options
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1319 - ML: Initial code for ML to support GPU
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1318 - Support cloud database for AWS.
Pull Request -
State: closed - Opened by haojinIntel almost 2 years ago
- 1 comment
#1317 - Get HuaweiCloud provider default cluster image
Pull Request -
State: closed - Opened by kiwik almost 2 years ago
- 1 comment
#1316 - Benchmarks: Add models original code about DLRM dist training.
Pull Request -
State: closed - Opened by yao531441 almost 2 years ago
#1315 - Dev: improve the release docker to release image individually
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1314 - ML: upgrade MLflow from 2.1.1 to 2.2.2
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1313 - Copy the source code of RNNT and SSD-RESNET to CloudTik.
Pull Request -
State: closed - Opened by haojinIntel almost 2 years ago
- 3 comments
#1312 - Add op_svc_userid into worker node metadata
Pull Request -
State: closed - Opened by kiwik almost 2 years ago
#1311 - ML: Fix the examples for legacy optimizers change from TensorFlow 2.11
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1310 - ML: upgrade TensorFlow to 2.12.0 for oneAPI ML runtime
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1309 - ML: Horovod upgrade to 0.27.0 for Tensorlfow 2.12.0
Pull Request -
State: closed - Opened by jerrychenhf almost 2 years ago
#1308 - Benchmarks: support to run maskcnn with or without IPEX.
Pull Request -
State: closed - Opened by haojinIntel almost 2 years ago
- 1 comment
#1307 - Add source code for maskcnn of ai-model
Pull Request -
State: closed - Opened by haojinIntel almost 2 years ago
#1306 - Patch ai models during running bootstrap-models.sh
Pull Request -
State: closed - Opened by haojinIntel almost 2 years ago
- 1 comment
#1305 - Benchmarks:Marked modifications to bert-large and ResNet50 distributed training in models.
Pull Request -
State: closed - Opened by yao531441 almost 2 years ago
#1304 - Support to run training or inference for ssd-resnet34 without IPEX.
Pull Request -
State: closed - Opened by haojinIntel almost 2 years ago
- 2 comments
#1293 - Add HUAWEICLOUD integration test
Pull Request -
State: closed - Opened by kiwik almost 2 years ago
#1194 - Can cloudtik support Alicloud?
Issue -
State: closed - Opened by george-gu-2021 almost 2 years ago
- 2 comments
#1011 - [Feature] Add HuaweiCloud provider
Issue -
State: open - Opened by kiwik about 2 years ago
- 49 comments