Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / GoogleCloudPlatform/ml-auto-solutions issues and pull requests

#426 - Adding a multislice test for maxdiffusion

Pull Request - State: closed - Opened by ssusie about 2 months ago

#425 - update to 2-5 for v6e test name

Pull Request - State: closed - Opened by ManfeiBai about 2 months ago

#424 - Install the optional dependencies with axlearn.

Pull Request - State: closed - Opened by lukebaumann about 2 months ago - 2 comments

#423 - Add Trillium CI tests for 2.5 release

Pull Request - State: closed - Opened by bhavya01 about 2 months ago

#422 - Update PyTorch/XLA r2.5 benchmarking with rc5 wheel

Pull Request - State: closed - Opened by ManfeiBai about 2 months ago

#421 - Increase timeouts for tensorflow tests

Pull Request - State: closed - Opened by chandrasekhard2 about 2 months ago

#420 - Yijiaj/mlperf a2

Pull Request - State: open - Opened by jyj0w0 about 2 months ago - 1 comment

#419 - add "multipod_team" tag to bite DAG

Pull Request - State: closed - Opened by sadikneipp about 2 months ago

#418 - Add redundancy to multipod codeowners

Pull Request - State: closed - Opened by jonb377 about 2 months ago - 1 comment

#417 - Add jetstream-pytorch gemma 7b tests

Pull Request - State: closed - Opened by sixiang-google about 2 months ago

#416 - Fixing incorrect command which is causing tests to fail

Pull Request - State: closed - Opened by parambole about 2 months ago

#415 - Change configs for inference dashboard & adjust num_prompts

Pull Request - State: closed - Opened by sixiang-google about 2 months ago

#414 - Clean up code owners

Pull Request - State: open - Opened by will-cromar about 2 months ago

#413 - Trillium addition - maxtext inference microbenchmarking

Pull Request - State: closed - Opened by mailvijayasingh about 2 months ago

#412 - Fixing test images and run command

Pull Request - State: closed - Opened by parambole about 2 months ago

#411 - Add vLLM inference benchmarks

Pull Request - State: open - Opened by richardsliu about 2 months ago

#410 - Add maxtext single host benchmark runner

Pull Request - State: open - Opened by ortibazar about 2 months ago

#409 - Change a3 cluster and its zone

Pull Request - State: closed - Opened by NinaCai about 2 months ago - 2 comments

#408 - Update test workflow to use the new mistral end-to-end script.

Pull Request - State: closed - Opened by shralex about 2 months ago

#407 - Add different batch sizes for llama and gemma

Pull Request - State: closed - Opened by mailvijayasingh about 2 months ago

#406 - Remove team tag from example dag

Pull Request - State: open - Opened by jonb377 about 2 months ago

#405 - make mantaray_gcs_bucket a param

Pull Request - State: closed - Opened by wenxindongwork about 2 months ago

#404 - Add CI tests for Trillium

Pull Request - State: closed - Opened by bhavya01 about 2 months ago - 2 comments

#403 - Create DAGs for microbenchmarks

Pull Request - State: closed - Opened by qinyiyan about 2 months ago - 1 comment

#402 - Fix device type in v5e cluster

Pull Request - State: closed - Opened by jonb377 about 2 months ago - 1 comment

#401 - Split GPU and TPU MaxText end_to_end tests

Pull Request - State: closed - Opened by jonb377 about 2 months ago

#400 - Update imageTag to nightly_3.10

Pull Request - State: closed - Opened by bhavya01 about 2 months ago

#399 - Fix for resource congestion & add tests with different batch sizes

Pull Request - State: closed - Opened by sixiang-google about 2 months ago

#398 - Refactoring Workload Stable Stack Images

Pull Request - State: closed - Opened by parambole about 2 months ago

#397 - Centralize XPK cluster details

Pull Request - State: closed - Opened by jonb377 about 2 months ago - 1 comment

#396 - Remove dangling Jax Stable Stack task group

Pull Request - State: closed - Opened by parambole 2 months ago

#395 - Adding maxdiffusion test to assert training loss

Pull Request - State: closed - Opened by parambole 2 months ago

#394 - fix multihost tf tests

Pull Request - State: closed - Opened by chandrasekhard2 2 months ago

#393 - Update trillium test cluster

Pull Request - State: closed - Opened by raymondzouu 2 months ago

#392 - Fix max* tensorboard file location

Pull Request - State: closed - Opened by raymondzouu 2 months ago

#391 - Update PyTorch/XLA nightly link.

Pull Request - State: closed - Opened by bhavya01 2 months ago - 1 comment

#389 - Added a new DAG and tests for MaxDiffusion

Pull Request - State: closed - Opened by parambole 2 months ago

#387 - Move v4-128 tests to new cluster

Pull Request - State: closed - Opened by bvandermoon 2 months ago

#386 - add data pipeline to convergence test

Pull Request - State: open - Opened by aireenmei 2 months ago

#385 - Add MaxText perf tests on trillium

Pull Request - State: closed - Opened by raymondzouu 2 months ago

#384 - Delete flax tests

Pull Request - State: closed - Opened by wenxindongwork 2 months ago - 1 comment

#383 - Update common.libsonnet to use built nightly wheel

Pull Request - State: open - Opened by ManfeiBai 2 months ago

#382 - asd

Pull Request - State: closed - Opened by ManfeiBai 2 months ago

#381 - Nightly torch wheel

Pull Request - State: closed - Opened by ManfeiBai 2 months ago

#380 - Update airflow tests for PyTorch/XLA 2.5 release

Pull Request - State: closed - Opened by ManfeiBai 2 months ago - 3 comments
Labels: 2.5 release

#379 - Update and rename r2_4.py to r2_5.py

Pull Request - State: closed - Opened by ManfeiBai 2 months ago

#377 - Update owners

Pull Request - State: closed - Opened by yeandy 3 months ago

#375 - Fixing JAX Stable Stack end-to-end breaking tests

Pull Request - State: closed - Opened by parambole 3 months ago

#374 - Jetstream-pytorch fix setup & add tests

Pull Request - State: closed - Opened by sixiang-google 3 months ago - 1 comment

#373 - Use reserved v5p

Pull Request - State: closed - Opened by zpcore 3 months ago

#372 - Remove libtpu lockfile before reloading tpu-runtime

Pull Request - State: closed - Opened by chandrasekhard2 3 months ago - 1 comment

#371 - Add AoT test for A3 GPU

Pull Request - State: closed - Opened by jonb377 3 months ago

#370 - Support preemptible TPU using GCE

Pull Request - State: closed - Opened by zpcore 4 months ago - 2 comments

#369 - JAX Stable Stack Docker Image for AOT Hybridsim Tests

Pull Request - State: closed - Opened by parambole 4 months ago

#368 - end_to_end tests for mixtral-8x22b

Pull Request - State: closed - Opened by rdyro 4 months ago - 2 comments

#367 - Fix setup instructions for TF DLRM model

Pull Request - State: closed - Opened by chandrasekhard2 4 months ago

#366 - Adding temp MXLA test for v5p GA as requested.

Pull Request - State: closed - Opened by tonyjohnchen 4 months ago

#365 - Fix r2.4 GPU docker link

Pull Request - State: closed - Opened by zpcore 4 months ago - 1 comment

#364 - Update composer_env.py

Pull Request - State: closed - Opened by wenxindongwork 4 months ago

#363 - fix flax tests by disabling profiling from the flax tests

Pull Request - State: closed - Opened by ZhaoyueCheng 4 months ago - 2 comments

#362 - Package common library code in `xlml`

Pull Request - State: closed - Opened by will-cromar 4 months ago - 3 comments

#361 - fix test version not used

Pull Request - State: closed - Opened by zpcore 4 months ago

#360 - Remove references to `torch_xla[tpuvm]` optional dependency

Pull Request - State: closed - Opened by will-cromar 4 months ago

#359 - update setuptools version

Pull Request - State: closed - Opened by zpcore 4 months ago - 2 comments

#358 - add r2.4 test and fix setuptool version

Pull Request - State: closed - Opened by zpcore 4 months ago - 1 comment

#357 - Fix permissions for PT/XLA 2.4 release tests

Pull Request - State: closed - Opened by bhavya01 4 months ago - 1 comment

#356 - Update Pytorch/XLA 2.4 release wheel to rc8

Pull Request - State: closed - Opened by bhavya01 4 months ago - 1 comment

#355 - Fix typo in PT/XLA llama2 test setup

Pull Request - State: closed - Opened by will-cromar 4 months ago

#354 - Integrate Mantaray into XLML

Pull Request - State: closed - Opened by wenxindongwork 4 months ago

#353 - Modified DAGs to consume jax-stable-stack docker image

Pull Request - State: closed - Opened by parambole 4 months ago - 1 comment

#352 - Fix permissions for PT/XLA llama tests

Pull Request - State: closed - Opened by will-cromar 4 months ago

#351 - Add to inference code owners

Pull Request - State: closed - Opened by JoeZijunZhou 4 months ago

#350 - Remove moe_matmul parameter in JetStream+Maxtext benchmarks

Pull Request - State: closed - Opened by yeandy 4 months ago

#349 - Return empty string if gcs file location not found

Pull Request - State: closed - Opened by raymondzouu 4 months ago - 2 comments

#348 - Increase number of dependencies in v5e maxtext perf tests

Pull Request - State: closed - Opened by raymondzouu 4 months ago

#347 - add @sixiang-google to codeowners

Pull Request - State: closed - Opened by sixiang-google 4 months ago

#346 - Add dependencies between v5e perf tests

Pull Request - State: closed - Opened by raymondzouu 5 months ago - 3 comments

#345 - Update Mixtral BS

Pull Request - State: open - Opened by yeandy 5 months ago - 3 comments

#344 - Fix check of bool str for quantize_kvcache

Pull Request - State: closed - Opened by yeandy 5 months ago

#343 - Add pathways v5e perf tests

Pull Request - State: closed - Opened by raymondzouu 5 months ago

#342 - add some comment

Pull Request - State: closed - Opened by morgandu 5 months ago

#341 - Update PyTorch/XLA 2.4 release version for tests

Pull Request - State: closed - Opened by bhavya01 5 months ago

#340 - Remove pt-nightly-resnet50-mp-plugin* tests

Pull Request - State: closed - Opened by vanbasten23 5 months ago

#339 - Add MaxText Llama2 to v5e perf tests

Pull Request - State: closed - Opened by raymondzouu 5 months ago

#338 - Add Mixtral-8x7B inference nightly/stable runs

Pull Request - State: closed - Opened by vipannalla 5 months ago - 2 comments

#337 - Multi-Model MLPerf4.0 Reproduce on A3

Pull Request - State: closed - Opened by jyj0w0 5 months ago - 2 comments

#336 - Change cluster zone for a3+ cluster

Pull Request - State: closed - Opened by NinaCai 5 months ago

#335 - [test] Update llama2-model.libsonnet

Pull Request - State: closed - Opened by ManfeiBai 5 months ago - 1 comment

#334 - add jetstream-pytorch initial automation tests

Pull Request - State: closed - Opened by sixiang-google 5 months ago - 4 comments

#333 - Modify release tests for 2.17.0

Pull Request - State: closed - Opened by chandrasekhard2 5 months ago

#332 - Update for jetstream maxtext

Pull Request - State: closed - Opened by morgandu 5 months ago

#331 - fix torchxla2 GPU failure

Pull Request - State: closed - Opened by zpcore 5 months ago

#330 - some changes

Pull Request - State: closed - Opened by morgandu 5 months ago

#329 - Add GPU test for torchxla2

Pull Request - State: closed - Opened by zpcore 5 months ago

#328 - Minor update to GKE and update torchbench terraform

Pull Request - State: closed - Opened by zpcore 5 months ago - 1 comment