Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / sparticlesteve/cosmoflow-benchmark issues and pull requests
#49 - inflate() failed with eror -4: incorrect header check
Issue -
State: open - Opened by RumitAP over 2 years ago
- 5 comments
#48 - Resuming from checkpoint always runs at least one epoch despite target
Issue -
State: open - Opened by sparticlesteve over 2 years ago
#47 - update docker container to ngc 22.04
Pull Request -
State: closed - Opened by sparticlesteve over 2 years ago
#46 - Various cleanups / updates
Pull Request -
State: closed - Opened by sparticlesteve over 2 years ago
#45 - Reproducibility updates
Pull Request -
State: closed - Opened by sparticlesteve over 2 years ago
#44 - Fix shard + shuffle
Pull Request -
State: closed - Opened by sparticlesteve almost 3 years ago
#43 - Configs and code for MLPerf HPC v1.0 RCPs
Pull Request -
State: closed - Opened by sparticlesteve about 3 years ago
#42 - Delete cosmo_dryrun.yaml
Pull Request -
State: closed - Opened by sparticlesteve about 3 years ago
#41 - fix global shuffling
Issue -
State: closed - Opened by sparticlesteve over 3 years ago
Labels: bug
#40 - Adding mlperf log events for weak-scaling metrics
Pull Request -
State: closed - Opened by sparticlesteve over 3 years ago
#39 - Fix epoch numbering for mlperf logging
Pull Request -
State: closed - Opened by sparticlesteve over 3 years ago
#38 - Added new HPs to mlperf logging
Pull Request -
State: closed - Opened by sparticlesteve over 3 years ago
#37 - Implement random seed for reproducibility
Issue -
State: closed - Opened by sparticlesteve over 3 years ago
- 1 comment
#36 - Logging bug in TF 2.3 - 2.4
Issue -
State: closed - Opened by sparticlesteve over 3 years ago
- 1 comment
#35 - set default prefetch and parallel reads to use autotune
Pull Request -
State: closed - Opened by sparticlesteve over 3 years ago
#34 - dockerfile for nvidia gpu systems
Pull Request -
State: closed - Opened by sparticlesteve over 3 years ago
#33 - update build_model function defaults to mlperf closed-div config
Pull Request -
State: closed - Opened by sparticlesteve over 3 years ago
#32 - Update python default settings to match closed division config
Issue -
State: closed - Opened by sparticlesteve almost 4 years ago
#31 - Revert to horovod built-in load_model function
Pull Request -
State: closed - Opened by sparticlesteve almost 4 years ago
- 1 comment
#30 - Problem continuing from checkpoint with AMP+Horovod
Issue -
State: closed - Opened by sparticlesteve almost 4 years ago
- 2 comments
#29 - Metrics names inconsistent
Issue -
State: closed - Opened by sparticlesteve almost 4 years ago
- 2 comments
#28 - Add mixed precision support
Pull Request -
State: closed - Opened by sparticlesteve almost 4 years ago
- 1 comment
#27 - Update code to work with TF 2.2
Pull Request -
State: closed - Opened by sparticlesteve almost 4 years ago
- 2 comments
#26 - add support for mixed precision
Issue -
State: closed - Opened by sparticlesteve almost 4 years ago
#25 - Update code to TF 2
Issue -
State: closed - Opened by sparticlesteve almost 4 years ago
#24 - Updated data preprocessing
Pull Request -
State: closed - Opened by sparticlesteve almost 4 years ago
#23 - Error when reading data
Issue -
State: closed - Opened by jaidayal almost 4 years ago
- 1 comment
#22 - adding W&B logging
Pull Request -
State: closed - Opened by sparticlesteve about 4 years ago
- 1 comment
#21 - update legal+license
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#20 - Update cpu docker build file
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#19 - Stopping callback
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#18 - Logging data staging
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#17 - Target accuracy and hyperparamteres
Issue -
State: closed - Opened by undertherain over 4 years ago
- 1 comment
#16 - Adding more performance tuning knobs
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#15 - updated timemory metric tracking around fit
Pull Request -
State: open - Opened by jbalma over 4 years ago
#14 - Docker/shifter workflow
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#13 - WIP adding mlperf logging
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#12 - Move data staging into main training script
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
- 1 comment
#11 - New LR schedule
Pull Request -
State: closed - Opened by sparticlesteve over 4 years ago
#10 - Updating to new 2019_05_4parE dataset
Pull Request -
State: closed - Opened by sparticlesteve almost 5 years ago
- 1 comment
#9 - Added epoch performance summary across all MPI processes
Pull Request -
State: closed - Opened by jbalma almost 5 years ago
#8 - create submit_scaling_theta.sh
Pull Request -
State: closed - Opened by memani1 about 5 years ago
- 1 comment
#7 - create train_theta.sh
Pull Request -
State: closed - Opened by memani1 about 5 years ago
- 1 comment
#6 - create setup_theta.sh file
Pull Request -
State: closed - Opened by memani1 about 5 years ago
- 1 comment
#5 - Dataset checksum
Issue -
State: open - Opened by ekuznetsov139 about 5 years ago
#4 - Summit staff, adding scripts, notebook, and modify the cosmo.py and t…
Pull Request -
State: closed - Opened by tsaris about 5 years ago
#3 - summit staff
Pull Request -
State: closed - Opened by tsaris about 5 years ago
#2 - summit staff new
Pull Request -
State: closed - Opened by tsaris about 5 years ago
#1 - summit staff
Pull Request -
State: closed - Opened by tsaris about 5 years ago
- 2 comments