GitHub / rwth-i6/returnn issues and pull requests
#1737 - `DistributeFilesDataset`: `distrib_shard_files=True` leads to assertion error
Issue -
State: closed - Opened by Icemole 17 days ago
- 2 comments
#1726 - Distributed trainings: wait for data at every step necessary?
Issue -
State: open - Opened by NeoLegends 2 months ago
- 1 comment
#1725 - Add tag_prefix as a parameter to LmDataset
Pull Request -
State: closed - Opened by dorian-K 3 months ago
#1724 - PT: No backend type associated with device type cpu
Issue -
State: open - Opened by NeoLegends 3 months ago
- 4 comments
#1723 - test_rel_pos_self_attention (nondeterministic? rare?) exception: dim_orig is None
Issue -
State: open - Opened by albertz 3 months ago
#1722 - get_complete_frac fix when num_seqs is None
Pull Request -
State: closed - Opened by albertz 3 months ago
#1721 - Dataset's num_seqs is None with TF backend for Viterbi training
Issue -
State: closed - Opened by Marvin84 3 months ago
- 2 comments
#1720 - Drop older Python support
Issue -
State: open - Opened by albertz 3 months ago
#1719 - tooling: change formatter to ruff
Pull Request -
State: closed - Opened by NeoLegends 3 months ago
- 16 comments
#1718 - FileCache.get_file error: No space left on device
Issue -
State: open - Opened by albertz 3 months ago
#1717 - `DistributeFilesDataset`: allow specifying files via list file
Pull Request -
State: closed - Opened by NeoLegends 3 months ago
Labels: enhancement
#1716 - HDFDataset startup for huge dataset is slow
Issue -
State: open - Opened by albertz 3 months ago
- 1 comment
#1715 - PostprocessingDataset serialization fails due to map_seq_stream_preserves_num_seqs
Issue -
State: closed - Opened by albertz 4 months ago
#1714 - RF RunCtx train_flag per func
Pull Request -
State: closed - Opened by albertz 4 months ago
#1713 - SimpleHDFWriter, sanity checks
Pull Request -
State: closed - Opened by albertz 4 months ago
#1712 - RF train_flag_ctx potentially not what you want, enable_dropout_ctx or enable_regularization_ctx or so instead
Issue -
State: closed - Opened by albertz 4 months ago
- 6 comments
#1711 - Fix incorrect `complete_frac` passed to `_print_process`
Pull Request -
State: closed - Opened by dorian-K 4 months ago
- 2 comments
#1710 - `torch.load` crash, due to changed defaults in torch >= 2.6
Issue -
State: open - Opened by NeoLegends 4 months ago
- 3 comments
Labels: bug
#1709 - FileCache: hold lock and refresh mtime during cleanup
Pull Request -
State: closed - Opened by NeoLegends 4 months ago
- 14 comments
Labels: bug
#1708 - RF combine inconcistent between native and pure Python
Issue -
State: open - Opened by albertz 4 months ago
- 2 comments
#1707 - Forward: OOM split batch crash on `epoch` data key
Issue -
State: closed - Opened by NeoLegends 4 months ago
#1706 - Tests failing, AttributeError: module 'torch' has no attribute 'compiler', transformers lib
Issue -
State: closed - Opened by albertz 4 months ago
- 3 comments
#1705 - Error from changes in engine.py
Issue -
State: closed - Opened by mmueller00 4 months ago
- 2 comments
#1704 - Add tensorboard to torch engine
Pull Request -
State: open - Opened by robin-p-schmitt 4 months ago
- 2 comments
#1703 - compile_tf_graph.py error when using --rec_step_by_step for an AED network
Issue -
State: open - Opened by jiangj-dc 4 months ago
#1702 - LLVM ERROR: Symbol not found: __svml_cosf8_ha
Issue -
State: closed - Opened by albertz 4 months ago
- 4 comments
#1701 - PostprocessingDataset with multi-processing
Issue -
State: open - Opened by albertz 4 months ago
- 2 comments
#1700 - MixingDataset needed
Issue -
State: open - Opened by albertz 4 months ago
- 3 comments
#1699 - RF tests, enable test_single_batch_entry globally
Pull Request -
State: closed - Opened by albertz 5 months ago
#1698 - PT: also use `complete_frac` for progress reporting
Pull Request -
State: closed - Opened by NeoLegends 5 months ago
Labels: enhancement
#1697 - PT: add randomization to bucket batching
Pull Request -
State: closed - Opened by NeoLegends 5 months ago
- 1 comment
#1696 - RF conv/pool, fix same padding with striding
Pull Request -
State: closed - Opened by albertz 5 months ago
#1695 - RF merge_dims, fix for mult dyn dims
Pull Request -
State: closed - Opened by albertz 5 months ago
#1694 - Frontend `merge_dims` problematic on dynamic dims
Issue -
State: closed - Opened by albertz 5 months ago
#1693 - Frontend `conv`/`pool` 'same' padding with striding is inconsistent
Issue -
State: closed - Opened by albertz 5 months ago
- 3 comments
#1692 - RF conv/pool/etc, stft, window, use_mask, new behavior version 23
Pull Request -
State: closed - Opened by albertz 5 months ago
- 1 comment
#1691 - Frontend: masking for more functions, global setting
Issue -
State: closed - Opened by albertz 5 months ago
- 7 comments
Labels: potential-new-behavior, returnn-frontend
#1690 - OggZip: add option to resample audio via ffmpeg
Pull Request -
State: closed - Opened by NeoLegends 5 months ago
#1689 - PT: pass dist rank/size via env to subprocesses
Pull Request -
State: closed - Opened by NeoLegends 5 months ago
- 3 comments
#1688 - PT preload_from_files ignores when no params are matching
Issue -
State: open - Opened by albertz 6 months ago
#1687 - `rf.set_default_device` (`torch.set_default_device`?) before model creation?
Issue -
State: open - Opened by albertz 6 months ago
#1686 - pytest collecting phase is slow
Issue -
State: open - Opened by albertz 6 months ago
#1685 - `DFDataset`: do not pickle sharding info
Pull Request -
State: closed - Opened by NeoLegends 6 months ago
- 6 comments
#1684 - fix build, publish wheels
Pull Request -
State: closed - Opened by dimbleby 6 months ago
- 1 comment
#1683 - Automatically sorting dataset does not work with Torch engine forward + MetaDatasets
Issue -
State: open - Opened by albertz 6 months ago
- 4 comments
#1682 - SprintCacheDataset issue with torch backend
Issue -
State: open - Opened by robin-p-schmitt 6 months ago
#1681 - Fix assert in LaplaceOrdering
Pull Request -
State: closed - Opened by dorian-K 6 months ago
- 2 comments
#1680 - Implicit `Tensor.__bool__` can cause unexpected behavior
Issue -
State: closed - Opened by albertz 6 months ago
- 1 comment
#1679 - Tensor, disallow __bool__
Pull Request -
State: closed - Opened by albertz 6 months ago
- 2 comments
#1678 - DistributeFilesDataset _num_shards issue
Issue -
State: open - Opened by Judyxujj 6 months ago
- 4 comments
#1677 - Add option to passthrough num_seqs in PostprocessingDataset
Pull Request -
State: closed - Opened by dorian-K 7 months ago
- 5 comments
#1676 - Dataset: implement global `dataset_distribution` option
Pull Request -
State: open - Opened by NeoLegends 7 months ago
- 4 comments
Labels: enhancement
#1675 - `FileNotFoundError` when updating mtime of files in file cache
Issue -
State: closed - Opened by NeoLegends 7 months ago
- 8 comments
Labels: bug
#1674 - Interrupt main thread on Exception in sub thread
Pull Request -
State: closed - Opened by NeoLegends 7 months ago
- 1 comment
#1673 - Dataset: allow LR scheduling based on `get_complete_frac`
Pull Request -
State: closed - Opened by NeoLegends 7 months ago
- 21 comments
#1672 - Fix _distribute_evenly_by_size for duplicate entries in files_order
Pull Request -
State: closed - Opened by dorian-K 7 months ago
- 1 comment
#1671 - SimpleHDFWriter extra seq lens not correct, not supporting custom seq lens
Issue -
State: open - Opened by albertz 7 months ago
#1670 - Cleanup `returnn.tf.compat`
Issue -
State: open - Opened by albertz 8 months ago
Labels: TensorFlow
#1669 - Loading large HDFDatasets inside MetaDataset is slow
Issue -
State: open - Opened by dorian-K 8 months ago
- 6 comments
#1668 - Bump required Python 3.7 -> 3.8, drop TF1 support
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
- 7 comments
#1667 - Unhandled exceptions in threads should halt the program
Issue -
State: closed - Opened by albertz 8 months ago
- 1 comment
#1666 - TODO add test for batching in RF RelPosSelfAttention
Issue -
State: closed - Opened by albertz 8 months ago
#1665 - RF cum_concat_step simplify and other RF things
Pull Request -
State: closed - Opened by albertz 8 months ago
- 5 comments
#1663 - `LockFile`: inspect other processes to check whether lockfile is held
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
- 1 comment
#1661 - PT: add uniform likelihood bucket batching
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
- 4 comments
#1660 - PT: allow custom batching, add bucket batching
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
- 1 comment
#1659 - Allow skipping sequences in forward config option
Pull Request -
State: closed - Opened by dorian-K 8 months ago
- 1 comment
#1658 - Allow inserting 0-length elements into HDF other
Pull Request -
State: closed - Opened by dorian-K 8 months ago
- 1 comment
#1657 - Bump github action versions to their most recent version
Pull Request -
State: closed - Opened by dorian-K 8 months ago
#1656 - PT: print padding amount per batch and subepoch
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
- 1 comment
Labels: enhancement
#1655 - `PPDataset`: implement `BucketOrdering`
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
- 6 comments
#1654 - Step count is not reset when loading a checkpoint and resetting the epoch
Issue -
State: open - Opened by mmueller00 8 months ago
- 2 comments
#1653 - Add extra_labels to SimpleHDFWriter
Pull Request -
State: closed - Opened by dorian-K 8 months ago
#1652 - `PPDataset`: be strict about `seq_order` and `seq_list` in `init_seq_order`
Pull Request -
State: closed - Opened by NeoLegends 8 months ago
#1651 - PostprocessingDataset init_seq_order with given seq_list or seq_order wrong (at least with map_seq_stream)
Issue -
State: closed - Opened by albertz 8 months ago
#1650 - MultiEpochDataset: implement get_current_seq_order
Pull Request -
State: closed - Opened by dorian-K 8 months ago
- 4 comments
#1649 - Train proc manager restarts after Bus error crash, still consumes GPU memory, get OutOfMemoryError
Issue -
State: open - Opened by albertz 9 months ago
#1648 - Unexpected bus error encountered in worker
Issue -
State: open - Opened by albertz 9 months ago
- 2 comments
#1647 - remove Nose dependency
Issue -
State: closed - Opened by albertz 9 months ago
- 3 comments
#1646 - `MPDataset`: make compatible with being wrapped in `PPDataset`
Pull Request -
State: closed - Opened by NeoLegends 9 months ago
Labels: bug
#1645 - Plan for packed dims
Issue -
State: open - Opened by albertz 9 months ago
- 3 comments
#1644 - FileCache: update mtime of lockfile immediately after acquiring it
Pull Request -
State: closed - Opened by NeoLegends 9 months ago
- 2 comments
#1643 - FileCache assertion on previous copy attempt age triggered
Issue -
State: closed - Opened by NeoLegends 9 months ago
- 11 comments
#1642 - RF (PT) meaning of losses with `as_error`
Issue -
State: open - Opened by albertz 9 months ago
#1641 - `Tensor` `Dim`, support `Dim.capacity > max(Dim.dyn_size_ext)`
Issue -
State: open - Opened by albertz 9 months ago
Labels: TPU, JAX
#1640 - MultiProcDataset, implement get_all_tags
Pull Request -
State: closed - Opened by albertz 9 months ago
#1639 - MultiEpochDataset, and some other smaller things
Pull Request -
State: closed - Opened by albertz 9 months ago
#1638 - Potential timeout during data caching in multi-node trainings
Issue -
State: open - Opened by NeoLegends 9 months ago
- 2 comments
Labels: bug
#1637 - Some fix for invalid broadcasting
Pull Request -
State: open - Opened by albertz 10 months ago
- 5 comments
#1636 - RF cross_entropy (matmul, gather) should maybe have allow_broadcast?
Issue -
State: open - Opened by albertz 10 months ago
#1635 - Remove outdated Python header attribs?
Issue -
State: open - Opened by albertz 10 months ago
#1634 - Sharding for multi-GPU training
Issue -
State: open - Opened by albertz 10 months ago
- 2 comments
#1633 - Dim declare_same_as, fix when existing same_as
Pull Request -
State: closed - Opened by albertz 10 months ago
#1632 - `LaplaceOrdering`: avoid spiky CPU utilization
Pull Request -
State: closed - Opened by NeoLegends 10 months ago
- 1 comment
Labels: bug
#1631 - `LaplaceOrdering` interacts badly w/ MultiProcDataset
Issue -
State: closed - Opened by NeoLegends 10 months ago
Labels: bug
#1630 - Datasets: implement support for within-dataset sharding
Pull Request -
State: closed - Opened by NeoLegends 10 months ago
- 4 comments
#1629 - DeepCopyError in gradient checkpointing when using `param_variational_noise`
Issue -
State: closed - Opened by mmz33 11 months ago
- 5 comments
#1628 - PT: regularly sync progress during eval, fix tensor assignment
Pull Request -
State: closed - Opened by NeoLegends 11 months ago
- 1 comment
#1627 - PT: regularly sync progress during eval
Pull Request -
State: closed - Opened by NeoLegends 11 months ago
Labels: bug
#1624 - `OggZipDataset`: normalize // to / when reading files from archive
Pull Request -
State: open - Opened by NeoLegends 11 months ago
- 2 comments