Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / kempnerinstitute/tatm issues and pull requests
#100 - env based config loading, add metadata config
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#100 - env based config loading, add metadata config
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#99 - add data loading scripts
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#99 - add data loading scripts
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#98 - Add basic image-text dataset code
Pull Request -
State: closed - Opened by timothyngo about 2 months ago
#97 - Lacking image-text support
Issue -
State: closed - Opened by timothyngo about 2 months ago
#96 - get_dataset() doesn't require context_length, but the TatmMemmapDataset constructor does
Issue -
State: open - Opened by toizzy about 2 months ago
- 1 comment
#96 - get_dataset() doesn't require context_length, but the TatmMemmapDataset constructor does
Issue -
State: open - Opened by toizzy about 2 months ago
- 1 comment
#95 - Write Metadata Store Loading Scripts
Issue -
State: closed - Opened by mbsabath about 2 months ago
#95 - Write Metadata Store Loading Scripts
Issue -
State: closed - Opened by mbsabath about 2 months ago
#94 - Implement OpenMetadata based metadata backend
Issue -
State: closed - Opened by mbsabath about 2 months ago
#94 - Implement OpenMetadata based metadata backend
Issue -
State: closed - Opened by mbsabath about 2 months ago
#93 - Implement backend configuration via a shared, admin maintained config file
Issue -
State: closed - Opened by mbsabath about 2 months ago
#93 - Implement backend configuration via a shared, admin maintained config file
Issue -
State: closed - Opened by mbsabath about 2 months ago
#92 - intial metadata store interface with lightweight json store
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#92 - intial metadata store interface with lightweight json store
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#91 - add parent datasets to tokenized data metadata
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#91 - add parent datasets to tokenized data metadata
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#90 - add file not found error for non-existent dataset path
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#90 - add file not found error for non-existent dataset path
Pull Request -
State: closed - Opened by mbsabath about 2 months ago
#89 - Create general interface for interacting with a metadata store
Issue -
State: closed - Opened by mbsabath about 2 months ago
#89 - Create general interface for interacting with a metadata store
Issue -
State: closed - Opened by mbsabath about 2 months ago
#88 - Allow combination of TatmDatasets from multiple directories
Issue -
State: open - Opened by mbsabath about 2 months ago
Labels: enhancement
#88 - Allow combination of TatmDatasets from multiple directories
Issue -
State: open - Opened by mbsabath about 2 months ago
Labels: enhancement
#87 - Add information about source data to tokenized dataset
Issue -
State: closed - Opened by mbsabath 2 months ago
#87 - Add information about source data to tokenized dataset
Issue -
State: closed - Opened by mbsabath 2 months ago
#86 - Upgrade aiohttp
Pull Request -
State: closed - Opened by mbsabath 2 months ago
#86 - Upgrade aiohttp
Pull Request -
State: closed - Opened by mbsabath 2 months ago
#85 - no workflow concurrency
Pull Request -
State: closed - Opened by mbsabath 2 months ago
#85 - no workflow concurrency
Pull Request -
State: closed - Opened by mbsabath 2 months ago
#84 - Prevent concurrent workflow execution
Issue -
State: closed - Opened by mbsabath 2 months ago
#84 - Prevent concurrent workflow execution
Issue -
State: closed - Opened by mbsabath 2 months ago
#83 - Opaque error in get_dataset() when the input path does not exist
Issue -
State: closed - Opened by toizzy 2 months ago
#83 - Opaque error in get_dataset() when the input path does not exist
Issue -
State: closed - Opened by toizzy 2 months ago
#82 - Use threading to speed up data iteration in tokenization
Issue -
State: open - Opened by mbsabath 3 months ago
#82 - Use threading to speed up data iteration in tokenization
Issue -
State: open - Opened by mbsabath 3 months ago
#81 - Allow split specification at CLI and with `get_data`
Issue -
State: open - Opened by mbsabath 4 months ago
#81 - Allow split specification at CLI and with `get_data`
Issue -
State: open - Opened by mbsabath 4 months ago
#80 - clarify api reference docs
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#80 - clarify api reference docs
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#79 - add dtype param to tokenization, set default to uint32
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#79 - add dtype param to tokenization, set default to uint32
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#78 - convert (str,Enums) to TatmOptionEnum
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#78 - convert (str,Enums) to TatmOptionEnum
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#77 - Save tokenized data in torch.int and convert to torch.long while loading
Issue -
State: closed - Opened by MJK12341234 4 months ago
- 1 comment
#77 - Save tokenized data in torch.int and convert to torch.long while loading
Issue -
State: closed - Opened by MJK12341234 4 months ago
- 1 comment
#76 - Allow customization of stored token data type at CLI
Issue -
State: closed - Opened by mbsabath 4 months ago
- 1 comment
#76 - Allow customization of stored token data type at CLI
Issue -
State: closed - Opened by mbsabath 4 months ago
- 1 comment
#75 - Convert tokens to `torch.long` on load
Issue -
State: closed - Opened by mbsabath 4 months ago
#75 - Convert tokens to `torch.long` on load
Issue -
State: closed - Opened by mbsabath 4 months ago
#74 - remove kempner from README
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#74 - remove kempner from README
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#73 - Benchmarking infrastructure
Issue -
State: open - Opened by Naeemkh 4 months ago
- 1 comment
Labels: enhancement
#73 - Benchmarking infrastructure
Issue -
State: open - Opened by Naeemkh 4 months ago
- 1 comment
Labels: enhancement
#72 - Add API Reference to the docs
Issue -
State: closed - Opened by Naeemkh 4 months ago
Labels: documentation
#71 - Remove Kempner Institute Reference in Readme
Issue -
State: closed - Opened by mbsabath 4 months ago
#71 - Remove Kempner Institute Reference in Readme
Issue -
State: closed - Opened by mbsabath 4 months ago
#70 - remove references to cannon cluster
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#70 - remove references to cannon cluster
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#69 - Remove Explicit References to Cannon Cluster
Issue -
State: closed - Opened by mbsabath 4 months ago
#69 - Remove Explicit References to Cannon Cluster
Issue -
State: closed - Opened by mbsabath 4 months ago
#68 - minimal tokenized data load example
Pull Request -
State: closed - Opened by mbsabath 4 months ago
- 2 comments
#68 - minimal tokenized data load example
Pull Request -
State: closed - Opened by mbsabath 4 months ago
- 2 comments
#67 - sparse getting started example
Issue -
State: closed - Opened by mbsabath 4 months ago
#66 - support directory based corpus splits
Pull Request -
State: closed - Opened by mbsabath 4 months ago
- 1 comment
#66 - support directory based corpus splits
Pull Request -
State: closed - Opened by mbsabath 4 months ago
- 1 comment
#65 - Iss62 link to docs
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#65 - Iss62 link to docs
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#64 - Support HF downloaded datasets where corpus is based on datadir
Issue -
State: closed - Opened by mbsabath 4 months ago
#64 - Support HF downloaded datasets where corpus is based on datadir
Issue -
State: closed - Opened by mbsabath 4 months ago
#63 - Build RTD docs
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#63 - Build RTD docs
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#62 - Link to Github Pages/RTD in package readme
Issue -
State: closed - Opened by mbsabath 4 months ago
#62 - Link to Github Pages/RTD in package readme
Issue -
State: closed - Opened by mbsabath 4 months ago
#61 - Improve Slurm Job submission error reporting
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#61 - Improve Slurm Job submission error reporting
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#60 - support negative indices
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#60 - support negative indices
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#59 - add default torch collate fn to the TatmMemmapDataset
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#59 - add default torch collate fn to the TatmMemmapDataset
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#58 - Fix padding for memory-mapped arrays
Pull Request -
State: closed - Opened by timothyngo 4 months ago
#58 - Fix padding for memory-mapped arrays
Pull Request -
State: closed - Opened by timothyngo 4 months ago
#57 - Memory-mapped array outputs not padded
Issue -
State: closed - Opened by timothyngo 4 months ago
#57 - Memory-mapped array outputs not padded
Issue -
State: closed - Opened by timothyngo 4 months ago
#56 - make document mask and document ids optional
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#56 - make document mask and document ids optional
Pull Request -
State: closed - Opened by mbsabath 4 months ago
#55 - Don't create document masks for Memmap dataset items by default
Issue -
State: closed - Opened by mbsabath 4 months ago
#55 - Don't create document masks for Memmap dataset items by default
Issue -
State: closed - Opened by mbsabath 4 months ago
#54 - Fix memory-mapped arrays when context length does not divide number of tokens
Pull Request -
State: closed - Opened by timothyngo 4 months ago
#54 - Fix memory-mapped arrays when context length does not divide number of tokens
Pull Request -
State: closed - Opened by timothyngo 4 months ago
#53 - MemmapArray Length Function Off by 1 Sometimes
Issue -
State: closed - Opened by timothyngo 4 months ago
#53 - MemmapArray Length Function Off by 1 Sometimes
Issue -
State: closed - Opened by timothyngo 4 months ago
#52 - Support indices less than 1 in the memmap dataset
Issue -
State: closed - Opened by mbsabath 4 months ago
#52 - Support indices less than 1 in the memmap dataset
Issue -
State: closed - Opened by mbsabath 4 months ago
#51 - Add more informative error message on slurm submission failure
Issue -
State: closed - Opened by mbsabath 4 months ago
#51 - Add more informative error message on slurm submission failure
Issue -
State: closed - Opened by mbsabath 4 months ago
#50 - Include default collate function for TatmTextDataset
Issue -
State: closed - Opened by mbsabath 4 months ago
#50 - Include default collate function for TatmTextDataset
Issue -
State: closed - Opened by mbsabath 4 months ago
#49 - Issue 47 - PR Review Edits (Documentation)
Pull Request -
State: closed - Opened by SarahL88 4 months ago
- 1 comment
#49 - Issue 47 - PR Review Edits (Documentation)
Pull Request -
State: closed - Opened by SarahL88 4 months ago
- 1 comment