Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / huggingface/datasets issues and pull requests
#5154 - Test latest fsspec in CI
Pull Request -
State: closed - Opened by lhoestq almost 2 years ago
- 2 comments
#5150 - Problems after upgrading to 2.6.1
Issue -
State: open - Opened by pietrolesci almost 2 years ago
- 10 comments
#5131 - WikiText 103 tokenizer hangs
Issue -
State: closed - Opened by TrentBrick almost 2 years ago
- 1 comment
Labels: bug
#5127 - [WIP] WebDataset export
Pull Request -
State: closed - Opened by lhoestq almost 2 years ago
- 2 comments
#5123 - datasets freezes with streaming mode in multiple-gpu
Issue -
State: open - Opened by jackfeinmann5 almost 2 years ago
- 11 comments
Labels: bug
#5117 - Progress bars have color red and never completed to 100%
Issue -
State: closed - Opened by echatzikyriakidis almost 2 years ago
- 5 comments
Labels: bug
#5096 - Transfer some canonical datasets under an organization namespace
Issue -
State: closed - Opened by albertvillanova almost 2 years ago
- 11 comments
Labels: dataset contribution
#5084 - IterableDataset formatting in numpy/torch/tf/jax
Pull Request -
State: closed - Opened by lhoestq almost 2 years ago
- 3 comments
#5083 - Support numpy/torch/tf/jax formatting for IterableDataset
Issue -
State: closed - Opened by lhoestq almost 2 years ago
- 2 comments
Labels: enhancement, streaming, good second issue
#5045 - Automatically revert to last successful commit to hub when a push_to_hub is interrupted
Issue -
State: closed - Opened by jorahn about 2 years ago
- 5 comments
Labels: enhancement
#5044 - integrate `load_from_disk` into `load_dataset`
Issue -
State: open - Opened by stas00 about 2 years ago
- 11 comments
Labels: enhancement
#5018 - Create all YAML dataset_info
Pull Request -
State: closed - Opened by lhoestq about 2 years ago
- 2 comments
Labels: dataset contribution
#5012 - Force JSON format regardless of file naming on S3
Issue -
State: closed - Opened by junwang-wish about 2 years ago
- 4 comments
Labels: enhancement
#5001 - Support loading XML datasets
Pull Request -
State: open - Opened by albertvillanova about 2 years ago
- 3 comments
#4983 - How to convert torch.utils.data.Dataset to huggingface dataset?
Issue -
State: closed - Opened by DEROOCE about 2 years ago
- 15 comments
Labels: enhancement
#4975 - Add `fn_kwargs` param to `IterableDataset.map`
Pull Request -
State: closed - Opened by mariosasko about 2 years ago
- 4 comments
#4973 - [GH->HF] Load datasets from the Hub
Pull Request -
State: closed - Opened by lhoestq about 2 years ago
- 2 comments
#4965 - [Apple M1] MemoryError: Cannot allocate write+execute memory for ffi.callback()
Issue -
State: closed - Opened by hoangtnm about 2 years ago
- 6 comments
Labels: bug
#4952 - Add test-datasets CI job
Pull Request -
State: closed - Opened by lhoestq about 2 years ago
- 2 comments
#4947 - Try to fix the Windows CI after TF update 2.10
Pull Request -
State: closed - Opened by lhoestq about 2 years ago
- 1 comment
#4926 - Dataset infos in yaml
Pull Request -
State: closed - Opened by lhoestq about 2 years ago
- 6 comments
Labels: dataset contribution
#4906 - Can't import datasets AttributeError: partially initialized module 'datasets' has no attribute 'utils' (most likely due to a circular import)
Issue -
State: closed - Opened by OPterminator about 2 years ago
- 6 comments
Labels: bug
#4883 - With dataloader RSS memory consumed by HF datasets monotonically increases
Issue -
State: open - Opened by apsdehal about 2 years ago
- 44 comments
Labels: bug
#4881 - Language names and language codes: connecting to a big database (rather than slow enrichment of custom list)
Issue -
State: open - Opened by alexis-michaud about 2 years ago
- 49 comments
Labels: enhancement
#4847 - Test win ci
Pull Request -
State: closed - Opened by Mr-Robot-001 about 2 years ago
#4828 - Support PIL Image objects in `add_item`/`add_column`
Pull Request -
State: open - Opened by mariosasko about 2 years ago
- 3 comments
#4804 - streaming dataset with concatenating splits raises an error
Issue -
State: open - Opened by Bing-su about 2 years ago
- 4 comments
Labels: bug
#4803 - Support `pipeline` argument in inspect.py functions
Issue -
State: open - Opened by severo about 2 years ago
- 1 comment
Labels: enhancement
#4800 - support LargeListArray in pyarrow
Pull Request -
State: closed - Opened by Jiaxin-Wen about 2 years ago
- 22 comments
#4799 - video dataset loader/parser
Issue -
State: closed - Opened by nollied about 2 years ago
- 3 comments
Labels: enhancement
#4796 - ArrowInvalid: Could not convert <PIL.Image.Image image mode=RGB when adding image to Dataset
Issue -
State: open - Opened by NielsRogge about 2 years ago
- 17 comments
Labels: bug
#4760 - Issue with offline mode
Issue -
State: closed - Opened by SaulLu about 2 years ago
- 15 comments
Labels: bug
#4755 - Datasets.map causes incorrect overflow_to_sample_mapping when used with tokenizers and small batch size
Issue -
State: open - Opened by srobertjames about 2 years ago
- 3 comments
Labels: bug
#4711 - Document how to create a dataset loading script for audio/vision
Issue -
State: closed - Opened by albertvillanova about 2 years ago
- 1 comment
Labels: documentation
#4702 - Domain specific dataset discovery on the Hugging Face hub
Issue -
State: open - Opened by davanstrien about 2 years ago
- 11 comments
Labels: enhancement
#4694 - Distributed data parallel training for streaming datasets
Issue -
State: open - Opened by cyk1337 about 2 years ago
- 6 comments
Labels: enhancement
#4686 - Align logging with Transformers (again)
Pull Request -
State: closed - Opened by mariosasko about 2 years ago
- 2 comments
#4624 - Remove all paperswithcode_id: null
Pull Request -
State: closed - Opened by lhoestq about 2 years ago
- 3 comments
#4602 - Upgrade setuptools in windows CI
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 1 comment
#4601 - Upgrade pip in WIN CI
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 2 comments
#4584 - Add binary classification task IDs
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 4 comments
#4573 - Fix evaluation metadata for ncbi_disease
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 2 comments
Labels: dataset contribution
#4571 - move under the facebook org?
Issue -
State: open - Opened by lewtun over 2 years ago
- 3 comments
#4567 - Add evaluation data for amazon_reviews_multi
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 2 comments
Labels: dataset contribution
#4560 - Add evaluation metadata to imagenet-1k
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 2 comments
Labels: dataset contribution
#4558 - Add evaluation metadata to wmt14
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 2 comments
Labels: dataset contribution
#4557 - Add evaluation metadata to wmt16
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 3 comments
Labels: dataset contribution
#4529 - Ecoset
Issue -
State: closed - Opened by DiGyt over 2 years ago
- 3 comments
Labels: dataset request
#4504 - Can you please add the Stanford dog dataset?
Issue -
State: closed - Opened by dgrnd4 over 2 years ago
- 15 comments
Labels: good first issue, dataset request
#4482 - Test that TensorFlow is not imported on startup
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 3 comments
#4463 - Use config_id to check split sizes instead of config name
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 2 comments
#4461 - AttributeError: module 'datasets' has no attribute 'load_dataset'
Issue -
State: closed - Opened by AlexNLP over 2 years ago
- 4 comments
Labels: bug
#4448 - New Preprocessing Feature - Deduplication [Request]
Issue -
State: open - Opened by yuvalkirstain over 2 years ago
- 2 comments
Labels: duplicate, enhancement
#4443 - Dataset Viewer issue for openclimatefix/nimrod-uk-1km
Issue -
State: open - Opened by ZYMXIXI over 2 years ago
- 7 comments
#4395 - Add Pascal VOC dataset
Pull Request -
State: closed - Opened by nateraw over 2 years ago
- 6 comments
Labels: dataset contribution
#4394 - trainer became extremely slow after reload dataset by `load_from_disk`
Issue -
State: open - Opened by conan1024hao over 2 years ago
- 5 comments
Labels: bug
#4365 - Remove dots in config names
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 2 comments
#4334 - Adding eval metadata for billsum
Pull Request -
State: closed - Opened by sashavor over 2 years ago
#4284 - Issues in processing very large datasets
Issue -
State: closed - Opened by sajastu over 2 years ago
- 2 comments
Labels: bug
#4197 - Add remove_columns=True
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 4 comments
#4184 - [Librispeech] Add 'all' config
Pull Request -
State: closed - Opened by patrickvonplaten over 2 years ago
- 29 comments
#4183 - Document librispeech configs
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 5 comments
#4175 - Add WIT Dataset
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 6 comments
#4129 - dataset metadata for reproducibility
Issue -
State: open - Opened by nbroad1881 over 2 years ago
- 1 comment
Labels: enhancement
#4117 - AttributeError: module 'huggingface_hub' has no attribute 'hf_api'
Issue -
State: closed - Opened by arymbe over 2 years ago
- 13 comments
Labels: bug
#4114 - Allow downloading just some columns of a dataset
Issue -
State: open - Opened by osanseviero over 2 years ago
- 8 comments
Labels: enhancement
#4104 - Add time series data - stock market
Issue -
State: open - Opened by INF800 over 2 years ago
- 10 comments
Labels: dataset request
#4102 - [hub] Fix `api.create_repo` call?
Pull Request -
State: closed - Opened by julien-c over 2 years ago
- 2 comments
#4096 - Add support for streaming Zarr stores for hosted datasets
Issue -
State: closed - Opened by jacobbieker over 2 years ago
- 11 comments
Labels: enhancement
#4062 - Loading mozilla-foundation/common_voice_7_0 dataset failed
Issue -
State: closed - Opened by aapot over 2 years ago
- 10 comments
Labels: dataset bug
#4038 - [DO NOT MERGE] Test doc-builder with skipped installation feature
Pull Request -
State: closed - Opened by lewtun over 2 years ago
- 2 comments
#4036 - Fix building of documentation
Pull Request -
State: closed - Opened by albertvillanova over 2 years ago
- 2 comments
#3984 - Local and automatic tests fail
Issue -
State: closed - Opened by MarkusSagen over 2 years ago
- 1 comment
Labels: bug
#3983 - Infinitely attempting lock
Issue -
State: closed - Opened by jyrr over 2 years ago
- 4 comments
#3979 - Fix google drive streaming for small files
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 4 comments
#3978 - I can't view HFcallback dataset for ASR Space
Issue -
State: open - Opened by kingabzpro over 2 years ago
- 4 comments
#3960 - Load local dataset error
Issue -
State: open - Opened by TXacs over 2 years ago
- 13 comments
Labels: bug, dataset bug
#3956 - TypeError: __init__() missing 1 required positional argument: 'scheme'
Issue -
State: closed - Opened by amirj over 2 years ago
- 8 comments
Labels: bug
#3946 - Add newline to text dataset builder for controlling universal newlines mode
Pull Request -
State: closed - Opened by albertvillanova over 2 years ago
- 3 comments
#3941 - billsum dataset: Checksums didn't match for dataset source files:
Issue -
State: closed - Opened by XingxingZhang over 2 years ago
- 3 comments
Labels: bug
#3913 - Deterministic split order in DatasetDict.map
Pull Request -
State: closed - Opened by lhoestq over 2 years ago
- 3 comments
#3912 - add draft of registering function for pandas
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 3 comments
#3867 - Update for the rename doc-builder -> hf-doc-utils
Pull Request -
State: closed - Opened by sgugger over 2 years ago
- 4 comments
#3865 - Add logo img
Pull Request -
State: closed - Opened by mishig25 over 2 years ago
- 2 comments
#3854 - load only England English dataset from common voice english dataset
Issue -
State: closed - Opened by amanjaiswal777 over 2 years ago
- 2 comments
Labels: question
#3847 - Datasets' cache not re-used
Issue -
State: open - Opened by gejinchen over 2 years ago
- 26 comments
Labels: bug
#3838 - Add a data type for labeled images (image segmentation)
Issue -
State: open - Opened by severo over 2 years ago
Labels: enhancement
#3792 - Checksums didn't match for dataset source
Issue -
State: closed - Opened by rafikg over 2 years ago
- 26 comments
Labels: dataset-viewer
#3753 - Expanding streaming capabilities
Issue -
State: open - Opened by lvwerra over 2 years ago
- 6 comments
Labels: enhancement
#3735 - Performance of `datasets` at scale
Issue -
State: open - Opened by lvwerra over 2 years ago
- 6 comments
#3720 - Builder Configuration Update Required on Common Voice Dataset
Issue -
State: closed - Opened by aasem over 2 years ago
- 7 comments
Labels: bug
#3700 - Unable to load a dataset
Issue -
State: closed - Opened by PaulchauvinAI over 2 years ago
- 3 comments
Labels: bug
#3681 - Fix TestCommand to move dataset_infos instead of copying
Pull Request -
State: closed - Opened by albertvillanova over 2 years ago
- 6 comments
#3658 - Dataset viewer issue for *P3*
Issue -
State: closed - Opened by jeffistyping over 2 years ago
- 4 comments
#3650 - Allow 'to_json' to run in unordered fashion in order to lower memory footprint
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 6 comments
#3644 - Add a GROUP BY operator
Issue -
State: open - Opened by felix-schneider over 2 years ago
- 11 comments
Labels: enhancement
#3638 - AutoTokenizer hash value got change after datasets.map
Issue -
State: open - Opened by tshu-w over 2 years ago
- 12 comments
Labels: bug
#3618 - TIMIT Dataset not working with GPU
Issue -
State: closed - Opened by TheSeamau5 over 2 years ago
- 3 comments
Labels: bug
#3595 - Add ImageNet toy datasets from fastai
Pull Request -
State: closed - Opened by mariosasko over 2 years ago
- 1 comment
Labels: dataset contribution
#3578 - label information get lost after parquet serialization
Issue -
State: closed - Opened by Tudyx over 2 years ago
- 2 comments
Labels: bug