Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / huggingface/datasets issues and pull requests
#3567 - Fix push to hub to allow individual split push
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#3547 - Datasets created with `push_to_hub` can't be accessed in offline mode
Issue -
State: closed - Opened by TevenLeScao over 2 years ago
- 18 comments
Labels: bug
#3504 - Unable to download PUBMED_title_abstracts_2019_baseline.jsonl.zst
Issue -
State: closed - Opened by ToddMorrill almost 3 years ago
- 10 comments
Labels: bug, dataset bug
#3474 - Decode images when iterating
Pull Request -
State: closed - Opened by lhoestq almost 3 years ago
#3468 - Add COCO dataset
Pull Request -
State: closed - Opened by mariosasko almost 3 years ago
- 7 comments
Labels: dataset contribution
#3465 - Unable to load 'cnn_dailymail' dataset
Issue -
State: closed - Opened by talha1503 almost 3 years ago
- 4 comments
Labels: bug, duplicate, dataset bug
#3460 - Don't encode lists as strings when using `Value("string")`
Pull Request -
State: closed - Opened by lhoestq almost 3 years ago
- 3 comments
#3455 - Easier information editing
Issue -
State: closed - Opened by borgr almost 3 years ago
- 2 comments
Labels: enhancement, generic discussion
#3450 - Unexpected behavior doing Split + Filter
Issue -
State: closed - Opened by jbrachat almost 3 years ago
- 1 comment
Labels: bug
#3449 - Add `__add__()`, `__iadd__()` and similar to `Dataset` class
Issue -
State: closed - Opened by sgraaf almost 3 years ago
- 2 comments
Labels: enhancement, generic discussion
#3446 - Remove redundant local path information in audio/image datasets
Pull Request -
State: closed - Opened by mariosasko almost 3 years ago
- 3 comments
Labels: dataset contribution
#3444 - Align the Dataset and IterableDataset processing API
Issue -
State: open - Opened by lhoestq almost 3 years ago
- 8 comments
Labels: enhancement, generic discussion
#3401 - Add Wikimedia pre-processed datasets
Issue -
State: open - Opened by albertvillanova almost 3 years ago
- 1 comment
Labels: dataset request
#3365 - Add task tags for multimodal datasets
Issue -
State: closed - Opened by albertvillanova almost 3 years ago
- 1 comment
Labels: enhancement
#3338 - [WIP] Add doctests for tutorials
Pull Request -
State: closed - Opened by stevhliu almost 3 years ago
- 1 comment
#3336 - Add support for multiple dynamic dimensions and to_pandas conversion for dynamic arrays
Pull Request -
State: closed - Opened by mariosasko almost 3 years ago
#3334 - Integrate Polars library
Issue -
State: closed - Opened by albertvillanova almost 3 years ago
- 8 comments
Labels: enhancement
#3299 - Add option to find unique elements in nested sequences when calling `Dataset.unique`
Issue -
State: open - Opened by mariosasko almost 3 years ago
- 4 comments
Labels: enhancement
#3220 - Add documentation about dataset viewer feature
Issue -
State: open - Opened by albertvillanova almost 3 years ago
- 1 comment
Labels: enhancement, dataset-viewer
#3178 - "Property couldn't be hashed properly" even though fully picklable
Issue -
State: closed - Opened by BramVanroy almost 3 years ago
- 26 comments
Labels: bug
#3172 - `SystemError 15` thrown in `Dataset.__del__` when using `Dataset.map()` with `num_proc>1`
Issue -
State: closed - Opened by vlievin almost 3 years ago
- 12 comments
Labels: bug
#3142 - Provide a way to write a streamed dataset to the disk
Issue -
State: open - Opened by severo almost 3 years ago
- 2 comments
Labels: enhancement, dataset-viewer
#3134 - Couldn't reach https://raw.githubusercontent.com/huggingface/datasets/1.11.0/metrics/rouge/rouge.py
Issue -
State: closed - Opened by yananchen1989 almost 3 years ago
- 4 comments
Labels: bug
#3113 - Loading Data from HDF files
Issue -
State: open - Opened by FeryET almost 3 years ago
- 7 comments
Labels: enhancement, good second issue
#2976 - Can't load dataset
Issue -
State: closed - Opened by mskovalova about 3 years ago
- 4 comments
Labels: bug
#2969 - medical-dialog error
Issue -
State: closed - Opened by smeyerhot about 3 years ago
- 3 comments
Labels: bug
#2964 - Error when calculating Matthews Correlation Coefficient loaded with `load_metric`
Issue -
State: closed - Opened by alvarobartt about 3 years ago
- 1 comment
Labels: bug
#2956 - Cache problem in the `load_dataset` method for local compressed file(s)
Issue -
State: open - Opened by SaulLu about 3 years ago
- 1 comment
Labels: bug
#2924 - "File name too long" error for file locks
Issue -
State: closed - Opened by gar1t about 3 years ago
- 12 comments
Labels: bug
#2869 - TypeError: 'NoneType' object is not callable
Issue -
State: closed - Opened by Chenfei-Kang about 3 years ago
- 9 comments
Labels: bug
#2868 - Add Common Objects in 3D (CO3D)
Issue -
State: open - Opened by nateraw about 3 years ago
Labels: dataset request, vision
#2838 - Add error_bad_chunk to the JSON loader
Pull Request -
State: open - Opened by lhoestq about 3 years ago
- 4 comments
#2825 - The datasets.map function does not load cached dataset after moving python script
Issue -
State: closed - Opened by hobbitlzy about 3 years ago
- 6 comments
Labels: bug
#2818 - cannot load data from my loacal path
Issue -
State: closed - Opened by yang-collect about 3 years ago
- 1 comment
Labels: bug
#2788 - How to sample every file in a list of files making up a split in a dataset when loading?
Issue -
State: closed - Opened by brijow about 3 years ago
- 1 comment
#2787 - ConnectionError: Couldn't reach https://raw.githubusercontent.com
Issue -
State: closed - Opened by jinec about 3 years ago
- 9 comments
Labels: bug
#2775 - `generate_random_fingerprint()` deterministic with 🤗Transformers' `set_seed()`
Issue -
State: closed - Opened by mbforbes about 3 years ago
- 3 comments
Labels: bug
#2773 - Remove dataset_infos.json
Issue -
State: closed - Opened by albertvillanova about 3 years ago
- 1 comment
Labels: enhancement, generic discussion
#2763 - English wikipedia datasets is not clean
Issue -
State: closed - Opened by lucadiliello about 3 years ago
- 1 comment
Labels: bug
#2699 - cannot combine splits merging and streaming?
Issue -
State: open - Opened by eyaler about 3 years ago
- 5 comments
Labels: bug
#2689 - cannot save the dataset to disk after rename_column
Issue -
State: closed - Opened by PaulLerner about 3 years ago
- 4 comments
Labels: bug
#2666 - Adds CodeClippy dataset [WIP]
Pull Request -
State: closed - Opened by arampacha about 3 years ago
- 2 comments
Labels: dataset contribution
#2656 - Change `from_csv` default arguments
Pull Request -
State: closed - Opened by SBrandeis about 3 years ago
- 1 comment
#2655 - Allow the selection of multiple columns at once
Issue -
State: closed - Opened by Dref360 about 3 years ago
- 5 comments
Labels: enhancement
#2650 - [load_dataset] shard and parallelize the process
Issue -
State: closed - Opened by stas00 about 3 years ago
- 4 comments
Labels: enhancement
#2642 - Support multi-worker with streaming dataset (IterableDataset).
Issue -
State: open - Opened by cccntu about 3 years ago
- 3 comments
Labels: enhancement
#2618 - `filelock.py` Error
Issue -
State: closed - Opened by liyucheng09 about 3 years ago
- 2 comments
Labels: bug
#2514 - Can datasets remove duplicated rows?
Issue -
State: open - Opened by liuxinglan over 3 years ago
- 12 comments
Labels: enhancement
#2462 - Merge DatasetDict and Dataset
Issue -
State: open - Opened by albertvillanova over 3 years ago
- 2 comments
Labels: enhancement, generic discussion
#2377 - ArrowDataset.save_to_disk produces files that cannot be read using pyarrow.feather
Issue -
State: open - Opened by Ark-kun over 3 years ago
- 4 comments
Labels: bug
#2371 - Align question answering tasks with sub-domains
Issue -
State: closed - Opened by lewtun over 3 years ago
- 1 comment
Labels: enhancement
#2370 - Adding HendrycksTest dataset
Pull Request -
State: closed - Opened by andyzoujm over 3 years ago
- 5 comments
#2252 - Slow dataloading with big datasets issue persists
Issue -
State: closed - Opened by hwijeen over 3 years ago
- 70 comments
#2212 - Can't reach "https://storage.googleapis.com/illuin/fquad/train.json.zip" when trying to load fquad dataset
Issue -
State: closed - Opened by hanss0n over 3 years ago
- 5 comments
#2096 - CoNLL 2003 dataset not including German
Issue -
State: closed - Opened by rxian over 3 years ago
- 2 comments
Labels: dataset request
#2089 - Add documentaton for dataset README.md files
Issue -
State: closed - Opened by PhilipMay over 3 years ago
- 8 comments
#2060 - Filtering refactor
Pull Request -
State: closed - Opened by theo-m over 3 years ago
- 10 comments
#2058 - Is it possible to convert a `tfds` to HuggingFace `dataset`?
Issue -
State: closed - Opened by abarbosa94 over 3 years ago
- 1 comment
#2035 - wiki40b/wikipedia for almost all languages cannot be downloaded
Issue -
State: closed - Opened by dorost1234 over 3 years ago
- 11 comments
#2003 - Messages are being printed to the `stdout`
Issue -
State: closed - Opened by mahnerak over 3 years ago
- 3 comments
#1992 - `datasets.map` multi processing much slower than single processing
Issue -
State: open - Opened by hwijeen over 3 years ago
- 14 comments
Labels: bug
#1933 - Use arrow ipc file format
Pull Request -
State: closed - Opened by lhoestq over 3 years ago
- 3 comments
#1835 - Add CHiME4 dataset
Issue -
State: open - Opened by patrickvonplaten over 3 years ago
- 4 comments
Labels: dataset request, speech
#1796 - Filter on dataset too much slowww
Issue -
State: open - Opened by ayubSubhaniya over 3 years ago
- 9 comments
#1781 - AttributeError: module 'pyarrow' has no attribute 'PyExtensionType' during import
Issue -
State: closed - Opened by PalaashAgrawal over 3 years ago
- 9 comments
#1774 - is it possible to make slice to be more compatible like python list and numpy?
Issue -
State: closed - Opened by world2vec over 3 years ago
- 2 comments
#1742 - Add GLUE Compat (compatible with transformers<3.5.0)
Pull Request -
State: closed - Opened by JetRunner over 3 years ago
- 2 comments
#1627 - `Dataset.map` disable progress bar
Issue -
State: closed - Opened by Nickil21 almost 4 years ago
- 3 comments
#1600 - AttributeError: 'DatasetDict' object has no attribute 'train_test_split'
Issue -
State: closed - Opened by david-waterworth almost 4 years ago
- 7 comments
Labels: question
#1443 - Add OPUS Wikimedia Translations Dataset
Pull Request -
State: closed - Opened by abhishekkrthakur almost 4 years ago
- 1 comment
Labels: dataset contribution
#1407 - Add Tweet Eval Dataset
Pull Request -
State: closed - Opened by abhishekkrthakur almost 4 years ago
- 4 comments
#1297 - OPUS Ted Talks 2013
Pull Request -
State: closed - Opened by abhishekkrthakur almost 4 years ago
#1245 - Add Google Turkish Treebank Dataset
Pull Request -
State: closed - Opened by abhishekkrthakur almost 4 years ago
- 1 comment
Labels: dataset contribution
#1243 - Add Google Noun Verb Dataset
Pull Request -
State: closed - Opened by abhishekkrthakur almost 4 years ago
- 1 comment
Labels: dataset contribution
#1240 - Multi Domain Sentiment Analysis Dataset (MDSA)
Pull Request -
State: closed - Opened by abhishekkrthakur almost 4 years ago
- 9 comments
Labels: dataset contribution
#1206 - Adding Enriched WebNLG dataset
Pull Request -
State: closed - Opened by TevenLeScao almost 4 years ago
- 3 comments
#961 - sample multiple datasets
Issue -
State: closed - Opened by rabeehk almost 4 years ago
- 6 comments
#960 - Add code to automate parts of the dataset card
Pull Request -
State: closed - Opened by patrickvonplaten almost 4 years ago
#937 - Local machine/cluster Beam Datasets example/tutorial
Issue -
State: closed - Opened by shangw-nvidia almost 4 years ago
- 2 comments
#876 - imdb dataset cannot be loaded
Issue -
State: closed - Opened by rabeehk almost 4 years ago
- 6 comments
#873 - load_dataset('cnn_dalymail', '3.0.0') gives a 'Not a directory' error
Issue -
State: closed - Opened by vishal-burman almost 4 years ago
- 13 comments
#868 - Consistent metric outputs
Pull Request -
State: closed - Opened by lhoestq almost 4 years ago
- 2 comments
Labels: transfer-to-evaluate
#856 - Add open book corpus
Pull Request -
State: closed - Opened by vblagoje almost 4 years ago
- 21 comments
#843 - use_custom_baseline still produces errors for bertscore
Issue -
State: closed - Opened by penatbater almost 4 years ago
- 5 comments
Labels: metric bug
#824 - Discussion using datasets in offline mode
Issue -
State: closed - Opened by mandubian almost 4 years ago
- 11 comments
Labels: enhancement, generic discussion
#759 - (Load dataset failure) ConnectionError: Couldn’t reach https://raw.githubusercontent.com/huggingface/datasets/1.1.2/datasets/cnn_dailymail/cnn_dailymail.py
Issue -
State: closed - Opened by AI678 almost 4 years ago
- 19 comments
#693 - Rachel ker add dataset/mlsum
Pull Request -
State: closed - Opened by pdhg almost 4 years ago
- 1 comment
#662 - Created dataset card snli.md
Pull Request -
State: closed - Opened by mcmillanmajora about 4 years ago
- 1 comment
Labels: Dataset discussion
#645 - Don't use take on dataset table in pyarrow 1.0.x
Pull Request -
State: closed - Opened by lhoestq about 4 years ago
- 4 comments
#615 - Offset overflow when slicing a big dataset with an array of indices in Pyarrow >= 1.0.0
Issue -
State: closed - Opened by lhoestq about 4 years ago
- 16 comments
#605 - [Datasets] Transmit format to children
Pull Request -
State: closed - Opened by thomwolf about 4 years ago
- 1 comment
#599 - Add MATINF dataset
Pull Request -
State: closed - Opened by JetRunner about 4 years ago
- 2 comments
#562 - [Reproductibility] Allow to pin versions of datasets/metrics
Pull Request -
State: closed - Opened by thomwolf about 4 years ago
- 1 comment
#546 - Very slow data loading on large dataset
Issue -
State: closed - Opened by agemagician about 4 years ago
- 26 comments
#480 - Column indexing hotfix
Pull Request -
State: closed - Opened by TevenLeScao about 4 years ago
- 2 comments
#462 - add DoQA (ACL 2020) dataset
Pull Request -
State: closed - Opened by mariamabarham about 4 years ago
#461 - Doqa
Pull Request -
State: closed - Opened by mariamabarham about 4 years ago
#456 - add crd3(ACL 2020) dataset
Pull Request -
State: closed - Opened by mariamabarham about 4 years ago
#449 - add reuters21578 dataset
Pull Request -
State: closed - Opened by mariamabarham about 4 years ago
- 3 comments
#406 - Faster Shuffling?
Issue -
State: closed - Opened by mitchellgordon95 about 4 years ago
- 7 comments