Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / huggingface/datasets issues and pull requests
#6147 - ValueError when running BeamBasedBuilder with GCS path in cache_dir
Issue -
State: closed - Opened by ktrk115 about 1 year ago
- 2 comments
#6146 - DatasetGenerationError when load glue benchmark datasets from `load_dataset`
Issue -
State: closed - Opened by yusx-swapp about 1 year ago
- 4 comments
#6145 - Export to_iterable_dataset to document
Pull Request -
State: closed - Opened by npuichigo about 1 year ago
- 2 comments
#6144 - NIH exporter file not found
Issue -
State: open - Opened by brando90 about 1 year ago
- 6 comments
#6142 - the-stack-dedup fails to generate
Issue -
State: closed - Opened by michaelroyzen about 1 year ago
- 4 comments
#6141 - TypeError: ClientSession._request() got an unexpected keyword argument 'https'
Issue -
State: closed - Opened by q935970314 about 1 year ago
- 1 comment
#6140 - Misalignment between file format specified in configs metadata YAML and the inferred builder
Issue -
State: closed - Opened by albertvillanova about 1 year ago
Labels: bug
#6139 - Offline dataset viewer
Issue -
State: closed - Opened by yuvalkirstain about 1 year ago
- 7 comments
Labels: enhancement, dataset-viewer
#6138 - Ignore CI lint rule violation in Pickler.memoize
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6137 - (`from_spark()`) Unable to connect HDFS in pyspark YARN setting
Issue -
State: open - Opened by kyoungrok0517 about 1 year ago
#6136 - CI check_code_quality error: E721 Do not compare types, use `isinstance()`
Issue -
State: closed - Opened by albertvillanova about 1 year ago
Labels: maintenance
#6135 - Remove unused allowed_extensions param
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 4 comments
#6134 - `datasets` cannot be installed alongside `apache-beam`
Issue -
State: closed - Opened by boyleconnor about 1 year ago
- 1 comment
#6133 - Dataset is slower after calling `to_iterable_dataset`
Issue -
State: open - Opened by npuichigo about 1 year ago
- 2 comments
#6132 - to_iterable_dataset is missing in document
Issue -
State: closed - Opened by npuichigo about 1 year ago
- 1 comment
#6131 - AttributeError: type object 'tqdm' has no attribute '_lock'
Issue -
State: open - Opened by NielsRogge about 1 year ago
- 1 comment
#6130 - default config name doesn't work when config kwargs are specified.
Issue -
State: closed - Opened by npuichigo about 1 year ago
- 15 comments
#6129 - Release 2.14.4
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 5 comments
#6128 - IndexError: Invalid key: 88 is out of bounds for size 0
Issue -
State: closed - Opened by TomasAndersonFang about 1 year ago
- 5 comments
#6127 - Fix authentication issues
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 8 comments
#6126 - Private datasets do not load when passing token
Issue -
State: closed - Opened by albertvillanova about 1 year ago
- 4 comments
Labels: bug
#6125 - Reinforcement Learning and Robotics are not task categories in HF datasets metadata
Issue -
State: closed - Opened by StoneT2000 about 1 year ago
#6124 - Datasets crashing runs due to KeyError
Issue -
State: closed - Opened by conceptofmind about 1 year ago
- 7 comments
#6123 - Inaccurate Bounding Boxes in "wildreceipt" Dataset
Issue -
State: closed - Opened by HamzaGbada about 1 year ago
- 1 comment
#6122 - Upload README via `push_to_hub`
Issue -
State: closed - Opened by liyucheng09 about 1 year ago
- 1 comment
Labels: enhancement
#6121 - Small typo in the code example of create imagefolder dataset
Pull Request -
State: closed - Opened by WangXin93 about 1 year ago
- 1 comment
#6120 - Lookahead streaming support?
Issue -
State: open - Opened by PicoCreator about 1 year ago
- 1 comment
Labels: enhancement
#6119 - [Docs] Add description of `select_columns` to guide
Pull Request -
State: closed - Opened by unifyh about 1 year ago
- 2 comments
#6118 - IterableDataset.from_generator() fails with pickle error when provided a generator or iterator
Issue -
State: open - Opened by finkga about 1 year ago
- 2 comments
#6117 - Set dev version
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6116 - [Docs] The "Process" how-to guide lacks description of `select_columns` function
Issue -
State: closed - Opened by unifyh about 1 year ago
- 1 comment
Labels: enhancement
#6115 - Release: 2.14.3
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 6 comments
#6114 - Cache not being used when loading commonvoice 8.0.0
Issue -
State: closed - Opened by clabornd about 1 year ago
- 2 comments
#6113 - load_dataset() fails with streamlit caching inside docker
Issue -
State: closed - Opened by fierval about 1 year ago
- 1 comment
#6112 - yaml error using push_to_hub with generated README.md
Issue -
State: closed - Opened by kevintee about 1 year ago
- 1 comment
#6111 - raise FileNotFoundError("Directory {dataset_path} is neither a `Dataset` directory nor a `DatasetDict` directory." )
Issue -
State: closed - Opened by 2catycm about 1 year ago
- 3 comments
#6110 - [BUG] Dataset initialized from in-memory data does not create cache.
Issue -
State: closed - Opened by MattYoon about 1 year ago
- 1 comment
#6109 - Problems in downloading Amazon reviews from HF
Issue -
State: closed - Opened by 610v4nn1 about 1 year ago
- 2 comments
#6108 - Loading local datasets got strangely stuck
Issue -
State: open - Opened by LoveCatc about 1 year ago
- 6 comments
#6107 - Fix deprecation of use_auth_token in file_utils
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6106 - load local json_file as dataset
Issue -
State: closed - Opened by CiaoHe about 1 year ago
- 2 comments
#6105 - Fix error when loading from GCP bucket
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 5 comments
#6104 - HF Datasets data access is extremely slow even when in memory
Issue -
State: open - Opened by NightMachinery about 1 year ago
- 1 comment
#6103 - Set dev version
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6102 - Release 2.14.2
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 4 comments
#6101 - Release 2.14.2
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6100 - TypeError when loading from GCP bucket
Issue -
State: closed - Opened by bilelomrani1 about 1 year ago
- 2 comments
#6099 - How do i get "amazon_us_reviews
Issue -
State: closed - Opened by IqraBaluch about 1 year ago
- 10 comments
Labels: enhancement
#6098 - Expanduser in save_to_disk()
Pull Request -
State: closed - Opened by Unknown3141592 about 1 year ago
- 3 comments
#6097 - Dataset.get_nearest_examples does not return all feature values for the k most similar datapoints - side effect of Dataset.set_format
Issue -
State: closed - Opened by aschoenauer-sebag about 1 year ago
- 1 comment
#6096 - Add `fsspec` support for `to_json`, `to_csv`, and `to_parquet`
Pull Request -
State: closed - Opened by alvarobartt about 1 year ago
- 5 comments
#6095 - Fix deprecation of errors in TextConfig
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6094 - Fix deprecation of use_auth_token in DownloadConfig
Pull Request -
State: closed - Opened by albertvillanova about 1 year ago
- 3 comments
#6093 - Deprecate `download_custom`
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 6 comments
#6092 - Minor fix in `iter_files` for hidden files
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 3 comments
#6091 - Bump fsspec from 2021.11.1 to 2022.3.0
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 3 comments
#6090 - FilesIterable skips all the files after a hidden file
Issue -
State: closed - Opened by dkrivosic about 1 year ago
- 1 comment
#6089 - AssertionError: daemonic processes are not allowed to have children
Issue -
State: open - Opened by codingl2k1 about 1 year ago
- 2 comments
#6088 - Loading local data files initiates web requests
Issue -
State: closed - Opened by lytning98 about 1 year ago
#6087 - fsspec dependency is set too low
Issue -
State: closed - Opened by iXce about 1 year ago
- 1 comment
#6086 - Support `fsspec` in `Dataset.to_<format>` methods
Issue -
State: closed - Opened by mariosasko about 1 year ago
- 5 comments
Labels: enhancement
#6085 - Fix `fsspec` download
Pull Request -
State: open - Opened by mariosasko about 1 year ago
- 3 comments
#6084 - Changing pixel values of images in the Winoground dataset
Issue -
State: open - Opened by ZitengWangNYU about 1 year ago
#6083 - set dev version
Pull Request -
State: closed - Opened by lhoestq about 1 year ago
- 3 comments
#6082 - Release: 2.14.1
Pull Request -
State: closed - Opened by lhoestq about 1 year ago
- 6 comments
#6081 - Deprecate `Dataset.export`
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 2 comments
#6080 - Remove README link to deprecated Colab notebook
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 3 comments
#6079 - Iterating over DataLoader based on HF datasets is stuck forever
Issue -
State: closed - Opened by arindamsarkar93 about 1 year ago
- 15 comments
#6078 - resume_download with streaming=True
Issue -
State: closed - Opened by NicolasMICAUX about 1 year ago
- 3 comments
#6077 - Mapping gets stuck at 99%
Issue -
State: open - Opened by Laurent2916 about 1 year ago
- 6 comments
#6076 - No gzip encoding from github
Pull Request -
State: closed - Opened by lhoestq about 1 year ago
- 3 comments
#6075 - Error loading music files using `load_dataset`
Issue -
State: closed - Opened by susnato about 1 year ago
- 2 comments
#6074 - Misc doc improvements
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 3 comments
#6073 - version2.3.2 load_dataset()data_files can't include .xxxx in path
Issue -
State: closed - Opened by BUAAChuanWang about 1 year ago
- 1 comment
#6072 - Fix fsspec storage_options from load_dataset
Pull Request -
State: closed - Opened by lhoestq about 1 year ago
- 6 comments
#6071 - storage_options provided to load_dataset not fully piping through since datasets 2.14.0
Issue -
State: closed - Opened by exs-avianello about 1 year ago
- 2 comments
#6070 - Fix Quickstart notebook link
Pull Request -
State: closed - Opened by mariosasko about 1 year ago
- 3 comments
#6069 - KeyError: dataset has no key "image"
Issue -
State: closed - Opened by etetteh about 1 year ago
- 7 comments
#6068 - fix tqdm lock deletion
Pull Request -
State: closed - Opened by lhoestq about 1 year ago
- 5 comments
#6066 - AttributeError: '_tqdm_cls' object has no attribute '_lock'
Issue -
State: closed - Opened by codingl2k1 about 1 year ago
- 7 comments
#6065 - Add column type guessing from map return function
Pull Request -
State: closed - Opened by piercefreeman about 1 year ago
- 5 comments
#6060 - Dataset.map() execute twice when in PyTorch DDP mode
Issue -
State: closed - Opened by wanghaoyucn about 1 year ago
- 4 comments
#6059 - Provide ability to load label mappings from file
Issue -
State: open - Opened by david-waterworth about 1 year ago
- 3 comments
Labels: enhancement
#6057 - Why is the speed difference of gen example so big?
Issue -
State: closed - Opened by pixeli99 about 1 year ago
- 1 comment
#6056 - Implement proper checkpointing for dataset uploading with resume function that does not require remapping shards that have already been uploaded
Pull Request -
State: open - Opened by AntreasAntoniou about 1 year ago
- 6 comments
#6053 - Change package name from "datasets" to something less generic
Issue -
State: closed - Opened by geajack about 1 year ago
- 1 comment
Labels: enhancement
#6049 - Update `ruff` version in pre-commit config
Pull Request -
State: closed - Opened by polinaeterna about 1 year ago
- 2 comments
#6046 - Support proxy and user-agent in fsspec calls
Issue -
State: open - Opened by lhoestq about 1 year ago
- 8 comments
Labels: enhancement, good second issue
#6036 - Deprecate search API
Pull Request -
State: open - Opened by mariosasko about 1 year ago
- 9 comments
#6032 - DownloadConfig.proxies not work when load_dataset_builder calling HfApi.dataset_info
Issue -
State: open - Opened by codingl2k1 about 1 year ago
- 5 comments
#6020 - Inconsistent "The features can't be aligned" error when combining map, multiprocessing, and variable length outputs
Issue -
State: open - Opened by kheyer about 1 year ago
- 3 comments
#6014 - Request to Share/Update Dataset Viewer Code
Issue -
State: closed - Opened by lilyorlilypad about 1 year ago
- 10 comments
Labels: duplicate
#6012 - [FR] Transform Chaining, Lazy Mapping
Issue -
State: open - Opened by NightMachinery about 1 year ago
- 7 comments
Labels: enhancement
#6010 - Improve `Dataset`'s string representation
Issue -
State: open - Opened by mariosasko about 1 year ago
- 3 comments
Labels: enhancement
#6007 - Get an error "OverflowError: Python int too large to convert to C long" when loading a large dataset
Issue -
State: open - Opened by silverriver about 1 year ago
- 8 comments
Labels: arrow
#5990 - Pushing a large dataset on the hub consistently hangs
Issue -
State: open - Opened by AntreasAntoniou over 1 year ago
- 45 comments
Labels: bug
#5984 - AutoSharding IterableDataset's when num_workers > 1
Issue -
State: open - Opened by mathephysicist over 1 year ago
- 8 comments
Labels: enhancement
#5983 - replaced PathLike as a variable for save_to_disk for dataset_path wit…
Pull Request -
State: closed - Opened by benjaminbrown038 over 1 year ago
#5981 - Only two cores are getting used in sagemaker with pytorch 3.10 kernel
Issue -
State: closed - Opened by mmr-crexi over 1 year ago
- 4 comments
#5968 - Common Voice datasets still need `use_auth_token=True`
Issue -
State: closed - Opened by patrickvonplaten over 1 year ago
- 4 comments