Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / huggingface/datasets issues and pull requests

#7414 - Gracefully cancel async tasks

Pull Request - State: open - Opened by lhoestq 2 days ago - 1 comment

#7410 - Set dev version

Pull Request - State: closed - Opened by lhoestq 4 days ago - 1 comment

#7409 - Release: 3.3.1

Pull Request - State: closed - Opened by lhoestq 4 days ago - 1 comment

#7409 - Release: 3.3.1

Pull Request - State: closed - Opened by lhoestq 4 days ago - 1 comment

#7408 - Fix filter speed regression

Pull Request - State: closed - Opened by lhoestq 4 days ago - 1 comment

#7408 - Fix filter speed regression

Pull Request - State: closed - Opened by lhoestq 4 days ago - 1 comment

#7406 - Adding Core Maintainer List to CONTRIBUTING.md

Issue - State: open - Opened by jp1924 4 days ago - 3 comments
Labels: enhancement

#7405 - Lazy loading of environment variables

Issue - State: open - Opened by nikvaessen 5 days ago - 1 comment

#7405 - Lazy loading of environment variables

Issue - State: open - Opened by nikvaessen 5 days ago - 1 comment

#7404 - Performance regression in `dataset.filter`

Issue - State: closed - Opened by ttim 5 days ago - 2 comments

#7404 - Performance regression in `dataset.filter`

Issue - State: closed - Opened by ttim 5 days ago - 3 comments

#7402 - Fix a typo in arrow_dataset.py

Pull Request - State: open - Opened by jingedawang 5 days ago

#7402 - Fix a typo in arrow_dataset.py

Pull Request - State: open - Opened by jingedawang 5 days ago

#7401 - set dev version

Pull Request - State: closed - Opened by lhoestq 7 days ago - 1 comment

#7399 - Synchronize parameters for various datasets

Issue - State: open - Opened by grofte 7 days ago - 2 comments

#7398 - Release: 3.3.0

Pull Request - State: closed - Opened by lhoestq 7 days ago - 1 comment

#7397 - Kannada dataset(Conversations, Wikipedia etc)

Pull Request - State: open - Opened by Likhith2612 7 days ago

#7396 - Update README.md

Pull Request - State: closed - Opened by lhoestq 8 days ago - 1 comment

#7395 - Update docs

Pull Request - State: closed - Opened by lhoestq 8 days ago - 1 comment

#7393 - Optimized sequence encoding for scalars

Pull Request - State: closed - Opened by lukasgd 10 days ago - 1 comment

#7390 - Re-add py.typed

Issue - State: open - Opened by NeilGirdhar 11 days ago
Labels: enhancement

#7389 - Getting statistics about filtered examples

Issue - State: closed - Opened by jonathanasdf 11 days ago - 2 comments

#7388 - OSError: [Errno 22] Invalid argument forbidden character

Issue - State: closed - Opened by langflogit 11 days ago - 2 comments

#7387 - Dynamic adjusting dataloader sampling weight

Issue - State: open - Opened by whc688 11 days ago - 3 comments

#7386 - Add bookfolder Dataset Builder for Digital Book Formats

Issue - State: closed - Opened by shikanime 13 days ago - 1 comment
Labels: enhancement

#7385 - Make IterableDataset (optionally) resumable

Pull Request - State: open - Opened by yzhangcs 17 days ago - 1 comment

#7384 - Support async functions in map()

Pull Request - State: closed - Opened by lhoestq 18 days ago - 2 comments

#7382 - Add Pandas, PyArrow and Polars docs

Pull Request - State: closed - Opened by lhoestq 21 days ago - 1 comment

#7381 - Iterating over values of a column in the IterableDataset

Issue - State: open - Opened by TopCoder2K 24 days ago - 2 comments
Labels: enhancement

#7380 - fix: dill default for version bigger 0.3.8

Pull Request - State: open - Opened by sam-hey 26 days ago

#7378 - Allow pushing config version to hub

Issue - State: open - Opened by momeara about 1 month ago - 1 comment
Labels: enhancement

#7377 - Support for sparse arrays with the Arrow Sparse Tensor format?

Issue - State: open - Opened by JulesGM about 1 month ago - 1 comment
Labels: enhancement

#7376 - [docs] uv install

Pull Request - State: open - Opened by stevhliu about 1 month ago

#7375 - vllm批量推理报错

Issue - State: open - Opened by YuShengzuishuai about 1 month ago - 1 comment

#7374 - Remove .h5 from imagefolder extensions

Pull Request - State: closed - Opened by lhoestq about 1 month ago

#7373 - Excessive RAM Usage After Dataset Concatenation concatenate_datasets

Issue - State: open - Opened by sam-hey about 1 month ago - 1 comment

#7371 - 500 Server error with pushing a dataset

Issue - State: open - Opened by martinmatak about 1 month ago - 1 comment

#7370 - Support faster processing using pandas or polars functions in `IterableDataset.map()`

Pull Request - State: closed - Opened by lhoestq about 1 month ago - 2 comments

#7368 - Add with_split to DatasetDict.map

Pull Request - State: open - Opened by jp1924 about 1 month ago - 5 comments

#7366 - Dataset.from_dict() can't handle large dict

Issue - State: open - Opened by CSU-OSS about 1 month ago

#7364 - API endpoints for gated dataset access requests

Issue - State: closed - Opened by jerome-white about 1 month ago - 3 comments
Labels: enhancement

#7363 - ImportError: To support decoding images, please install 'Pillow'.

Issue - State: open - Opened by jamessdixon about 1 month ago - 3 comments

#7362 - HuggingFace CLI dataset download raises error

Issue - State: closed - Opened by ajayvohra2005 about 1 month ago - 3 comments

#7361 - Fix lock permission

Pull Request - State: open - Opened by cih9088 about 2 months ago

#7360 - error when loading dataset in Hugging Face: NoneType error is not callable

Issue - State: open - Opened by nanu23333 about 2 months ago - 3 comments

#7358 - Fix remove_columns in the formatted case

Pull Request - State: open - Opened by lhoestq about 2 months ago - 1 comment

#7357 - Python process aborded with GIL issue when using image dataset

Issue - State: open - Opened by AlexKoff88 about 2 months ago - 1 comment

#7356 - How about adding a feature to pass the key when performing map on DatasetDict?

Issue - State: open - Opened by jp1924 about 2 months ago - 6 comments
Labels: enhancement

#7355 - Not available datasets[audio] on python 3.13

Issue - State: open - Opened by sergiosinlimites about 2 months ago - 1 comment

#7353 - changes to MappedExamplesIterable to resolve #7345

Pull Request - State: closed - Opened by vttrifonov about 2 months ago - 2 comments

#7352 - fsspec 2024.12.0

Pull Request - State: closed - Opened by lhoestq about 2 months ago - 1 comment

#7350 - Bump hfh to 0.24 to fix ci

Pull Request - State: closed - Opened by lhoestq about 2 months ago - 1 comment

#7349 - Webdataset special columns in last position

Pull Request - State: closed - Opened by lhoestq about 2 months ago - 1 comment

#7348 - Catch OSError for arrow

Pull Request - State: closed - Opened by lhoestq about 2 months ago - 1 comment

#7347 - Converting Arrow to WebDataset TAR Format for Offline Use

Issue - State: closed - Opened by katie312 about 2 months ago - 4 comments
Labels: enhancement

#7346 - OSError: Invalid flatbuffers message.

Issue - State: closed - Opened by antecede about 2 months ago - 3 comments

#7345 - Different behaviour of IterableDataset.map vs Dataset.map with remove_columns

Issue - State: closed - Opened by vttrifonov about 2 months ago - 1 comment

#7342 - Update LICENSE

Pull Request - State: closed - Opened by eliebak 2 months ago - 1 comment

#7341 - minor video docs on how to install

Pull Request - State: closed - Opened by lhoestq 2 months ago - 1 comment

#7340 - don't import soundfile in tests

Pull Request - State: closed - Opened by lhoestq 2 months ago - 1 comment

#7339 - Update CONTRIBUTING.md

Pull Request - State: closed - Opened by lhoestq 2 months ago - 1 comment

#7336 - Clarify documentation or Create DatasetCard

Issue - State: open - Opened by August-murr 2 months ago
Labels: enhancement

#7335 - Too many open files: '/root/.cache/huggingface/token'

Issue - State: open - Opened by kopyl 2 months ago

#7328 - Fix typo in arrow_dataset

Pull Request - State: closed - Opened by AndreaFrancis 2 months ago - 1 comment

#7327 - .map() is not caching and ram goes OOM

Issue - State: open - Opened by simeneide 2 months ago - 1 comment

#7326 - Remove upper bound for fsspec

Issue - State: open - Opened by fellhorn 2 months ago - 1 comment

#7325 - Introduce pdf support (#7318)

Pull Request - State: open - Opened by yabramuvdi 2 months ago - 3 comments

#7323 - Unexpected cache behaviour using load_dataset

Issue - State: closed - Opened by Moritz-Wirth 2 months ago - 1 comment

#7321 - ImportError: cannot import name 'set_caching_enabled' from 'datasets'

Issue - State: open - Opened by sankexin 2 months ago - 2 comments

#7319 - set dev version

Pull Request - State: closed - Opened by lhoestq 2 months ago - 1 comment

#7318 - Introduce support for PDFs

Issue - State: open - Opened by yabramuvdi 2 months ago - 6 comments
Labels: enhancement

#7317 - Release: 3.2.0

Pull Request - State: closed - Opened by lhoestq 2 months ago - 1 comment

#7316 - More docs to from_dict to mention that the result lives in RAM

Pull Request - State: closed - Opened by lhoestq 2 months ago - 1 comment

#7314 - Resolved for empty datafiles

Pull Request - State: open - Opened by sahillihas 2 months ago - 2 comments

#7313 - Cannot create a dataset with relative audio path

Issue - State: open - Opened by sedol1339 2 months ago - 3 comments

#7311 - How to get the original dataset name with username?

Issue - State: open - Opened by npuichigo 3 months ago - 2 comments
Labels: enhancement

#7311 - How to get the original dataset name with username?

Issue - State: open - Opened by npuichigo 3 months ago
Labels: enhancement