Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / dask/dask issues and pull requests

#9825 - Docstring of dask.array.coarsen should provide more details on valid reduction functions

Issue - State: open - Opened by jni almost 2 years ago - 3 comments
Labels: array, documentation, needs attention

#9824 - Groupby Quantile

Issue - State: open - Opened by patcao almost 2 years ago - 1 comment
Labels: dataframe, needs attention, enhancement

#9821 - Update deploying-kubernetes to use `dask_kubernetes.operator`.

Issue - State: open - Opened by TomAugspurger almost 2 years ago - 1 comment
Labels: documentation, needs attention

#9819 - Improve error message when cudf-backend dispatching fails

Pull Request - State: open - Opened by rjzamora almost 2 years ago
Labels: needs attention, enhancement

#9814 - Update `RemovedIn20Warning` filtering

Pull Request - State: open - Opened by jrbourbeau almost 2 years ago
Labels: dataframe, io, needs attention

#9804 - test_tokenize_object_with_recursion_error() triggers stack overflow on Windows due to IPython imports

Issue - State: open - Opened by pfox89 almost 2 years ago - 1 comment
Labels: tests, needs attention

#9803 - Provide a clearer error message when using dataframe.backend if dask_cudf is not installed

Issue - State: open - Opened by randerzander almost 2 years ago - 3 comments
Labels: dataframe, needs attention, enhancement

#9800 - ValueError: divisions must be sorted when using int64 index

Issue - State: open - Opened by cbyrohl almost 2 years ago - 3 comments
Labels: needs info, needs attention

#9798 - Indexing a dask DataFrame with a dask boolean array

Issue - State: open - Opened by gcaria almost 2 years ago - 2 comments
Labels: dataframe, needs attention, enhancement

#9795 - Optimization is slow

Issue - State: open - Opened by martindurant almost 2 years ago - 29 comments
Labels: highlevelgraph, needs attention

#9791 - Consider adding support for https://github.com/eto-ai/lance

Issue - State: open - Opened by asmith26 almost 2 years ago - 1 comment
Labels: dataframe, io, needs attention

#9780 - Fix read_parquet index preservation when pyarrow-schema is specified

Pull Request - State: open - Opened by rjzamora almost 2 years ago
Labels: dataframe, io, needs attention

#9767 - `Series.std()` fails when using extension dtypes

Issue - State: closed - Opened by jrbourbeau almost 2 years ago - 1 comment
Labels: dataframe, needs attention, bug

#9763 - It raises Error to Index dask.dataframe's index by applied series.

Issue - State: open - Opened by Crispy13 almost 2 years ago - 5 comments
Labels: dataframe, tests

#9756 - Fix ``corr`` and ``cov`` on a single-row partition

Pull Request - State: closed - Opened by j-bennet almost 2 years ago
Labels: dataframe, needs attention

#9732 - Remove source and target optimization in array.store

Pull Request - State: open - Opened by djhoese almost 2 years ago - 6 comments
Labels: array

#9728 - Pivot table min max functions

Pull Request - State: closed - Opened by sorenwacker almost 2 years ago - 5 comments
Labels: dataframe

#9725 - Upgrade `tiledb-py` to `>=0.18.3` in CI

Pull Request - State: open - Opened by graingert almost 2 years ago - 3 comments

#9718 - DOC: add example showing dask.delayed on a range, with usage of `nout`

Issue - State: open - Opened by NickleDave almost 2 years ago - 4 comments
Labels: documentation, enhancement

#9694 - Update flake8-bugbear with 'upadup'

Pull Request - State: closed - Opened by sirosen almost 2 years ago - 1 comment
Labels: dataframe, array, needs attention

#9666 - Use repeat to build nearest boundary

Pull Request - State: closed - Opened by j2bbayle about 2 years ago - 8 comments
Labels: array, needs attention

#9632 - Load nullables from dask

Pull Request - State: closed - Opened by hayesgb about 2 years ago - 1 comment
Labels: dataframe, io

#9626 - Failure in assignment of `np.ma.masked` to obect-type `Array`

Issue - State: open - Opened by davidhassell about 2 years ago - 4 comments
Labels: array, needs attention, bug

#9621 - Error when computing a cloned graph from xarray.open_dataset

Issue - State: closed - Opened by tierriminator about 2 years ago - 8 comments
Labels: array, needs attention, bug

#9619 - Read_parquet is slower than expected with S3

Issue - State: open - Opened by mrocklin about 2 years ago - 51 comments
Labels: dataframe, io, parquet

#9470 - bug of svd_compressed needs to fix

Pull Request - State: open - Opened by LUOXIAO92 about 2 years ago - 5 comments
Labels: dataframe, io, needs attention

#9468 - Add compat code for unsupported groupby `dropna` cases

Pull Request - State: closed - Opened by charlesbluca about 2 years ago
Labels: dataframe, needs attention

#9374 - Hypothesis strategy for chunking arrays

Pull Request - State: open - Opened by TomNicholas over 2 years ago - 10 comments
Labels: array, documentation

#9311 - 2022.7.1: documentation build fails wth sphinx 5.x

Issue - State: open - Opened by kloczek over 2 years ago - 13 comments
Labels: documentation

#9308 - Document Dask DataFrame `Index` methods in the API reference

Issue - State: closed - Opened by pavithraes over 2 years ago - 3 comments
Labels: good first issue, documentation

#9271 - Add a filter for the `numeric_only` warning.

Pull Request - State: closed - Opened by jsignell over 2 years ago
Labels: dataframe

#9269 - Pass `numeric_only` through

Pull Request - State: closed - Opened by jsignell over 2 years ago
Labels: dataframe, upstream

#9226 - ddf.cor and ddf.cov fail for single-row partition

Issue - State: closed - Opened by rjzamora over 2 years ago - 1 comment
Labels: dataframe, needs attention, bug

#9205 - FIX: update versioneer for py312

Pull Request - State: closed - Opened by tacaswell over 2 years ago - 3 comments

#9079 - dd.concat crashes with unhelpful error message when column types are incompatible

Issue - State: open - Opened by eric-yu-snorkel over 2 years ago - 2 comments
Labels: dataframe, bug

#9056 - Is it intended? Column loaded twice if listed twice in "columns=" when reading a Parquet file.

Issue - State: open - Opened by bsesar over 2 years ago - 6 comments
Labels: dataframe, parquet, needs attention, bug

#9008 - Metadata error when dropping a list of columns, and then later updating that list of columns

Issue - State: open - Opened by multimeric over 2 years ago - 8 comments
Labels: dataframe, enhancement

#8939 - Sort Values Division By Zero

Issue - State: closed - Opened by kevjumba over 2 years ago - 4 comments
Labels: dataframe

#8917 - Python Array API in Dask issue tracking

Issue - State: open - Opened by tomwhite over 2 years ago - 3 comments
Labels: array

#8853 - Type hints doesn't work as expected, because they aren't present in dask source code.

Issue - State: open - Opened by karolzlot over 2 years ago - 7 comments
Labels: needs attention, feature

#8853 - Type hints doesn't work as expected, because they aren't present in dask source code.

Issue - State: open - Opened by karolzlot over 2 years ago - 7 comments
Labels: needs attention, feature

#8787 - [Bug] [Dask-on-Ray] Partd files are not cleaned automatically

Issue - State: closed - Opened by mikwieczorek over 2 years ago - 9 comments
Labels: core, needs info

#8658 - Groupby Rank

Issue - State: open - Opened by beckernick almost 3 years ago - 2 comments
Labels: dataframe, needs attention, feature

#8645 - [DISCUSSION] What to do about `None` vs `no_default` as a pandas kwargs

Issue - State: open - Opened by jsignell almost 3 years ago - 8 comments
Labels: dataframe, discussion

#8638 - Allow some boolean indexing operations to return correct shape

Issue - State: open - Opened by Illviljan almost 3 years ago - 3 comments
Labels: array, needs attention, enhancement

#8635 - Blockwise optimization doesn't combine task names, like low-level fusion does

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 7 comments
Labels: highlevelgraph, needs attention

#8620 - `read_sql_query` with meta converts dtypes from 32 to 64.

Issue - State: open - Opened by jsignell almost 3 years ago - 9 comments
Labels: needs attention, bug, p3

#8616 - [DISCUSSION] Layer-by-Layer Graph Execution

Issue - State: open - Opened by rjzamora almost 3 years ago - 10 comments
Labels: discussion, highlevelgraph, needs attention

#8581 - Blockwise serialization can fail with LocalCluster(processes=False)

Issue - State: open - Opened by rjzamora almost 3 years ago - 3 comments
Labels: core, highlevelgraph

#8570 - Culling massive Blockwise graphs is very slow, not constant-time

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 10 comments
Labels: highlevelgraph, needs attention

#8549 - OSError: Could not load shared object file: llvmlite.dll

Issue - State: open - Opened by crusaderky almost 3 years ago - 4 comments
Labels: upstream, needs attention

#8546 - Allow coroutines to be used in dask.bag operations

Issue - State: open - Opened by ianliu almost 3 years ago - 3 comments
Labels: bag, needs attention

#8530 - dask.dataframe.describe error with nullable data types

Issue - State: open - Opened by scharlottej13 almost 3 years ago - 2 comments
Labels: dataframe, p3

#8528 - From Delayed throws exception when column names are out of order

Issue - State: open - Opened by mlahir1 almost 3 years ago - 3 comments
Labels: dataframe, needs attention

#8506 - Failing Windows tests

Issue - State: closed - Opened by jsignell almost 3 years ago - 1 comment
Labels: needs attention

#8499 - test_development_guidelines_matches_ci fails from sdist

Issue - State: open - Opened by QuLogic almost 3 years ago - 4 comments

#8481 - Array/DataFrame optimization requires HLG

Pull Request - State: open - Opened by gjoseph92 almost 3 years ago - 1 comment
Labels: dataframe, array, needs attention

#8480 - `test_scheduler_highlevel_graph_unpack_import` flaky

Issue - State: open - Opened by jrbourbeau almost 3 years ago - 1 comment
Labels: tests, needs attention

#8476 - Avoid materialization for ArrayOverlapLayer in methods __len__ & get_output_keys

Pull Request - State: open - Opened by GenevieveBuckley almost 3 years ago
Labels: needs attention

#8460 - Assignment using 1D dask Array index

Issue - State: open - Opened by TLouf almost 3 years ago - 5 comments
Labels: array, needs attention

#8448 - Add fusion optimization for Delayed

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 4 comments
Labels: delayed, highlevelgraph, needs attention

#8447 - Consider reactivating low-level DataFrame optimization when not all layers are Blockwise

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 1 comment
Labels: dataframe, needs attention

#8442 - Optimize groupby when `by` contains `ddf.index`

Pull Request - State: open - Opened by jsignell almost 3 years ago - 10 comments
Labels: dataframe, needs attention

#8437 - When `divisions` has repeats, `set_index` puts all data in the last partition instead of balancing it

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 1 comment
Labels: dataframe, needs attention

#8435 - [Discussion] Don't compute divisions by default in `set_index`?

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 8 comments
Labels: dataframe, discussion, needs attention

#8430 - Make sort on groupby also affect the groupby sort on the chunk

Pull Request - State: open - Opened by jsignell almost 3 years ago - 3 comments
Labels: dataframe, needs attention

#8421 - DataFrame.groupby sorts group keys even with sort=False

Issue - State: open - Opened by ghost almost 3 years ago - 1 comment
Labels: dataframe, needs attention, bug, p3

#8415 - Documentation for `set_index(col, compute=True)` is unclear/inaccurate

Issue - State: open - Opened by DahnJ almost 3 years ago - 13 comments
Labels: dataframe, documentation, needs attention

#8380 - da.store loses dependency information

Issue - State: open - Opened by djhoese about 3 years ago - 12 comments
Labels: array, needs attention, bug

#8361 - Optimized groupby aggregations when grouping by a sorted index

Issue - State: open - Opened by gjoseph92 about 3 years ago - 11 comments
Labels: dataframe, needs attention

#8355 - Could not deserialize task when using `npartitions="auto"` in `DataFrame.set_index()`

Issue - State: open - Opened by aloysius-lim about 3 years ago - 5 comments
Labels: array, needs attention, bug

#8353 - ignore_index is not used in dd.concat

Issue - State: open - Opened by boazmohar about 3 years ago - 2 comments
Labels: dataframe, needs attention

#8335 - `aiobotocore` releated test failures

Issue - State: closed - Opened by jrbourbeau about 3 years ago - 9 comments

#8334 - Use map instead of batch submit in local.get_async

Issue - State: open - Opened by SebastienDorgan about 3 years ago - 1 comment
Labels: needs attention

#8294 - Shuffle prototype: Feedback (disk usage + workers dying)

Issue - State: open - Opened by DahnJ about 3 years ago - 6 comments
Labels: needs attention

#8292 - graph became invalid in 2021.10.0

Issue - State: open - Opened by chrisroat about 3 years ago - 20 comments
Labels: dataframe

#8291 - Fix test_describe_empty to work without global -Werror

Pull Request - State: closed - Opened by mgorny about 3 years ago - 4 comments
Labels: tests, almost done

#8289 - #4012 for read_csv?

Issue - State: open - Opened by y-he2 about 3 years ago - 5 comments
Labels: dataframe, io, needs attention

#8280 - computing std of sparse matrix produces an error

Issue - State: open - Opened by vttrifonov about 3 years ago - 3 comments
Labels: array, needs attention

#8262 - Use Rich more broadly?

Issue - State: open - Opened by mrocklin about 3 years ago - 1 comment
Labels: discussion, needs attention

#8247 - [WIP] fix OOM error of dask-glm with cupy on GPU

Pull Request - State: open - Opened by daxiongshu about 3 years ago - 10 comments
Labels: array

#8245 - Implement `unstack()` and/or `pivot()`

Issue - State: open - Opened by DahnJ about 3 years ago - 4 comments
Labels: dataframe, needs attention

#8233 - Monthly community meeting

Issue - State: open - Opened by jrbourbeau about 3 years ago - 4 comments
Labels: community

#8229 - Unexpected behaviour with out-of-bound indices

Issue - State: open - Opened by fnattino about 3 years ago - 3 comments
Labels: array, needs attention

#8216 - Pandas 1.2.0 compatibility - column reductions are applied column-wise (when possible)

Issue - State: open - Opened by jsignell about 3 years ago - 1 comment
Labels: dataframe

#8196 - Remove `try..except` block in `set_partitions_pre`

Pull Request - State: open - Opened by charlesbluca about 3 years ago - 2 comments
Labels: dataframe, needs attention

#8172 - botocore error when writing parquet to S3

Issue - State: open - Opened by cliffplaysdrums about 3 years ago - 4 comments
Labels: dataframe, io, parquet, needs attention

#8147 - Unified data reader / writer interfaces

Issue - State: open - Opened by MrPowers about 3 years ago - 3 comments
Labels: io, parquet

#8143 - A few threaded scheduler fixups

Pull Request - State: open - Opened by jcrist about 3 years ago - 7 comments

#8062 - Flaky `test_create_metadata_file`

Issue - State: closed - Opened by jrbourbeau about 3 years ago - 10 comments
Labels: dataframe, io, tests, parquet, needs attention

#8058 - [Discussion] Improve Parquet-Metadata Processing in read_parquet

Issue - State: open - Opened by rjzamora about 3 years ago - 8 comments
Labels: dataframe, io, discussion, parquet, needs attention

#8020 - Use uniform distribution in `timeseries` demo

Pull Request - State: open - Opened by jrbourbeau over 3 years ago - 3 comments
Labels: dataframe, io, needs attention

#8001 - in code suggestion of when to use split_out in dask.dataframe.groupby

Issue - State: open - Opened by raybellwaves over 3 years ago - 6 comments
Labels: dataframe, documentation, needs attention

#7999 - Possible bug when using dask.array.bincount and dask.array.apply_along_axis

Issue - State: open - Opened by miguelcarcamov over 3 years ago - 4 comments
Labels: array, needs attention

#7996 - Flaky `test_setitem_extended_API_2d[index13-value13]`

Issue - State: open - Opened by pentschev over 3 years ago - 2 comments
Labels: tests, needs attention

#7977 - Pyarrow metadata `RuntimeError` in `to_parquet`

Issue - State: open - Opened by jrbourbeau over 3 years ago - 26 comments
Labels: dataframe, io, parquet, needs attention

#7971 - Fix slicing Dask arrays with NumPy array of ndim=0

Pull Request - State: open - Opened by afofa over 3 years ago - 1 comment
Labels: array, needs attention

#7957 - series1.eq(series2) gives different results than pandas with certain dtypes

Issue - State: open - Opened by thehomebrewnerd over 3 years ago - 2 comments
Labels: dataframe, needs attention

#7951 - Slicing Dask arrays with NumPy scalars

Issue - State: open - Opened by jrbourbeau over 3 years ago - 2 comments
Labels: array, needs attention

#7950 - Improve tensordot performance with auto-rechunking

Pull Request - State: open - Opened by GenevieveBuckley over 3 years ago - 17 comments
Labels: array, needs attention