Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / dask/dask issues and pull requests

#10009 - Support complex moment

Pull Request - State: closed - Opened by wkrasnicki over 1 year ago - 15 comments
Labels: array, needs attention, needs review

#9995 - `dask.order` over-prioritizes root tasks in some situations

Issue - State: closed - Opened by gjoseph92 over 1 year ago - 4 comments
Labels: scheduler, core

#9992 - ENH: Allow `__array_ufunc__` to dispatch on `where=` in NumPy?

Issue - State: open - Opened by seberg over 1 year ago - 3 comments
Labels: upstream, needs triage

#9981 - Port option parsing logic from `dask.dataframe.read_parquet` to `to_parquet`

Pull Request - State: closed - Opened by antonl over 1 year ago - 6 comments
Labels: dataframe, io

#9972 - dd.to_hdf tries to write with multiple processes (errno 11, unable to lock file )

Issue - State: open - Opened by MarcoPasta over 1 year ago - 1 comment
Labels: io, needs attention

#9969 - Abandon encoded tuples as task definition in dsk graphs

Issue - State: closed - Opened by fjetter over 1 year ago - 6 comments
Labels: scheduler, core, discussion, highlevelgraph, needs attention, enhancement

#9967 - Generalize some NumPy vs CuPy logic

Pull Request - State: closed - Opened by sarahyurick over 1 year ago - 1 comment
Labels: dataframe, array

#9961 - Allow passing index_col=False in dd.read_csv

Pull Request - State: closed - Opened by michaeldleslie over 1 year ago - 3 comments
Labels: dataframe, io

#9942 - [DNM] Try `copy_on_write` in CI

Pull Request - State: open - Opened by jrbourbeau over 1 year ago - 2 comments
Labels: upstream, needs attention

#9941 - Documenting compatibility of NumPy functions with Dask functions

Pull Request - State: closed - Opened by cmarmo over 1 year ago - 5 comments
Labels: documentation

#9925 - Do not ravel array when slicing with same shape

Pull Request - State: open - Opened by fjetter over 1 year ago - 2 comments
Labels: array, needs attention

#9924 - Make `repartition` a no-op when divisions match

Pull Request - State: closed - Opened by jrbourbeau over 1 year ago - 4 comments
Labels: dataframe

#9917 - [DOC] Favicon not loading for generated API docs

Issue - State: open - Opened by charlesbluca over 1 year ago - 4 comments
Labels: documentation, needs attention, bug

#9915 - `axis=None` behavior in `pandas` 2.0 for `skew` and `kurtosis`

Issue - State: open - Opened by jrbourbeau over 1 year ago
Labels: dataframe, needs attention

#9914 - From-array dispatch support

Pull Request - State: open - Opened by quasiben over 1 year ago
Labels: array, needs attention

#9913 - `numeric_only` compatibility with `pandas>=2.0`, handling datetime and timedelta

Issue - State: open - Opened by j-bennet over 1 year ago
Labels: dataframe, needs attention

#9902 - Add support for grouping by tuple keys

Pull Request - State: open - Opened by charlesbluca over 1 year ago - 2 comments
Labels: dataframe, needs attention

#9899 - Consider migrating some Dask-SQL utility functions into Dask

Issue - State: closed - Opened by sarahyurick over 1 year ago - 2 comments
Labels: needs triage

#9894 - Links to changelog sub-sections are not stable

Issue - State: open - Opened by hendrikmakait over 1 year ago - 2 comments
Labels: documentation, needs attention

#9888 - Reuse of keys in blockwise fusion can cause spurious KeyErrors on distributed cluster

Issue - State: open - Opened by fjetter over 1 year ago - 16 comments
Labels: highlevelgraph, needs attention

#9883 - [WIP] Support `dataframe.dtype_backend` globally

Pull Request - State: open - Opened by jrbourbeau over 1 year ago
Labels: dataframe, io, needs attention

#9882 - idxmin/idxmax on GroupBy does not returns the same result as Pandas

Issue - State: open - Opened by j-bennet over 1 year ago - 5 comments
Labels: dataframe, needs attention, bug

#9881 - Generalize ``dd.to_datetime`` for GPU-backed collections, introduce ``get_meta_library`` utility

Pull Request - State: closed - Opened by charlesbluca over 1 year ago - 4 comments
Labels: dataframe

#9880 - Generalize `dd.to_datetime` to support cuDF-backed Series

Issue - State: closed - Opened by sarahyurick over 1 year ago - 1 comment
Labels: dataframe

#9879 - Apply `dataframe.dtype_backend` configuration option globally

Issue - State: open - Opened by jrbourbeau over 1 year ago - 3 comments
Labels: dataframe, needs attention

#9874 - Add `dataframe.nullable_dtypes` configuration option

Pull Request - State: open - Opened by jrbourbeau over 1 year ago - 2 comments
Labels: dataframe, io, needs attention

#9866 - Da fill diagonal

Pull Request - State: open - Opened by talkhanz over 1 year ago - 5 comments
Labels: array, needs attention

#9861 - Issue writing parquet files on s3 in append mode with pyarrow

Issue - State: open - Opened by antoinebon over 1 year ago - 4 comments
Labels: dataframe, io, parquet, needs attention

#9859 - Utility that prints out delayed values when they're computed?

Issue - State: open - Opened by gjoseph92 over 1 year ago - 8 comments
Labels: core, needs attention, feature

#9858 - dask.dataframe should aggressively cast object strings to PyArrow

Issue - State: closed - Opened by crusaderky over 1 year ago - 2 comments
Labels: dataframe

#9857 - `DataFrame.__len__` / `Series.__len__` should raise a warning if it triggers a compute

Issue - State: open - Opened by crusaderky over 1 year ago - 6 comments
Labels: dataframe, needs attention

#9856 - Allow structs in Parquet files to be flattened on read

Issue - State: open - Opened by te-x over 1 year ago - 8 comments
Labels: io, parquet, needs attention, feature

#9849 - Add blocksize to read_parquet and read_json (non-line json)

Issue - State: open - Opened by crusaderky over 1 year ago - 13 comments
Labels: needs attention, feature

#9848 - DataFrame.mask fails for single-row partitions

Issue - State: closed - Opened by albarji over 1 year ago - 7 comments
Labels: dataframe, bug

#9847 - We should rethink categorize()

Issue - State: open - Opened by crusaderky over 1 year ago - 3 comments
Labels: dataframe, discussion, needs attention

#9846 - implementation of np.fill_diagonal

Issue - State: open - Opened by talkhanz over 1 year ago - 11 comments
Labels: array, needs attention, feature

#9840 - Config option dataframe.dtype_backend: pyarrow doesn't seem to work

Issue - State: closed - Opened by crusaderky over 1 year ago - 4 comments
Labels: dataframe, io

#9837 - Documentation on how to set an s3 profile

Issue - State: open - Opened by crusaderky over 1 year ago - 2 comments
Labels: io, documentation, needs attention

#9835 - Don't import sizeof entrypoints immediately

Pull Request - State: open - Opened by martindurant over 1 year ago - 7 comments
Labels: needs attention

#9825 - Docstring of dask.array.coarsen should provide more details on valid reduction functions

Issue - State: open - Opened by jni over 1 year ago - 3 comments
Labels: array, documentation, needs attention

#9824 - Groupby Quantile

Issue - State: open - Opened by patcao over 1 year ago - 1 comment
Labels: dataframe, needs attention, enhancement

#9821 - Update deploying-kubernetes to use `dask_kubernetes.operator`.

Issue - State: open - Opened by TomAugspurger over 1 year ago - 1 comment
Labels: documentation, needs attention

#9819 - Improve error message when cudf-backend dispatching fails

Pull Request - State: open - Opened by rjzamora over 1 year ago
Labels: needs attention, enhancement

#9814 - Update `RemovedIn20Warning` filtering

Pull Request - State: open - Opened by jrbourbeau over 1 year ago
Labels: dataframe, io, needs attention

#9804 - test_tokenize_object_with_recursion_error() triggers stack overflow on Windows due to IPython imports

Issue - State: open - Opened by pfox89 over 1 year ago - 1 comment
Labels: tests, needs attention

#9803 - Provide a clearer error message when using dataframe.backend if dask_cudf is not installed

Issue - State: open - Opened by randerzander over 1 year ago - 3 comments
Labels: dataframe, needs attention, enhancement

#9800 - ValueError: divisions must be sorted when using int64 index

Issue - State: open - Opened by cbyrohl over 1 year ago - 3 comments
Labels: needs info, needs attention

#9798 - Indexing a dask DataFrame with a dask boolean array

Issue - State: open - Opened by gcaria over 1 year ago - 2 comments
Labels: dataframe, needs attention, enhancement

#9795 - Optimization is slow

Issue - State: open - Opened by martindurant over 1 year ago - 29 comments
Labels: highlevelgraph, needs attention

#9791 - Consider adding support for https://github.com/eto-ai/lance

Issue - State: open - Opened by asmith26 almost 2 years ago - 1 comment
Labels: dataframe, io, needs attention

#9780 - Fix read_parquet index preservation when pyarrow-schema is specified

Pull Request - State: open - Opened by rjzamora almost 2 years ago
Labels: dataframe, io, needs attention

#9767 - `Series.std()` fails when using extension dtypes

Issue - State: closed - Opened by jrbourbeau almost 2 years ago - 1 comment
Labels: dataframe, needs attention, bug

#9763 - It raises Error to Index dask.dataframe's index by applied series.

Issue - State: open - Opened by Crispy13 almost 2 years ago - 5 comments
Labels: dataframe, tests

#9756 - Fix ``corr`` and ``cov`` on a single-row partition

Pull Request - State: closed - Opened by j-bennet almost 2 years ago
Labels: dataframe, needs attention

#9732 - Remove source and target optimization in array.store

Pull Request - State: open - Opened by djhoese almost 2 years ago - 6 comments
Labels: array

#9728 - Pivot table min max functions

Pull Request - State: closed - Opened by sorenwacker almost 2 years ago - 5 comments
Labels: dataframe

#9725 - Upgrade `tiledb-py` to `>=0.18.3` in CI

Pull Request - State: open - Opened by graingert almost 2 years ago - 3 comments

#9718 - DOC: add example showing dask.delayed on a range, with usage of `nout`

Issue - State: open - Opened by NickleDave almost 2 years ago - 4 comments
Labels: documentation, enhancement

#9694 - Update flake8-bugbear with 'upadup'

Pull Request - State: closed - Opened by sirosen almost 2 years ago - 1 comment
Labels: dataframe, array, needs attention

#9666 - Use repeat to build nearest boundary

Pull Request - State: open - Opened by j2bbayle almost 2 years ago - 5 comments
Labels: array

#9632 - Load nullables from dask

Pull Request - State: closed - Opened by hayesgb almost 2 years ago - 1 comment
Labels: dataframe, io

#9626 - Failure in assignment of `np.ma.masked` to obect-type `Array`

Issue - State: open - Opened by davidhassell almost 2 years ago - 4 comments
Labels: array, needs attention, bug

#9621 - Error when computing a cloned graph from xarray.open_dataset

Issue - State: closed - Opened by tierriminator almost 2 years ago - 8 comments
Labels: array, needs attention, bug

#9619 - Read_parquet is slower than expected with S3

Issue - State: open - Opened by mrocklin almost 2 years ago - 51 comments
Labels: dataframe, io, parquet

#9470 - bug of svd_compressed needs to fix

Pull Request - State: open - Opened by LUOXIAO92 about 2 years ago - 5 comments
Labels: dataframe, io, needs attention

#9374 - Hypothesis strategy for chunking arrays

Pull Request - State: open - Opened by TomNicholas about 2 years ago - 10 comments
Labels: array, documentation

#9311 - 2022.7.1: documentation build fails wth sphinx 5.x

Issue - State: open - Opened by kloczek about 2 years ago - 13 comments
Labels: documentation

#9308 - Document Dask DataFrame `Index` methods in the API reference

Issue - State: closed - Opened by pavithraes about 2 years ago - 3 comments
Labels: good first issue, documentation

#9271 - Add a filter for the `numeric_only` warning.

Pull Request - State: closed - Opened by jsignell about 2 years ago
Labels: dataframe

#9269 - Pass `numeric_only` through

Pull Request - State: closed - Opened by jsignell about 2 years ago
Labels: dataframe, upstream

#9226 - ddf.cor and ddf.cov fail for single-row partition

Issue - State: closed - Opened by rjzamora over 2 years ago - 1 comment
Labels: dataframe, needs attention, bug

#9205 - FIX: update versioneer for py312

Pull Request - State: closed - Opened by tacaswell over 2 years ago - 3 comments

#9079 - dd.concat crashes with unhelpful error message when column types are incompatible

Issue - State: open - Opened by eric-yu-snorkel over 2 years ago - 2 comments
Labels: dataframe, bug

#9056 - Is it intended? Column loaded twice if listed twice in "columns=" when reading a Parquet file.

Issue - State: open - Opened by bsesar over 2 years ago - 6 comments
Labels: dataframe, parquet, needs attention, bug

#9008 - Metadata error when dropping a list of columns, and then later updating that list of columns

Issue - State: open - Opened by multimeric over 2 years ago - 8 comments
Labels: dataframe, enhancement

#8939 - Sort Values Division By Zero

Issue - State: closed - Opened by kevjumba over 2 years ago - 4 comments
Labels: dataframe

#8917 - Python Array API in Dask issue tracking

Issue - State: open - Opened by tomwhite over 2 years ago - 3 comments
Labels: array

#8853 - Type hints doesn't work as expected, because they aren't present in dask source code.

Issue - State: open - Opened by karolzlot over 2 years ago - 7 comments
Labels: needs attention, feature

#8853 - Type hints doesn't work as expected, because they aren't present in dask source code.

Issue - State: open - Opened by karolzlot over 2 years ago - 7 comments
Labels: needs attention, feature

#8787 - [Bug] [Dask-on-Ray] Partd files are not cleaned automatically

Issue - State: closed - Opened by mikwieczorek over 2 years ago - 9 comments
Labels: core, needs info

#8658 - Groupby Rank

Issue - State: open - Opened by beckernick over 2 years ago - 2 comments
Labels: dataframe, needs attention, feature

#8645 - [DISCUSSION] What to do about `None` vs `no_default` as a pandas kwargs

Issue - State: open - Opened by jsignell over 2 years ago - 8 comments
Labels: dataframe, discussion

#8638 - Allow some boolean indexing operations to return correct shape

Issue - State: open - Opened by Illviljan over 2 years ago - 3 comments
Labels: array, needs attention, enhancement

#8635 - Blockwise optimization doesn't combine task names, like low-level fusion does

Issue - State: open - Opened by gjoseph92 over 2 years ago - 7 comments
Labels: highlevelgraph, needs attention

#8620 - `read_sql_query` with meta converts dtypes from 32 to 64.

Issue - State: open - Opened by jsignell over 2 years ago - 9 comments
Labels: needs attention, bug, p3

#8616 - [DISCUSSION] Layer-by-Layer Graph Execution

Issue - State: open - Opened by rjzamora over 2 years ago - 10 comments
Labels: discussion, highlevelgraph, needs attention

#8581 - Blockwise serialization can fail with LocalCluster(processes=False)

Issue - State: open - Opened by rjzamora over 2 years ago - 3 comments
Labels: core, highlevelgraph

#8570 - Culling massive Blockwise graphs is very slow, not constant-time

Issue - State: open - Opened by gjoseph92 over 2 years ago - 10 comments
Labels: highlevelgraph, needs attention

#8549 - OSError: Could not load shared object file: llvmlite.dll

Issue - State: open - Opened by crusaderky over 2 years ago - 4 comments
Labels: upstream, needs attention

#8546 - Allow coroutines to be used in dask.bag operations

Issue - State: open - Opened by ianliu over 2 years ago - 3 comments
Labels: bag, needs attention

#8530 - dask.dataframe.describe error with nullable data types

Issue - State: open - Opened by scharlottej13 over 2 years ago - 2 comments
Labels: dataframe, p3

#8528 - From Delayed throws exception when column names are out of order

Issue - State: open - Opened by mlahir1 over 2 years ago - 3 comments
Labels: dataframe, needs attention

#8499 - test_development_guidelines_matches_ci fails from sdist

Issue - State: open - Opened by QuLogic almost 3 years ago - 4 comments

#8481 - Array/DataFrame optimization requires HLG

Pull Request - State: open - Opened by gjoseph92 almost 3 years ago - 1 comment
Labels: dataframe, array, needs attention

#8480 - `test_scheduler_highlevel_graph_unpack_import` flaky

Issue - State: open - Opened by jrbourbeau almost 3 years ago - 1 comment
Labels: tests, needs attention

#8476 - Avoid materialization for ArrayOverlapLayer in methods __len__ & get_output_keys

Pull Request - State: open - Opened by GenevieveBuckley almost 3 years ago
Labels: needs attention

#8460 - Assignment using 1D dask Array index

Issue - State: open - Opened by TLouf almost 3 years ago - 5 comments
Labels: array, needs attention

#8448 - Add fusion optimization for Delayed

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 4 comments
Labels: delayed, highlevelgraph, needs attention

#8447 - Consider reactivating low-level DataFrame optimization when not all layers are Blockwise

Issue - State: open - Opened by gjoseph92 almost 3 years ago - 1 comment
Labels: dataframe, needs attention

#8442 - Optimize groupby when `by` contains `ddf.index`

Pull Request - State: open - Opened by jsignell almost 3 years ago - 10 comments
Labels: dataframe, needs attention