Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / dask/dask issues and pull requests

#11307 - Out of memory

Issue - State: open - Opened by dbalabka about 2 months ago - 12 comments
Labels: dataframe, dask-expr

#11306 - ⚠️ Upstream CI failed ⚠️

Issue - State: closed - Opened by github-actions[bot] about 2 months ago - 1 comment
Labels: upstream

#11305 - order: ensure runnable tasks are certainly runnable

Pull Request - State: closed - Opened by fjetter about 2 months ago - 2 comments

#11304 - Fix upstream numpy build

Pull Request - State: closed - Opened by phofl about 2 months ago - 1 comment
Labels: upstream

#11303 - order: Choose better target for branches with multiple leaf nodes

Pull Request - State: closed - Opened by phofl about 2 months ago - 3 comments

#11302 - Columns are missing after rename

Issue - State: closed - Opened by guozhans about 2 months ago - 2 comments
Labels: needs info, dask-expr

#11301 - Enable slicing with only one unknonw chunk

Pull Request - State: closed - Opened by phofl about 2 months ago - 1 comment

#11300 - Fix slicing for masked arrays

Pull Request - State: closed - Opened by phofl about 2 months ago - 3 comments

#11299 - DataFrame object vs string datatype

Issue - State: closed - Opened by FBruzzesi about 2 months ago - 3 comments
Labels: needs triage

#11298 - Dask 2024.8 started failing when indexing result of numpy.flatnonzero

Issue - State: closed - Opened by lagru about 2 months ago - 15 comments
Labels: array

#11296 - `2024.08.0` array slicing does not preserve masks

Issue - State: closed - Opened by rcomer about 2 months ago - 1 comment
Labels: array, needs triage

#11295 - Error in addition of dask dataframe and array when reading from parquet

Issue - State: open - Opened by kbbat about 2 months ago - 3 comments
Labels: dask-expr, array-expr

#11294 - Prune conda nightlies on release

Pull Request - State: open - Opened by charlesbluca about 2 months ago - 2 comments

#11293 - Error when saving joined dataframe with index name to parquet

Issue - State: closed - Opened by joshua-gould about 2 months ago
Labels: dataframe

#11292 - order: dask order returns suboptimal ordering for xr.rechunk().groupby().reduce("cohorts")

Issue - State: closed - Opened by phofl about 2 months ago
Labels: dask-order

#11291 - Make shuffle a no-op if possible

Pull Request - State: closed - Opened by phofl about 2 months ago - 1 comment

#11290 - `2024.8.0` breaks `sparse` array indexing

Issue - State: open - Opened by ilan-gold about 2 months ago - 13 comments

#11289 - Link to dask vs spark benchmarks on dask docs

Pull Request - State: closed - Opened by scharlottej13 about 2 months ago - 2 comments

#11288 - array: fix `asarray` for array input with `dtype`

Pull Request - State: closed - Opened by lucascolley about 2 months ago - 7 comments

#11285 - BUG: `array.asarray` does not respect `dtype` arg

Issue - State: closed - Opened by lucascolley about 2 months ago - 1 comment
Labels: array

#11284 - Deprecate split-large-chunks option

Pull Request - State: open - Opened by phofl about 2 months ago - 5 comments

#11282 - Automatically rechunk in array-shuffle if groups are too large

Issue - State: closed - Opened by phofl about 2 months ago
Labels: array

#11281 - Ensure that array-shuffle with range-like full indexer is a no-op

Issue - State: closed - Opened by phofl about 2 months ago
Labels: array

#11273 - Keep chunksize consistent in reshape

Pull Request - State: closed - Opened by phofl about 2 months ago - 4 comments

#11271 - Add more docstring examples for ``normalize_chunks``

Pull Request - State: closed - Opened by Illviljan about 2 months ago - 1 comment

#11266 - Dask indexing problem with cupy

Issue - State: open - Opened by miguelcarcamov about 2 months ago - 3 comments
Labels: array, bug, gpu

#11248 - Add a Task class to replace tuples for task specification

Pull Request - State: closed - Opened by fjetter 2 months ago - 9 comments

#11242 - Update gpuCI `RAPIDS_VER` to `24.10`

Pull Request - State: closed - Opened by github-actions[bot] 2 months ago - 5 comments

#11240 - Optimizer applies parquet `filters` after loading when using `read_parquet(...).map_partitions(...).compute()`

Issue - State: open - Opened by Timost 2 months ago - 1 comment
Labels: needs info, parquet

#11239 - `read_sql_table` no longer sets index name in the resulting ddf meta

Issue - State: closed - Opened by Timost 2 months ago - 2 comments
Labels: dataframe, needs info

#11235 - Pyarrow <NA> filters are not being applied in read_parquet

Issue - State: closed - Opened by benrutter 2 months ago - 3 comments
Labels: dataframe, parquet

#11220 - Handle np.frombuffer

Issue - State: closed - Opened by anruijian 3 months ago - 3 comments
Labels: needs triage

#11217 - Add array annotations

Pull Request - State: open - Opened by jason-trinidad 3 months ago - 4 comments

#11188 - Bug in map_blocks when iterating over multiple arrays

Issue - State: open - Opened by astrofrog 3 months ago - 1 comment
Labels: array

#11182 - Ensure we test against numpy 2 in CI

Pull Request - State: closed - Opened by jrbourbeau 3 months ago - 3 comments

#11155 - 'SeriesGroupBy' object has no attribute 'nunique_approx'

Issue - State: closed - Opened by LeilaGold 4 months ago - 7 comments
Labels: needs info

#11128 - Fix map_overlap with new_axis

Pull Request - State: closed - Opened by dstansby 4 months ago - 4 comments

#11124 - Overlap with `new_axis` option is not trimmed correctly

Issue - State: closed - Opened by chourroutm 5 months ago - 1 comment
Labels: needs triage

#11111 - ⚠️ Upstream CI failed ⚠️

Issue - State: closed - Opened by github-actions[bot] 5 months ago - 1 comment
Labels: upstream

#11075 - Add Scheduling section to DataFrame best practices

Pull Request - State: open - Opened by phofl 5 months ago - 1 comment

#11055 - Substantial memory usage in dask.order

Issue - State: closed - Opened by fjetter 5 months ago - 4 comments
Labels: needs triage

#11040 - Add lazy "cudf" registration for p2p-related dispatch functions

Pull Request - State: closed - Opened by rjzamora 6 months ago - 4 comments
Labels: bug

#11039 - Make python 3.11.9 fix a bit safer

Pull Request - State: closed - Opened by rjzamora 6 months ago - 2 comments
Labels: dataframe

#11038 - Incompatibility with python 3.11.9

Issue - State: closed - Opened by briceruzand 6 months ago - 2 comments
Labels: dataframe, needs triage

#11036 - Remove skips for named aggregations

Pull Request - State: closed - Opened by phofl 6 months ago - 1 comment

#11035 - Fix ``dask.dataframe`` import error for Python 3.11.9

Pull Request - State: closed - Opened by rjzamora 6 months ago - 7 comments
Labels: bug

#11034 - dask-expr with drop_duplicates messes with dtypes

Issue - State: closed - Opened by aimran-adroll 6 months ago - 4 comments
Labels: needs triage

#11033 - Column aggregation with "list" produces incorrect output

Issue - State: closed - Opened by aimran-adroll 6 months ago - 2 comments
Labels: needs triage

#11031 - Print functions are wrong inside of map_blocks

Issue - State: closed - Opened by leo333000 6 months ago - 2 comments
Labels: array, needs triage

#11030 - Does not work with AWS - aiobotocore related error

Issue - State: closed - Opened by openSourcerer9000 6 months ago - 2 comments
Labels: needs triage

#11029 - Adjust `test_set_index` for "cudf" backend

Pull Request - State: closed - Opened by rjzamora 6 months ago - 2 comments

#11028 - Remove xfail tracebacks from testsuite

Pull Request - State: closed - Opened by phofl 6 months ago - 1 comment

#11027 - Fix ci for upstream pandas changes

Pull Request - State: closed - Opened by phofl 6 months ago - 1 comment

#11026 - Poor scheduling with `flox`, leading to high memory usage and eventual failure

Issue - State: open - Opened by ivirshup 6 months ago - 7 comments
Labels: needs triage

#11025 - Use ``to/from_legacy_dataframe`` instead of ``to/from_dask_dataframe``

Pull Request - State: closed - Opened by rjzamora 6 months ago - 2 comments
Labels: dataframe, dask-expr

#11024 - Friendly import error message for dask-expr

Pull Request - State: closed - Opened by benrutter 6 months ago - 5 comments

#11023 - Fix value_counts raising if branch exists of nans only

Pull Request - State: closed - Opened by phofl 6 months ago - 1 comment

#11021 - Preserving divisions when reading/loading dataframes with structs containing multiple fields

Issue - State: open - Opened by PhilippeMoussalli 6 months ago - 1 comment
Labels: dataframe, io

#11019 - Hash join transfer with error cannot pickle '_contextvars.ContextVar' object

Issue - State: open - Opened by guozhans 6 months ago - 5 comments
Labels: dataframe, p2

#11018 - `vindex` as outer indexer: memory and time performance

Issue - State: open - Opened by ilan-gold 6 months ago
Labels: array, needs triage

#11017 - ``new_dd_object``'s array logic always assumes the metadata is ``numpy``

Issue - State: open - Opened by rjzamora 6 months ago
Labels: dataframe, array

#11016 - Minimal dd.to_datetime to convert a string column no longer works

Issue - State: closed - Opened by benrutter 6 months ago
Labels: needs triage

#11015 - .loc fails to select columns from boolean array (after dask-exp update)

Issue - State: closed - Opened by benrutter 6 months ago
Labels: needs triage

#11014 - Build nightlies on tag releases

Pull Request - State: closed - Opened by charlesbluca 6 months ago - 1 comment

#11013 - Enable custom expressions in ``dask_cudf``

Pull Request - State: closed - Opened by rjzamora 6 months ago - 1 comment

#11012 - [Docs] Add Hugging Face `hf://` to the list of `fsspec` compatible remote services

Pull Request - State: closed - Opened by lhoestq 6 months ago - 6 comments
Labels: documentation

#11011 - value_counts with NaN sometimes raises ValueError: No objects to concatenate

Issue - State: closed - Opened by m-rossi 6 months ago - 2 comments
Labels: needs triage

#11010 - Update gpuCI `RAPIDS_VER` to `24.06`

Pull Request - State: open - Opened by github-actions[bot] 6 months ago - 1 comment

#11009 - Bump actions/checkout from 4.1.1 to 4.1.2

Pull Request - State: closed - Opened by dependabot[bot] 6 months ago - 1 comment
Labels: dependencies

#11008 - Add HypersSpy to ecosystem.rst

Pull Request - State: closed - Opened by jlaehne 6 months ago - 2 comments
Labels: documentation

#11007 - raise ImportError instead of ValueError when dask-expr cannot be imported

Pull Request - State: closed - Opened by jameslamb 7 months ago - 1 comment

#11006 - as of v2024.3.1, comparing a 1D dask.array.Array to a dask.dataframe.Series fails

Issue - State: closed - Opened by jameslamb 7 months ago - 1 comment
Labels: bug, dask-expr

#11005 - dask.dataframe.DataFrame.reduction fails on`split_every=False` if query planning is in effect

Issue - State: closed - Opened by cbourjau 7 months ago - 1 comment
Labels: needs triage

#11004 - Ensure that repack collections only return tuple if necessary

Pull Request - State: open - Opened by fjetter 7 months ago - 3 comments

#11003 - Only warn if dask-expr is not installed

Pull Request - State: closed - Opened by fjetter 7 months ago - 1 comment

#11002 - Dataframe constructed from single partition bag cannot be shuffled with query planning enabled

Issue - State: closed - Opened by b-phi 7 months ago - 2 comments
Labels: bug, dask-expr

#11001 - Dask query planning string column unique bug

Issue - State: closed - Opened by b-phi 7 months ago - 2 comments
Labels: needs triage

#11000 - dask.dataframe.Series.reduction is not available when using query planning

Issue - State: closed - Opened by cbourjau 7 months ago - 4 comments
Labels: bug, dask-expr

#10999 - TypeError: float() argument must be a string or a real number, not 'csr_matrix'

Issue - State: closed - Opened by erico-imgproj 7 months ago - 1 comment
Labels: needs triage

#10998 - dask.bag.Bag.to_dataframe behavior change in 2024.3.0 - setting dtype to string rather than object by default

Issue - State: open - Opened by kbuma 7 months ago - 4 comments
Labels: dataframe, convert-string

#10997 - Dumb code error in the Example code in Dask-SQL Homepage

Issue - State: closed - Opened by tiraldj 7 months ago - 3 comments
Labels: needs triage

#10996 - importing dask.dataframe changes pandas behaviour in 2024.3.0

Issue - State: closed - Opened by ivirshup 7 months ago - 11 comments
Labels: dask-expr

#10995 - Feedback - DataFrame query planning

Issue - State: open - Opened by fjetter 7 months ago - 7 comments
Labels: dataframe, discussion, dask-expr

#10992 - Implement setting config variables that contain the dot in name

Pull Request - State: open - Opened by dbalabka 7 months ago - 2 comments

#10991 - Combined save and calculation is using excessive memory

Issue - State: open - Opened by pp-mo 7 months ago - 3 comments
Labels: needs triage

#10986 - CI is printing tracebacks for all xfailed tests which can be very confusing

Issue - State: closed - Opened by phofl 7 months ago
Labels: needs triage

#10982 - Dask Nunique bug under dask 2024.2.1

Issue - State: open - Opened by frbelotto 7 months ago - 7 comments
Labels: dataframe

#10962 - Drop pandas 1.X support?

Issue - State: open - Opened by fjetter 7 months ago - 1 comment
Labels: dataframe, discussion

#10951 - UnicodeDecodeError when using a Dataframe with byte data and pandas 2

Issue - State: closed - Opened by danmar3 7 months ago - 2 comments
Labels: needs triage

#10949 - Issue repartitioning a time series by frequency when loaded from parquet file

Issue - State: open - Opened by pvaezi 7 months ago - 5 comments
Labels: dataframe

#10906 - Rename futures to tasks

Pull Request - State: open - Opened by milesgranger 8 months ago - 1 comment

#10896 - [DNM] Test numba tokenization

Pull Request - State: open - Opened by crusaderky 8 months ago

#10895 - Bump codecov/codecov-action from 3 to 4

Pull Request - State: open - Opened by dependabot[bot] 8 months ago - 1 comment
Labels: dependencies

#10894 - Bump peter-evans/create-pull-request from 5 to 6

Pull Request - State: open - Opened by dependabot[bot] 8 months ago - 1 comment
Labels: dependencies

#10893 - Test against pandas 2.0

Pull Request - State: open - Opened by crusaderky 8 months ago - 2 comments