Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / dask/dask issues and pull requests

#11354 - Improve normalize_chunks calculation for "auto" setting

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11353 - Futures not always resolved when using dataframe.reduction

Issue - State: open - Opened by DaniJG 3 months ago
Labels: needs triage, dask-expr

#11352 - Parquet read with `filesystem="arrow"` fails when `distributed` isn't imported first

Issue - State: open - Opened by rjzamora 3 months ago - 3 comments
Labels: bug, dask-expr

#11351 - Deprecate legacy DataFrame implementation

Issue - State: closed - Opened by phofl 3 months ago - 1 comment
Labels: dataframe, deprecation

#11350 - Add changelor entries for shuffle, vindex and blockwise_reshape

Pull Request - State: closed - Opened by phofl 3 months ago - 1 comment

#11349 - map_overlap passes wrong block_info[:]['array-location']

Issue - State: open - Opened by bnavigator 3 months ago - 2 comments
Labels: array

#11348 - Ensure persisted collections are released without GC

Pull Request - State: closed - Opened by fjetter 3 months ago - 2 comments

#11347 - Full support for task spec in dask.order

Pull Request - State: closed - Opened by fjetter 3 months ago - 1 comment

#11346 - KilledWorker (exceeded 95% memory budget) with new optimizer

Issue - State: open - Opened by noreentry 3 months ago - 5 comments
Labels: needs triage

#11345 - Increase visibility of GPU CI updates

Pull Request - State: closed - Opened by charlesbluca 3 months ago - 1 comment

#11344 - Weird RecursionError during `tokenize`

Issue - State: closed - Opened by hanjinliu 3 months ago - 5 comments
Labels: needs info

#11343 - Bug: Can't perform a (meaningful) "outer" concatenation with dask-expr on `axis=1`

Issue - State: closed - Opened by benrutter 3 months ago - 1 comment
Labels: dask-expr

#11342 - Better chunk size value for chunks=auto setting

Issue - State: open - Opened by phofl 3 months ago
Labels: array

#11341 - Improve how normalize_chunks selects chunk sizes if auto is given

Issue - State: open - Opened by phofl 3 months ago - 7 comments
Labels: array

#11340 - Update ``numpy`` and ``pyarrow`` versions in install docs

Pull Request - State: closed - Opened by jrbourbeau 3 months ago

#11339 - Suggesting updates on the doc of `dask.dataframe.read_sql_query`

Issue - State: open - Opened by ParsifalXu 3 months ago - 2 comments
Labels: dataframe, documentation

#11338 - Fixup dask and distributed dependencies

Pull Request - State: closed - Opened by phofl 3 months ago

#11337 - Choose automatically between tasks-based and p2p rechunking

Pull Request - State: closed - Opened by hendrikmakait 3 months ago - 7 comments

#11336 - An inconsistency between the documentation of `dask.array.percentile` and code implementation

Issue - State: open - Opened by ParsifalXu 3 months ago - 2 comments
Labels: array, documentation

#11335 - Add ``crick`` back to Python 3.11+ CI builds

Pull Request - State: closed - Opened by jrbourbeau 3 months ago - 2 comments

#11334 - gpuCI failing

Issue - State: closed - Opened by jrbourbeau 3 months ago - 3 comments
Labels: tests, gpu

#11332 - Fix docstring formatting for map_overlap

Pull Request - State: closed - Opened by Tao-VanJS 3 months ago - 3 comments

#11331 - Bump `numpy>=1.24` and `pyarrow>=14.0.1` minimum versions

Pull Request - State: closed - Opened by jrbourbeau 3 months ago - 4 comments

#11330 - Preserve chunksizes in vindex

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11329 - `map_blocks()` with `new_axis` output has incorrect shape

Issue - State: closed - Opened by dstansby 3 months ago - 5 comments
Labels: array

#11328 - Implement blockwise reshape

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11327 - Fix NumPy overflowing for prod on 2.0

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11326 - Make rechunking in shuffle more intelligent to distribute unevenly if necessary

Pull Request - State: closed - Opened by phofl 3 months ago - 1 comment

#11325 - read_sql_table would throw an exception when calling for unique values of a column

Issue - State: closed - Opened by phalvesmbai 3 months ago
Labels: dataframe, io

#11324 - Add changelog entry for reshape and ordering improvements

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11323 - Bump mindeps for pyarrow and numpy

Issue - State: closed - Opened by fjetter 3 months ago - 3 comments
Labels: needs triage

#11322 - Avoid casting arrow dtypes to numpy object for tokenize

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11321 - Revert "Test ordering on distributed scheduler (#11310)"

Pull Request - State: closed - Opened by fjetter 3 months ago - 2 comments

#11320 - Ensure pickle does not change tokens

Pull Request - State: closed - Opened by fjetter 3 months ago - 12 comments

#11319 - Pass additional parameters to `rechunk_p2p`

Pull Request - State: closed - Opened by hendrikmakait 3 months ago - 1 comment

#11318 - cannot access local variable 'divisions' where it is not associated with a value

Issue - State: closed - Opened by Cognitus-Stuti 3 months ago - 1 comment
Labels: needs triage

#11317 - Rename chunksize-tolerance option

Pull Request - State: closed - Opened by phofl 3 months ago - 5 comments

#11316 - Requested dask.distributed scheduler but no Client active

Issue - State: open - Opened by Cognitus-Stuti 3 months ago - 6 comments
Labels: needs info

#11315 - ⚠️ Upstream CI failed ⚠️

Issue - State: closed - Opened by github-actions[bot] 3 months ago - 3 comments
Labels: upstream

#11313 - Add tests to cover more cases of new reshape implementation

Pull Request - State: closed - Opened by phofl 3 months ago - 1 comment

#11312 - gpuCI broken

Issue - State: open - Opened by fjetter 3 months ago - 7 comments
Labels: needs triage

#11311 - Implement automatic rechunking for shuffle

Pull Request - State: closed - Opened by phofl 3 months ago - 2 comments

#11310 - Test ordering on distributed scheduler

Pull Request - State: closed - Opened by fjetter 3 months ago - 3 comments

#11309 - Upgrade gpuCI and fix Dask Array failures with "cupy" backend

Pull Request - State: closed - Opened by rjzamora 3 months ago - 3 comments
Labels: array, bug, gpu

#11308 - Unexpected Behavior When Using `dask.delayed` with `xarray` to Load a Chunked Dataset

Issue - State: open - Opened by Eis-ba-er 3 months ago - 3 comments
Labels: needs triage

#11307 - Out of memory

Issue - State: open - Opened by dbalabka 3 months ago - 12 comments
Labels: dataframe, dask-expr

#11306 - ⚠️ Upstream CI failed ⚠️

Issue - State: closed - Opened by github-actions[bot] 3 months ago - 1 comment
Labels: upstream

#11305 - order: ensure runnable tasks are certainly runnable

Pull Request - State: closed - Opened by fjetter 3 months ago - 2 comments

#11304 - Fix upstream numpy build

Pull Request - State: closed - Opened by phofl 3 months ago - 1 comment
Labels: upstream

#11303 - order: Choose better target for branches with multiple leaf nodes

Pull Request - State: closed - Opened by phofl 3 months ago - 3 comments

#11302 - Columns are missing after rename

Issue - State: closed - Opened by guozhans 3 months ago - 2 comments
Labels: needs info, dask-expr

#11301 - Enable slicing with only one unknonw chunk

Pull Request - State: closed - Opened by phofl 3 months ago - 1 comment

#11300 - Fix slicing for masked arrays

Pull Request - State: closed - Opened by phofl 3 months ago - 3 comments

#11299 - DataFrame object vs string datatype

Issue - State: closed - Opened by FBruzzesi 3 months ago - 3 comments
Labels: needs triage

#11298 - Dask 2024.8 started failing when indexing result of numpy.flatnonzero

Issue - State: closed - Opened by lagru 3 months ago - 15 comments
Labels: array

#11296 - `2024.08.0` array slicing does not preserve masks

Issue - State: closed - Opened by rcomer 3 months ago - 1 comment
Labels: array, needs triage

#11295 - Error in addition of dask dataframe and array when reading from parquet

Issue - State: open - Opened by kbbat 3 months ago - 3 comments
Labels: dask-expr, array-expr

#11294 - Prune conda nightlies on release

Pull Request - State: open - Opened by charlesbluca 3 months ago - 2 comments

#11293 - Error when saving joined dataframe with index name to parquet

Issue - State: closed - Opened by joshua-gould 3 months ago
Labels: dataframe

#11291 - Make shuffle a no-op if possible

Pull Request - State: closed - Opened by phofl 3 months ago - 1 comment

#11290 - `2024.8.0` breaks `sparse` array indexing

Issue - State: closed - Opened by ilan-gold 3 months ago - 13 comments

#11289 - Link to dask vs spark benchmarks on dask docs

Pull Request - State: closed - Opened by scharlottej13 3 months ago - 2 comments

#11288 - array: fix `asarray` for array input with `dtype`

Pull Request - State: closed - Opened by lucascolley 3 months ago - 7 comments

#11285 - BUG: `array.asarray` does not respect `dtype` arg

Issue - State: closed - Opened by lucascolley 3 months ago - 1 comment
Labels: array

#11284 - Deprecate split-large-chunks option

Pull Request - State: closed - Opened by phofl 3 months ago - 6 comments

#11282 - Automatically rechunk in array-shuffle if groups are too large

Issue - State: closed - Opened by phofl 3 months ago
Labels: array

#11281 - Ensure that array-shuffle with range-like full indexer is a no-op

Issue - State: closed - Opened by phofl 3 months ago
Labels: array

#11273 - Keep chunksize consistent in reshape

Pull Request - State: closed - Opened by phofl 3 months ago - 4 comments

#11271 - Add more docstring examples for ``normalize_chunks``

Pull Request - State: closed - Opened by Illviljan 4 months ago - 1 comment

#11266 - Dask indexing problem with cupy

Issue - State: open - Opened by miguelcarcamov 4 months ago - 5 comments
Labels: array, bug, gpu

#11248 - Add a Task class to replace tuples for task specification

Pull Request - State: closed - Opened by fjetter 4 months ago - 9 comments

#11242 - Update gpuCI `RAPIDS_VER` to `24.10`

Pull Request - State: closed - Opened by github-actions[bot] 4 months ago - 5 comments

#11240 - Optimizer applies parquet `filters` after loading when using `read_parquet(...).map_partitions(...).compute()`

Issue - State: open - Opened by Timost 4 months ago - 1 comment
Labels: needs info, parquet

#11239 - `read_sql_table` no longer sets index name in the resulting ddf meta

Issue - State: closed - Opened by Timost 4 months ago - 2 comments
Labels: dataframe, needs info

#11235 - Pyarrow <NA> filters are not being applied in read_parquet

Issue - State: closed - Opened by benrutter 4 months ago - 3 comments
Labels: dataframe, parquet

#11234 - Array slicing is using low level materialized graphs

Issue - State: closed - Opened by fjetter 4 months ago - 2 comments
Labels: array, highlevelgraph, dask-expr

#11224 - Fix ambigous errors in sqlalchemy statements with join

Pull Request - State: open - Opened by semohr 4 months ago - 2 comments

#11220 - Handle np.frombuffer

Issue - State: closed - Opened by anruijian 4 months ago - 3 comments
Labels: needs triage

#11217 - Add array annotations

Pull Request - State: open - Opened by jason-trinidad 4 months ago - 5 comments

#11188 - Bug in map_blocks when iterating over multiple arrays

Issue - State: open - Opened by astrofrog 5 months ago - 2 comments
Labels: array

#11182 - Ensure we test against numpy 2 in CI

Pull Request - State: closed - Opened by jrbourbeau 5 months ago - 3 comments

#11155 - 'SeriesGroupBy' object has no attribute 'nunique_approx'

Issue - State: closed - Opened by LeilaGold 6 months ago - 7 comments
Labels: needs info

#11146 - Dask 2024.5.1 removed `.attrs`

Issue - State: open - Opened by LucaMarconato 6 months ago - 13 comments
Labels: needs triage

#11145 - Concat with unknown divisions raises TypeError

Issue - State: closed - Opened by manschoe 6 months ago - 3 comments
Labels: needs triage

#11128 - Fix map_overlap with new_axis

Pull Request - State: closed - Opened by dstansby 6 months ago - 4 comments

#11124 - Overlap with `new_axis` option is not trimmed correctly

Issue - State: closed - Opened by chourroutm 6 months ago - 1 comment
Labels: needs triage

#11118 - Fix na casting behavior for groupby.agg with arrow dtypes

Pull Request - State: closed - Opened by phofl 6 months ago - 1 comment

#11116 - When using PyArrow dtypes, aggregations create NaNs of unexpected type

Issue - State: closed - Opened by nprihodko 6 months ago - 1 comment
Labels: needs triage

#11111 - ⚠️ Upstream CI failed ⚠️

Issue - State: closed - Opened by github-actions[bot] 6 months ago - 1 comment
Labels: upstream

#11075 - Add Scheduling section to DataFrame best practices

Pull Request - State: open - Opened by phofl 7 months ago - 1 comment

#11070 - Turning off query planning is difficult

Issue - State: closed - Opened by jtilly 7 months ago - 5 comments
Labels: needs triage

#11069 - Backport Python import error patch to 2024.2.1

Issue - State: closed - Opened by jtilly 7 months ago - 1 comment
Labels: needs triage

#11069 - Backport Python import error patch to 2024.2.1

Issue - State: closed - Opened by jtilly 7 months ago - 1 comment
Labels: needs triage

#11055 - Substantial memory usage in dask.order

Issue - State: closed - Opened by fjetter 7 months ago - 4 comments
Labels: needs triage

#11040 - Add lazy "cudf" registration for p2p-related dispatch functions

Pull Request - State: closed - Opened by rjzamora 8 months ago - 4 comments
Labels: bug

#11039 - Make python 3.11.9 fix a bit safer

Pull Request - State: closed - Opened by rjzamora 8 months ago - 2 comments
Labels: dataframe