Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / NVIDIA/spark-rapids issues and pull requests

#11249 - Release Checklist v24.10

Issue - State: open - Opened by caryr35 2 months ago
Labels: documentation

#11239 - [FEA] Big Data Fingerprinting tool

Issue - State: open - Opened by revans2 2 months ago - 3 comments
Labels: feature request

#11239 - [FEA] Big Data Fingerprinting tool

Issue - State: open - Opened by revans2 2 months ago - 3 comments
Labels: feature request

#11174 - Opcode Suite fails for Scala 2.13.8+

Issue - State: open - Opened by razajafri 3 months ago - 2 comments
Labels: bug

#11015 - Fix tests failures in parquet_test.py

Issue - State: open - Opened by razajafri 4 months ago - 2 comments
Labels: bug, Spark 4.0+

#11004 - Fix test failures for Spark 4.0.0

Issue - State: open - Opened by razajafri 4 months ago
Labels: bug, Spark 4.0+

#10968 - [FEA] support min_by function

Issue - State: closed - Opened by nvliyuan 4 months ago - 2 comments
Labels: feature request

#10968 - [FEA] support min_by function

Issue - State: closed - Opened by nvliyuan 4 months ago - 2 comments
Labels: feature request

#10955 - [FEA] Remove support for Spark 3.1.x

Issue - State: open - Opened by sameerz 4 months ago - 3 comments
Labels: feature request

#10955 - [FEA] Remove support for Spark 3.1.x

Issue - State: open - Opened by sameerz 4 months ago - 3 comments
Labels: feature request

#10922 - from_json cannot support line separator in the input string.

Issue - State: open - Opened by Feng-Jiang28 4 months ago
Labels: bug

#10911 - from_json: when input is a bad json string, rapids would throw an exception.

Issue - State: closed - Opened by Feng-Jiang28 4 months ago - 1 comment
Labels: bug

#10898 - [BUG] test_from_json_struct_decimal integration test failed with DATAGEN_SEED=1716745329

Issue - State: open - Opened by nartal1 4 months ago - 2 comments
Labels: bug

#10856 - Release Checklist v24.08

Issue - State: closed - Opened by caryr35 5 months ago - 3 comments
Labels: documentation

#10814 - [FEA] Use `cudf::io::config_host_memory_resource`

Issue - State: closed - Opened by abellina 5 months ago - 1 comment
Labels: task

#10798 - Optimizing Expand+Aggregate in sqls with many count distinct

Pull Request - State: closed - Opened by binmahone 5 months ago - 14 comments
Labels: performance

#10798 - Optimzing Expand+Aggregate in sqls with many count distinct [WIP]

Pull Request - State: open - Opened by binmahone 5 months ago - 10 comments
Labels: performance

#10770 - [BUG] Slow/no progress with cascaded pandas udfs/mapInPandas in Databricks

Issue - State: closed - Opened by eordentlich 5 months ago - 8 comments
Labels: bug

#10770 - [BUG] Slow/no progress with cascaded pandas udfs/mapInPandas in Databricks

Issue - State: closed - Opened by eordentlich 5 months ago - 8 comments
Labels: bug

#10687 - [AUDIT] [SPARK-46832] The new expressions Collate and Collation can change StringType

Issue - State: open - Opened by revans2 6 months ago - 3 comments
Labels: audit_4.0.0

#10687 - [AUDIT] [SPARK-46832] The new expressions Collate and Collation can change StringType

Issue - State: open - Opened by revans2 6 months ago - 3 comments
Labels: audit_4.0.0

#10661 - [FEA] Add support for Databricks 14.3 ML LTS

Issue - State: open - Opened by sameerz 6 months ago
Labels: feature request

#10661 - [FEA] Add support for Databricks 14.3 ML LTS

Issue - State: open - Opened by sameerz 6 months ago
Labels: feature request

#10534 - [BUG] Need Improved JSON Validation

Issue - State: closed - Opened by revans2 7 months ago - 1 comment
Labels: bug, cudf_dependency

#10479 - [BUG] JsonToStructs and ScanJson should return null for non-numeric, non-boolean non-quoted strings

Issue - State: closed - Opened by revans2 7 months ago - 2 comments
Labels: bug, cudf_dependency

#10457 - [BUG] ScanJson and JsonToStructs allow unquoted control chars by default

Issue - State: closed - Opened by revans2 8 months ago
Labels: bug

#10437 - [FEA] Add Spark 3.5.2 snapshot support

Issue - State: closed - Opened by tgravescs 8 months ago - 5 comments
Labels: feature request

#10437 - [FEA] Add Spark 3.5.2 snapshot support

Issue - State: closed - Opened by tgravescs 8 months ago - 5 comments
Labels: feature request

#10254 - [FEA] Fix GetJsonObject

Issue - State: open - Opened by revans2 8 months ago - 2 comments
Labels: epic

#10254 - [FEA] Fix GetJsonObject

Issue - State: open - Opened by revans2 8 months ago - 2 comments
Labels: epic

#10231 - Improve the Maven distro download workaround

Issue - State: closed - Opened by gerashegalov 9 months ago
Labels: good first issue, build

#9393 - [BUG] failed to build integration images due to mamba solver incompatible issue

Issue - State: closed - Opened by pxLi 12 months ago - 1 comment
Labels: bug, build

#8587 - Improve Databricks runtime shim detection

Issue - State: closed - Opened by gerashegalov over 1 year ago - 3 comments
Labels: task, improve

#8587 - Improve Databricks runtime shim detection

Issue - State: closed - Opened by gerashegalov over 1 year ago - 3 comments
Labels: task, improve

#8558 - [BUG] `from_json` generated inconsistent result comparing with CPU for input column with nested json strings

Issue - State: closed - Opened by cindyyuanjiang over 1 year ago - 11 comments
Labels: bug, cudf_dependency

#8317 - Fixed peak device memory metrics doesn't work in parquet coalesce reading

Pull Request - State: open - Opened by thirtiseven over 1 year ago - 4 comments

#8316 - [EPIC] Spark 3.4 Remaining Functionality

Issue - State: open - Opened by NVnavkumar over 1 year ago
Labels: feature request, ? - Needs Triage, epic

#8315 - [FEA] Support Retry for GpuBaseLimitExec

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, ? - Needs Triage, reliability

#8314 - [FEA] SupportSplitAndRetry for GpuRangeExec

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, ? - Needs Triage, reliability

#8313 - [FEA] Support SplitAndRetry for GpuFastSampleExec

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, ? - Needs Triage, reliability

#8312 - [FEA] Support SplitAndRetry for GpuSampleExec

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, ? - Needs Triage

#8311 - [FEA] Support Split And Retry for GpuProjectAstExec

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, ? - Needs Triage, reliability

#8310 - [FEA] Support Split and Retry for GpuTopN

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, ? - Needs Triage, reliability

#8309 - Fix GpuTopN with offset for multiple batches

Pull Request - State: open - Opened by revans2 over 1 year ago - 1 comment
Labels: bug, Spark 3.4+

#8308 - [BUG] Device Memory leak seen in integration_tests when AssertEmptyNulls are enabled

Issue - State: open - Opened by razajafri over 1 year ago
Labels: bug, ? - Needs Triage

#8307 - Full ordinal support in GetArrayItem

Pull Request - State: open - Opened by revans2 over 1 year ago - 4 comments

#8306 - Update code to deal with new retry semantics

Pull Request - State: open - Opened by revans2 over 1 year ago - 2 comments
Labels: reliability

#8305 - [FEA] Remove referenced weak reference in JNI MemoryCleaner as soon as possible to save memory

Issue - State: open - Opened by res-life over 1 year ago - 2 comments
Labels: feature request, ? - Needs Triage, reliability

#8304 - Support combining small files for multi-threaded ORC reads

Pull Request - State: open - Opened by firestarman over 1 year ago - 4 comments

#8303 - [BUG] GpuExpression columnarEval can return scalars from subqueries that may be unhandled

Issue - State: open - Opened by jlowe over 1 year ago - 3 comments
Labels: bug, ? - Needs Triage

#8302 - Add support for DecimalType in Remainder for Spark 3.4 and DB 11.3 [databricks]

Pull Request - State: open - Opened by NVnavkumar over 1 year ago - 1 comment
Labels: bug, Spark 3.4+

#8301 - [FEA] semaphore prioritization

Issue - State: open - Opened by abellina over 1 year ago
Labels: feature request, performance, reliability

#8300 - [FEA] Detect app host memory misconfiguration

Issue - State: open - Opened by abellina over 1 year ago
Labels: feature request, reliability

#8298 - Append new authorized user to blossom-ci whitelist [skip ci]

Pull Request - State: closed - Opened by thirtiseven over 1 year ago - 1 comment
Labels: build

#8297 - [BUG] The metrics of "peak device memory" in Gpu parquet scan doesn't work when using coalesce reading

Issue - State: open - Opened by GaryShen2008 over 1 year ago - 3 comments
Labels: bug, ? - Needs Triage

#8296 - Fix Multithreaded Readers working with Unity Catalog on Databricks [databricks]

Pull Request - State: closed - Opened by tgravescs over 1 year ago - 6 comments
Labels: bug

#8295 - Fix ORC reader for `CHAR(N)` columns written from Hive

Pull Request - State: closed - Opened by mythrocks over 1 year ago - 5 comments
Labels: bug

#8294 - [BUG] ORC `CHAR(N)` columns written from Hive unreadable with RAPIDS plugin

Issue - State: closed - Opened by mythrocks over 1 year ago
Labels: bug

#8292 - [FEA] multi-threaded shuffle above 200 partitions

Issue - State: open - Opened by revans2 over 1 year ago
Labels: feature request, performance

#8291 - Fix delta stats tracker conf [databricks]

Pull Request - State: closed - Opened by jlowe over 1 year ago - 1 comment
Labels: bug

#8290 - Pre-merge docker build stage to support containerd runtime [skip ci]

Pull Request - State: closed - Opened by pxLi over 1 year ago - 1 comment
Labels: build

#8289 - Improve regex test readability by using raw literals to reduce escape `\` usage

Issue - State: open - Opened by gerashegalov over 1 year ago - 1 comment
Labels: test, task

#8288 - Enable Spark-3.4 build & unit test in pre-merge.

Issue - State: closed - Opened by NVnavkumar over 1 year ago
Labels: build

#8287 - Fix Delta write stats if data schema is missing columns relative to table schema [databricks]

Pull Request - State: closed - Opened by jlowe over 1 year ago - 2 comments
Labels: bug

#8286 - Add Tencent cosn:// to default cloud schemes

Pull Request - State: closed - Opened by tgravescs over 1 year ago - 3 comments
Labels: feature request

#8285 - [DOC] Feedback for qualification tool documentation

Issue - State: open - Opened by NVnavkumar over 1 year ago
Labels: documentation, ? - Needs Triage, tools

#8284 - [FEA] Look into running some/all of our integration tests distributed on databricks

Issue - State: open - Opened by revans2 over 1 year ago - 1 comment
Labels: feature request, ? - Needs Triage

#8284 - [FEA] Look into running some/all of our integration tests distributed on databricks

Issue - State: open - Opened by revans2 over 1 year ago - 2 comments
Labels: feature request, ? - Needs Triage

#8283 - Add split and retry support for filter [databricks]

Pull Request - State: closed - Opened by revans2 over 1 year ago - 7 comments
Labels: reliability

#8282 - WIP: Add 332db shim

Pull Request - State: open - Opened by andygrove over 1 year ago - 5 comments
Labels: feature request

#8281 - [BUG] ParquetCachedBatchSerializer is crashing on count

Issue - State: open - Opened by razajafri over 1 year ago
Labels: bug, ? - Needs Triage

#8281 - [BUG] ParquetCachedBatchSerializer is crashing on count

Issue - State: open - Opened by razajafri over 1 year ago
Labels: bug

#8280 - [DOC] Suggestions for getting-started-on-prem.md

Issue - State: open - Opened by nartal1 over 1 year ago
Labels: documentation, ? - Needs Triage

#8280 - [DOC] Suggestions for getting-started-on-prem.md

Issue - State: open - Opened by nartal1 over 1 year ago
Labels: documentation

#8279 - [BUG] GpuDynamicPartitionDataSingleWriter can call closeFile multiple times for the same file

Issue - State: open - Opened by jlowe over 1 year ago
Labels: bug, ? - Needs Triage

#8278 - [BUG] NDS query 16 hangs at SF30K

Issue - State: open - Opened by mattahrens over 1 year ago
Labels: bug, reliability

#8277 - [BUG] unhelpful branch key in the version info for builds in the jar

Issue - State: open - Opened by gerashegalov over 1 year ago
Labels: bug, build

#8276 - Fallback to CPU for `enableDateTimeParsingFallback` configuration

Pull Request - State: closed - Opened by rwlee over 1 year ago - 2 comments
Labels: Spark 3.4+

#8276 - Fallback to CPU for `enableDateTimeParsingFallback` configuration

Pull Request - State: open - Opened by rwlee over 1 year ago - 1 comment
Labels: Spark 3.4+

#8275 - [FEA] use matrix to combine multiple jdk* jobs in maven-verify CI

Issue - State: open - Opened by pxLi over 1 year ago
Labels: test, build

#8275 - [FEA] use matrix to combine multiple jdk* jobs in maven-verify CI

Issue - State: open - Opened by pxLi over 1 year ago
Labels: test, build

#8274 - Add a unit test for reordered canonicalized expressions in BinaryComparison

Pull Request - State: closed - Opened by NVnavkumar over 1 year ago - 2 comments
Labels: spark 3.3+, Spark 3.4+

#8273 - Add support for escaped dot in character class in regexp parser

Pull Request - State: closed - Opened by andygrove over 1 year ago - 4 comments
Labels: feature request

#8271 - Added documentation for assertion on non-empty nulls [skip ci]

Pull Request - State: open - Opened by razajafri over 1 year ago
Labels: documentation

#8269 - [CI] It is not obvious where to find hs_err_pid files in CI

Issue - State: open - Opened by abellina over 1 year ago - 2 comments
Labels: test, reliability

#8269 - [CI] It is not obvious where to find hs_err_pid files in CI

Issue - State: open - Opened by abellina over 1 year ago - 2 comments
Labels: test, reliability

#8268 - [BUG] NDS query 16 fails on EMR 6.10 with java.lang.ClassCastException

Issue - State: open - Opened by mattahrens over 1 year ago - 5 comments
Labels: bug

#8267 - [Doc] Update gh-pages branch for 23.04.1 patch release [skip ci]

Pull Request - State: closed - Opened by nvliyuan over 1 year ago - 2 comments
Labels: documentation

#8266 - Add test to confirm correct behavior for decimal average in Spark 3.4

Pull Request - State: closed - Opened by andygrove over 1 year ago - 4 comments
Labels: test

#8265 - Small code cleanup for pattern matching on Decimal type

Pull Request - State: closed - Opened by andygrove over 1 year ago - 1 comment
Labels: task

#8265 - Small code cleanup for pattern matching on Decimal type

Pull Request - State: closed - Opened by andygrove over 1 year ago - 1 comment
Labels: task

#8264 - Make tables spillable by default

Pull Request - State: open - Opened by abellina over 1 year ago - 2 comments
Labels: performance, reliability

#8262 - [FEA] Implement SpillableTable

Issue - State: open - Opened by andygrove over 1 year ago
Labels: feature request, reliability

#8262 - [FEA] Implement SpillableTable

Issue - State: open - Opened by andygrove over 1 year ago
Labels: feature request, reliability

#8261 - Generate markdown for filecache.checkStale

Pull Request - State: closed - Opened by gerashegalov over 1 year ago - 3 comments
Labels: documentation, build