Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / ray-project/deltacat issues and pull requests

#160 - Add pytest benchmarking for Parquet reads

Pull Request - State: closed - Opened by jaychia over 1 year ago

#160 - Add pytest benchmarking for Parquet reads

Pull Request - State: closed - Opened by jaychia over 1 year ago

#159 - Move s3_client_kwargs default setter to parent scope

Pull Request - State: closed - Opened by rkenmi over 1 year ago

#159 - Move s3_client_kwargs default setter to parent scope

Pull Request - State: closed - Opened by rkenmi over 1 year ago

#158 - Allow s3_client_kwargs to be passed into repartition

Pull Request - State: closed - Opened by rkenmi over 1 year ago

#158 - Allow s3_client_kwargs to be passed into repartition

Pull Request - State: closed - Opened by rkenmi over 1 year ago

#157 - Honor profile name in s3 client kwargs

Pull Request - State: closed - Opened by raghumdani over 1 year ago

#157 - Honor profile name in s3 client kwargs

Pull Request - State: closed - Opened by raghumdani over 1 year ago

#156 - Pre-release version bump to 0.1.18.b9

Pull Request - State: closed - Opened by pfaraone over 1 year ago

#155 - Allow s3 client kwargs as argument of compact_partition

Pull Request - State: closed - Opened by raghumdani over 1 year ago

#155 - Allow s3 client kwargs as argument of compact_partition

Pull Request - State: closed - Opened by raghumdani over 1 year ago

#154 - add dd parallelism params

Pull Request - State: closed - Opened by valiantljk over 1 year ago

#154 - add dd parallelism params

Pull Request - State: closed - Opened by valiantljk over 1 year ago

#152 - RetryHandlerSkeleton

Pull Request - State: open - Opened by ekaschaw over 1 year ago - 1 comment

#152 - RetryHandlerSkeleton

Pull Request - State: open - Opened by ekaschaw over 1 year ago - 1 comment

#151 - Adding a data model to represent compact_partition function parameters

Pull Request - State: closed - Opened by pfaraone over 1 year ago
Labels: enhancement

#151 - Adding a data model to represent compact_partition function parameters

Pull Request - State: closed - Opened by pfaraone over 1 year ago
Labels: enhancement

#149 - Refactoring the compact_partition to support different object store implementations

Pull Request - State: closed - Opened by raghumdani over 1 year ago - 5 comments

#146 - Capturing all the performance metrics in an audit

Pull Request - State: closed - Opened by raghumdani over 1 year ago - 1 comment

#144 - Integration tests for materialize: skip untouched files improvement

Issue - State: closed - Opened by Zyiqin-Miranda over 1 year ago - 1 comment

#143 - remove dependency on rebase_source_partition

Pull Request - State: closed - Opened by valiantljk over 1 year ago

#143 - remove dependency on rebase_source_partition

Pull Request - State: closed - Opened by valiantljk over 1 year ago

#142 - Logging memory consumed to validate worker estimation correctness

Pull Request - State: closed - Opened by raghumdani over 1 year ago - 1 comment

#141 - Log the memory consumption to calculate correctness of worker estimation

Issue - State: closed - Opened by raghumdani over 1 year ago - 1 comment

#141 - Log the memory consumption to calculate correctness of worker estimation

Issue - State: closed - Opened by raghumdani over 1 year ago - 1 comment

#140 - Always persist a high watermark for source table

Issue - State: open - Opened by valiantljk over 1 year ago - 2 comments

#140 - Always persist a high watermark for source table

Issue - State: open - Opened by valiantljk over 1 year ago - 2 comments

#139 - Setting hash bucket count based on POC results

Pull Request - State: closed - Opened by raghumdani over 1 year ago

#139 - Setting hash bucket count based on POC results

Pull Request - State: closed - Opened by raghumdani over 1 year ago

#137 - [skip untouched files]Enable skipping untouched files during materialize

Pull Request - State: closed - Opened by Zyiqin-Miranda over 1 year ago - 6 comments

#136 - Unittest for Repartition

Pull Request - State: closed - Opened by valiantljk over 1 year ago - 1 comment

#136 - Unittest for Repartition

Pull Request - State: closed - Opened by valiantljk over 1 year ago - 1 comment

#135 - Replace DistributedDataset with Ray Dataset

Issue - State: open - Opened by valiantljk over 1 year ago

#135 - Replace DistributedDataset with Ray Dataset

Issue - State: open - Opened by valiantljk over 1 year ago

#134 - Add unit test for repartition API

Issue - State: closed - Opened by valiantljk over 1 year ago

#133 - Refactor the existing discover_delta to handle different cases

Issue - State: closed - Opened by valiantljk over 1 year ago - 1 comment

#133 - Refactor the existing discover_delta to handle different cases

Issue - State: closed - Opened by valiantljk over 1 year ago - 1 comment

#132 - Repartition

Pull Request - State: closed - Opened by valiantljk over 1 year ago - 1 comment

#132 - Repartition

Pull Request - State: closed - Opened by valiantljk over 1 year ago - 1 comment

#131 - Read Iceberg to DeltacatDataset

Pull Request - State: closed - Opened by JonasJ-ap over 1 year ago - 1 comment

#130 - Fix the stream position of the delta

Pull Request - State: closed - Opened by raghumdani almost 2 years ago - 2 comments

#129 - Update hash bucket and dedupe result dataclasses to tuple type

Pull Request - State: closed - Opened by rkenmi almost 2 years ago - 1 comment

#128 - Use NumPy return types for dedupe tasks

Issue - State: open - Opened by rkenmi almost 2 years ago
Labels: enhancement

#127 - Audit relevant execution info

Issue - State: closed - Opened by rkenmi almost 2 years ago - 1 comment
Labels: enhancement, P0

#126 - Add assertions to fast fail compaction in case of a metadata/execution issue

Pull Request - State: closed - Opened by rkenmi almost 2 years ago - 1 comment

#125 - support read_kwargs_provider arg to compact_partition

Pull Request - State: closed - Opened by raghumdani almost 2 years ago - 1 comment

#123 - Ensure the list_deltas call returns in ascending order

Pull Request - State: closed - Opened by raghumdani almost 2 years ago

#122 - fix bug due to assumption that list_deltas returns in ascending order

Issue - State: closed - Opened by raghumdani almost 2 years ago - 1 comment

#121 - [Iceberg] Implement compaction for partitioned tables

Issue - State: open - Opened by jackye1995 almost 2 years ago - 1 comment
Labels: iceberg

#120 - [Iceberg] Implement compaction support for unpartitioned tables

Issue - State: open - Opened by jackye1995 almost 2 years ago - 1 comment
Labels: iceberg

#119 - [Iceberg] Describe strategy to commit partitions and deltas

Issue - State: open - Opened by jackye1995 almost 2 years ago
Labels: iceberg

#118 - [Iceberg] Support commit operation in pyiceberg

Issue - State: open - Opened by jackye1995 almost 2 years ago
Labels: iceberg

#117 - [Iceberg] Describe how Iceberg table maps to DeltaCAT Delta concept

Issue - State: closed - Opened by jackye1995 almost 2 years ago - 2 comments
Labels: iceberg

#116 - Support read Iceberg dataset in Ray in distributed mode

Issue - State: closed - Opened by jackye1995 almost 2 years ago - 8 comments

#115 - Support read Iceberg dataset in Ray

Issue - State: closed - Opened by jackye1995 almost 2 years ago - 1 comment

#114 - raise error when the compaction requires multiple rounds

Pull Request - State: closed - Opened by raghumdani almost 2 years ago

#112 - fixing high watermark deserialization and TypeError

Pull Request - State: closed - Opened by raghumdani almost 2 years ago - 1 comment

#111 - adding kwargs to all invocations of _discover_deltas

Pull Request - State: closed - Opened by raghumdani almost 2 years ago

#110 - Add an arg for list_deltas kwargs to disable compacted table resolution

Pull Request - State: closed - Opened by raghumdani almost 2 years ago - 1 comment

#109 - Allow passing list_deltas kwargs to compact_partition

Issue - State: closed - Opened by raghumdani almost 2 years ago

#108 - Investigation: Discover and document all DeltaCAT storage APIs used by compaction

Issue - State: closed - Opened by pdames almost 2 years ago - 1 comment
Labels: P0

#107 - Ensure s3fs version is compatible with boto3 version used by the unde…

Pull Request - State: closed - Opened by raghumdani almost 2 years ago

#106 - Fix key type

Pull Request - State: closed - Opened by valiantljk almost 2 years ago - 1 comment

#105 - Compaction with no primary key index

Pull Request - State: closed - Opened by valiantljk almost 2 years ago - 1 comment

#104 - deltacat version bump and support for graviton

Pull Request - State: closed - Opened by raghumdani almost 2 years ago

#103 - Make memray install optional and support custom logging config

Pull Request - State: closed - Opened by pdames almost 2 years ago - 1 comment

#102 - add property info

Pull Request - State: closed - Opened by valiantljk almost 2 years ago - 1 comment

#101 - Create an enum for storing locators

Issue - State: open - Opened by valiantljk almost 2 years ago

#100 - Compaction recover from partition commit failures

Issue - State: open - Opened by valiantljk almost 2 years ago

#98 - [ci][linting] Updated pre-commit repository mapping to be compatible with Python 3.7

Pull Request - State: closed - Opened by pfaraone almost 2 years ago - 1 comment

#96 - [metrics] Add customized CloudWatch metrics support

Pull Request - State: closed - Opened by Zyiqin-Miranda almost 2 years ago - 1 comment

#95 - Hotfix: Remove @ray.remote annotation in compaction session and zero_copy_only in materialize

Pull Request - State: closed - Opened by rkenmi almost 2 years ago - 1 comment

#94 - Revert "#37- Retroactively applying linter (black, isort) + Lint on C…

Pull Request - State: closed - Opened by pfaraone almost 2 years ago

#93 - ran pre-commit

Pull Request - State: closed - Opened by pfaraone almost 2 years ago

#92 - Bugfix broken from linter

Pull Request - State: closed - Opened by pfaraone almost 2 years ago

#91 - Bugfix/breaking linter pr changes

Pull Request - State: closed - Opened by pfaraone almost 2 years ago - 1 comment

#90 - reran linting

Pull Request - State: closed - Opened by pfaraone almost 2 years ago - 1 comment

#89 - [dev] Checking linter breaking changes

Pull Request - State: closed - Opened by pfaraone almost 2 years ago

#88 - [profiler] Add memray profiler to compaction

Pull Request - State: closed - Opened by Zyiqin-Miranda almost 2 years ago - 2 comments

#87 - [metrics]Add customized CloudWatch metrics support; Enable straggler detection

Pull Request - State: closed - Opened by Zyiqin-Miranda almost 2 years ago - 2 comments

#86 - Compaction with no PKI

Pull Request - State: closed - Opened by valiantljk almost 2 years ago

#85 - Enhance logging observability with Ray Runtime Context and Process ID

Pull Request - State: closed - Opened by rkenmi almost 2 years ago - 5 comments

#84 - Assign new group when locator is different

Pull Request - State: closed - Opened by valiantljk almost 2 years ago - 1 comment

#83 - Support different delta locator during dedupe and materialize

Issue - State: closed - Opened by valiantljk almost 2 years ago
Labels: P0

#82 - Rebatching based on delta locator

Issue - State: closed - Opened by valiantljk almost 2 years ago
Labels: P0

#81 - Remove Primary Key Index Building in Main and Create a Separate Branch for PKI Support

Issue - State: closed - Opened by valiantljk almost 2 years ago - 3 comments
Labels: P0

#80 - Optimize compaction materialize step by not reading files that delta records haven't touched

Issue - State: closed - Opened by raghumdani almost 2 years ago - 1 comment
Labels: P1