Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / pinterest/iceberg issues and pull requests
#69 - [pinterest] skip updating hms table schema for thrift table
Pull Request -
State: closed - Opened by puchengy about 1 year ago
- 1 comment
Labels: HIVE
#68 - [DRAFT] Allow bucketing partition data to increase parallelism during snapshot table creation
Pull Request -
State: open - Opened by puchengy about 1 year ago
Labels: SPARK, DATA
#67 - [temp] add spark.sql.pinterest.experiment.parquet.read.threads to use multi-thread to read parquet file metadata to speed up partition imports
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: DATA
#66 - Cache filesToImport variable to avoid duplicated compute
Pull Request -
State: closed - Opened by puchengy about 1 year ago
- 1 comment
Labels: SPARK
#65 - update build for flink iceberg module and use Flink 1.15 as default to match internal Flink version
Pull Request -
State: closed - Opened by zzhhhzz about 1 year ago
Labels: INFRA, BUILD, FLINK
#64 - Spark 3.2: Skip duplicate check on deleted file path when import file…
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: SPARK
#63 - Add a checker to raise exception when string type partition column is used to compare against datetype
Pull Request -
State: closed - Opened by puchengy about 1 year ago
- 1 comment
Labels: SPARK
#62 - [temp] remove TestRemoveEmptyFilesProcedure test
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: SPARK
#61 - Fix style (to be merged with remove_empty_files diff when rebase)
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: SPARK
#60 - [spark 3.2] add remove_empty_files procedure (if there is a same PR on upstream, get rid of this)
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: SPARK
#59 - [spark 3.2] skip empty file during table migration, table snapshotting or adding files (if there is a same PR on upstream, get rid of this)
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: SPARK, DATA
#58 - Core, Spark: Fix migrate table in case of partitioned table with partition containing a special character (#7744)
Pull Request -
State: closed - Opened by puchengy about 1 year ago
Labels: SPARK, CORE, DATA
#57 - Allow snapshot/ migrate Hive table with partition value contains slash
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: CORE, DATA
#56 - Bump pinspark version
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
Labels: SPARK, BUILD
#55 - AWS: Fix Tests3RestSigner on OSX (#7742)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: AWS
#54 - Support Storage Partitioned Joins for Spark 3.2
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
Labels: SPARK, BUILD
#53 - Handle enums in parquet conversions
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET
#52 - [Iceberg thrift] support case insensitive id assignment for thrift backed Iceberg Parquet table (#28)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET, SPARK
#51 - [Iceberg thrift] set IGNORE_PARQUET_FIELD_IDS for unpartitioned table importing (#29)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK
#50 - Handle thrift schema backed table reads in vectorized reader
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK
#49 - [Spark] Remove sorts from delete from queries reading just one partition and already sorted data.
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK, CORE
#48 - Allow ignoring field ids from parquet files in add files APIs
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET, SPARK, CORE, DATA
#47 - Support thrift schema in Iceberg tables (#44)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET, SPARK
#46 - Allow ignoring field ids from parquet files in add files APIs
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET, SPARK, CORE, DATA
#45 - [Spark] Remove sorts from delete from queries reading just one partition and already sorted data.
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK, CORE
#44 - Support thrift schema in Iceberg tables
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET, SPARK
#43 - Python legacy timestamp support for manifest matching
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PYTHON
#42 - [backport] Spark 3.2: Make manifest file names unique during imports (#6845)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK
#41 - [python_legacy] fix evaluator failure to match date column
Pull Request -
State: closed - Opened by puchengy over 1 year ago
#40 - backport https://github.com/apache/iceberg/pull/5126
Pull Request -
State: closed - Opened by zzhhhzz over 1 year ago
Labels: PARQUET
#37 - update iceberg-flink-runtime build
Pull Request -
State: closed - Opened by zzhhhzz over 1 year ago
Labels: BUILD
#36 - update iceberg-flink-runtime build so that the jar is compatible with…
Pull Request -
State: closed - Opened by zzhhhzz over 1 year ago
Labels: BUILD
#35 - [Backport] Spark 3.2: Discard filters that can be pushed down completely.
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
Labels: SPARK, API
#34 - [Python Legacy] Convert string to boolean if the binding variable is Boolean
Pull Request -
State: closed - Opened by pritampan over 1 year ago
Labels: PYTHON
#33 - [Python Legacy] Convert string to boolean if the binding variable is Boolean
Pull Request -
State: closed - Opened by pritampan over 1 year ago
- 1 comment
Labels: PYTHON, INFRA, DOCS, PARQUET, SPARK, BUILD, API, CORE, ARROW, ORC, HIVE, DATA, MR, NESSIE, FLINK, AWS, COMMON, PIG, ALIYUN
#32 - [backport] Build: Let revapi compare API compatibility against apache-iceberg-1.0.0 (#6053)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: INFRA, BUILD
#31 - [backport] Spark 3.2: Add prefix mismatch mode for deleting orphan files (#4652)
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK, API
#30 - [backport from 1.1] ICEBERG-4346: Better handling of Orphan files #4652
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PYTHON, INFRA, DOCS, PARQUET, BUILD, API, CORE, ARROW, ORC, HIVE, DATA, MR, NESSIE, FLINK, AWS, COMMON, PIG, ALIYUN
#29 - [Iceberg thrift] set IGNORE_PARQUET_FIELD_IDS for unpartitioned table importing
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: SPARK
#28 - [Iceberg thrift] support case insensitive id assignment for thrift backed Iceberg Parquet table
Pull Request -
State: closed - Opened by puchengy over 1 year ago
Labels: PARQUET, SPARK
#27 - Use PEP 440 compatible version specifiers
Pull Request -
State: closed - Opened by steverice over 1 year ago
Labels: PYTHON
#26 - Handle thrift schema backed table reads in vectorized reader
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
Labels: SPARK
#25 - [Spark] Remove sorts from delete from queries reading just one partition and already sorted data.
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
- 1 comment
Labels: SPARK, CORE
#24 - Allow ignoring field ids from parquet files in add files APIs
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
Labels: PARQUET, SPARK, CORE, DATA
#23 - Handle enums in parquet conversions
Pull Request -
State: closed - Opened by SinghAsDev over 1 year ago
Labels: PARQUET
#22 - [Backport] Python: Pin mypy (#6147)
Pull Request -
State: closed - Opened by puchengy almost 2 years ago
Labels: PYTHON
#21 - [python_legacy] support check partition exist api
Pull Request -
State: closed - Opened by puchengy almost 2 years ago
Labels: PYTHON
#20 - [backport from Pinterest branch][python_legacy] release python_legacy as py-iceberg (#5) (#8)
Pull Request -
State: closed - Opened by puchengy almost 2 years ago
Labels: PYTHON, INFRA
#19 - [backport from Pinterest branch][python legacy] get_partitions api support (#13)
Pull Request -
State: closed - Opened by puchengy almost 2 years ago
Labels: PYTHON
#18 - [backport from upstream] Python: BOTO_STS_CLIENT lazy initialization (#5930)
Pull Request -
State: closed - Opened by puchengy almost 2 years ago
Labels: PYTHON
#17 - Support thrift schema in Iceberg tables
Pull Request -
State: closed - Opened by SinghAsDev almost 2 years ago
Labels: PARQUET, SPARK
#16 - Support 2-level list and maps type in RemoveIds.
Pull Request -
State: closed - Opened by SinghAsDev almost 2 years ago
Labels: PARQUET
#15 - Support 2-level list and maps type in RemoveIds
Pull Request -
State: closed - Opened by SinghAsDev almost 2 years ago
Labels: PYTHON, INFRA, DOCS, PARQUET, SPARK, BUILD, API, CORE, ARROW, ORC, HIVE, DATA, MR, NESSIE, FLINK, AWS, COMMON, PIG, ALIYUN, GCP, DELL
#14 - [python_legacy] BOTO_STS_CLIENT lazy init
Pull Request -
State: closed - Opened by puchengy almost 2 years ago
Labels: PYTHON
#13 - [python legacy] get_partitions api support
Pull Request -
State: closed - Opened by puchengy about 2 years ago
Labels: PYTHON
#12 - Introduce metadata location selective suffix table property and use it for metadata location generation
Pull Request -
State: closed - Opened by puchengy about 2 years ago
Labels: CORE
#11 - Add regex replace support in remove orphan file action and procedure
Pull Request -
State: closed - Opened by SinghAsDev over 2 years ago
Labels: SPARK
#10 - Fix concurrent transactions overwriting commits by adding hive lock heartbeats.
Pull Request -
State: closed - Opened by SinghAsDev over 2 years ago
Labels: HIVE
#9 - [python legacy] use process pool instead of thread pool for files planning
Pull Request -
State: closed - Opened by puchengy over 2 years ago
Labels: PYTHON
#8 - release python_legacy as py-iceberg
Pull Request -
State: closed - Opened by puchengy over 2 years ago
Labels: PYTHON, INFRA
#7 - Core: Fix history timestamp for rollbacks (#4135)
Pull Request -
State: closed - Opened by puchengy over 2 years ago
Labels: CORE
#6 - Allow table defaults to be configured and/ or enforced at catalog level using catalog properties.
Pull Request -
State: closed - Opened by SinghAsDev over 2 years ago
Labels: CORE, HIVE, AWS
#5 - release python_legacy as py-iceberg
Pull Request -
State: closed - Opened by puchengy over 2 years ago
Labels: PYTHON, INFRA
#4 - [Pinterest Specific] Package pinterest-iceberg module into spark3.2 runtime package
Pull Request -
State: closed - Opened by puchengy over 2 years ago
Labels: PYTHON, INFRA, DOCS, PARQUET, SPARK, BUILD, API, CORE, HIVE, DATA, MR, FLINK, AWS
#3 - Try to sync to upstream
Pull Request -
State: closed - Opened by puchengy almost 3 years ago
Labels: INFRA, DOCS, PARQUET, SPARK, BUILD, API, CORE, ARROW, ORC, HIVE, DATA, MR, NESSIE
#2 - Fix Iceberg's parquet reader returning nulls incorrectly for parquet files written by writers that don't use list and element as names.
Pull Request -
State: closed - Opened by SinghAsDev almost 3 years ago
Labels: DOCS, PARQUET, SPARK
#1 - Release iceberg python module as py-iceberg
Pull Request -
State: closed - Opened by puchengy almost 3 years ago
Labels: PYTHON, INFRA