Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / moj-analytical-services/splink issues and pull requests

#1813 - Migrate tests - comparison term frequencies

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: term frequency, splink4, testing

#1812 - Comparison tf adjustments

Pull Request - State: closed - Opened by ADBond about 1 year ago - 4 comments
Labels: comparison levels, term frequency, splink4

#1810 - Check for ColumnExpression in TF adjustment column

Pull Request - State: closed - Opened by ADBond about 1 year ago - 4 comments
Labels: splink4

#1809 - Rename DateDiffAtThresholds class to DatediffAtThresholds

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1808 - Datediff level + comparison naming consistency

Issue - State: closed - Opened by ADBond about 1 year ago - 1 comment
Labels: splink4, maintenance

#1807 - test out cards on landing page

Pull Request - State: closed - Opened by RossKen about 1 year ago - 3 comments

#1806 - Cluster metrics - node degree + cluster centralisation

Pull Request - State: closed - Opened by ADBond about 1 year ago - 3 comments
Labels: cluster metrics

#1805 - Fix broken links

Pull Request - State: closed - Opened by RossKen about 1 year ago - 1 comment

#1804 - authors yml format fix

Pull Request - State: closed - Opened by RossKen about 1 year ago

#1803 - material by mkdocs upgrade

Pull Request - State: closed - Opened by RossKen about 1 year ago

#1802 - remove reference to deleted token

Pull Request - State: closed - Opened by RossKen about 1 year ago

#1801 - Incorrect URLs in README.md/linker.py

Issue - State: closed - Opened by KyleHaynes about 1 year ago - 6 comments

#1800 - Faster duckdb train u

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1798 - Remove brittleness of convergence test

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1797 - [FEAT] Parallelise u and EM estimating in duckdb

Issue - State: closed - Opened by RobinL about 1 year ago - 1 comment
Labels: enhancement

#1796 - Parallelise duckdb resulting in e.g. 2-4x speedup on 6 core machine

Pull Request - State: closed - Opened by RobinL about 1 year ago - 17 comments

#1794 - Ethics article for Splink blog

Issue - State: closed - Opened by zslade about 1 year ago

#1793 - Allow match weight threshold for `cluster_pairwise_predictions_at_threshold`

Issue - State: open - Opened by RossKen about 1 year ago
Labels: enhancement, clustering

#1792 - add cs awards to readme

Pull Request - State: closed - Opened by RossKen about 1 year ago - 2 comments

#1791 - Blog - Dec 2023

Pull Request - State: closed - Opened by RossKen about 1 year ago

#1790 - v3.9.10

Pull Request - State: closed - Opened by RossKen about 1 year ago

#1789 - Custom charts label for levels

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: comparison levels, splink4, charts

#1788 - Reinstate code lost in merge

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: splink4

#1787 - Merge master into Splink4 dev

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1786 - Postgres linker fails when reading columns from information_schema

Issue - State: closed - Opened by krixon about 1 year ago - 1 comment

#1785 - Postgres backend fails when creating UDFs due to SQLAlchemy ROLLBACKs

Issue - State: closed - Opened by krixon about 1 year ago - 2 comments

#1784 - Fix brl comp test

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: blocking, testing

#1783 - Composition tests migrated + `splink4_dev` updates

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: splink4, testing

#1782 - (Splink4) ColumnExpression

Pull Request - State: closed - Opened by RobinL about 1 year ago - 2 comments

#1780 - Type hinting and variable renaming (mypy conformance stage 1)

Pull Request - State: closed - Opened by ADBond about 1 year ago - 3 comments
Labels: type hints

#1779 - Add Mypy setup to `pyproject.toml`

Pull Request - State: closed - Opened by ADBond about 1 year ago - 1 comment
Labels: dev, type hints

#1778 - [BUG] Spark nested columns (i.e. structs inside tables) breaks in comparisons

Issue - State: closed - Opened by TinoSM about 1 year ago - 8 comments

#1777 - Comparison level creator composition

Pull Request - State: closed - Opened by ADBond about 1 year ago - 2 comments
Labels: comparison levels, splink4

#1776 - [FYI] KDTree blocker from dblink

Issue - State: open - Opened by NickCrews about 1 year ago - 1 comment
Labels: enhancement

#1775 - Remove unused code and improve the Athena Linker

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1774 - added argument for register_udfs_automatically

Pull Request - State: closed - Opened by JonathanLaidler about 1 year ago

#1773 - Improve speed of link only sample test

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1772 - Make notebook tests run faster

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1771 - (Splink 4) ArrayIntersectAtSizes

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1770 - (Splink 4) Comparisons for remaining distance functions

Pull Request - State: closed - Opened by RobinL about 1 year ago - 1 comment

#1769 - Pass SQL dialect information to `ComparisonLevel` object

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: comparison levels, splink4, dialects

#1768 - Custom comparison level and comparison

Pull Request - State: closed - Opened by ADBond about 1 year ago - 4 comments
Labels: comparison levels, splink4

#1767 - [DOCS] Some symbols not rendering properly in docs

Issue - State: closed - Opened by ADBond about 1 year ago - 1 comment
Labels: bug, documentation

#1765 - Comparison configure

Pull Request - State: closed - Opened by ADBond about 1 year ago - 1 comment
Labels: comparison levels, splink4

#1764 - [MAINT] Revamp the settings validation steps

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago - 1 comment

#1763 - Fixes to _compute_cluster_metrics

Pull Request - State: closed - Opened by zslade about 1 year ago - 1 comment

#1762 - [BUG] Incorrect results using lower/regex extract with tf adjustments

Issue - State: open - Opened by RobinL about 1 year ago - 1 comment

#1761 - [FEAT] Unlinkables chart for multiple datasets when linking

Issue - State: open - Opened by samnlindsay about 1 year ago
Labels: enhancement, good first issue, charts

#1759 - Reset time_series.json to match the master branch

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1758 - Percentage difference level

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1757 - Introduce a `ColumnTreeBuilder` to aid in the construction of our column ASTs

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago - 2 comments

#1756 - (Splink 4) Fix datediff level now super is removed

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1755 - Implement Singleton Pattern for Dialects and Refine Dialect Factory Methods

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago - 11 comments

#1754 - Cluster studio sample by density

Pull Request - State: closed - Opened by zslade about 1 year ago

#1753 - (Splink 4) Implement array intersect level

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1752 - [BUG] Delete cached tables before resetting the cache

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1751 - No super in comparison levels

Pull Request - State: closed - Opened by RobinL about 1 year ago - 3 comments

#1750 - Add expressions to input column

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago - 3 comments

#1749 - Fix uncommitted changes to distance levels

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1748 - Fix docs build

Pull Request - State: closed - Opened by ADBond about 1 year ago
Labels: documentation

#1745 - Docs build is failing

Issue - State: closed - Opened by ADBond about 1 year ago - 2 comments
Labels: bug, documentation, continuous integration

#1744 - UDFRegistration.registerJava() not whitelisted

Issue - State: closed - Opened by JonathanLaidler about 1 year ago - 4 comments

#1741 - Bump aiohttp from 3.8.5 to 3.8.6

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#1740 - [FEAT] Validate that input data sets are conformant when the Linker is created

Issue - State: open - Opened by alanakilleen about 1 year ago
Labels: enhancement

#1739 - (DO NOT MERGE) Try and put spark setup in setup timings

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1736 - [MAINT] Improve speed of tests

Pull Request - State: closed - Opened by RobinL about 1 year ago - 8 comments

#1735 - 3.9.9

Pull Request - State: closed - Opened by RossKen about 1 year ago - 2 comments

#1734 - problem creating sparklinker

Issue - State: open - Opened by yz0000 about 1 year ago - 2 comments

#1733 - fix: respect boto3_session when checking table existence from AthenaLinker

Pull Request - State: closed - Opened by finalgrrrl about 1 year ago - 2 comments

#1731 - Fix issue with `_source_dataset_col` and `_source_dataset_input_column`

Pull Request - State: closed - Opened by RobinL about 1 year ago - 1 comment

#1730 - Convert all InputColumn methods that take no arguments to properties

Pull Request - State: closed - Opened by RobinL about 1 year ago - 1 comment

#1729 - [MAINT] Better __repr__ for comparison levels

Issue - State: closed - Opened by RobinL about 1 year ago
Labels: good first issue

#1728 - Splink4: Additional string distance levels (damerau-lev, jaro and jaccard)

Pull Request - State: closed - Opened by RobinL about 1 year ago - 1 comment

#1727 - Splink4: Distance in km level

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1726 - Columns reversed level

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1725 - Splink4: How to enable comparison_levels with multiple input columns

Issue - State: open - Opened by RobinL about 1 year ago - 3 comments

#1724 - Splink4 - Make null level null

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1722 - Bump pyarrow from 13.0.0 to 14.0.1 in /binder

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#1721 - Splink4 ComparisonCreator ft. ExactMatch and LevenshteinAtThresholds

Pull Request - State: closed - Opened by ADBond about 1 year ago - 1 comment
Labels: comparison levels, splink4

#1720 - [FEAT] Cluster IDs based on node centrality

Issue - State: open - Opened by samnlindsay about 1 year ago
Labels: enhancement, clustering

#1719 - Fix InputColumn quoting for spark and improve code quality

Pull Request - State: closed - Opened by RobinL about 1 year ago

#1718 - Add `distance_threshold` check

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1717 - Implement new jaro_winkler_level (Splink4)

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1716 - threshold_match_weight in linker.predict() not checked when equal to 0

Issue - State: closed - Opened by medwar99 about 1 year ago - 2 comments
Labels: bug

#1715 - Remove duplicate input_columns code

Issue - State: open - Opened by ADBond about 1 year ago
Labels: refactoring, maintenance

#1714 - Migrate tests for Splink 4 (`ComparisonLevelCreator` and `ComparisonCreator` and related changes)

Pull Request - State: closed - Opened by ADBond about 1 year ago - 3 comments
Labels: splink4, testing

#1713 - add `@property` decorator to `sqlglot_name` method

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1712 - [BUG] `InputColumn` does not work properly with Spark columns that need escaping

Issue - State: closed - Opened by RobinL about 1 year ago - 2 comments

#1710 - Settings val updates

Pull Request - State: closed - Opened by ThomasHepworth about 1 year ago

#1709 - [FEAT] Sorting prediction DataFrame by match_weight/probability (or more)

Issue - State: closed - Opened by medwar99 about 1 year ago - 2 comments
Labels: enhancement

#1707 - [MAINT] Improve speed of test runs

Issue - State: closed - Opened by RobinL about 1 year ago - 4 comments
Labels: maintenance

#1706 - [DO NOT MERGE] Time tests

Pull Request - State: closed - Opened by RobinL about 1 year ago