Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / moj-analytical-services/splink issues and pull requests
#1019 - Bug: Comparison.__deepcopy__() doesn't respect subclassing
Issue -
State: open - Opened by NickCrews about 2 years ago
Labels: nice to have
#1018 - feat: Support sqlglot versions >=5.1.0
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
#1011 - [FEAT] Support for embedding-based similarity functions
Issue -
State: open - Opened by OlivierBinette about 2 years ago
- 26 comments
Labels: enhancement, comparison levels
#1007 - Examples/tutorials for custom comparisons
Issue -
State: closed - Opened by samnlindsay about 2 years ago
- 2 comments
Labels: documentation
#1006 - fix: Safeguard against rounding/overflow errors in great_circle_distance_km_sql()
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
- 1 comment
#1004 - Use Ruff as a linter
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
- 4 comments
#1003 - Drop python 3.6 support
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
- 2 comments
#1001 - Create general function to profile clusters
Issue -
State: open - Opened by RossKen about 2 years ago
- 1 comment
Labels: enhancement, good first issue, profiling, clustering
#1000 - SQLite random SQL doesn't allow customised `unique_id_column_name`
Issue -
State: open - Opened by ADBond about 2 years ago
Labels: bug, good first issue, sqlite
#996 - Profiling upgrades/fixes
Issue -
State: open - Opened by samnlindsay about 2 years ago
Labels: good first issue, profiling
#995 - Robertwhiffin udf register fix
Pull Request -
State: closed - Opened by ThomasHepworth about 2 years ago
- 2 comments
#992 - feat: Support sqlglot >=5.1.0
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
- 4 comments
#985 - Awards and citation
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#981 - `add_l_or_r_to_identifier` now has case for type exp.Lambda
Pull Request -
State: closed - Opened by ThomasHepworth about 2 years ago
- 1 comment
#969 - Does the completeness chart works
Issue -
State: closed - Opened by RobinL about 2 years ago
Labels: profiling
#962 - (WIP) 961 ideas for improving caching
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 9 comments
#961 - Ideas for improving caching
Issue -
State: closed - Opened by RobinL about 2 years ago
- 5 comments
Labels: caching
#947 - Tf tables not being correctly referenced in 'estimate_probability_two_random_records_match'
Issue -
State: open - Opened by RobinL about 2 years ago
Labels: check if still an issue, term frequency
#946 - black and bump version to 3.5.1
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#943 - Update and lint docstring
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#942 - Bump jsonschema dependency to ensure Splink works in latest jupyterlab
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#935 - [object Object] in cluster studio & comparison viewer tables
Issue -
State: closed - Opened by ADBond about 2 years ago
Labels: bug, good first issue, graphs
#931 - [DOCS] Add data prep pre-requisites section to docs
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#930 - [DOCS] Add m estimation from pairwise (clerical) labels example
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#925 - Fix tests
Pull Request -
State: closed - Opened by ThomasHepworth about 2 years ago
- 1 comment
#922 - Ensure input tables are overwritten for real time linkage to prevent 'table already exists' errors
Pull Request -
State: closed - Opened by RobinL about 2 years ago
- 1 comment
#916 - `__splink__df_concat_with_tf` cache reused if two separate linkers in play
Issue -
State: closed - Opened by RobinL about 2 years ago
- 2 comments
Labels: bug
#910 - `RANDOM()` / `RAND()` backend compatibility
Issue -
State: closed - Opened by samnlindsay about 2 years ago
- 1 comment
Labels: bug, spark
#907 - Return settings dict from save_settings_to_json()
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
- 2 comments
#896 - docs: Fix link to settings_jsonschema.json
Pull Request -
State: closed - Opened by NickCrews about 2 years ago
- 1 comment
#885 - Comparison level logical composition
Issue -
State: closed - Opened by ADBond about 2 years ago
- 3 comments
Labels: enhancement, Interface/API improvement, comparison levels
#884 - Error if input dataframes already have a column named `source_dataset`
Issue -
State: open - Opened by RobinL about 2 years ago
- 1 comment
Labels: check if still an issue, validation
#882 - Toy Example
Issue -
State: closed - Opened by firmai about 2 years ago
- 5 comments
Labels: validation
#880 - InputColumn class is ignoring index
Issue -
State: closed - Opened by mamonu about 2 years ago
- 2 comments
Labels: documentation, good first issue, comparison levels
#879 - More detail on missing trained values in `linker.predict()`
Issue -
State: open - Opened by ADBond about 2 years ago
- 1 comment
Labels: enhancement, model training
#865 - Update pyproject.toml
Pull Request -
State: closed - Opened by mamonu over 2 years ago
- 4 comments
#852 - Unclear error if EM training blocking rule creates empty link table
Issue -
State: closed - Opened by ADBond over 2 years ago
- 1 comment
Labels: bug
#850 - github action for testing py3.6 compatibility
Pull Request -
State: closed - Opened by mamonu over 2 years ago
- 2 comments
#849 - [DOCS] Clarify best data for Splink
Pull Request -
State: closed - Opened by RobinL over 2 years ago
- 1 comment
#845 - `comparison_viewer_dashboard` breaks if `output_column_name` contains spaces
Issue -
State: closed - Opened by ADBond over 2 years ago
Labels: bug
#839 - Comparison viewer filters doesn't use level labels
Issue -
State: closed - Opened by ADBond over 2 years ago
- 1 comment
Labels: enhancement, Interface/API improvement, graphs
#825 - [FEAT] Add match probability to precision recall and roc
Pull Request -
State: closed - Opened by RobinL over 2 years ago
- 1 comment
#824 - [FIX] Fix overlapping bars problem in match weight and m and u values charts
Pull Request -
State: closed - Opened by RobinL over 2 years ago
- 1 comment
#810 - [FIX] Add preceding blocking rules to eliminate dupes in `find_matches_to_new_records`
Pull Request -
State: closed - Opened by RobinL over 2 years ago
- 1 comment
#808 - Missingness chart fails if tables don't have same columns
Issue -
State: closed - Opened by ADBond over 2 years ago
- 2 comments
Labels: bug
#807 - Add F1 score to ROC and precision/recall charts
Pull Request -
State: closed - Opened by NickCrews over 2 years ago
- 1 comment
#802 - `compare_two_records` needs to check whether tf tables exist
Issue -
State: open - Opened by RobinL over 2 years ago
- 5 comments
Labels: bug, check if still an issue
#801 - [FIX] Improve poor performance of linker.prediction_errors_from_labels_table in Spark
Pull Request -
State: closed - Opened by RobinL over 2 years ago
- 1 comment
#793 - Accuracy analysis from labels column assumes blocking rules have perfect recall
Issue -
State: open - Opened by RobinL over 2 years ago
Labels: model qa
#694 - decimal can only support precision up to 38
Issue -
State: closed - Opened by KalaniStanton over 2 years ago
- 4 comments
#680 - Arrays/structs break when loading pandas df into `duckdb`
Issue -
State: closed - Opened by ThomasHepworth over 2 years ago
- 2 comments
Labels: duckdb, check if still an issue
#666 - Unable to create `Comparison` for a function with a non-default schema
Issue -
State: closed - Opened by philip-hunt-kani over 2 years ago
- 6 comments
Labels: check if still an issue
#657 - `pd.NA` is treated as a string value when registered in a db.
Issue -
State: closed - Opened by ThomasHepworth over 2 years ago
- 5 comments
Labels: bug, duckdb, check if still an issue
#642 - Apply accessibility guidelines to splink documentation
Issue -
State: open - Opened by mamonu over 2 years ago
- 1 comment
Labels: documentation
#539 - Add density-based sampling to splink cluster studio
Issue -
State: closed - Opened by RobinL over 2 years ago
- 2 comments
Labels: enhancement, clustering
#430 - Add chart to show TF adjustments for specific values
Issue -
State: closed - Opened by samnlindsay almost 3 years ago
- 1 comment
Labels: enhancement, model training, term frequency
#402 - Allow `linker.train_m_from_deterministic_rule()`
Issue -
State: open - Opened by RobinL almost 3 years ago
- 1 comment
Labels: good first issue, model training
#251 - One-to-one matching
Issue -
State: open - Opened by lucasmalherbe about 3 years ago
- 6 comments
Labels: enhancement
#215 - Add default postcode comparison function
Issue -
State: closed - Opened by samnlindsay over 3 years ago
- 7 comments
Labels: comparison levels
#199 - Profiling of dates/quantities with a histogram
Issue -
State: open - Opened by samnlindsay almost 4 years ago
Labels: good first issue, profiling