Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / wcmc-its/ReCiter issues and pull requests

#100 - For each of an author’s aliases, modify initial query based on lexical rules

Issue - State: closed - Opened by michaelbales1 about 9 years ago - 5 comments
Labels: Phase: Information Retrieval, Phase: Preprocessing, On Hold

#99 - Read BoardCertificationsWCMC.xlsx from the database

Issue - State: closed - Opened by michaelbales1 about 9 years ago - 1 comment
Labels: Phase: Information Retrieval

#98 - Read DiscrepanciesYears.tab data from the database

Issue - State: closed - Opened by michaelbales1 about 9 years ago - 2 comments
Labels: Phase: Information Retrieval

#97 - Use citizenship and educational background to improve precision

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Will not fix

#96 - Decrease likelihood of cluster assignment when co-author name is common

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Phase One Clustering, On Hold

#95 - Output a human-readable explanation for why a publication is matched to an individual

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Error Analysis, Phase: Output, On Hold

#94 - Use authoritative identity sources to anticipate the need for looking up different derivations of an author's name

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Phase: Information Retrieval

#93 - Update ReCiter code so that aliases can be included as input

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 3 comments
Labels: Priority, Phase: Information Retrieval

#92 - Manage maiden name lookups

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase: Information Retrieval, On Hold

#91 - Investigate whether ideal similarity threshold is related to the number of records returned by "LastName, FirstInitial" searches

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Error Analysis, Phase: Output, Will not fix

#90 - Look up articles in PubMed by grant number and last name

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 5 comments
Labels: Phase: Information Retrieval, On Hold

#89 - Cross reference funding statement in PubMed against institutional records of grant funding to improve recall

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 11 comments
Labels: Phase Two Matching, Priority

#88 - Investigate why 20227150 is mapped to Ari Melnick (amm2014)

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Error Analysis, Phase: Output, On Hold

#87 - Manage special characters in PubMed data

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 4 comments
Labels: Priority, Phase: Information Retrieval

#86 - Investigate why we are mapping papers to a cluster when there's no apparent similarity

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Phase Two Matching, On Hold

#85 - Investigate why keyword similarity is zero for all of Arleen Rifkind's articles (and many other people as well)

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase Two Matching, On Hold

#84 - Match on vector of keywords taken from the article title, journal title, and keywords from MeSH major

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 3 comments
Labels: Phase One Clustering, Will not fix

#83 - If a candidate article is published in a journal and the cluster contains that journal, increase the score for a match

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 5 comments
Labels: Phase Two Matching

#82 - Improve score in cases where MeSH major terms match between cluster and target article

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 3 comments
Labels: Phase One Clustering

#81 - Investigate why dcm9006 is not matching to cluster #1

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Phase Two Matching, On Hold

#80 - Add PMIDs to output that are in gold standard but not retrieved by query

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Error Analysis, Phase: Output, In progress

#79 - Leverage departmental affiliation string matching for phase two matching

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 3 comments
Labels: Phase Two Matching, Priority

#78 - Use citizenship and educational background to improve recall

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 4 comments
Labels: Phase Two Matching, Priority

#77 - Use divisional organizational unit (from SAP) and practice location (from POPS) as a source for affiliation keyword

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase Two Matching, On Hold

#76 - Fix for why als7001 is outputting zero results

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Phase: Information Retrieval

#75 - Output the ReCiter performance summary as .csv file

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase: Output

#73 - Look up email separately in Scopus and PubMed at a formative state to find name variants

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase: Information Retrieval

#72 - Use keywords from grants

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase Two Matching, On Hold, Will not fix

#71 - Improve error analysis output

Issue - State: closed - Opened by paulalbert1 over 9 years ago
Labels: Error Analysis, Phase: Output

#70 - Manage special characters better

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Phase Two Matching, Phase One Clustering

#69 - Improve computation time for ReCiter for people with common names

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 2 comments
Labels: Phase Two Matching, Phase One Clustering, Phase: Information Retrieval, Phase: Preprocessing, Phase: Output

#68 - Remove `cmedina` from the list of 64 cwids.

Issue - State: closed - Opened by jl987-Jie over 9 years ago

#67 - Remove `cmedina` from the list of 64 cwids.

Issue - State: closed - Opened by jl987-Jie over 9 years ago - 1 comment

#66 - Assign separate similarity scores for target author and co-author affiliations

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 6 comments
Labels: Phase One Clustering, Priority

#65 - Add year of publication to CSV output

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Error Analysis

#64 - Update README.md to include the option of using git clone when installing

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment

#62 - Generate a CSV file that includes output for all targets

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Error Analysis, Priority

#61 - Limit CSV output to that of the current target author

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Error Analysis, Priority

#60 - For individuals with no/few papers, use default departmental-journal similarity score

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 3 comments
Labels: Phase Two Matching, On Hold

#59 - First name matching in phase one clustering

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 9 comments
Labels: Phase One Clustering, Priority

#58 - Use middle initial for Phase One clustering and Phase Two matching

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 1 comment
Labels: Phase Two Matching, Phase One Clustering, Priority

#57 - Use email as a conclusive indication of author identity when present

Issue - State: closed - Opened by paulalbert1 over 9 years ago - 7 comments
Labels: Phase Two Matching

#56 - Measure and report ReCiter running time for various numbers of articles

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Advanced Testing

#55 - Add instructions on running ReCiter against WCMC reference standard to wiki

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Priority

#54 - Add similarity score for each article to CSV output

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Priority

#53 - Determine completeness of affiliation records for co-authors, by year, in Medline

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Phase One Clustering, Will not fix

#52 - Leverage geographic location of author affiliation for phase one clustering

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Phase One Clustering, Will not fix

#51 - Leverage data on author academic rank to improve phase two matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Phase Two Matching

#50 - Update ReCiter and installation instructions so that it can be run outside of WCMC

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Priority

#49 - Leverage known co-investigators on grants to improve phase two matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 7 comments
Labels: Phase Two Matching

#48 - Leverage data on name variants to improve phase two matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 7 comments
Labels: Phase Two Matching, Will not fix

#47 - Create authorAffiliationScoringStrategy

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 4 comments

#46 - Add primary and/or other department name(s) to list of topic keywords

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Phase Two Matching, Will not fix

#45 - Leverage data on board certifications to improve phase two matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Phase Two Matching, Priority

#44 - Include Phase Two matching score in output

Issue - State: closed - Opened by paulalbert1 over 9 years ago
Labels: Phase Two Matching, Priority

#43 - Add author's known prior institutional affiliations to the database

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Phase: Information Retrieval

#42 - Use last name, first initial queries to download XML via eFetch for full-time faculty

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Priority

#41 - ReCiter is not storing the exact number of xml results returned by PubMed.

Issue - State: closed - Opened by jl987-Jie over 9 years ago - 3 comments
Labels: Bug, Phase: Information Retrieval

#40 - Year-based clustering and matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 7 comments
Labels: Phase Two Matching, Phase One Clustering, Priority

#39 - Identify exact target author's affiliations from Scopus XML

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Priority

#37 - Allow ReCiter to be used as a web service by Academic Staff Management System

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Web Interface, In progress

#36 - Integrate ReCiter with PubAdmin interface: accepts and rejects

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Web Interface, On Hold

#35 - Integrate ReCiter with PubAdmin interface: input seed publication

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Web Interface, On Hold

#34 - Create interface to run ReCiter

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Web Interface, On Hold

#33 - Prepare specs for front-end developer

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment

#32 - Run ReCiter against data from Columbia from ReCiter (Version 1) manuscript

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Advanced Testing

#31 - Create web interface for displaying ReCiter results

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: On Hold

#30 - Test performance of random forest classifier

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Advanced Testing

#29 - Report precision and recall at the article level to mirror 2014 ReCiter paper

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Advanced Testing

#28 - Assess runtime performance for common names like Y. Wang

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 3 comments
Labels: Advanced Testing

#26 - Test ReCiter performance for all faculty

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Advanced Testing

#25 - Phase two error analysis

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Advanced Testing

#23 - Design plan for handling cases where a person has more than a certain number of publications

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Phase: Information Retrieval

#22 - Use target author's known publications to populate first cluster

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 4 comments
Labels: Phase One Clustering, Priority

#21 - Use journal similarity for phase one and two matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 8 comments
Labels: Phase Two Matching, Phase One Clustering, Priority

#20 - Leverage year of terminal degree

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Phase Two Matching

#19 - Use TF-IDF for year

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Phase Two Matching

#18 - Implement code that picks 0 to many clusters, depending on threshold

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Phase Two Matching

#17 - Implement stemming of terms used in phase one clustering

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 4 comments
Labels: Phase One Clustering

#16 - Normalize Unicode characters to Roman equivalents

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 5 comments
Labels: Phase One Clustering

#15 - Decide on data representation for author profiles used in the Phase Two matching

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Phase Two Matching

#14 - Review Michael's draft descriptions for JReCiter classes

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment

#13 - Explore how scores improve as asserted publications are used to select clusters rather than seed them

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Error Analysis, On Hold

#12 - Examine calculation for sensitivity/recall=zero

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Error Analysis

#11 - Enumerate and describe types of errors

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment
Labels: Error Analysis, Phase: Output

#10 - Run ReCiter locally and identify problems

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Error Analysis

#9 - Add MySQL db for Phase Two matching to repo

Issue - State: closed - Opened by michaelbales1 over 9 years ago
Labels: Phase Two Matching

#7 - Store output and scores in the database

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 8 comments
Labels: Phase Two Matching

#6 - Update data flow diagram and entity relationship diagram

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 1 comment

#5 - Add documentation on how to compile and run

Issue - State: closed - Opened by michaelbales1 over 9 years ago

#4 - Document steps required to run clustering locally

Issue - State: closed - Opened by michaelbales1 over 9 years ago

#3 - Move selected documentation from ReCiter wiki to GitHub

Issue - State: closed - Opened by michaelbales1 over 9 years ago

#1 - Update ReCiter clustering so that it can be run locally and produce readable output

Issue - State: closed - Opened by michaelbales1 over 9 years ago - 2 comments
Labels: Priority