Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / adbar/simplemma issues and pull requests

#136 - unable to load simplemma.simplemma

Issue - State: closed - Opened by FrogInDizzy 11 days ago - 2 comments

#135 - Feat/better dictionaries

Pull Request - State: open - Opened by juanjoDiaz 20 days ago

#134 - Drop support for Python 3.6 and 3.7

Pull Request - State: closed - Opened by Dunedan 25 days ago - 4 comments

#133 - Add a dictionary factory backed by MARISA-tries

Pull Request - State: open - Opened by Dunedan about 1 month ago - 17 comments

#132 - Returning all possible lemmas for a single word

Issue - State: open - Opened by zeeyado about 1 month ago - 1 comment

#131 - prepare version 1.0.0

Pull Request - State: closed - Opened by adbar about 1 month ago - 6 comments

#130 - maintenance: code linting

Pull Request - State: closed - Opened by adbar about 1 month ago - 1 comment

#129 - maintenance: explicitly deprecate langdetect submodule

Pull Request - State: closed - Opened by adbar about 1 month ago - 1 comment

#128 - use binary strings in dictionaries to save memory

Pull Request - State: closed - Opened by adbar about 1 month ago - 4 comments

#127 - training: do not remove words tackled by rules

Pull Request - State: closed - Opened by adbar about 1 month ago - 1 comment

#126 - Feat/simplify is known function

Pull Request - State: closed - Opened by juanjoDiaz about 2 months ago - 3 comments

#125 - Function `is_known()` not working as expected (missing word when a rule is active)

Issue - State: closed - Opened by adbar about 2 months ago - 4 comments
Labels: question

#124 - tests: update settings

Pull Request - State: closed - Opened by adbar 2 months ago - 1 comment

#123 - docs: switch README to markdown and update

Pull Request - State: closed - Opened by adbar 2 months ago - 1 comment

#122 - docs: add info on training data

Pull Request - State: closed - Opened by adbar 2 months ago - 1 comment

#121 - RST-Readme rendering broken on Github

Issue - State: closed - Opened by adbar 3 months ago

#120 - Coverage for training scripts?

Issue - State: closed - Opened by adbar 3 months ago - 1 comment

#119 - update setup

Pull Request - State: closed - Opened by adbar 3 months ago - 7 comments

#118 - Use custom dictionaries

Issue - State: open - Opened by 1over137 3 months ago - 3 comments
Labels: question

#117 - greedy decomposition not working on some german verbs

Issue - State: open - Opened by joprice 6 months ago - 1 comment
Labels: question

#116 - feat: better evaluation scripts

Pull Request - State: closed - Opened by juanjoDiaz 10 months ago - 1 comment

#115 - refactor: clean unused line

Pull Request - State: closed - Opened by juanjoDiaz 10 months ago - 1 comment

#114 - fix: proportion_in_target_languages not considering tokens present in…

Pull Request - State: closed - Opened by juanjoDiaz 10 months ago - 2 comments

#113 - Add README section on advanced usage via classes

Pull Request - State: closed - Opened by osma 11 months ago - 3 comments

#112 - in_target_language can count words twice and return ratios above 1.0

Issue - State: closed - Opened by osma 11 months ago - 5 comments
Labels: bug

#111 - simplemma.lang_detector import no longer working

Issue - State: closed - Opened by osma 11 months ago - 3 comments
Labels: bug, documentation

#110 - Plans for simplemma 1.0 release?

Issue - State: closed - Opened by osma 11 months ago - 10 comments
Labels: question

#109 - docs: add mkdocs page for documentation

Pull Request - State: closed - Opened by juanjoDiaz 12 months ago - 10 comments

#108 - feat: use cached object for legacy functions

Pull Request - State: closed - Opened by juanjoDiaz 12 months ago - 2 comments

#107 - Processing time difference between legacy functions and new classes

Issue - State: closed - Opened by adbar about 1 year ago - 3 comments
Labels: question

#106 - fix: issue with tokens that match prefixes

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#105 - update evaluation and add how-to

Pull Request - State: closed - Opened by adbar about 1 year ago - 1 comment

#104 - fix: broken build

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 2 comments

#103 - Document how simplemma's quality is assessed

Issue - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments
Labels: question

#102 - Document how dictionaries are created

Issue - State: closed - Opened by juanjoDiaz about 1 year ago - 8 comments
Labels: documentation

#101 - fix: bring back initial arg in lemmatize

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#100 - Setup: issue with packaging of dictionary data

Issue - State: closed - Opened by adbar about 1 year ago - 9 comments
Labels: question

#99 - Restore `initial` argument in lemmatize

Issue - State: closed - Opened by adbar about 1 year ago

#98 - Feat/better apporach to greedy lookups

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 9 comments

#97 - Greedy option seems inconsistent

Issue - State: open - Opened by dysby about 1 year ago - 2 comments
Labels: question

#96 - Support for right-to-left decomposition

Issue - State: open - Opened by adbar about 1 year ago
Labels: enhancement

#95 - chore: remove useless logging

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 4 comments

#94 - Words that match more than one lemma

Issue - State: open - Opened by juanjoDiaz about 1 year ago - 5 comments
Labels: question

#93 - Inaccuracy related to capitalization

Issue - State: open - Opened by juanjoDiaz about 1 year ago - 1 comment
Labels: enhancement

#92 - refactor: prefix and rules tests to test the actual strategy and not …

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 5 comments

#91 - refactor: provide better strategies exports

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#90 - Test/add tests for new classes

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 2 comments

#89 - Chore/add docstring

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 5 comments

#88 - Ensure __slots__ used everywhere & refactor DictionaryFactory to be part of strategies

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#87 - chore: add requirements-dev.txt file for dev dependencies

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 2 comments

#86 - remove optional mypyc compilation

Pull Request - State: closed - Opened by adbar about 1 year ago - 4 comments

#85 - Refactor/move training files to own folder

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 8 comments

#84 - Use Protocol classes

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 8 comments

#83 - Returning lowercase token for non found tokens might be incorrect

Issue - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#82 - feat: make cache_max_size and int instead of an optional int

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#81 - feat: get list of supported languages dynamically

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 4 comments

#80 - feat: throw on unsupported languages

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#79 - feat: stricter input by making lang mandatory

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 3 comments

#78 - Refactor/better searches

Pull Request - State: closed - Opened by juanjoDiaz about 1 year ago - 22 comments

#77 - use regexes for prefix search

Pull Request - State: closed - Opened by adbar over 1 year ago

#76 - Does capitalization provide value?

Issue - State: closed - Opened by juanjoDiaz over 1 year ago - 1 comment
Labels: question

#75 - refactor: simplify _dehyphen

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 5 comments

#74 - refactor: simplify _decompose function

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 2 comments

#73 - refactor: minimize dictionary lookups in _greedy_search

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 1 comment

#72 - Investigate other data structures to store language data

Issue - State: open - Opened by adbar over 1 year ago - 11 comments
Labels: question

#71 - Add protocol classes for classes that can be extended by the user

Issue - State: closed - Opened by juanjoDiaz over 1 year ago - 4 comments
Labels: question

#70 - Simplify lemmatizing code

Pull Request - State: closed - Opened by adbar over 1 year ago - 2 comments

#69 - Better DE rules

Pull Request - State: closed - Opened by adbar over 1 year ago - 1 comment

#68 - update data based on new dict sort (#41)

Pull Request - State: closed - Opened by adbar over 1 year ago

#67 - rules: add generic substitution function

Pull Request - State: closed - Opened by adbar over 1 year ago - 1 comment

#66 - re-create langdetect() function and add warning

Pull Request - State: closed - Opened by adbar over 1 year ago - 3 comments

#65 - refactor: language detector improvements

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 19 comments

#64 - Ensure functional continuity before releasing v1

Issue - State: closed - Opened by adbar over 1 year ago - 5 comments
Labels: documentation

#63 - Create configurable Rules Engine

Issue - State: closed - Opened by juanjoDiaz over 1 year ago - 2 comments
Labels: enhancement

#62 - refactor: rename extensive to greedy

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 6 comments

#61 - refactor: split rules in multiple files

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 2 comments

#60 - More opportunistic language detection

Pull Request - State: closed - Opened by adbar over 1 year ago - 9 comments

#59 - refactor: rename langdetect to language_detector to be more clear and…

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 4 comments

#58 - performance: move rules to dictionary

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 5 comments

#57 - fix: leave DictionaryFactory to use only LRU

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago

#56 - Checking the behavior of dictionary_factory

Issue - State: closed - Opened by adbar over 1 year ago - 6 comments
Labels: question

#55 - Establish linting and quality tools

Issue - State: open - Opened by juanjoDiaz over 1 year ago - 1 comment
Labels: question

#54 - refactor: solve linting issues and simplify some methods complexity

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 7 comments

#53 - refactor: simple_tokenizer to return strings instead of matches

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 10 comments

#52 - refactor: simplify english rules

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 3 comments

#51 - Use require.txt for dependencies and dev dependencies

Issue - State: closed - Opened by juanjoDiaz over 1 year ago - 2 comments
Labels: question

#50 - feat: add posibility to customize token sampler

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 1 comment

#49 - Separate scripts used to create the dictionaries from actual src & tests

Issue - State: closed - Opened by juanjoDiaz over 1 year ago - 2 comments
Labels: enhancement

#48 - chore: include tests in quality checks

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 1 comment

#47 - feat: wrap DictionaryFactory in a class

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 5 comments

#46 - Add function for language data loading/unloading (#33)

Pull Request - State: closed - Opened by adbar over 1 year ago - 1 comment

#45 - Custom token sampling in prepare_text() for language identification

Issue - State: closed - Opened by adbar over 1 year ago - 3 comments
Labels: enhancement

#44 - Make tokenization configurable

Issue - State: closed - Opened by adbar over 1 year ago - 2 comments
Labels: enhancement

#43 - Replace lru_cache decorators by freely configurable alternative

Issue - State: closed - Opened by adbar over 1 year ago - 4 comments
Labels: enhancement

#42 - refactor: split files into modules

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 2 comments

#41 - Sort dict before pickling

Pull Request - State: closed - Opened by 1over137 over 1 year ago - 8 comments

#40 - Create separate eval directory and log results as CSV file

Pull Request - State: closed - Opened by 1over137 over 1 year ago - 6 comments

#39 - Clarify use of language codes in README

Pull Request - State: closed - Opened by osma over 1 year ago - 1 comment

#38 - Refactor/separate logic into modules

Pull Request - State: closed - Opened by juanjoDiaz over 1 year ago - 10 comments

#37 - Consider doing all development in branches and PRs

Issue - State: closed - Opened by osma over 1 year ago - 3 comments