Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / OpenPecha/Botok issues and pull requests

#109 - dialectpack for modern Tibetan

Issue - State: open - Opened by fxerhard 11 days ago

#108 - 催更帮助文档！

Issue - State: open - Opened by Tshor 8 months ago

#107 - Missing English words at the end of the text during sentence tokenization

Issue - State: open - Opened by BLKSerene about 1 year ago

#106 - Make error handling more robust when downloading dialect packs

Pull Request - State: open - Opened by BLKSerene about 1 year ago - 1 comment

#105 - Remove unnecessary print messages

Pull Request - State: closed - Opened by mikkokotila over 1 year ago

#104 - Splitting མངས་བས་ wrong?

Issue - State: open - Opened by lothelanor over 1 year ago

#103 - fix: create new release manually

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#102 - Update test.yml

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#101 - Update test.yml

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#100 - Revert "add normalization code"

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#99 - Create test.yml

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#98 - Update publish.yaml

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#97 - add normalization code

Pull Request - State: closed - Opened by eroux over 1 year ago

#96 - Update and rename publish.yaml to CI_CD.yaml

Pull Request - State: closed - Opened by 10zinten over 1 year ago

#95 - fix(resources): Create bo_punct_position.csv

Pull Request - State: closed - Opened by ngawangtrinley over 1 year ago - 1 comment

#94 - [Feature] Classify all PUNCTs into left and right

Issue - State: open - Opened by 10zinten over 1 year ago

#93 - Can we remove "Loading Trie... (1s.)" message

Issue - State: closed - Opened by mikkokotila over 1 year ago

#92 - `token.text_unaffixed` failed to add tsek

Issue - State: open - Opened by 10zinten over 2 years ago

#91 - Missing pos for PUNCT

Issue - State: open - Opened by 10zinten over 2 years ago

#90 - syllable component

Issue - State: open - Opened by kaldan007 over 2 years ago

#89 - syllable tokenizer request

Issue - State: open - Opened by ta4tsering over 2 years ago

#88 - importing a custom dictionary

Issue - State: open - Opened by eroux almost 3 years ago - 1 comment

#87 - issue with Python 3.9

Issue - State: open - Opened by eroux almost 3 years ago
Labels: bug

#86 - identifying weak syllables

Issue - State: open - Opened by eroux about 3 years ago - 1 comment

#85 - POS tags ? distinguishing some patterns

Issue - State: open - Opened by eroux about 3 years ago - 2 comments

#84 - fix(sent-tokenizer): normalised sentence is included in sentence tokens

Pull Request - State: closed - Opened by kaldan007 about 3 years ago

#82 - Unexpected skip

Pull Request - State: closed - Opened by kaldan007 over 3 years ago

#81 - Unexpected syl skip

Pull Request - State: closed - Opened by kaldan007 over 3 years ago

#80 - Unexpected skip of syllable while tokenizing.

Issue - State: open - Opened by kaldan007 over 3 years ago

#79 - Invalid index in merge rule silently produces uncalled for result.

Issue - State: open - Opened by kaldan007 over 3 years ago

#78 - Why VOWELS constant only has one vowel?

Issue - State: open - Opened by forest-jiang almost 4 years ago - 1 comment

#76 - Download of dialect packs fails on macOS when running CI

Issue - State: open - Opened by BLKSerene about 4 years ago - 1 comment

#75 - detect any language

Issue - State: open - Opened by ngawangtrinley over 4 years ago
Labels: enhancement

#74 - dict like `get` method for Token object

Issue - State: open - Opened by 10zinten over 4 years ago
Labels: enhancement

#73 - understanding custom pipelines

Issue - State: open - Opened by mikkokotila over 4 years ago - 3 comments

#72 - minimal instructions/docstring for Trie

Issue - State: closed - Opened by mikkokotila over 4 years ago - 1 comment

#71 - Directory based config

Pull Request - State: closed - Opened by 10zinten over 4 years ago - 2 comments

#70 - Multiprocessing tokenization

Pull Request - State: closed - Opened by 10zinten over 4 years ago - 5 comments

#69 - Check existence of the latest resource files before downloading

Issue - State: closed - Opened by BLKSerene over 4 years ago - 2 comments

#68 - bad segmentation

Issue - State: closed - Opened by drupchen about 5 years ago - 1 comment

#67 - AttributeError: 'NoneType' object has no attribute 'append'

Issue - State: closed - Opened by eroux about 5 years ago - 13 comments

#66 - batch process files

Issue - State: closed - Opened by drupchen about 5 years ago

#65 - Missing lemma for numbers

Issue - State: closed - Opened by 10zinten about 5 years ago - 3 comments

#64 - multi-threading

Issue - State: open - Opened by mikkokotila about 5 years ago - 6 comments

#63 - Github Actions for CI

Issue - State: closed - Opened by mikkokotila about 5 years ago - 3 comments

#62 - labels

Issue - State: open - Opened by mikkokotila about 5 years ago - 1 comment

#61 - statistics performance with tokenizer.list_word_types

Issue - State: open - Opened by mikkokotila about 5 years ago - 3 comments

#60 - from pybo to botok

Pull Request - State: closed - Opened by drupchen about 5 years ago - 1 comment

#59 - Path issue after frozen with PyInstaller on macOS

Issue - State: closed - Opened by BLKSerene about 5 years ago - 3 comments

#58 - Tokenizer improvement

Pull Request - State: closed - Opened by drupchen about 5 years ago - 2 comments

#57 - pybo 0.6.0 tokenizer failed for འིའོ

Issue - State: closed - Opened by 10zinten over 5 years ago - 5 comments

#56 - Huge memory cost when initializing the tokenizer

Issue - State: closed - Opened by BLKSerene over 5 years ago - 3 comments

#55 - Sentencize a list of tokens that have been manually tokenized by adding spaces

Issue - State: open - Opened by BLKSerene over 5 years ago - 1 comment

#54 - Failed to tokenize text with pybo 0.6.3

Issue - State: closed - Opened by BLKSerene over 5 years ago - 3 comments

#53 - update

Pull Request - State: closed - Opened by drupchen over 5 years ago

#52 - Missing character when updating from pybo 0.4.0 to pybo 0.6.0, BoTokenizer to WordTokenizer

Issue - State: closed - Opened by aninrusimha over 5 years ago - 4 comments

#51 - oops

Pull Request - State: closed - Opened by drupchen over 5 years ago - 1 comment

#50 - Add multiple words per entry

Pull Request - State: closed - Opened by drupchen over 5 years ago - 1 comment

#49 - test

Pull Request - State: closed - Opened by drupchen over 5 years ago - 1 comment

#48 - finding sentence limits

Issue - State: open - Opened by eroux over 5 years ago - 11 comments

#47 - Trie's handing of word list that contains both པར་(photo) and པར་(particle)

Issue - State: closed - Opened by evanyerburgh over 5 years ago - 1 comment

#46 - Update README.md

Pull Request - State: closed - Opened by evanyerburgh over 5 years ago

#45 - Unicode normalisation

Issue - State: closed - Opened by ngawangtrinley over 5 years ago - 4 comments

#44 - Add folia output to pybo

Issue - State: closed - Opened by ngawangtrinley over 5 years ago

#43 - Sentences and Paragraphs as Token attributes

Issue - State: open - Opened by drupchen over 5 years ago
Labels: enhancement

#42 - Warning issued after upgrading PyYAML to 5.1

Issue - State: closed - Opened by BLKSerene over 5 years ago - 1 comment

#41 - syllable boundary bug

Issue - State: closed - Opened by drupchen over 5 years ago

#40 - Oops ! on the wrong branch

Pull Request - State: closed - Opened by drupchen over 5 years ago - 1 comment

#39 - refactor parsing resource files to directory based

Pull Request - State: closed - Opened by 10zinten over 5 years ago - 1 comment

#38 - POS-tagging a list of tokens that have already been tokenized

Issue - State: closed - Opened by BLKSerene over 5 years ago - 6 comments

#37 - Sentence tokenization and detokenization

Issue - State: closed - Opened by BLKSerene over 5 years ago - 6 comments

#36 - Add reathedoc style documentation

Pull Request - State: closed - Opened by 10zinten almost 6 years ago - 1 comment

#35 - How to initialize the tokenizer without the POS tagging feature?

Issue - State: closed - Opened by BLKSerene almost 6 years ago - 3 comments

#34 - Cache and reuse temporary files to speed up initialization

Issue - State: closed - Opened by BLKSerene almost 6 years ago - 6 comments

#33 - Remove trailing whitespace in tokens

Issue - State: closed - Opened by BLKSerene almost 6 years ago - 4 comments

#32 - Bopipeline

Pull Request - State: closed - Opened by drupchen almost 6 years ago - 1 comment

#31 - What's the tagset used by pybo?

Issue - State: closed - Opened by BLKSerene almost 6 years ago - 2 comments

#30 - CQLMatcher can not match last token

Issue - State: closed - Opened by kevinhuangtw almost 6 years ago - 2 comments

#29 - change toadd_filenames and todel_filenames to a folder path

Issue - State: closed - Opened by drupchen almost 6 years ago - 1 comment

#28 - sanskrit entries don't seem to be inflected

Issue - State: closed - Opened by drupchen almost 6 years ago - 1 comment

#27 - How to add my own dictionary

Issue - State: closed - Opened by CrystalWLH almost 6 years ago - 7 comments

#26 - add: adjust rule for ལ་ལ་ལ་ལ་

Pull Request - State: closed - Opened by 10zinten about 6 years ago

#25 - additional affix combinations

Issue - State: closed - Opened by eroux about 6 years ago - 1 comment

#24 - using unicode data

Issue - State: closed - Opened by eroux about 6 years ago - 2 comments

#23 - The resources for the frequency is not in the package

Issue - State: closed - Opened by thubtenrigzin over 6 years ago - 1 comment

#22 - integrate tests in setup.py

Issue - State: closed - Opened by eroux over 6 years ago

#21 - Missing syllabes and punctuations

Issue - State: closed - Opened by thubtenrigzin over 6 years ago - 3 comments

#20 - default value for Token#pos

Issue - State: closed - Opened by drupchen over 6 years ago - 1 comment

#19 - symbol considered as token content

Issue - State: closed - Opened by drupchen over 6 years ago - 1 comment

#18 - tokenizer gives IndexError

Issue - State: closed - Opened by mikkokotila over 6 years ago - 5 comments
Labels: bug

#17 - word2vec implementation in Tibetan

Issue - State: closed - Opened by mikkokotila over 6 years ago - 9 comments

#16 - colibri for gramm'n

Issue - State: closed - Opened by mikkokotila over 6 years ago - 3 comments

#15 - handling genitive case (and maybe other cases too)

Issue - State: closed - Opened by mikkokotila over 6 years ago - 5 comments

#14 - Travis, README.md, etc update

Pull Request - State: closed - Opened by mikkokotila over 6 years ago - 2 comments

#13 - tests failing because of LemmatizeTokens().lemmatize(tokens)

Issue - State: closed - Opened by mikkokotila over 6 years ago - 2 comments

#12 - suggestion for token conventions

Issue - State: closed - Opened by mikkokotila over 6 years ago - 2 comments

#11 - int and bool

Issue - State: closed - Opened by ngawangtrinley over 6 years ago

#10 - NONE error when trying to match int or bool token attributes

Issue - State: open - Opened by ngawangtrinley over 6 years ago - 4 comments
Labels: help wanted

#9 - yaml fails to import

Issue - State: closed - Opened by mikkokotila over 6 years ago - 2 comments

#8 - tokenizer fails

Issue - State: closed - Opened by mikkokotila over 6 years ago - 6 comments