Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / Helsinki-NLP/OpusFilter issues and pull requests
#75 - Opusfilter fails to compress data when it is downloaded via moses
Issue -
State: closed - Opened by thfrkielikone 5 months ago
- 3 comments
#74 - Cache behaviour
Issue -
State: closed - Opened by thfrkielikone 5 months ago
- 3 comments
#73 - Make some older libraries optional
Pull Request -
State: closed - Opened by svirpioj 5 months ago
#72 - Installing on fedora 40
Issue -
State: closed - Opened by thfrkielikone 6 months ago
- 5 comments
#71 - fix score method in SentenceEmbeddingFilter
Pull Request -
State: closed - Opened by svirpioj 8 months ago
#70 - SentenceEmbeddingFilter chunksize clashes with general chunksize
Issue -
State: closed - Opened by miau1 9 months ago
- 1 comment
Labels: bug
#69 - Issue with opus-fast-mosestokenizer dep for ARM-macs
Issue -
State: open - Opened by rggdmonk 9 months ago
- 3 comments
#68 - LMclassify always score 1
Issue -
State: closed - Opened by wuyangjian 10 months ago
- 3 comments
#67 - Add lingua-py support for language identification
Pull Request -
State: closed - Opened by svirpioj 11 months ago
#66 - Add support for fastspell for language identification
Issue -
State: open - Opened by marco-c about 1 year ago
#65 - Add lingua-py support for language identification
Pull Request -
State: closed - Opened by marco-c about 1 year ago
- 1 comment
#64 - Refactor autogen code
Pull Request -
State: closed - Opened by svirpioj about 1 year ago
Labels: enhancement
#63 - eflomal crashes during filtering
Issue -
State: open - Opened by yvesscherrer over 1 year ago
- 1 comment
#61 - Issue during installation
Issue -
State: closed - Opened by evramnarouz over 1 year ago
- 3 comments
Labels: installation
#60 - Add pyyaml to requirements
Issue -
State: closed - Opened by yvesscherrer over 1 year ago
- 1 comment
Labels: invalid
#59 - insufficient documentation
Issue -
State: closed - Opened by jairosg over 1 year ago
- 1 comment
Labels: documentation
#58 - Install eflomal from PyPI and use the new interface in WordAlignFilter
Pull Request -
State: closed - Opened by svirpioj over 1 year ago
#57 - switch to opus-fast-mosestokenizer
Pull Request -
State: closed - Opened by svirpioj over 1 year ago
#56 - Bump setuptools from 58.0.0 to 65.5.1
Pull Request -
State: closed - Opened by dependabot[bot] almost 2 years ago
- 1 comment
Labels: dependencies
#55 - build documentation with sphinx
Pull Request -
State: closed - Opened by svirpioj about 2 years ago
#54 - migrate docs to sphinx
Pull Request -
State: closed - Opened by BrightXiaoHan about 2 years ago
- 2 comments
#53 - Integration with MTData
Issue -
State: open - Opened by svirpioj about 2 years ago
Labels: enhancement
#52 - Better word alignment filter
Issue -
State: open - Opened by svirpioj about 2 years ago
- 1 comment
Labels: enhancement
#51 - Automatic configuration generation
Pull Request -
State: closed - Opened by svirpioj about 2 years ago
#50 - Improve handling whitespace in Jieba and MeCab tokenization
Pull Request -
State: closed - Opened by svirpioj over 2 years ago
#49 - feature: add parallel decorator for functions preprocess, score, and filter
Pull Request -
State: closed - Opened by BrightXiaoHan over 2 years ago
- 6 comments
#48 - fix jieba tokenize and detokenize funcs.
Pull Request -
State: closed - Opened by BrightXiaoHan over 2 years ago
- 2 comments
#47 - fix: missing the checker for param
Pull Request -
State: closed - Opened by BrightXiaoHan over 2 years ago
- 1 comment
Labels: bug
#46 - Process Killed
Issue -
State: closed - Opened by bayesrule over 2 years ago
- 2 comments
#45 - Add subword segmentation support
Pull Request -
State: closed - Opened by svirpioj over 2 years ago
Labels: enhancement
#44 - add SentenceEmbeddingFilter and ParallelNearestNeighbors model
Pull Request -
State: closed - Opened by svirpioj over 2 years ago
#43 - Add support for Japanese tokenization
Pull Request -
State: closed - Opened by svirpioj over 2 years ago
Labels: enhancement
#42 - add SimilarityFilter
Pull Request -
State: closed - Opened by svirpioj over 2 years ago
Labels: enhancement
#41 - Debug the configuration by export filtered corpus.
Issue -
State: closed - Opened by BrightXiaoHan almost 3 years ago
- 2 comments
Labels: question
#40 - allow per-language parameters for length filters
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
- 1 comment
Labels: enhancement
#39 - fix bug in classifier training and improve unit tests
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
Labels: bug
#38 - Specify different "unit" types in filters.
Issue -
State: closed - Opened by BrightXiaoHan almost 3 years ago
- 2 comments
Labels: enhancement
#37 - Version 2.3.0 breaks train_classifier function
Issue -
State: closed - Opened by wujameszj almost 3 years ago
- 1 comment
Labels: bug
#36 - add option to save scores in train_alignment
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
Labels: enhancement
#35 - add RepetitionFilter
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
Labels: enhancement
#34 - Is it possible to generate score file during training alignment model?
Issue -
State: closed - Opened by BrightXiaoHan almost 3 years ago
- 6 comments
Labels: enhancement
#33 - Add LMClassifierFilter
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
#32 - add MonolingualSentenceSplitter
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
#31 - Possible bug in word_alignment accept function
Issue -
State: closed - Opened by tomsbergmanis almost 3 years ago
- 5 comments
Labels: invalid
#30 - tokenizer ignored when creating align.priors
Issue -
State: closed - Opened by tomsbergmanis almost 3 years ago
- 2 comments
Labels: invalid
#29 - Add method-specific options for LanguageIDFilter
Pull Request -
State: closed - Opened by svirpioj almost 3 years ago
#28 - Use multicore to accelerate score, filter and tokenize processes.
Issue -
State: closed - Opened by BrightXiaoHan almost 3 years ago
- 5 comments
Labels: enhancement
#27 - add jieba tokenizer for Chinese
Pull Request -
State: closed - Opened by svirpioj about 3 years ago
- 1 comment
Labels: enhancement
#26 - opusfilter : command not found
Issue -
State: closed - Opened by Pkscode about 3 years ago
- 2 comments
Labels: installation
#25 - pandas<1.0.0 not supported in opusfilter>=2.0.0
Issue -
State: closed - Opened by svirpioj about 3 years ago
- 1 comment
Labels: bug
#24 - How to choose threshold for WordAlignFilter?
Issue -
State: closed - Opened by BrightXiaoHan about 3 years ago
#23 - add jieba tokenizer for Chinese corpus.
Pull Request -
State: closed - Opened by BrightXiaoHan about 3 years ago
- 5 comments
Labels: enhancement
#22 - Installation fails on Windows
Issue -
State: closed - Opened by aarnetalman about 3 years ago
- 1 comment
Labels: documentation
#21 - Installation using pip fails
Issue -
State: closed - Opened by aarnetalman about 3 years ago
- 5 comments
Labels: bug
#20 - Add support to fasttext for language detection
Pull Request -
State: closed - Opened by svirpioj over 3 years ago
#19 - Add suppress_prompts parameter for opus_read
Pull Request -
State: closed - Opened by radinplaid over 3 years ago
#18 - Add option to suppress download confirmation for "opus_read" (Issue #10)
Pull Request -
State: closed - Opened by radinplaid over 3 years ago
Labels: enhancement
#17 - add function for downloading a single file
Pull Request -
State: closed - Opened by svirpioj over 3 years ago
Labels: enhancement
#16 - restrict build-n-publish job to pushed tags
Pull Request -
State: closed - Opened by svirpioj over 3 years ago
#15 - fix build-n-publish job
Pull Request -
State: closed - Opened by svirpioj over 3 years ago
Labels: bug
#14 - Add support to fasttext for language detection (Develop branch)
Pull Request -
State: closed - Opened by kirianguiller over 3 years ago
- 4 comments
Labels: enhancement
#13 - Extended YAML configuration
Pull Request -
State: closed - Opened by svirpioj over 3 years ago
Labels: enhancement
#12 - Add support to fasttext for language detection
Pull Request -
State: closed - Opened by kirianguiller over 3 years ago
- 4 comments
Labels: enhancement
#11 - TypeError when processing ParaCrawl
Issue -
State: closed - Opened by lefterav over 3 years ago
- 1 comment
Labels: bug
#10 - Option to suppress download confirmation for "opus_read"
Issue -
State: closed - Opened by lefterav over 3 years ago
- 2 comments
Labels: enhancement
#9 - Tokenization behavior in WordAlignFilter
Issue -
State: closed - Opened by yvesscherrer over 3 years ago
- 4 comments
Labels: bug
#8 - Additional filter suggestion: remove lines with repeated content
Issue -
State: closed - Opened by yvesscherrer over 3 years ago
- 3 comments
Labels: enhancement
#7 - Language id filter comparison
Issue -
State: closed - Opened by yvesscherrer over 3 years ago
- 3 comments
Labels: enhancement
#6 - LanguageIDFilter filter error
Issue -
State: closed - Opened by virgulvirgul almost 4 years ago
- 2 comments
Labels: bug
#5 - use latest release if not provided
Pull Request -
State: closed - Opened by jbrry about 4 years ago
- 3 comments
Labels: enhancement
#4 - Option to keep blank lines
Issue -
State: closed - Opened by jbrry about 4 years ago
- 3 comments
Labels: enhancement
#3 - standardize_dataframe_scores receives empty data frame in classifier.py on nlingual-rebase branch
Issue -
State: closed - Opened by jbrry over 4 years ago
- 4 comments
Labels: bug
#2 - LM paths do not use output_directory
Issue -
State: closed - Opened by yvesscherrer over 4 years ago
- 2 comments
Labels: bug
#1 - Infinite scores from word aligment
Issue -
State: open - Opened by svirpioj almost 5 years ago
Labels: bug