Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / LanguageMachines/ucto issues and pull requests

#95 - Setting -m in container does not supress punctuation-based sentence splitting

Issue - State: closed - Opened by pirolen 7 months ago - 4 comments
Labels: bug

#94 - add a batch option

Issue - State: closed - Opened by kosloot 7 months ago - 6 comments
Labels: enhancement, Testing...

#93 - Ucto fails on some UTF-8 characters in tei2folia generated FoLiA

Issue - State: closed - Opened by martinreynaert 7 months ago - 12 comments
Labels: bug

#92 - Implement (soft)hyphen handling in Ucto analogues to foliautils

Issue - State: open - Opened by kosloot about 1 year ago
Labels: enhancement, investigate

#91 - Develop a tokenizer for Premodern Slavic

Issue - State: open - Opened by pirolen about 1 year ago

#90 - Question: Concatenating word parts at soft hyphens

Issue - State: closed - Opened by pirolen over 1 year ago - 77 comments

#89 - Ucto aborts on FoLiA creation

Issue - State: closed - Opened by kosloot almost 2 years ago
Labels: bug

#88 - remove some deprecated options

Issue - State: open - Opened by kosloot about 2 years ago - 6 comments

#87 - Ucto with 'detectlanguages' : failure

Issue - State: closed - Opened by martinreynaert over 2 years ago - 3 comments

#86 - Language detection default for 'unknown' language

Issue - State: closed - Opened by martinreynaert over 2 years ago - 9 comments

#85 - IDs in UCTO in concert with tei2folia

Issue - State: closed - Opened by martinreynaert over 2 years ago - 3 comments

#84 - ucto sometimes misses out on the <t> for <p>

Issue - State: closed - Opened by martinreynaert about 3 years ago - 3 comments
Labels: Testing...

#82 - Tokenization of t-style element that has font_typeface Feature

Issue - State: closed - Opened by pirolen over 3 years ago - 19 comments

#81 - -T full option produces invalid FoLiA

Issue - State: closed - Opened by kosloot over 3 years ago - 1 comment
Labels: bug

#80 - is this correct handling of FoLiA paragraphs with embedded Part nodes?

Issue - State: closed - Opened by kosloot over 3 years ago - 4 comments
Labels: question

#79 - Byte-order mark followed by space or tab results in Folia error

Issue - State: closed - Opened by marijnschraagen almost 4 years ago - 7 comments
Labels: Testing...

#78 - Update debian package for v0.21

Issue - State: closed - Opened by proycon over 4 years ago
Labels: packaging, ready

#77 - ucto creates invalid folia

Issue - State: open - Opened by kosloot over 4 years ago - 2 comments
Labels: bug

#76 - passthru mode should not be combined with other language options

Issue - State: closed - Opened by kosloot almost 5 years ago
Labels: enhancement

#74 - interactive ucto seems broken

Issue - State: closed - Opened by kosloot about 5 years ago
Labels: bug

#73 - add a rule for ROMAN numbers

Issue - State: open - Opened by kosloot about 5 years ago
Labels: enhancement, question

#72 - problems with 'complex' file names using the -c option

Issue - State: closed - Opened by kosloot about 5 years ago - 3 comments
Labels: bug

#71 - ucto breaks on empty FoLiA comment

Issue - State: closed - Opened by proycon about 5 years ago - 1 comment
Labels: bug, low priority

#69 - No FoLiA outputfile created on MacOSX

Issue - State: closed - Opened by kosloot over 5 years ago - 2 comments
Labels: bug, investigate

#68 - Rerunning ucto on already tokenized FoLiA

Issue - State: closed - Opened by kosloot over 5 years ago - 6 comments
Labels: question

#67 - file iso639_3.foliaset is missing

Issue - State: closed - Opened by a-tsioh over 5 years ago - 2 comments

#66 - ucto should never create Words without an ID

Issue - State: closed - Opened by kosloot over 5 years ago - 4 comments
Labels: bug

#65 - Don't bail out on Paragraphs and Sentences in FoLia, but check always for words

Issue - State: closed - Opened by kosloot over 5 years ago - 3 comments
Labels: bug, question

#64 - extra sentence wrongly added on strange folia input.

Issue - State: closed - Opened by kosloot over 5 years ago - 5 comments
Labels: bug

#63 - add rules voor super/sub-script (formulae)

Issue - State: closed - Opened by kosloot over 5 years ago - 2 comments
Labels: enhancement, question

#62 - libtextcat works different on MacOSX

Issue - State: closed - Opened by kosloot over 5 years ago
Labels: bug, investigate

#61 - installing issue on Mac OS

Issue - State: closed - Opened by alabrashJr almost 6 years ago - 2 comments

#60 - Document the existance of ucto FreeBSD package

Pull Request - State: closed - Opened by 0mp almost 6 years ago

#59 - Install target breaks: ln: /usr/local/share/ucto/textcat.cfg: Permission denied

Issue - State: closed - Opened by yurivict almost 6 years ago - 11 comments
Labels: duplicate

#58 - language detection should probably work on Sentence level.

Issue - State: closed - Opened by kosloot almost 6 years ago - 2 comments
Labels: enhancement

#57 - HTML Ampersand Character Codes

Issue - State: open - Opened by Irishx almost 6 years ago - 4 comments
Labels: enhancement

#56 - add tests for more languages

Issue - State: open - Opened by kosloot almost 6 years ago

#55 - double config files?

Issue - State: closed - Opened by Irishx almost 6 years ago - 2 comments

#54 - Latest ucto make check fails on Mac OS X

Issue - State: closed - Opened by proycon over 6 years ago - 2 comments
Labels: bug, PRIORITY

#53 - Extend the %MACRO% expansion in META-RULES to all rules

Issue - State: open - Opened by kosloot over 6 years ago - 2 comments
Labels: enhancement

#52 - Improve the include mechanisme for uctodata files

Issue - State: open - Opened by kosloot over 6 years ago
Labels: enhancement

#51 - Tokenization of bracketed abbreviations is problematic

Issue - State: closed - Opened by kosloot over 6 years ago - 2 comments
Labels: Testing...

#50 - Ucto sentence splitting can cause FoLiA text redundancy errors

Issue - State: closed - Opened by proycon over 6 years ago - 4 comments
Labels: bug

#49 - Disable word tokenization

Issue - State: closed - Opened by emanjavacas over 6 years ago - 8 comments
Labels: Testing...

#48 - parsing very long integers takes exponential time

Issue - State: closed - Opened by kosloot over 6 years ago - 2 comments
Labels: Testing...

#47 - add possibility to add extra user-defined rules on startup

Issue - State: open - Opened by kosloot over 6 years ago
Labels: enhancement

#46 - Some edge cases for nld rules

Issue - State: closed - Opened by asharkinasuit over 6 years ago - 16 comments

#45 - cannot configure package

Issue - State: closed - Opened by emanjavacas over 6 years ago - 4 comments

#44 - enable alternative search paths for uctodata

Issue - State: closed - Opened by kosloot over 6 years ago - 7 comments
Labels: enhancement

#43 - turning off sentence detection fails

Issue - State: closed - Opened by Irishx over 6 years ago - 3 comments
Labels: question, investigate

#42 - Greek encoding

Issue - State: closed - Opened by JessedeDoes over 6 years ago - 6 comments
Labels: wrong package

#41 - Problem with labeled lists

Issue - State: closed - Opened by JessedeDoes over 6 years ago - 9 comments

#40 - assigning paragraphs to FoLiA structure elements, yes, no, maybe?

Issue - State: closed - Opened by kosloot almost 7 years ago - 1 comment
Labels: enhancement, question

#39 - Ucto attempts to double-append the same paragraph when processing tables

Issue - State: closed - Opened by proycon almost 7 years ago - 2 comments
Labels: bug, Testing...

#38 - ucto v0.9.9 does not include so-fix

Issue - State: closed - Opened by proycon almost 7 years ago
Labels: bug

#37 - Sentence detection breaks structure

Issue - State: closed - Opened by proycon almost 7 years ago - 3 comments
Labels: bug, PRIORITY, ready

#36 - Update debian package for ucto v0.14

Issue - State: closed - Opened by proycon almost 7 years ago - 9 comments
Labels: packaging, ready

#35 - TEXT VALIDATION ERROR (consistency)

Issue - State: closed - Opened by JessedeDoes almost 7 years ago - 18 comments
Labels: bug, wrong package

#34 - Difficulties with complex <t> contents

Issue - State: closed - Opened by JessedeDoes almost 7 years ago - 21 comments
Labels: bug

#33 - Release ucto v0.9.7?

Issue - State: closed - Opened by proycon almost 7 years ago - 1 comment

#32 - Retain --with-icu

Issue - State: closed - Opened by proycon about 7 years ago - 2 comments
Labels: Testing...

#31 - Add option for outputting text (t) on deepest level (w) only

Issue - State: closed - Opened by proycon about 7 years ago - 10 comments
Labels: enhancement, Testing...

#30 - Ucto does not use folia file extension to automatically assume folia input

Issue - State: closed - Opened by fkunneman about 7 years ago - 5 comments
Labels: enhancement, Testing...

#29 - Ucto fails on XML comments

Issue - State: closed - Opened by proycon about 7 years ago - 2 comments
Labels: bug

#28 - Utterance processing, don't do paragraph detection inside utterances

Issue - State: closed - Opened by proycon about 7 years ago - 4 comments
Labels: bug, PRIORITY

#27 - unsupported language 'eng'

Issue - State: closed - Opened by emanjavacas about 7 years ago - 2 comments

#26 - REGEXP support not available

Issue - State: closed - Opened by emanjavacas about 7 years ago - 3 comments

#25 - Ucto fails to tokenise certain folia input?

Issue - State: closed - Opened by proycon about 7 years ago - 5 comments
Labels: bug

#24 - Feature request: Rule type applied prior to whitespace tokenization, to allow protecting token sequences

Issue - State: closed - Opened by etashru over 7 years ago - 2 comments
Labels: enhancement, wontfix, low priority

#23 - Tokenize ALL FoLiA elements that carry text

Issue - State: closed - Opened by kosloot over 7 years ago - 1 comment

#22 - detectlanguages should detect languages in FoLiA

Issue - State: open - Opened by kosloot over 7 years ago
Labels: enhancement

#21 - Unable to load shared libraries

Issue - State: closed - Opened by limogin over 7 years ago - 3 comments

#20 - Ucto crashes on overly long word strings

Issue - State: closed - Opened by martinreynaert over 7 years ago - 5 comments
Labels: enhancement

#19 - ucto slow on very long lines?

Issue - State: closed - Opened by kosloot over 7 years ago - 2 comments
Labels: investigate

#18 - Solve ucto/uctodata upgrade conflicts for debian packaging!

Issue - State: closed - Opened by proycon over 7 years ago - 1 comment
Labels: bug, PRIORITY, packaging

#17 - Limit network transfers, add `ccache`

Pull Request - State: closed - Opened by sanmai-NL over 7 years ago - 4 comments

#16 - Date tagging can be improved

Issue - State: closed - Opened by sanmai-NL over 7 years ago - 7 comments

#15 - Check code quality during CI

Pull Request - State: closed - Opened by sanmai-NL over 7 years ago - 11 comments

#14 - Refactor `Setting::read`

Issue - State: closed - Opened by sanmai-NL over 7 years ago - 3 comments
Labels: enhancement, question

#13 - update debian package for v0.9.4

Issue - State: closed - Opened by proycon over 7 years ago
Labels: packaging, waiting

#12 - misplaced uctodata warning for tokconfig-generic configuration

Issue - State: closed - Opened by proycon almost 8 years ago - 2 comments
Labels: bug

#11 - segfault on an empty line when --pasthru and -m options are used

Issue - State: closed - Opened by kosloot almost 8 years ago - 1 comment
Labels: bug

#10 - Decide on one encoding schem for language

Issue - State: closed - Opened by kosloot almost 8 years ago - 1 comment

#9 - Update debian package for v0.9.3 release

Issue - State: closed - Opened by proycon about 8 years ago - 2 comments
Labels: packaging, waiting, ready

#8 - Problem with PUNCTUATION-MULTI-DOT, ucto hangs

Issue - State: closed - Opened by proycon about 8 years ago - 1 comment
Labels: bug

#7 - separate dat from code

Issue - State: closed - Opened by kosloot over 8 years ago - 3 comments
Labels: enhancement, investigate

#6 - Multi label rules

Issue - State: closed - Opened by kosloot over 8 years ago - 1 comment
Labels: enhancement, low priority

#5 - combinations of words/numbers with abbreviations are incorrectly handled

Issue - State: closed - Opened by kosloot over 8 years ago - 1 comment
Labels: bug

#4 - Abbreviations at the end of sentence not handled correctly

Issue - State: closed - Opened by mhkuu over 8 years ago - 1 comment
Labels: bug, wontfix

#3 - Handling of abbreviations followed by punctuation goes awry

Issue - State: closed - Opened by mhkuu over 8 years ago - 1 comment
Labels: bug

#2 - Manual is outdated

Issue - State: closed - Opened by kosloot over 8 years ago - 6 comments
Labels: enhancement

#1 - Autoconf template has errors (with autoconf 2.69)

Issue - State: closed - Opened by sanmai-NL over 9 years ago - 5 comments