Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / diasks2/pragmatic_tokenizer issues and pull requests
#48 - Master dev/7 numbered lists
Pull Request -
State: closed - Opened by abrazzini almost 4 years ago
#47 - Master dev/2 strip tags
Pull Request -
State: closed - Opened by giovannelli almost 4 years ago
#46 - & symbol and URL's downcase
Pull Request -
State: closed - Opened by giovannelli almost 4 years ago
- 1 comment
#45 - Adding rules for tokenization of words with apostrophes in french
Pull Request -
State: closed - Opened by taha-yassine over 5 years ago
#44 - Replicated in Crystal
Issue -
State: open - Opened by watzon over 5 years ago
- 1 comment
#43 - Non-breaking spaces should STILL be spaces
Pull Request -
State: closed - Opened by wflanagan over 5 years ago
- 2 comments
#42 - downcase: false shoudn't mean upcase for contractions
Issue -
State: open - Opened by sheerun over 5 years ago
#41 - Contractions don't remove dots
Issue -
State: open - Opened by sheerun over 5 years ago
#40 - multiple slashes within a string not properly processed
Issue -
State: open - Opened by maia almost 6 years ago
#39 - speed improvements by optimisation of regular expressions
Pull Request -
State: closed - Opened by maia over 6 years ago
- 1 comment
#38 - lower memory usage by reducing object allocations
Pull Request -
State: closed - Opened by maia over 6 years ago
- 1 comment
#37 - NoMethodError (nil.length)
Issue -
State: closed - Opened by maia about 7 years ago
- 2 comments
#36 - fix deprecated warning for Ruby 2.4
Pull Request -
State: closed - Opened by mmacia over 7 years ago
- 3 comments
#35 - EMOJI_REGEX exception on JRuby
Issue -
State: open - Opened by Arvinje over 8 years ago
- 1 comment
#34 - stop words not replaceable
Issue -
State: closed - Opened by maia over 8 years ago
- 1 comment
Labels: duplicate
#33 - urls should not be downcased
Issue -
State: open - Opened by maia over 8 years ago
- 1 comment
Labels: bug, help wanted
#32 - long_word_split should not split emails, urls, twitter handles
Issue -
State: closed - Opened by maia over 8 years ago
- 1 comment
Labels: bug
#31 - stop words and filter languages
Issue -
State: closed - Opened by maia almost 9 years ago
- 2 comments
Labels: bug
#30 - unifying regex, using constants
Pull Request -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#29 - refactored PostProcessor
Pull Request -
State: closed - Opened by maia almost 9 years ago
- 5 comments
#28 - cleanup pre_processor.rb
Pull Request -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#27 - Speed
Issue -
State: closed - Opened by diasks2 almost 9 years ago
- 3 comments
Labels: enhancement
#26 - refactoring to style guide
Pull Request -
State: closed - Opened by maia almost 9 years ago
- 5 comments
#25 - Properly detect emoticons
Issue -
State: open - Opened by diasks2 almost 9 years ago
- 2 comments
Labels: enhancement, help wanted
#24 - characters test string
Issue -
State: closed - Opened by maia almost 9 years ago
- 2 comments
Labels: bug
#23 - mapping of similar characters (e.g. apostrophes)?
Issue -
State: open - Opened by maia almost 9 years ago
- 1 comment
Labels: enhancement
#22 - more specs
Issue -
State: closed - Opened by maia almost 9 years ago
- 2 comments
#21 - more specs
Issue -
State: closed - Opened by maia almost 9 years ago
- 2 comments
#20 - Identifying emojis by unicode ranges?
Issue -
State: closed - Opened by maia almost 9 years ago
- 4 comments
Labels: enhancement, question
#19 - Should all TLDs be whitelisted?
Issue -
State: open - Opened by diasks2 almost 9 years ago
- 1 comment
Labels: question
#18 - Definition of clean
Issue -
State: closed - Opened by diasks2 almost 9 years ago
- 2 comments
#17 - additional specs
Issue -
State: closed - Opened by maia almost 9 years ago
- 10 comments
#16 - splitting of words with # prefix at hyphen
Issue -
State: closed - Opened by maia almost 9 years ago
- 4 comments
#15 - classic_filter and non-acronyms
Issue -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#14 - single quotes return different result based on language setting
Issue -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#13 - remove_numbers should keep tokens that contain letters
Issue -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#12 - option :clean removes hashtags
Issue -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#11 - split long words
Issue -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#10 - three options for each kind of token
Issue -
State: closed - Opened by maia almost 9 years ago
- 5 comments
#9 - feature overlap with pragmatic_segmenter?
Issue -
State: open - Opened by maia almost 9 years ago
- 1 comment
Labels: question
#8 - Allow user to specify abbreviations and/or stop words to be used
Issue -
State: closed - Opened by diasks2 almost 9 years ago
- 1 comment
Labels: enhancement
#7 - slow loading time
Issue -
State: closed - Opened by maia almost 9 years ago
- 3 comments
Labels: enhancement, question
#6 - ActiveSupport::Multibyte::Chars causing NoMethodError
Issue -
State: closed - Opened by maia almost 9 years ago
- 5 comments
#5 - option to require only specific languages?
Issue -
State: open - Opened by maia almost 9 years ago
- 2 comments
Labels: enhancement
#4 - german contractions list
Issue -
State: closed - Opened by maia almost 9 years ago
- 2 comments
#3 - updated german abbreviations
Issue -
State: closed - Opened by maia almost 9 years ago
- 3 comments
#2 - options should (also) allow symbols
Issue -
State: closed - Opened by maia almost 9 years ago
- 1 comment
#1 - additional specs
Issue -
State: closed - Opened by maia almost 9 years ago
- 12 comments