Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / JuliaText/WordTokenizers.jl issues and pull requests

#64 - Unable to install WordTokenizers.jl

Issue - State: closed - Opened by ablaom 6 months ago - 2 comments

#63 - Optimize statistical unigram tokenizer `decode_forward`

Pull Request - State: open - Opened by aria42 over 2 years ago - 2 comments

#62 - Sentence Splitters: no sentence break in between two words with no punctuation

Pull Request - State: open - Opened by dhruvil410 over 3 years ago - 2 comments

#61 - Adding GPT2 Tokenizer for WordTokenizers' Pretrained tokenizers

Pull Request - State: open - Opened by shikhargoswami over 3 years ago - 1 comment

#59 - Interest in Improving Sentence Tokenization

Issue - State: open - Opened by TheCedarPrince over 3 years ago - 2 comments

#58 - Fix Typos and Indentation

Pull Request - State: closed - Opened by SambhawDrag over 3 years ago - 1 comment

#57 - Lowercasing each token in tokenize function

Issue - State: closed - Opened by shikhargoswami over 3 years ago - 3 comments

#56 - use a normal function in __init__ to intialize the data deps

Pull Request - State: closed - Opened by KristofferC about 4 years ago - 1 comment

#55 - InitError on julia 1.5

Issue - State: closed - Opened by chengchingwen about 4 years ago - 3 comments

#54 - Adopt ColPrac?

Pull Request - State: closed - Opened by oxinabox about 4 years ago - 1 comment

#53 - Update to version 0.5.5.

Pull Request - State: closed - Opened by Ayushk4 about 4 years ago - 1 comment

#52 - Release latest version

Issue - State: closed - Opened by tejasvaidhyadev about 4 years ago

#51 - Adding support for unigram sentencepiece model

Pull Request - State: closed - Opened by tejasvaidhyadev about 4 years ago - 14 comments

#50 - [WIP] Update README with JOSS Badge and Citation

Pull Request - State: open - Opened by Ayushk4 over 4 years ago

#49 - Install TagBot as a GitHub Action

Pull Request - State: closed - Opened by JuliaTagBot over 4 years ago

#48 - Update paper.md

Pull Request - State: closed - Opened by kthyng over 4 years ago

#47 - Update paper.bib

Pull Request - State: closed - Opened by kthyng over 4 years ago

#46 - Benchmark against Rust library

Issue - State: open - Opened by oxinabox over 4 years ago

#45 - Update paper based on JOSS review

Pull Request - State: closed - Opened by oxinabox over 4 years ago - 2 comments

#44 - Add statistical tokenization algorithms

Issue - State: closed - Opened by Ayushk4 over 4 years ago - 20 comments

#43 - Add installation guide to README

Pull Request - State: closed - Opened by Ayushk4 over 4 years ago - 1 comment

#42 - Change example setting tokenizer to TinySegmenter.jl's tokenizer

Pull Request - State: closed - Opened by Ayushk4 over 4 years ago - 1 comment

#41 - Fixing a number of typos in paper and readme

Pull Request - State: closed - Opened by leios over 4 years ago - 1 comment

#40 - Minor Fixes in JOSS paper

Pull Request - State: closed - Opened by Ayushk4 almost 5 years ago - 1 comment

#39 - very minor grammar fixes in README

Pull Request - State: closed - Opened by danielskatz almost 5 years ago - 1 comment

#38 - Sentence spliting of sentences with out whitespace after period

Issue - State: open - Opened by oxinabox almost 5 years ago - 2 comments

#37 - Filtering the empty strings from substring array

Pull Request - State: open - Opened by RohitPingale almost 5 years ago - 4 comments

#36 - Add plot comparing speeds of tokenizers to JOSS paper.

Pull Request - State: closed - Opened by Ayushk4 about 5 years ago - 2 comments

#35 - Support and Contribution guidelines

Pull Request - State: closed - Opened by Ayushk4 about 5 years ago - 1 comment

#34 - JOSS paper update

Pull Request - State: closed - Opened by Ayushk4 about 5 years ago - 7 comments

#33 - Handle final periods

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 3 comments

#32 - split_sentences - handling spaces after "."

Issue - State: open - Opened by Ayushk4 over 5 years ago - 7 comments

#31 - Toktok fix patch

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 3 comments

#30 - Update for Julia-1.1

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 1 comment

#29 - Fix TokTok.jl

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 4 comments

#28 - Tokenize begins with full stop.

Issue - State: closed - Opened by haampie over 5 years ago - 1 comment

#27 - Julia 1.1

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 5 comments

#26 - Make a release

Issue - State: closed - Opened by oxinabox over 5 years ago

#25 - Fix inconsistency between tabs and spaces

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 2 comments

#24 - Fix sentence splitter: sentences ending with acronyms

Pull Request - State: closed - Opened by nickto over 5 years ago - 5 comments

#23 - appveyor badge fix

Pull Request - State: closed - Opened by aquatiko over 5 years ago - 1 comment

#22 - Fix indentation in nltk_word.jl

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 1 comment

#21 - Fix indentation.

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 2 comments

#20 - Minor doc fixes in fast.jl

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 2 comments

#18 - add toktok tokenizer

Pull Request - State: closed - Opened by aquatiko over 5 years ago - 17 comments

#17 - fix names

Pull Request - State: closed - Opened by aquatiko over 5 years ago - 2 comments

#16 - Minor Fix in docs

Pull Request - State: closed - Opened by rsdel2007 over 5 years ago - 1 comment

#15 - Add TokTok tokenizer

Issue - State: closed - Opened by oxinabox over 5 years ago - 7 comments
Labels: help wanted, good first issue

#14 - add reverse tokenizer

Pull Request - State: closed - Opened by aquatiko over 5 years ago - 13 comments

#13 - Add Tweet Tokenizer

Pull Request - State: closed - Opened by Ayushk4 over 5 years ago - 44 comments

#12 - Simple reversible tokenizer

Issue - State: closed - Opened by MikeInnes almost 6 years ago - 1 comment

#11 - `tokenize` API

Issue - State: open - Opened by MikeInnes almost 6 years ago - 9 comments

#10 - Make sed-based tokenisers 30x faster

Pull Request - State: closed - Opened by MikeInnes almost 6 years ago - 12 comments

#9 - Fix for 1.0

Pull Request - State: closed - Opened by AShedko almost 6 years ago - 4 comments

#8 - Fixes Julia 0.7 deprecation warnings

Pull Request - State: closed - Opened by Paethon about 6 years ago - 6 comments

#7 - Write paper for JOSS

Pull Request - State: closed - Opened by oxinabox about 6 years ago - 7 comments

#6 - Standardize spelling of "Tokenizer" with Z throughout the repo

Pull Request - State: closed - Opened by waldyrious about 6 years ago - 2 comments

#5 - [WIP] Port TokTok

Pull Request - State: closed - Opened by oxinabox about 6 years ago

#4 - 0.7 compat

Pull Request - State: closed - Opened by oxinabox about 6 years ago - 1 comment

#3 - Add a Twitter tokenizer

Issue - State: closed - Opened by oxinabox over 6 years ago - 8 comments

#2 - Minor docs change

Pull Request - State: closed - Opened by oxinabox over 6 years ago - 1 comment

#1 - Reexport RevTok.jl

Issue - State: closed - Opened by oxinabox over 6 years ago - 2 comments