Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / rth/vtext issues and pull requests

#86 - Bump scipy from 1.4.1 to 1.10.0 in /ci

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#85 - Bump numpy from 1.17.3 to 1.22.0 in /ci

Pull Request - State: open - Opened by dependabot[bot] over 2 years ago
Labels: dependencies

#84 - Update dependencies

Pull Request - State: open - Opened by rth over 3 years ago

#83 - Treebank word tokenizer from NLTK

Pull Request - State: open - Opened by rth over 4 years ago

#82 - Feature/kskip-ngram

Pull Request - State: open - Opened by joshlk over 4 years ago - 10 comments

#81 - Use approx create for tests

Pull Request - State: closed - Opened by rth over 4 years ago

#80 - Fine-tune tokenizers

Issue - State: open - Opened by rth over 4 years ago

#79 - Standardize language option

Issue - State: open - Opened by rth over 4 years ago

#78 - Add StopWordFilter

Pull Request - State: open - Opened by rth over 4 years ago - 3 comments

#77 - Fix clippy warnings

Pull Request - State: closed - Opened by rth over 4 years ago - 2 comments

#76 - Improve error handling

Pull Request - State: closed - Opened by rth over 4 years ago

#75 - Renamed `UnicodeSegmentTokenizer` to `UnicodeWordTokenizer`.

Pull Request - State: closed - Opened by rth over 4 years ago

#74 - Add CHANGELOG.md

Pull Request - State: closed - Opened by rth over 4 years ago

#73 - Add pickling support for Python tokenizers

Pull Request - State: closed - Opened by rth over 4 years ago

#72 - Rename UnicodeSegmentTokenizer to UnicodeWordTokenizer

Issue - State: closed - Opened by rth over 4 years ago - 1 comment

#71 - Sentence tokenizers benchmarks

Pull Request - State: closed - Opened by rth over 4 years ago

#70 - Punctuation sentence tokenizer

Pull Request - State: closed - Opened by joshlk over 4 years ago - 9 comments

#69 - Update to PyO3 0.10 and rust-numpy 0.9

Pull Request - State: closed - Opened by rth over 4 years ago

#68 - Update rust version used in CI

Pull Request - State: closed - Opened by rth over 4 years ago

#67 - Sentence tokenization using Unicode segmentation (Python package)

Pull Request - State: closed - Opened by joshlk over 4 years ago - 3 comments

#66 - Sentence tokenization using Unicode segmentation

Pull Request - State: closed - Opened by joshlk over 4 years ago - 3 comments

#65 - MAINT Build wheels for Python 3.8

Pull Request - State: closed - Opened by rth over 4 years ago

#64 - MAINT Update dependencies

Pull Request - State: closed - Opened by rth almost 5 years ago

#63 - Make to_ascii_lowercase optional

Issue - State: open - Opened by technic about 5 years ago - 4 comments

#62 - BLD Build for the wasm target

Pull Request - State: closed - Opened by rth about 5 years ago - 1 comment

#61 - MAINT Make rayon dependency optional

Pull Request - State: closed - Opened by rth about 5 years ago

#60 - PY Implement get_params methods for tokenizers

Pull Request - State: closed - Opened by rth about 5 years ago

#59 - MNT Update dependencies versions

Pull Request - State: closed - Opened by rth about 5 years ago

#58 - TST Use hypothesis in python tests

Pull Request - State: closed - Opened by rth about 5 years ago

#57 - API Set parameters with the builder pattern

Pull Request - State: closed - Opened by rth over 5 years ago

#56 - Update to PyO3 0.7

Pull Request - State: closed - Opened by rth over 5 years ago - 3 comments

#55 - Parallel CountVectorizer

Pull Request - State: closed - Opened by rth over 5 years ago

#54 - TST add float_cmp crate for tests

Pull Request - State: closed - Opened by jbowles over 5 years ago - 1 comment

#53 - Tokenizers dispatch in vectorizers

Pull Request - State: closed - Opened by rth over 5 years ago - 1 comment

#52 - General architecture feedback

Issue - State: open - Opened by rth over 5 years ago - 2 comments

#51 - Add sentence splitter

Issue - State: closed - Opened by rth over 5 years ago - 8 comments
Labels: new feature

#50 - Better support of configuration parameters in vectorizers

Issue - State: closed - Opened by rth over 5 years ago - 2 comments

#49 - ENH Improve CountVectorizer performance

Pull Request - State: closed - Opened by rth over 5 years ago

#48 - Add tokenizer trait

Pull Request - State: closed - Opened by rth over 5 years ago - 2 comments

#47 - Migrate to PyO3 0.6.0

Pull Request - State: closed - Opened by rth over 5 years ago

#46 - ENH Avoid copying tokens in tokenizers in Python

Issue - State: closed - Opened by rth over 5 years ago - 1 comment
Labels: python, tokenization, performance

#45 - Add CharacterTokenizer

Pull Request - State: closed - Opened by rth over 5 years ago

#44 - Relicense under Apache license 2.0

Pull Request - State: closed - Opened by rth over 5 years ago

#43 - Add Levenshtein Edit distance

Pull Request - State: closed - Opened by rth over 5 years ago

#42 - DOC Add function signatures

Pull Request - State: closed - Opened by rth over 5 years ago

#41 - ENH Jaro Winkler similarity

Pull Request - State: closed - Opened by rth over 5 years ago

#40 - Character n-grams

Issue - State: open - Opened by rth over 5 years ago - 4 comments
Labels: new feature

#39 - Add Jaro similarity

Pull Request - State: closed - Opened by rth over 5 years ago

#38 - Add Sørensen-Dice string similarity

Pull Request - State: closed - Opened by rth over 5 years ago

#37 - Update python readme and rename python package

Pull Request - State: closed - Opened by rth over 5 years ago

#36 - Add documentation

Pull Request - State: closed - Opened by rth over 5 years ago

#35 - Build release wheels with LTO

Issue - State: open - Opened by rth over 5 years ago
Labels: build / CI, performance

#34 - CI Deploy wheels

Pull Request - State: closed - Opened by rth over 5 years ago

#33 - Add Bling Fire benchmarks

Pull Request - State: closed - Opened by rth over 5 years ago

#32 - MAINT Improve crate structure

Pull Request - State: closed - Opened by rth over 5 years ago

#31 - Better unicode support in tokenization rules

Issue - State: open - Opened by rth over 5 years ago - 1 comment
Labels: tokenization

#30 - Improve french tokenizer

Pull Request - State: closed - Opened by rth over 5 years ago

#29 - Tokenization evaluation script

Pull Request - State: closed - Opened by rth over 5 years ago

#28 - Add VTextTokenizer

Pull Request - State: closed - Opened by rth over 5 years ago

#27 - MAINT Replace fasthash dependency with seahash

Pull Request - State: closed - Opened by rth over 5 years ago

#26 - Add Azure Pipelines CI

Pull Request - State: closed - Opened by rth over 5 years ago

#25 - Make estimators picklables

Issue - State: open - Opened by rth over 5 years ago
Labels: python

#24 - Rename package text-vectorize to vtext

Pull Request - State: closed - Opened by rth over 5 years ago

#23 - Use hashbrown instead of fnv Hasher

Pull Request - State: closed - Opened by rth over 5 years ago

#22 - Snowball stemmer

Pull Request - State: closed - Opened by rth over 5 years ago

#21 - NLP pipeline design

Issue - State: open - Opened by rth over 5 years ago - 10 comments
Labels: help wanted

#20 - Parallel HashingVectorizer

Pull Request - State: closed - Opened by rth over 5 years ago - 1 comment

#19 - Return iterator in tokenizers

Pull Request - State: closed - Opened by rth over 5 years ago

#18 - Add Regexp tokenizer

Pull Request - State: closed - Opened by rth over 5 years ago

#17 - Add unicode tokenizer

Pull Request - State: closed - Opened by rth over 5 years ago

#16 - Use setuptools-rust instead of pyo3-pack

Pull Request - State: closed - Opened by rth over 5 years ago

#15 - Fix CI on master

Pull Request - State: closed - Opened by rth almost 6 years ago

#14 - Add preliminary CountVectorizer python wrapper

Pull Request - State: closed - Opened by rth almost 6 years ago

#13 - Use sprs library to represent sparse arrays

Pull Request - State: closed - Opened by rth almost 6 years ago

#12 - Optimize summing of duplicate tokens

Pull Request - State: closed - Opened by rth almost 6 years ago

#11 - Migrate to rust 2018 edition

Pull Request - State: closed - Opened by rth almost 6 years ago

#10 - Support different hash functions in HashingVectorizer

Issue - State: closed - Opened by rth almost 6 years ago - 2 comments

#9 - Build Python wheels for Linux

Pull Request - State: closed - Opened by rth almost 6 years ago

#8 - Use tox for testing

Pull Request - State: closed - Opened by rth almost 6 years ago

#7 - Set up CI with Azure Pipelines

Pull Request - State: closed - Opened by rth almost 6 years ago

#6 - Python wrappers

Issue - State: closed - Opened by rth almost 6 years ago - 1 comment
Labels: python

#5 - Add CI on Linux

Pull Request - State: closed - Opened by rth about 6 years ago

#4 - Implement IDF transforms

Issue - State: open - Opened by rth about 6 years ago
Labels: new feature

#3 - Multi-OS Python wheels

Issue - State: closed - Opened by rth about 6 years ago
Labels: python

#2 - Word n-grams

Issue - State: open - Opened by rth about 6 years ago
Labels: new feature

#1 - Preliminary Python wrapper for HashingVectorizer

Pull Request - State: closed - Opened by rth about 6 years ago