rspeer/wordfreq issues and pull requests

#116 - Appreciation Thread

Issue - State: open - Opened by yorodm 9 days ago

#115 - [generic issue of support and well-wishing]

Issue - State: open - Opened by qilyn 11 days ago

#114 - Bump setuptools from 69.0.2 to 70.0.0

Pull Request - State: open - Opened by dependabot[bot] 3 months ago
Labels: dependencies

#113 - lemmatization?

Issue - State: closed - Opened by doctorcolossus 4 months ago - 1 comment

#112 - Data packs are not detected by PyInstaller

Issue - State: open - Opened by thelabcat 5 months ago - 1 comment

#111 - Bump ipython from 7.34.0 to 8.10.0

Pull Request - State: closed - Opened by dependabot[bot] 10 months ago - 1 comment
Labels: dependencies

#110 - Replace `pkg_resources` with `importlib.resources`

Pull Request - State: closed - Opened by xxyzz 12 months ago - 1 comment

#109 - Bump pygments from 2.13.0 to 2.15.0

Pull Request - State: closed - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#108 - Mix-up between Slovenian and Slovak in README.md

Issue - State: closed - Opened by tderflinger over 1 year ago - 1 comment

#107 - Rename fil language code to tl

Pull Request - State: closed - Opened by Vuizur over 1 year ago - 2 comments

#106 - Update Zenodo version to enable updated citation?

Issue - State: closed - Opened by scaperex almost 2 years ago - 1 comment

#105 - Add extras packages to `tool.poetry.dependencies` in pyproject.toml

Pull Request - State: closed - Opened by xxyzz about 2 years ago

#104 - newest version does not work?

Issue - State: closed - Opened by JINHXu about 2 years ago - 3 comments

#103 - pip install wordfreq[cjk] does not get extras

Issue - State: closed - Opened by KennyChenBasis about 2 years ago

#102 - Adding new language "Basque"

Issue - State: closed - Opened by Mikelhoya about 2 years ago - 3 comments

#101 - AttributeError: module 'regex' has no attribute 'Match'

Issue - State: closed - Opened by oseminck over 2 years ago - 1 comment

#100 - 📦 Unnecessary dependency on Mypy

Issue - State: closed - Opened by connorbrinton over 2 years ago - 1 comment

#99 - How to use Unidic for Japanese?

Issue - State: open - Opened by patarapolw over 2 years ago - 1 comment

#98 - Letter frequencies

Issue - State: closed - Opened by willwade over 2 years ago

#97 - Include license file in source distribution

Pull Request - State: closed - Opened by synapticarbors almost 3 years ago

#96 - Can the license file be packaged in as well?

Issue - State: closed - Opened by thewchan almost 3 years ago

#95 - Can't install wordfreq[cjk]. no matches found: wordfreq[cjk]

Issue - State: closed - Opened by Fraa-124 almost 3 years ago - 1 comment

#94 - English words containing this character "’" are not in the data base

Issue - State: closed - Opened by theRealProHacker over 3 years ago - 2 comments

#93 - Cannot import name top_n_list

Issue - State: open - Opened by yaoberh over 3 years ago

#92 - Use UNILEX ?

Issue - State: closed - Opened by hugolpz over 3 years ago - 2 comments

#91 - Version 2.5, incorporating OSCAR data

Pull Request - State: closed - Opened by rspeer over 3 years ago - 1 comment

#90 - Upload 2.4 and 2.4.1 to PyPI

Issue - State: closed - Opened by frankier over 3 years ago - 1 comment

#89 - Rework CJK dependencies and fix a tokenization bug

Pull Request - State: closed - Opened by rspeer over 3 years ago

#88 - work with langcodes 3.0, without language_data

Pull Request - State: closed - Opened by rspeer over 3 years ago

#87 - Frequency lists have high-ranked numbers

Issue - State: closed - Opened by Destaq over 3 years ago - 1 comment

#86 - Install fails

Issue - State: closed - Opened by lt20kmph almost 4 years ago - 2 comments

#85 - Cannot install on Windows 10, marisa-trie dependency error; plaintext data possible?

Issue - State: closed - Opened by andreskarjus almost 4 years ago - 1 comment

#84 - Update the "initial vowels" in French/Catalan

Pull Request - State: closed - Opened by Tahnan almost 4 years ago

#83 - Add åïö to our list of vowels

Pull Request - State: closed - Opened by jlowryduda almost 4 years ago

#82 - add Œ and œ to initial vowels

Pull Request - State: closed - Opened by LBeaudoux almost 4 years ago - 2 comments

#81 - Version 2.4 with updated data

Pull Request - State: closed - Opened by rspeer almost 4 years ago - 1 comment

#80 - Ensure consistent results around punctuation

Pull Request - State: closed - Opened by rspeer about 4 years ago

#79 - Inconsistent tokenization in Italian depending on the version of regex

Issue - State: closed - Opened by rspeer about 4 years ago - 1 comment

#78 - 'narrow no-break space' ("\u202f) is not recognized as a word boundary

Issue - State: open - Opened by LBeaudoux over 4 years ago

#77 - Fix regex's inconsistent word breaking around apostrophes

Pull Request - State: closed - Opened by rspeer over 4 years ago

#76 - so, can we training from private corpus?

Issue - State: closed - Opened by SeekPoint over 4 years ago - 1 comment

#75 - use langcodes 2.0 and deprecate 'match_cutoff'

Pull Request - State: closed - Opened by rspeer over 4 years ago

#74 - Fix code affected by a breaking change in msgpack 1.0

Pull Request - State: closed - Opened by Tahnan over 4 years ago

#73 - Add a mailmap

Pull Request - State: closed - Opened by rspeer almost 5 years ago

#72 - Suggestion: allow function for `minimum`

Issue - State: closed - Opened by gezakerecsenyi about 5 years ago - 1 comment

#71 - Fix a deprecation warning by using raw strings

Pull Request - State: closed - Opened by rspeer about 5 years ago

#70 - Fixes to scripts that accidentally run during tests

Pull Request - State: closed - Opened by rspeer over 5 years ago

#69 - Revert "Build with Pytest on Jenkins"

Pull Request - State: closed - Opened by moss over 5 years ago

#68 - Build with Pytest on Jenkins

Pull Request - State: closed - Opened by moss over 5 years ago

#67 - No `mecab-ipadic-utf8` on centos 7, how can I use wordfreq on Japanese in this case?

Issue - State: open - Opened by clairett over 5 years ago - 14 comments

#66 - Update msgpack parameter

Pull Request - State: closed - Opened by rspeer over 5 years ago - 1 comment

#65 - update msgpack parameter

Issue - State: closed - Opened by rspeer over 5 years ago

#64 - Allow a wider range of 'regex' versions

Pull Request - State: closed - Opened by rspeer almost 6 years ago

#63 - Regex version is incompatible with spaCy

Issue - State: closed - Opened by jlpeck almost 6 years ago - 2 comments

#62 - Update my name and the Zenodo citation

Pull Request - State: closed - Opened by rspeer almost 6 years ago

#61 - Argument to specify frequency source

Issue - State: open - Opened by glupyan about 6 years ago - 1 comment

#60 - Recognize "@" in gender-neutral word endings as part of the token

Pull Request - State: closed - Opened by rspeer about 6 years ago

#59 - Korean install fixes

Pull Request - State: closed - Opened by rspeer over 6 years ago

#58 - Round wordfreq output to 3 sig. figs, and update documentation

Pull Request - State: closed - Opened by rspeer over 6 years ago

#57 - Version 2.1

Pull Request - State: closed - Opened by rspeer over 6 years ago - 1 comment

#56 - Handle Japanese edge cases in `simple_tokenize`

Pull Request - State: closed - Opened by rspeer over 6 years ago - 1 comment

#55 - Version 2, with standalone text pre-processing

Pull Request - State: closed - Opened by rspeer over 6 years ago - 1 comment

#54 - Fix setup.py (version number and msgpack dependency)

Pull Request - State: closed - Opened by rspeer over 6 years ago

#53 - Updated setup.py

Pull Request - State: closed - Opened by ixxie over 6 years ago - 5 comments

#52 - Is there a way to use custom word lists?

Issue - State: closed - Opened by HatScripts over 6 years ago - 1 comment

#51 - Version 1.7: update tokenization, update Wikipedia data, add languages

Pull Request - State: closed - Opened by rspeer about 7 years ago - 2 comments

#50 - Tokenize by graphemes, not codepoints

Pull Request - State: closed - Opened by rspeer about 7 years ago

#49 - Use langcodes when tokenizing again

Pull Request - State: closed - Opened by rspeer over 7 years ago

#48 - Code review notes

Pull Request - State: closed - Opened by alin-luminoso over 7 years ago

#47 - All 1.6 changes

Pull Request - State: closed - Opened by rspeer over 7 years ago - 1 comment

#46 - Tokenize words such as "l'heure" the same way as "l'arc"

Pull Request - State: closed - Opened by rspeer almost 8 years ago

#45 - Describe how to cite wordfreq

Pull Request - State: closed - Opened by rspeer about 8 years ago

#44 - Allow MeCab to work in Japanese or Korean without the other

Pull Request - State: closed - Opened by rspeer about 8 years ago

#43 - Both Korean and Japanese dictionaries must be installed to use either

Issue - State: closed - Opened by alin-luminoso about 8 years ago - 1 comment

#42 - Look for MeCab dictionaries in various places besides this package

Pull Request - State: closed - Opened by rspeer about 8 years ago - 5 comments

#41 - Czech and Slovak

Issue - State: closed - Opened by rspeer about 8 years ago - 1 comment

#40 - Hungarian

Issue - State: closed - Opened by doublex about 8 years ago - 3 comments

#39 - Add Common Crawl data and more languages

Pull Request - State: closed - Opened by rspeer about 8 years ago

#38 - Tokenization in Korean, plus abjad languages

Pull Request - State: closed - Opened by rspeer about 8 years ago - 1 comment

#37 - Fix tokenization of SE Asian and South Asian scripts

Pull Request - State: closed - Opened by rspeer about 8 years ago - 1 comment

#36 - Inconsistent language-code strings lead to inconsistent normalization

Issue - State: closed - Opened by rspeer over 8 years ago - 1 comment

#35 - fix Arabic test, where 'lol' is no longer common

Pull Request - State: closed - Opened by rspeer over 8 years ago

#34 - wordfreq 1.4: some bigger wordlists, better use of language detection

Pull Request - State: closed - Opened by rspeer over 8 years ago - 3 comments

#33 - Restore a missing comma.

Pull Request - State: closed - Opened by alin-luminoso over 8 years ago

#32 - Leave Thai segments alone in the default regex

Pull Request - State: closed - Opened by rspeer over 8 years ago - 1 comment

#31 - Specify encoding when dealing with files

Pull Request - State: closed - Opened by slibs63 almost 9 years ago