Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / nltk/nltk issues and pull requests
#3243 - Questions about Copilot + Open Source Software Hierarchy
Issue -
State: closed - Opened by liaochris 6 months ago
#3242 - Formatted code with black=24.3.0 in codebase
Pull Request -
State: closed - Opened by jeslinpjames 6 months ago
- 2 comments
Labels: corpus, tagger, parsing, stem/lemma
#3241 - UTF-8 codec can't decode byte 0×e9 in position 122
Issue -
State: open - Opened by ikrammohamdi 7 months ago
#3240 - Add support for disabling the sorting and list creation for WordNet object relation methods
Pull Request -
State: open - Opened by bryant1410 7 months ago
- 14 comments
Labels: corpus
#3239 - stem accuracy
Issue -
State: closed - Opened by Moustafa1Rizk1 7 months ago
- 2 comments
#3238 - It would be nice to have a mapping from arpabet to IPA for the cmudict
Issue -
State: open - Opened by fcbond 7 months ago
- 3 comments
#3237 - ci: bump action versions
Pull Request -
State: closed - Opened by purificant 7 months ago
Labels: CI
#3236 - Best NLTK books
Issue -
State: open - Opened by StepHaze 7 months ago
#3235 - Reversed y labels in dispersion_plot
Issue -
State: open - Opened by kvmilos 7 months ago
- 2 comments
#3234 - i want to write python script i have italian text files that who i verify my word in italian dictionery please solve
Issue -
State: closed - Opened by imtiaz231 7 months ago
- 1 comment
#3233 - 10x Faster Levenshtein Distances
Pull Request -
State: open - Opened by ashvardanian 7 months ago
- 3 comments
Labels: metrics
#3232 - Fraction object creation fails with extra kwargs in bleu_score.py
Issue -
State: closed - Opened by destroy-lonely 8 months ago
- 2 comments
#3231 - Make WordNet's synset relations available from the lemmas
Pull Request -
State: closed - Opened by ekaf 8 months ago
- 2 comments
Labels: corpus
#3230 - Fix #3124- bug with PickleCorpusView raising UnicodeDecodeError
Pull Request -
State: closed - Opened by Ubadub 8 months ago
- 3 comments
Labels: corpus
#3229 - Add reference to entropy implementation used
Pull Request -
State: closed - Opened by mbauwens 8 months ago
- 3 comments
Labels: language-model
#3228 - module 'nltk' has no attribute 'data
Issue -
State: closed - Opened by peronc 9 months ago
- 2 comments
#3227 - A potential edge case for WordNetLemmatizer.lemmatize()
Issue -
State: closed - Opened by bowenyi-umich 9 months ago
- 1 comment
#3226 - import error with numpy 1.24.4
Issue -
State: closed - Opened by mcdominik 9 months ago
- 3 comments
#3225 - Avoid recursive suffix stripping in wordnet morphy
Pull Request -
State: closed - Opened by ekaf 9 months ago
- 3 comments
Labels: corpus, stem/lemma
#3224 - fix for word_tokenize() Failing to Split English Contractions When Followed by [\t\n\f\r]
Pull Request -
State: closed - Opened by Higgs32584 9 months ago
- 9 comments
Labels: tokenizer
#3222 - Implement vocabulary introduction for texttiling
Pull Request -
State: open - Opened by Syzygy2048 9 months ago
Labels: tokenizer
#3221 - add workaround for cache sometimes not being restored correctly on macos
Pull Request -
State: closed - Opened by purificant 9 months ago
Labels: CI
#3220 - Not able to download the NLTK data module (python as well as manual download)
Issue -
State: closed - Opened by subhra-ranjan-padhy 9 months ago
- 2 comments
#3219 - upgrade automated code checks, part 2
Pull Request -
State: closed - Opened by purificant 10 months ago
- 1 comment
Labels: corpus, tokenizer, tagger, parsing, stem/lemma, classifier, GUI, twitter, cluster, metrics, internals
#3218 - Silence verbose warnings in closure
Pull Request -
State: closed - Opened by ekaf 10 months ago
- 6 comments
#3217 - upgrade automated code checks
Pull Request -
State: closed - Opened by purificant 10 months ago
- 1 comment
Labels: classifier
#3216 - sunset python 3.7
Pull Request -
State: closed - Opened by purificant 10 months ago
Labels: CI
#3215 - quickfix syntax / typo
Pull Request -
State: closed - Opened by purificant 10 months ago
- 1 comment
Labels: metrics
#3214 - ci: update labeler to v5, change config file to new format
Pull Request -
State: closed - Opened by purificant 10 months ago
Labels: CI
#3213 - ci: update actions
Pull Request -
State: closed - Opened by purificant 10 months ago
Labels: CI
#3212 - Dispersion Plot was not populating in correct order on Y axis. I have corrected that order. Please use the below code in dispersion.py file.
Issue -
State: closed - Opened by DS3006 10 months ago
- 2 comments
#3211 - KneserNeyInterpolated has problem with OOV words during testing and perplexity is always inf
Issue -
State: open - Opened by nilinykh 10 months ago
- 7 comments
#3210 - `TreebankWordDetokenizer().detokenize()` introduces unexpected spaces before periods.
Issue -
State: open - Opened by Alnusjaponica 10 months ago
#3209 - Refactor LanguageModel class, adding split functionality and unit tests
Pull Request -
State: open - Opened by venkat1924 10 months ago
#3208 - Tokenizer punkt zip file sometimes does not unpackage
Issue -
State: open - Opened by ryonsteele 10 months ago
#3207 - fix: enable py 3.12 in ci and fix error in bleu calculation
Pull Request -
State: closed - Opened by k4black 11 months ago
- 17 comments
Labels: CI
#3206 - Bug in nltk.draw.dispersion_plot with nltk 3.8.1, matplotlib-base 3.8.0, matplotlib-inline 0.1.6 and numpy 1.26
Issue -
State: closed - Opened by m-d-grunnill 11 months ago
- 2 comments
#3205 - Prevent crash on BLEU if weights are np array
Pull Request -
State: closed - Opened by tomaarsen 11 months ago
#3204 - `corpus_bleu` function does not catch all the expections when calling `weights[0][0]`
Issue -
State: closed - Opened by zhaochenyang20 11 months ago
- 3 comments
#3203 - Make sure that we invoke all the intended regex patterns in ToktokTokenizer...
Pull Request -
State: closed - Opened by alexrudnick 11 months ago
- 3 comments
Labels: tokenizer
#3202 - ToktokTokenizer doesn't call one of the included replacement patterns and thus doesn't tokenize some punctuation, like opening guillemets
Issue -
State: closed - Opened by alexrudnick 11 months ago
- 1 comment
#3201 - fix broken link to the Coding Horror blog post in CONTRIBUTING.md
Pull Request -
State: closed - Opened by alexrudnick 11 months ago
- 1 comment
#3200 - Import of Trie fails in mwe.py
Issue -
State: closed - Opened by passionate-zebracorn 11 months ago
- 1 comment
#3199 - Fix dunning log likelihood ValueError
Pull Request -
State: closed - Opened by vivekkalyan 11 months ago
- 1 comment
Labels: tokenizer
#3198 - NLTK is considering "hi" and "hello" as a noun.
Issue -
State: closed - Opened by RishitAtwal 11 months ago
- 4 comments
#3197 - NLTK thinks `turn` is a noun when it shoud be a verb.
Issue -
State: closed - Opened by alf1e 11 months ago
- 1 comment
#3196 - Problems Running Examples Starting with Babelize
Issue -
State: closed - Opened by mdebellis 11 months ago
- 1 comment
#3195 - Add a function of splitting combined words.
Issue -
State: open - Opened by wxz 12 months ago
#3194 - Unable to download Stopwords and also unable to access stopwords zip file manually.
Issue -
State: closed - Opened by mdabdulrahman 12 months ago
- 2 comments
#3193 - Add support for a `sort` argument in WordNet methods
Issue -
State: closed - Opened by bryant1410 12 months ago
- 22 comments
Labels: enhancement
#3192 - Trouble with installation importing nltk
Issue -
State: closed - Opened by davidam 12 months ago
- 1 comment
#3191 - Potential Regex Denial of Service (ReDoS)
Issue -
State: open - Opened by ready-research almost 1 year ago
#3190 - minor fix for wordnet lemmatization pos param documentation
Pull Request -
State: closed - Opened by sharpblade4 about 1 year ago
- 1 comment
Labels: stem/lemma
#3189 - word_tokenize() Failed to Split English Contractions When Followed by [\t\n\f\r]
Issue -
State: closed - Opened by donglihe-hub about 1 year ago
- 3 comments
#3188 - Update Penn POS descriptions in chunkparser_app.py
Pull Request -
State: closed - Opened by nathanjmcdougall about 1 year ago
- 1 comment
Labels: GUI
#3187 - not download punkt
Issue -
State: open - Opened by NIRA02525 about 1 year ago
- 6 comments
#3186 - Missing English words in words()
Issue -
State: open - Opened by BaGRoS about 1 year ago
- 4 comments
#3185 - Download somehow blocked
Issue -
State: closed - Opened by sjkoelle about 1 year ago
- 1 comment
#3184 - In CoreNLPParser, how can I get output as different formats, e.g., 'wordsAndTags' or 'typedDependencies'
Issue -
State: open - Opened by Lopa07 about 1 year ago
#3183 - Refactoring
Pull Request -
State: closed - Opened by tosemml about 1 year ago
- 2 comments
Labels: corpus, classifier, metrics
#3182 - Formatargspec Warning in import line
Issue -
State: open - Opened by nvenkatcivil about 1 year ago
#3181 - edit_distance_align() in distance.py gives wrong alignment path when substitution_cost is greater than 2
Issue -
State: open - Opened by yzhaoinuw about 1 year ago
#3180 - Bug on edit distance align
Pull Request -
State: open - Opened by yzhaoinuw about 1 year ago
Labels: metrics
#3179 - Incorrect documentation in nltk.stem.lancaster.LancasterStemmer class
Issue -
State: open - Opened by Smeetp1234 about 1 year ago
#3178 - Lepor : A machine translation evaluation Metric.
Pull Request -
State: open - Opened by ulhaqi12 about 1 year ago
- 11 comments
Labels: enhancement, nice idea, translate
#3177 - I tried everything and still I get: [nltk_data] Error loading taggers: Package 'taggers' not found in [nltk_data] index
Issue -
State: closed - Opened by venturaEffect about 1 year ago
- 24 comments
Labels: installation
#3176 - Adding LEPOR - A machine translation evaluation metric.
Issue -
State: open - Opened by ulhaqi12 about 1 year ago
- 3 comments
#3175 - Update CITATION.cff metadata
Pull Request -
State: closed - Opened by cgobat about 1 year ago
- 5 comments
#3174 - Remove redundant function call (demo())
Pull Request -
State: closed - Opened by cootshk about 1 year ago
- 1 comment
#3173 - punkt model for Arabic
Issue -
State: open - Opened by abdollahpour over 1 year ago
#3172 - Align text.ConcordanceIndex.find_concordance()
Pull Request -
State: closed - Opened by BroMattMiller over 1 year ago
- 1 comment
#3171 - ImportError: mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')
Issue -
State: open - Opened by HuaYuXiao over 1 year ago
- 3 comments
#3170 - AttributeError: module 'numpy' has no attribute 'int'.
Issue -
State: open - Opened by Abonia1 over 1 year ago
- 3 comments
#3169 - AttributeError: module 'numpy' has no attribute 'int'.
Issue -
State: closed - Opened by Anandrajgit over 1 year ago
#3168 - '18. Averaged Perceptron Tagger' is free for commercial use?
Issue -
State: open - Opened by goseaplay over 1 year ago
#3167 - urlopen error [Errno 99] Cannot assign requested address>
Issue -
State: open - Opened by sunddytwo over 1 year ago
#3166 - Dear Jan Strunk
Issue -
State: closed - Opened by hiDevman over 1 year ago
- 2 comments
#3165 - a lot of NLTK DATA does not express their license
Issue -
State: open - Opened by hiDevman over 1 year ago
#3164 - Can not download stopwords corpus from Ubuntu 22.04 + OpenSSL 3.0.2
Issue -
State: open - Opened by rafasimionato over 1 year ago
#3163 - Broken GitHub Actions' Cache
Issue -
State: closed - Opened by zakkie over 1 year ago
- 3 comments
#3162 - Use efficient ngrams implementation from python docs
Pull Request -
State: closed - Opened by rmalouf over 1 year ago
- 1 comment
#3161 - Fix race condition for directory creation
Pull Request -
State: closed - Opened by zakkie over 1 year ago
- 1 comment
#3160 - dispersion plot shows incorrect data when multiple words selected
Issue -
State: open - Opened by SamuelSilverio123 over 1 year ago
- 2 comments
#3159 - MissingCorpusError
Issue -
State: open - Opened by Priya8888 over 1 year ago
#3158 - Make sure dispersion plot spans full text length
Pull Request -
State: closed - Opened by dlukes over 1 year ago
- 2 comments
#3157 - Uncontrolable token len reducing in Cyrillic texts
Issue -
State: open - Opened by NovikovMS over 1 year ago
- 1 comment
#3156 - Solve breaking issue with CharTokenizer
Pull Request -
State: closed - Opened by tomaarsen over 1 year ago
Labels: bug, tokenizer
#3155 - Class 'CharTokenizer' is missing attribute '_string'
Issue -
State: closed - Opened by Wilscos over 1 year ago
#3154 - 3.8.1: sphinx warnings `reference target not found`
Issue -
State: open - Opened by kloczek over 1 year ago
- 1 comment
Labels: documentation, enhancement
#3153 - Update punkt.py
Pull Request -
State: open - Opened by VirgisM over 1 year ago
- 1 comment
Labels: tokenizer
#3152 - CategorizedMarkdownCorpusReader.sections() does not return final section of markdown
Issue -
State: open - Opened by nkuehnle over 1 year ago
- 1 comment
#3151 - Allow resizing the nltk.download() GUI columns
Pull Request -
State: closed - Opened by tomaarsen over 1 year ago
Labels: bug
#3150 - Fix order of words in y-axis of dispersion_plot
Pull Request -
State: closed - Opened by chbrandt over 1 year ago
- 2 comments
#3149 - TclError resizing download dialog table column
Issue -
State: closed - Opened by E-Paine over 1 year ago
#3148 - added the option to change the wordnet's language
Pull Request -
State: closed - Opened by TiMauzi over 1 year ago
- 1 comment
#3147 - Add Swahili Stopwords
Issue -
State: open - Opened by OmondiVincent over 1 year ago
#3146 - not able to download the nltk data in my macbook
Issue -
State: open - Opened by maadhur over 1 year ago
- 7 comments
#3145 - Set reachable depth for generate
Pull Request -
State: closed - Opened by ekaf over 1 year ago
- 3 comments
Labels: parsing
#3144 - using nltk.txt for downloading stopwords
Issue -
State: open - Opened by ngingihy over 1 year ago
- 3 comments
#3143 - Detection of names with NLTK in Spanish
Issue -
State: closed - Opened by cporrasn over 1 year ago
- 1 comment