Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / oscar-project/ungoliant issues and pull requests

#132 - [BUG] download malfunctioning

Issue - State: closed - Opened by kargaranamir 8 months ago - 1 comment
Labels: bug

#131 - [BUG] corrupt deflate stream

Issue - State: open - Opened by kargaranamir 8 months ago
Labels: bug

#130 - [BUG] UnexpectedEof While running Ungoliant Pipeline

Issue - State: closed - Opened by nattkorat 9 months ago - 3 comments
Labels: bug

#129 - Updated dependencies

Pull Request - State: open - Opened by pjox 12 months ago

#128 - chore(deps): bump rustix from 0.37.19 to 0.37.25

Pull Request - State: open - Opened by dependabot[bot] about 1 year ago
Labels: dependencies

#127 - chore(deps): bump webpki from 0.22.0 to 0.22.2

Pull Request - State: open - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#126 - chore(deps): bump sha2 from 0.9.9 to 0.10.8

Pull Request - State: open - Opened by dependabot[bot] about 1 year ago - 1 comment
Labels: dependencies

#125 - lint fixes

Pull Request - State: open - Opened by chris-ha458 about 1 year ago

#124 - Optional corpus checksum + have language folders rather than flat files

Pull Request - State: closed - Opened by Uinelj about 1 year ago - 1 comment

#123 - [Feature request] Add larger timeouts between 503 retries for CC download

Issue - State: open - Opened by Uinelj over 1 year ago
Labels: enhancement

#122 - chore(deps): bump rustls-webpki from 0.100.1 to 0.100.2

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#121 - chore(deps): bump sha2 from 0.9.9 to 0.10.7

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: dependencies

#120 - chore(deps): bump oscar-io from 0.2.4 to 0.4.0

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#119 - chore(deps): bump env_logger from 0.8.4 to 0.9.3

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#118 - chore(deps): bump sha-1 from 0.9.8 to 0.10.1

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#117 - chore(deps): bump log from 0.4.18 to 0.4.20

Pull Request - State: open - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#116 - Add Dockerfile

Pull Request - State: open - Opened by Uinelj over 1 year ago - 2 comments

#115 - ci: add cargo dist CI file and Cargo.toml configuration

Pull Request - State: closed - Opened by Uinelj over 1 year ago - 1 comment

#114 - Add splitting and compression at ungoliant runtime

Pull Request - State: closed - Opened by Uinelj over 1 year ago - 1 comment

#113 - chore(deps): bump tokio-util from 0.6.10 to 0.7.8

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#112 - chore(deps): bump tokio from 1.28.2 to 1.29.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#110 - WIP 3.0.0

Pull Request - State: open - Opened by Uinelj over 1 year ago - 1 comment

#109 - Change sha256 to sha384

Pull Request - State: closed - Opened by chris-ha458 over 1 year ago - 2 comments

#108 - [Feature request] Secure against length extension attacks

Issue - State: open - Opened by chris-ha458 over 1 year ago - 4 comments
Labels: enhancement

#107 - Add information about other langID models

Pull Request - State: closed - Opened by Uinelj over 1 year ago - 3 comments

#106 - [Feature request] Document how to set fasttext model

Issue - State: open - Opened by chris-ha458 over 1 year ago - 2 comments
Labels: enhancement

#105 - Download errors with Ungoliant

Pull Request - State: closed - Opened by pjox over 1 year ago - 1 comment

#104 - Fix spelling in readme

Pull Request - State: closed - Opened by Force1ess over 1 year ago - 2 comments

#103 - Fix language tags on NLLB model

Pull Request - State: closed - Opened by Uinelj over 1 year ago - 1 comment

#102 - chore(deps): bump csv from 1.2.0 to 1.2.2

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#101 - chore(deps): bump h2 from 0.3.15 to 0.3.17

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: dependencies

#100 - chore(deps): bump tokio-util from 0.6.10 to 0.7.7

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#99 - chore(deps): bump csv from 1.2.0 to 1.2.1

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#98 - chore(deps): bump futures-util from 0.3.26 to 0.3.28

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: dependencies

#97 - chore(deps): bump futures-core from 0.3.26 to 0.3.28

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 1 comment
Labels: dependencies

#96 - chore(deps): bump tokio from 1.25.0 to 1.26.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago - 2 comments
Labels: good first issue, dependencies

#95 - fix "No language tag found" on NLLB tags

Pull Request - State: closed - Opened by Uinelj over 1 year ago - 1 comment

#94 - [BUG] Error when downloading full CC snapshot

Issue - State: closed - Opened by ngan-nt almost 2 years ago - 4 comments
Labels: bug

#93 - [BUG] Deduplication with Ungoliant

Issue - State: open - Opened by Hammamwa47 almost 2 years ago - 1 comment
Labels: bug

#92 - Automatically add binaries on releases

Issue - State: open - Opened by Uinelj almost 2 years ago
Labels: enhancement, good first issue

#91 - Improve coverage

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#90 - fix tarpaulin CI

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#89 - chore(deps): bump tokio from 1.17.0 to 1.18.5

Pull Request - State: closed - Opened by dependabot[bot] almost 2 years ago - 1 comment
Labels: dependencies

#88 - Option to keep documents that can't be identified

Issue - State: open - Opened by Uinelj almost 2 years ago - 1 comment
Labels: enhancement

#87 - test: uncomment tests and add one more

Pull Request - State: closed - Opened by Uinelj almost 2 years ago

#86 - Move TLSH out of annotations

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#85 - Change `annotation` to `quality_warnings`

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#84 - chore(deps): bump bumpalo from 3.9.1 to 3.12.0

Pull Request - State: closed - Opened by dependabot[bot] almost 2 years ago - 2 comments
Labels: dependencies

#83 - Move IO out of Ungoliant

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#82 - refactor: remove old pipelines, old io code and old langtags

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#81 - tests: add tests for quality at a glance langtag mismatches

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#80 - Removal of custom domain blocklists from the CLI

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#79 - Blocklists checklist

Issue - State: closed - Opened by Uinelj almost 2 years ago

#78 - chore(deps): bump tokio from 1.17.0 to 1.18.4

Pull Request - State: closed - Opened by dependabot[bot] almost 2 years ago - 1 comment
Labels: dependencies

#77 - [BUG] Cannot install via cargo

Issue - State: closed - Opened by new5558 almost 2 years ago - 3 comments
Labels: bug

#76 - feat(blocklists): ability to use multiple blocklists

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#75 - Configuration file for `ungoliant pipeline`

Issue - State: open - Opened by Uinelj almost 2 years ago

#74 - Avoid creating Blocklist for each shard

Issue - State: closed - Opened by Uinelj almost 2 years ago - 1 comment
Labels: good first issue

#73 - Fix coverage

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 1 comment

#72 - KenLM based content detection

Pull Request - State: closed - Opened by Uinelj almost 2 years ago - 2 comments

#71 - Fix bug in MeanLength filter

Pull Request - State: closed - Opened by sadra-barikbin about 2 years ago - 1 comment

#70 - Bug in `MeanLength` filter

Issue - State: closed - Opened by sadra-barikbin about 2 years ago - 1 comment

#69 - Locality sensitive hashing annotation

Pull Request - State: closed - Opened by Uinelj about 2 years ago - 1 comment

#67 - ci: begin testing workflow with caching

Pull Request - State: closed - Opened by Uinelj about 2 years ago - 1 comment

#66 - Cache dependencies

Issue - State: open - Opened by Uinelj about 2 years ago - 1 comment

#65 - Validate/Fix rebuilding for OSCAR Doc

Pull Request - State: closed - Opened by Uinelj about 2 years ago - 3 comments

#64 - [BUG] No hard fail when blocklist path is invalid

Issue - State: open - Opened by Uinelj about 2 years ago
Labels: bug

#63 - Question about the -o <offset> option in download

Issue - State: closed - Opened by TristanThrush about 2 years ago - 1 comment

#62 - Ungoliant 2

Pull Request - State: closed - Opened by Uinelj about 2 years ago - 1 comment

#61 - Add integration testing for rebuild files with oscardoc pipeline

Issue - State: closed - Opened by Uinelj about 2 years ago - 1 comment

#60 - Automate release and deployment to crates.io

Issue - State: open - Opened by Uinelj about 2 years ago - 1 comment

#59 - Rename `master` branch to `main` and protect it

Issue - State: open - Opened by Uinelj about 2 years ago

#58 - Handle dependabot vulnerabilities

Issue - State: closed - Opened by Uinelj about 2 years ago

#57 - Dynamic language tag handling

Pull Request - State: closed - Opened by Uinelj about 2 years ago - 1 comment

#56 - [Feature request] Pipeline remove download file after process and extract single language

Issue - State: open - Opened by acul3 over 2 years ago - 3 comments
Labels: enhancement

#55 - Feature `std_rng` depends on `rand_hc` which is not an optional dependency

Issue - State: closed - Opened by DavidNemeskey over 2 years ago - 2 comments
Labels: bug

#54 - chore(deps): bump regex from 1.5.4 to 1.5.6

Pull Request - State: closed - Opened by dependabot[bot] over 2 years ago - 2 comments
Labels: dependencies

#53 - [BUG] Chavacano marked as "cbr" rather than cbk

Issue - State: open - Opened by Uinelj over 2 years ago
Labels: bug

#52 - Update download.rs Change BASE_URL to new address

Pull Request - State: closed - Opened by qhduan over 2 years ago - 2 comments

#51 - feat(avro): add simple avro writer

Pull Request - State: open - Opened by Uinelj over 2 years ago - 1 comment

#50 - [Question] Different multilingual identification methods

Issue - State: closed - Opened by codedecde over 2 years ago - 2 comments
Labels: enhancement

#49 - tests: add newline test

Pull Request - State: closed - Opened by Uinelj over 2 years ago - 1 comment

#48 - [BUG] ungoliant::io::reader::corpus] [<lang>] no text/meta file.

Issue - State: closed - Opened by kirianguiller over 2 years ago - 3 comments
Labels: bug

#47 - [Feature request] Controling the number of thread being used

Issue - State: open - Opened by kirianguiller over 2 years ago - 3 comments
Labels: enhancement

#46 - feat(blocklist): make blocklist optional and improve error messages

Pull Request - State: closed - Opened by Uinelj over 2 years ago - 1 comment

#45 - Fix Cargo.toml and bump version

Pull Request - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#44 - Fix Cargo.toml errors for crates.io publishing

Issue - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#43 - Revamp the error reporting

Issue - State: open - Opened by Uinelj almost 3 years ago - 1 comment
Labels: enhancement

#42 - fix: crash on absence of an empty rebuild folder

Pull Request - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#41 - [BUG] Pipeline command not working

Issue - State: closed - Opened by kirianguiller almost 3 years ago - 5 comments
Labels: bug

#40 - [BUG] Cargo install of Ungoliant not working

Issue - State: closed - Opened by kirianguiller almost 3 years ago - 2 comments
Labels: bug

#37 - Ungoliant v1.1.0

Pull Request - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#36 - Bug: Bad computation of Identification probability score

Issue - State: closed - Opened by Uinelj almost 3 years ago
Labels: bug

#35 - Pipeline operations order

Pull Request - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#34 - Improve operation order in pipeline

Issue - State: closed - Opened by Uinelj almost 3 years ago - 2 comments

#33 - Feature: Add retry option on downloader

Issue - State: open - Opened by Uinelj almost 3 years ago
Labels: enhancement

#31 - Noisy annotation

Pull Request - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#30 - feat(multilingual): add test function for multilinguality

Pull Request - State: closed - Opened by Uinelj almost 3 years ago - 1 comment

#29 - feat(header/footer): add short header/footer annotator

Pull Request - State: closed - Opened by Uinelj about 3 years ago - 1 comment

#28 - Feature: Zipflike validation on documents at character-level

Issue - State: closed - Opened by Uinelj about 3 years ago - 1 comment