Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / oscar-project/ungoliant issues and pull requests

#27 - Feature: Multilingual documents

Issue - State: closed - Opened by Uinelj about 3 years ago

#25 - Feature: Header/Footer annotation

Issue - State: closed - Opened by Uinelj about 3 years ago

#24 - zipf law data generation for corpus validity checking

Pull Request - State: closed - Opened by Uinelj about 3 years ago - 1 comment

#23 - add word counter for zipf law evaluation

Pull Request - State: closed - Opened by Uinelj about 3 years ago - 1 comment

#22 - Revert "[WIP] Document-level generation & filtering"

Pull Request - State: closed - Opened by Uinelj about 3 years ago

#21 - [Feature request] Train a classifier to better classify languages

Issue - State: open - Opened by Muhtasham about 3 years ago - 2 comments
Labels: enhancement, help wanted, good first issue

#19 - [WIP] Document-level generation & filtering

Pull Request - State: closed - Opened by Uinelj about 3 years ago - 1 comment

#18 - [WIP] OSCAR rebuilding

Pull Request - State: closed - Opened by Uinelj about 3 years ago - 2 comments

#17 - Publish on crates.io

Issue - State: closed - Opened by Uinelj about 3 years ago - 1 comment

#16 - Dev update warc

Pull Request - State: closed - Opened by Uinelj over 3 years ago - 1 comment

#15 - feat(package): add ability to move rather than copy files

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#14 - Add json schema export on pipeline

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#13 - Ungoliant v1.0.0

Pull Request - State: closed - Opened by Uinelj over 3 years ago - 1 comment

#12 - feat(packaging): add sorting of files into proper lang folders

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#11 - feat(compress): add early compress feature

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#10 - feat(metadata): change format to jsonlines

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#9 - feat(split): add offline splitting

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#8 - Dev dedup offline

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#7 - [WIP] File splitting

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#6 - Metadata

Pull Request - State: closed - Opened by Uinelj over 3 years ago

#5 - Pipeline

Pull Request - State: closed - Opened by Uinelj over 3 years ago
Labels: enhancement

#4 - Feature: Failures handling

Issue - State: closed - Opened by Uinelj over 3 years ago - 1 comment
Labels: enhancement

#3 - Feature: Pipeline and Benchmarking

Issue - State: closed - Opened by Uinelj over 3 years ago
Labels: enhancement, good first issue

#2 - Dev failures

Pull Request - State: closed - Opened by Uinelj over 3 years ago