Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / thammegowda/mtdata issues and pull requests

#162 - Better error messages for better UX

Issue - State: open - Opened by SamuelLarkin 6 months ago

#161 - v0.4.2 (wmt24)

Pull Request - State: closed - Opened by thammegowda 8 months ago

#160 - Bump requests from 2.31.0 to 2.32.0

Pull Request - State: open - Opened by dependabot[bot] 8 months ago
Labels: dependencies

#159 - Depend on external lib for language standardization

Issue - State: open - Opened by AlexUmnov 9 months ago

#158 - typo: possibly met to say SGM and not TMX

Pull Request - State: closed - Opened by SamuelLarkin 9 months ago

#157 - Allow strict langpair ordering

Issue - State: open - Opened by erip 10 months ago - 1 comment

#156 - test

Pull Request - State: closed - Opened by mmpython111 10 months ago

#155 - branch_test

Pull Request - State: closed - Opened by sifan0067 10 months ago

#154 - Adding missing kab_DZ

Pull Request - State: open - Opened by BoFFire 12 months ago

#153 - Update Tatoeba corpus

Issue - State: open - Opened by jeanm about 1 year ago

#152 - Add TALPCo

Issue - State: open - Opened by kpu over 1 year ago
Labels: Dataset-add

#151 - Add Thai-English parallel corpus "scb-mt-en-th-2020"

Issue - State: open - Opened by kpu over 1 year ago
Labels: Dataset-add

#150 - Bump requests from 2.26.0 to 2.31.0

Pull Request - State: closed - Opened by dependabot[bot] over 1 year ago
Labels: dependencies

#149 - v0.4.1

Pull Request - State: closed - Opened by thammegowda almost 2 years ago

#148 - How to add in missing parts of datasets

Issue - State: closed - Opened by arvieFrydenlund almost 2 years ago - 4 comments
Labels: WMT

#147 - Index store bibkey and not the bibtext content

Issue - State: closed - Opened by thammegowda almost 2 years ago - 2 comments
Labels: enhancement

#146 - v0.4.0

Pull Request - State: closed - Opened by thammegowda almost 2 years ago

#145 - Add Flores 200

Pull Request - State: closed - Opened by ZenBel almost 2 years ago - 1 comment

#144 - Add NTREX-128

Issue - State: closed - Opened by thammegowda almost 2 years ago - 1 comment
Labels: Dataset-add

#143 - Dataset Add: JParaCrawl Chinese-Japanese

Pull Request - State: closed - Opened by BrightXiaoHan about 2 years ago - 1 comment

#142 - Add Samanantar datasets.

Issue - State: closed - Opened by BrightXiaoHan about 2 years ago - 3 comments

#141 - Faster downloads with multiple streams

Issue - State: open - Opened by thammegowda about 2 years ago
Labels: enhancement

#140 - Add support for monolingual data

Issue - State: closed - Opened by thammegowda about 2 years ago - 1 comment
Labels: enhancement

#139 - Add `echo` task

Issue - State: closed - Opened by thammegowda about 2 years ago - 1 comment
Labels: enhancement

#138 - Opus update + elrc datasets

Pull Request - State: closed - Opened by AlexUmnov about 2 years ago - 1 comment

#137 - No such file or directory: '..../mtdata/index/allenai_nllb.json'

Issue - State: closed - Opened by thammegowda about 2 years ago - 2 comments
Labels: bug

#136 - Travis build is broken

Issue - State: closed - Opened by thammegowda about 2 years ago - 1 comment

#135 - Update opus index and add new datasets to ELRC

Pull Request - State: closed - Opened by ZenBel about 2 years ago - 1 comment

#134 - AllenAi nllb dataset (excluding ccmatrix)

Pull Request - State: closed - Opened by AlexUmnov about 2 years ago - 1 comment

#133 - Add `allenai/nllb` dataset

Issue - State: closed - Opened by ZenBel about 2 years ago - 2 comments

#132 - Not all available `ELRC` datasets are downloaded from OPUS

Issue - State: closed - Opened by ZenBel about 2 years ago - 2 comments

#131 - Add ebible corpus

Issue - State: open - Opened by joelthe1 over 2 years ago
Labels: Dataset-add

#130 - CVE-2007-4559 Patch

Pull Request - State: closed - Opened by TrellixVulnTeam over 2 years ago

#129 - Is there a way to see the dataset size before starting the download

Issue - State: open - Opened by XapaJIaMnu over 2 years ago - 5 comments

#128 - Add MaCoCu corpora

Issue - State: open - Opened by ZJaume over 2 years ago
Labels: Dataset-add

#127 - Add mni-eng parallel data

Issue - State: open - Opened by kpu over 2 years ago
Labels: Dataset-add

#126 - Add gn-es parallel data

Issue - State: open - Opened by kpu over 2 years ago
Labels: Dataset-add

#125 - 0.3.8

Pull Request - State: closed - Opened by thammegowda over 2 years ago

#124 - Update ELRC-SHARE data

Pull Request - State: closed - Opened by thammegowda over 2 years ago

#123 - Update ELRC-SHARE data

Pull Request - State: closed - Opened by kpu over 2 years ago - 1 comment

#122 - Return non-zero on error

Issue - State: closed - Opened by kpu over 2 years ago - 1 comment

#121 - Add EU acts in Ukrainian

Issue - State: closed - Opened by thammegowda over 2 years ago - 1 comment
Labels: Dataset-add

#120 - [WIP] 0.3.7 development

Pull Request - State: closed - Opened by thammegowda over 2 years ago

#119 - AI4Bharath link is down

Issue - State: open - Opened by thammegowda over 2 years ago - 1 comment
Labels: broken-link

#117 - Fixed a bug in KECL JParaCrawl v3 extraction used in WMT22 en-ja translation task

Pull Request - State: closed - Opened by de9uch1 over 2 years ago - 2 comments

#116 - Cannot Download wmt21 en2zh test data

Issue - State: open - Opened by Pzzzzz5142 over 2 years ago - 5 comments

#115 - Update cache.py

Pull Request - State: closed - Opened by jgwinnup almost 3 years ago - 2 comments

#114 - Trying to use mtdata with python

Issue - State: closed - Opened by MathieuGrosso almost 3 years ago - 5 comments

#113 - Add ParaCrawl Ukranian bonus

Issue - State: closed - Opened by kpu almost 3 years ago

#112 - [WIP] v0.3.6

Pull Request - State: closed - Opened by thammegowda almost 3 years ago

#111 - ELRC updates: unban some fixed TMXes, add more Irish

Pull Request - State: closed - Opened by kpu almost 3 years ago

#110 - [WIP] v0.3.5

Pull Request - State: closed - Opened by thammegowda almost 3 years ago

#109 - CCMatrix?

Issue - State: closed - Opened by kpu almost 3 years ago - 1 comment

#107 - Parallel Corpora for 6 Indian Languages

Issue - State: open - Opened by kpu almost 3 years ago - 2 comments
Labels: Dataset-add

#106 - Lanfrica

Issue - State: open - Opened by kpu almost 3 years ago
Labels: Dataset-add, epic

#105 - added histogram

Pull Request - State: closed - Opened by sgowdaks almost 3 years ago

#104 - Add visualizations in search results

Issue - State: open - Opened by thammegowda almost 3 years ago

#103 - WIP 0.3.5

Pull Request - State: closed - Opened by thammegowda almost 3 years ago

#102 - Add Fon-French 2 and Daily Dialogues

Pull Request - State: closed - Opened by kpu almost 3 years ago

#100 - v0.3.4 WIP

Pull Request - State: closed - Opened by thammegowda about 3 years ago

#99 - ELRC update

Pull Request - State: closed - Opened by kpu about 3 years ago

#98 - Policy on BCP-47 in TMX files?

Issue - State: open - Opened by kpu about 3 years ago - 2 comments
Labels: question

#97 - ELRC-euipo_law-1-eng-fra hits 403 (forbidden)

Issue - State: closed - Opened by XapaJIaMnu about 3 years ago - 2 comments

#96 - Fix #95 by updating TMX names used on ELRC-SHARE for ELRC-swedish_wor…

Pull Request - State: closed - Opened by kpu about 3 years ago - 1 comment

#95 - ELRC-swedish_work_environment-1-eng-fra doesn't work

Issue - State: closed - Opened by XapaJIaMnu about 3 years ago - 1 comment

#94 - mtdata get ignores the language pair when the dataset has only one language pair

Issue - State: closed - Opened by XapaJIaMnu about 3 years ago - 1 comment
Labels: bug

#93 - Fix buggy matching of languages.

Pull Request - State: closed - Opened by XapaJIaMnu about 3 years ago

#92 - Non-fuzzy match mtdata is broken due to comparing `y1==y1`

Issue - State: closed - Opened by XapaJIaMnu about 3 years ago

#91 - mtdata list : filters

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: enhancement

#90 - Add wmt21 tests

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: Dataset-add

#89 - Add wmt21 ccaligned datasets

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: Dataset-add

#88 - Add ParIce dataset (en-is)

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: Dataset-add

#87 - Add wmt21 ha-en corpus

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: Dataset-add

#86 - Add wikititles v3

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: Dataset-add

#85 - v0.3.3

Pull Request - State: closed - Opened by thammegowda about 3 years ago

#84 - Add datasets listed by Stanford NMT

Issue - State: closed - Opened by thammegowda about 3 years ago - 4 comments
Labels: Dataset-add

#83 - 0.3.2 -

Pull Request - State: closed - Opened by thammegowda about 3 years ago

#82 - recipes.yml is not packed in pip package

Issue - State: closed - Opened by thammegowda about 3 years ago
Labels: bug

#81 - Anuvaad Parallel Corpus for Indian languages

Issue - State: closed - Opened by GokulNC about 3 years ago - 1 comment

#80 - Add parallel bible corpus

Issue - State: open - Opened by thammegowda over 3 years ago - 1 comment
Labels: invalid

#79 - v0.3.1

Pull Request - State: closed - Opened by thammegowda over 3 years ago

#78 - Reading from tarfiles without extracting them is slow

Issue - State: closed - Opened by thammegowda over 3 years ago

#77 - JW300 taken down from OPUS

Issue - State: open - Opened by kpu over 3 years ago - 3 comments

#76 - More datasets https://martinweisser.org/corpora_site/corpora2.html

Issue - State: open - Opened by thammegowda over 3 years ago
Labels: Dataset-add

#75 - Keep datasets compressed

Pull Request - State: closed - Opened by thammegowda over 3 years ago

#74 - BCP47 parsing : (language, script, country)

Pull Request - State: closed - Opened by thammegowda over 3 years ago

#73 - ParaCrawl 9 is out

Issue - State: closed - Opened by kpu over 3 years ago

#72 - [WIP] v0.3.0

Pull Request - State: closed - Opened by thammegowda over 3 years ago

#71 - Force utf-8 encoding (be explicit)

Issue - State: closed - Opened by thammegowda over 3 years ago

#70 - JW300 v1c

Issue - State: closed - Opened by kpu over 3 years ago - 4 comments

#69 - Licence info for datasets

Issue - State: open - Opened by thammegowda over 3 years ago - 1 comment

#68 - Support dataset caching on a server

Issue - State: open - Opened by thammegowda over 3 years ago - 4 comments
Labels: enhancement

#67 - Add CAMeL Arabic resources

Issue - State: open - Opened by kpu over 3 years ago - 1 comment
Labels: Dataset-add

#66 - Casmacat is down

Issue - State: closed - Opened by ZJaume over 3 years ago - 2 comments

#65 - Language code inacurrate for chinese languages

Issue - State: closed - Opened by kirianguiller over 3 years ago - 3 comments

#64 - The variable versions for one langauge is not avilable

Issue - State: closed - Opened by pluiez over 3 years ago - 5 comments

#63 - The wikititles is incomplete

Issue - State: closed - Opened by pluiez over 3 years ago - 2 comments