Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / bigscience-workshop/catalogue_data issues and pull requests
#67 - fix issue with config streamlit app
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#67 - fix issue with config streamlit app
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#66 - add a streamlit app to show PII logs
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#66 - add a streamlit app to show PII logs
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#65 - Multiprocessing with datasets in jsonl format
Pull Request -
State: open - Opened by HugoLaurencon over 2 years ago
#65 - Multiprocessing with datasets in jsonl format
Pull Request -
State: open - Opened by HugoLaurencon over 2 years ago
#64 - Execute pii on the whole oscar dataset
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#64 - Execute pii on the whole oscar dataset
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#63 - [WIP] add multiprocessing for pii
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#63 - [WIP] add multiprocessing for pii
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#62 - Add streamlit viewer app
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 1 comment
#62 - Add streamlit viewer app
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 1 comment
#61 - fixed typo in clean.py
Pull Request -
State: closed - Opened by TevenLeScao over 2 years ago
- 2 comments
#61 - fixed typo in clean.py
Pull Request -
State: closed - Opened by TevenLeScao over 2 years ago
- 2 comments
#60 - Making sure that things are sorted
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#60 - Making sure that things are sorted
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#59 - Concatenate ester dataset
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#59 - Concatenate ester dataset
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#58 - Generalise deduplicate pattern.
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#58 - Generalise deduplicate pattern.
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#57 - new way to simplify dedup url
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 2 comments
#57 - new way to simplify dedup url
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 2 comments
#56 - Make new experiment concerning filtering
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#56 - Make new experiment concerning filtering
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#55 - Replace filter with map
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#55 - Replace filter with map
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#54 - Fix vi sent tokenizer
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 1 comment
#54 - Fix vi sent tokenizer
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 1 comment
#53 - Fix stanza num proc dirty
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#53 - Fix stanza num proc dirty
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#52 - Fix stanza num proc
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#52 - Fix stanza num proc
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#51 - remove whitespace before checking for emptyness
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#51 - remove whitespace before checking for emptyness
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#50 - Generalise deduplication function
Pull Request -
State: open - Opened by thomasw21 over 2 years ago
#50 - Generalise deduplication function
Pull Request -
State: open - Opened by thomasw21 over 2 years ago
#49 - add sentence splitter functions
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 2 comments
#49 - add sentence splitter functions
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 2 comments
#48 - Update preprocessing key to use the new value from the google sheet
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#48 - Update preprocessing key to use the new value from the google sheet
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#47 - Add documentation.
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#47 - Add documentation.
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#46 - Code doesn't need to run deduplication script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 3 comments
#46 - Code doesn't need to run deduplication script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 3 comments
#45 - Remove unecessary deduplication
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#45 - Remove unecessary deduplication
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#44 - Add script to generate the columns for deduplication and short filter document
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#44 - Add script to generate the columns for deduplication and short filter document
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#43 - change way to compute the size of the text
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 3 comments
#43 - change way to compute the size of the text
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 3 comments
#42 - Make scripts robust to meta format
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#42 - Make scripts robust to meta format
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#41 - Add deduplication script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#41 - Add deduplication script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#40 - Make substring stripper regex faster
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#40 - Make substring stripper regex faster
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#39 - Fix to accurate logging
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#39 - Fix to accurate logging
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#38 - Non-Wikipedia Wikis Dedup script
Pull Request -
State: closed - Opened by cakiki over 2 years ago
- 1 comment
#38 - Non-Wikipedia Wikis Dedup script
Pull Request -
State: closed - Opened by cakiki over 2 years ago
- 1 comment
#37 - Accurate size modification logging
Pull Request -
State: closed - Opened by TevenLeScao over 2 years ago
#37 - Accurate size modification logging
Pull Request -
State: closed - Opened by TevenLeScao over 2 years ago
#36 - Add deduplication on url level
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 2 comments
#36 - Add deduplication on url level
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 2 comments
#35 - Short document filter in byte
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#35 - Short document filter in byte
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#34 - Compile regex
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#34 - Compile regex
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#33 - remove whitespace, numbers and punctuation before hashing
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#33 - remove whitespace, numbers and punctuation before hashing
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#32 - Remove short lines
Pull Request -
State: open - Opened by thomasw21 over 2 years ago
#32 - Remove short lines
Pull Request -
State: open - Opened by thomasw21 over 2 years ago
#31 - add more line filters
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 1 comment
#31 - add more line filters
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 1 comment
#30 - Add substring remover mapper
Pull Request -
State: closed - Opened by cakiki over 2 years ago
- 1 comment
#30 - Add substring remover mapper
Pull Request -
State: closed - Opened by cakiki over 2 years ago
- 1 comment
#29 - Let's save json when we need to
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#29 - Let's save json when we need to
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#28 - Opentiti fix
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 2 comments
#28 - Opentiti fix
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
- 2 comments
#27 - add "[if" and "<script" to list of excluded lines
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#27 - add "[if" and "<script" to list of excluded lines
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#26 - Deduplication document
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#26 - Deduplication document
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#25 - Use MD5 to obtain persistent hash
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#25 - Use MD5 to obtain persistent hash
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#24 - Test for wikis filters
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 1 comment
#24 - Test for wikis filters
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 1 comment
#23 - Remove excessive duplicates
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 2 comments
#23 - Remove excessive duplicates
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 2 comments
#22 - Curly fix
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#22 - Curly fix
Pull Request -
State: closed - Opened by lvwerra over 2 years ago
#21 - Slurm script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#21 - Slurm script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
#20 - Allow deduplication scripts to be added to the preprocessing script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#20 - Allow deduplication scripts to be added to the preprocessing script
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 1 comment
#19 - Add feature to see the modified examples by a map operation
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 2 comments
#19 - Add feature to see the modified examples by a map operation
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
- 2 comments
#18 - Allow no maps or filters
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 2 comments
#18 - Allow no maps or filters
Pull Request -
State: closed - Opened by thomasw21 over 2 years ago
- 2 comments