Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / bigscience-workshop/metadata issues and pull requests
#198 - Prefix special tokens
Pull Request -
State: open - Opened by jordiclive over 1 year ago
#197 - feat: create a merged vld set of designated metadata
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#196 - Train june
Pull Request -
State: open - Opened by jordiclive over 1 year ago
#195 - feat: enable generation_length_text
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#194 - feat: add local special tokens for HTML
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#193 - fix: avoid the situation of inf loss * weight → nan
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#192 - Eval loop
Pull Request -
State: open - Opened by jordiclive over 1 year ago
#191 - Update tokenizer in evaluation
Pull Request -
State: closed - Opened by manandey over 1 year ago
#190 - Set logits of entity related special tokens to -infinity
Pull Request -
State: closed - Opened by manandey over 1 year ago
#189 - Configurable html sample rate, and without metada same context option.
Pull Request -
State: closed - Opened by jordiclive over 1 year ago
- 1 comment
#188 - Update metadata_utils.py
Pull Request -
State: open - Opened by jordiclive over 1 year ago
#187 - Update metadata_utils.py
Pull Request -
State: open - Opened by jordiclive over 1 year ago
#186 - fix: args and defaults
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#185 - feat: update eval script
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#184 - fix issue with embedding size too small
Pull Request -
State: closed - Opened by jordiclive over 1 year ago
#183 - feat: v2.yaml for the resampled training set
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#182 - Patch 3
Pull Request -
State: closed - Opened by tianjianjiang over 1 year ago
#181 - Add prompting baseline eval
Pull Request -
State: closed - Opened by ppommer over 1 year ago
#180 - Fix stuff and add new readme
Pull Request -
State: closed - Opened by cccntu over 1 year ago
#179 - Changes for eval
Pull Request -
State: closed - Opened by Muennighoff almost 2 years ago
#178 - Add loss plotting
Pull Request -
State: closed - Opened by Muennighoff almost 2 years ago
#177 - debug code (WIP)
Pull Request -
State: open - Opened by cccntu almost 2 years ago
#176 - Update train.py
Pull Request -
State: closed - Opened by Muennighoff almost 2 years ago
#175 - Fix mask bug
Pull Request -
State: closed - Opened by cccntu almost 2 years ago
#174 - Fix code quality tests
Pull Request -
State: closed - Opened by manandey almost 2 years ago
#173 - Add CM3 loss
Pull Request -
State: open - Opened by masoudjs almost 2 years ago
- 2 comments
#172 - test evaluation script
Pull Request -
State: closed - Opened by cccntu almost 2 years ago
#171 - evaluation script debugging
Pull Request -
State: closed - Opened by cccntu almost 2 years ago
#170 - ci: pin Python, Ubuntu, & GH Action versions
Pull Request -
State: closed - Opened by tianjianjiang almost 2 years ago
- 1 comment
#169 - Fix eval script
Pull Request -
State: closed - Opened by ppommer almost 2 years ago
- 2 comments
#168 - Minor updates in evaluation script
Pull Request -
State: closed - Opened by manandey about 2 years ago
#167 - fix: ms-timestamp conversion
Pull Request -
State: closed - Opened by tianjianjiang about 2 years ago
#166 - build: upgrade transformers for a consistent version of huggingface_hub
Pull Request -
State: closed - Opened by tianjianjiang about 2 years ago
Labels: bug
#165 - Add evaluation pipeline
Pull Request -
State: closed - Opened by ppommer about 2 years ago
- 4 comments
#164 - Updates
Pull Request -
State: closed - Opened by cccntu about 2 years ago
- 2 comments
#163 - Fix file list
Pull Request -
State: closed - Opened by cccntu about 2 years ago
#162 - Refactor a util function
Pull Request -
State: closed - Opened by cccntu over 2 years ago
#161 - add special tokens for entities
Pull Request -
State: closed - Opened by manandey over 2 years ago
- 1 comment
#160 - feat: log the average of the loss rather than the value of the main process
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#159 - fix: timestamp precision
Pull Request -
State: closed - Opened by tianjianjiang over 2 years ago
Labels: bug
#158 - Fix streaming mode
Pull Request -
State: closed - Opened by cccntu over 2 years ago
#157 - add new configs for entity_paragraph
Pull Request -
State: closed - Opened by manandey over 2 years ago
- 1 comment
#156 - new configuration for HTML tags
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#155 - Finalizing for big training
Pull Request -
State: closed - Opened by cccntu over 2 years ago
- 2 comments
#154 - Revert "Filter examples by `num_chars` to include in a batch (#137)"
Pull Request -
State: closed - Opened by manandey over 2 years ago
#153 - Add separate EntityParagraph processor
Pull Request -
State: closed - Opened by manandey over 2 years ago
#152 - Adapt code to use new data format
Pull Request -
State: closed - Opened by cccntu over 2 years ago
- 3 comments
Labels: #dataset
#151 - feat: clean up website desc.
Issue -
State: closed - Opened by tianjianjiang over 2 years ago
Labels: #dataset
#150 - feat: tag clean website desc., entity paragraph, and title
Pull Request -
State: closed - Opened by tianjianjiang over 2 years ago
- 1 comment
Labels: #dataset
#149 - feat: add paragraph-entity metadata
Issue -
State: closed - Opened by tianjianjiang over 2 years ago
#148 - feat: add title
Issue -
State: closed - Opened by tianjianjiang over 2 years ago
- 1 comment
Labels: #dataset
#147 - Post processing website desc
Pull Request -
State: closed - Opened by shanyas10 over 2 years ago
- 1 comment
#146 - add example that build a dataset
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#145 - feat: add paragraphs
Pull Request -
State: closed - Opened by tianjianjiang over 2 years ago
#144 - Entity at paragraph level
Pull Request -
State: closed - Opened by manandey over 2 years ago
Labels: #dataset
#143 - Additional changes to test the entities extraction
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#142 - [WIP] entities extraction tentative 2
Pull Request -
State: closed - Opened by SaulLu over 2 years ago
#141 - Create evaluation_utils.py
Pull Request -
State: closed - Opened by shanyas10 over 2 years ago
#140 - test gpt2-xl
Pull Request -
State: closed - Opened by cccntu over 2 years ago
- 3 comments
#139 - build: sync setup.py defined dependencies and fix broken ones
Pull Request -
State: closed - Opened by tianjianjiang over 2 years ago
- 1 comment
Labels: bug
#138 - Common simple eval function to calculate ppl
Issue -
State: open - Opened by shanyas10 almost 3 years ago
#137 - Filter examples by `num_chars` to include in a batch
Pull Request -
State: closed - Opened by manandey almost 3 years ago
- 7 comments
#136 - Fix accelerate not using multi-GPU
Pull Request -
State: closed - Opened by cccntu almost 3 years ago
- 1 comment
#135 - Update add_metadata.py
Pull Request -
State: closed - Opened by manandey almost 3 years ago
- 1 comment
#134 - `Title` preprocessor
Pull Request -
State: closed - Opened by manandey almost 3 years ago
#133 - Add `title` metadata processor
Pull Request -
State: closed - Opened by manandey almost 3 years ago
#132 - build: pin datasets to 1.17.0
Pull Request -
State: closed - Opened by tianjianjiang almost 3 years ago
Labels: bug
#131 - Add script to convert the dataset in compressed jsonlines files
Pull Request -
State: closed - Opened by SaulLu almost 3 years ago
#130 - build: bump nltk to 3.6.7 for security and performance
Pull Request -
State: closed - Opened by tianjianjiang almost 3 years ago
Labels: bug
#128 - feat: mark paragraphs by metadata-html #125
Pull Request -
State: closed - Opened by tianjianjiang almost 3 years ago
Labels: #paragraph_extraction
#127 - Add filters to `HtmlProcessor`
Pull Request -
State: closed - Opened by SaulLu almost 3 years ago
- 1 comment
#126 - Remove entity description
Pull Request -
State: closed - Opened by manandey almost 3 years ago
- 1 comment
#125 - feat: HTML scanner for text content & content sectioning elements → segment paragraphs
Issue -
State: closed - Opened by tianjianjiang almost 3 years ago
Labels: #paragraph_extraction
#124 - Create Dataset with metadata
Issue -
State: open - Opened by SaulLu almost 3 years ago
Labels: #dataset, Epic
#123 - Add fp16, multi-GPU training script (toy dataset)
Pull Request -
State: closed - Opened by cccntu almost 3 years ago
#110 - Which HTML tags should be used during training?
Issue -
State: closed - Opened by norakassner almost 3 years ago
- 1 comment
Labels: duplicate
#108 - Evaluation bias
Issue -
State: open - Opened by norakassner almost 3 years ago
- 1 comment
#103 - Add code to sampling multiple metadata
Pull Request -
State: closed - Opened by cccntu almost 3 years ago
- 1 comment
#100 - Discuss style evaluation for website description and data source with Anna
Issue -
State: open - Opened by norakassner almost 3 years ago
#99 - Evaluation toxicity for website description and data source
Issue -
State: open - Opened by norakassner almost 3 years ago
- 1 comment
#98 - data analysis: website description (quality and yield)
Issue -
State: open - Opened by norakassner almost 3 years ago
#97 - Start joint training
Issue -
State: open - Opened by norakassner almost 3 years ago
#96 - eval hyperparameters: occupied tokens
Issue -
State: open - Opened by norakassner almost 3 years ago
#95 - entity tagging speedup
Issue -
State: closed - Opened by norakassner almost 3 years ago
- 1 comment
#94 - estimate amount of data
Issue -
State: open - Opened by norakassner almost 3 years ago
#93 - eval hyperparameters: amount of metadata
Issue -
State: open - Opened by norakassner almost 3 years ago
#92 - method to sample global metadata
Issue -
State: open - Opened by norakassner almost 3 years ago
#91 - method to sample local metadata
Issue -
State: open - Opened by norakassner almost 3 years ago
#90 - explore hyperparameters:
Issue -
State: open - Opened by norakassner almost 3 years ago
#89 - simple zero-shot eval function: time stamps
Issue -
State: open - Opened by norakassner almost 3 years ago
#88 - simple zero-shot eval function: website description
Issue -
State: open - Opened by norakassner almost 3 years ago
- 1 comment
#87 - simple zero-shot eval function: datasource
Issue -
State: open - Opened by norakassner almost 3 years ago
#86 - simple zero-shot eval function: entity tags
Issue -
State: open - Opened by norakassner almost 3 years ago
#85 - simple zero-shot eval function: HTML tags
Issue -
State: open - Opened by norakassner almost 3 years ago
#84 - simple zero-shot eval function: generation length
Issue -
State: open - Opened by norakassner almost 3 years ago
- 1 comment
#84 - simple zero-shot eval function: generation length
Issue -
State: open - Opened by norakassner almost 3 years ago
- 1 comment
#83 - Handle the comment specific type not recognized by pyarrow
Pull Request -
State: closed - Opened by SaulLu almost 3 years ago
- 1 comment
#82 - Change torch version + make it optional
Pull Request -
State: closed - Opened by SaulLu almost 3 years ago
#81 - update: generation length and datasource
Pull Request -
State: closed - Opened by chkla almost 3 years ago
#80 - Update entity-tags preprocessing code to speed up the process
Pull Request -
State: closed - Opened by manandey almost 3 years ago