An open API service for providing issue and pull request metadata for open source projects.

GitHub / bigcode-project/bigcode-analysis issues and pull requests

#46 - Update README.md

Pull Request - State: closed - Opened by christiancopeland about 2 years ago

#46 - Update README.md

Pull Request - State: closed - Opened by christiancopeland about 2 years ago

#45 - Update README.md

Pull Request - State: closed - Opened by christiancopeland about 2 years ago

#43 - download dataset from kaggle

Pull Request - State: open - Opened by xu3kev over 2 years ago - 1 comment

#42 - Pull Requests

Pull Request - State: open - Opened by loubnabnl over 2 years ago

#41 - kaggle dataset

Pull Request - State: open - Opened by loubnabnl over 2 years ago

#40 - Stackoverflow processing

Pull Request - State: open - Opened by loubnabnl over 2 years ago

#39 - [WIP] textbooks filtering

Pull Request - State: open - Opened by loubnabnl over 2 years ago

#38 - [WIP] code reviews dataset

Pull Request - State: open - Opened by loubnabnl over 2 years ago

#37 - Add pdf of Miro board of MozFest

Pull Request - State: closed - Opened by harm-devries over 2 years ago

#36 - Chinchilla analysis

Pull Request - State: closed - Opened by harm-devries almost 3 years ago

#35 - add scaling laws notebook

Pull Request - State: closed - Opened by lvwerra almost 3 years ago - 4 comments

#34 - Data inspection

Pull Request - State: closed - Opened by harm-devries about 3 years ago

#33 - add github issues analysis notebook

Pull Request - State: closed - Opened by loubnabnl about 3 years ago

#32 - Add unimax exploration notebook

Pull Request - State: closed - Opened by harm-devries about 3 years ago - 1 comment

#31 - Issues language identifier

Pull Request - State: closed - Opened by Muhtasham about 3 years ago

#30 - Minhash Improvement

Pull Request - State: closed - Opened by ChenghaoMou about 3 years ago - 1 comment

#29 - add kenlm experiment

Pull Request - State: closed - Opened by lvwerra over 3 years ago

#28 - update readmes of filtering methods

Pull Request - State: closed - Opened by loubnabnl over 3 years ago

#27 - add code preprocessing and comment to code notebook

Pull Request - State: closed - Opened by loubnabnl over 3 years ago

#26 - Email regex modified

Pull Request - State: closed - Opened by paulovn over 3 years ago

#25 - add PII detection pipeline and analysis notebooks

Pull Request - State: closed - Opened by loubnabnl over 3 years ago

#24 - Use detect-secrets to scan secrets (WIP)

Pull Request - State: closed - Opened by liyongsea over 3 years ago - 2 comments

#23 - Evaluate CodeGen on safe and all-license dataset

Issue - State: closed - Opened by harm-devries over 3 years ago - 3 comments
Labels: good first issue

#22 - MQA experiments on AWS SageMaker Lab

Pull Request - State: closed - Opened by ocramz over 3 years ago - 4 comments

#21 - requirements uses the right branch of transformers

Pull Request - State: closed - Opened by ocramz over 3 years ago

#20 - cannot import AttentionType from gpt2

Issue - State: closed - Opened by ocramz over 3 years ago

#19 - [Decontamination] Add readme and instructions to run substring decontamination

Issue - State: closed - Opened by RaymondLi0 over 3 years ago - 1 comment

#18 - update readme and requirements

Pull Request - State: closed - Opened by ChenghaoMou over 3 years ago

#17 - Reorganize data analysis folder and update readmess

Pull Request - State: closed - Opened by loubnabnl over 3 years ago

#16 - add subtsring decontamination

Pull Request - State: closed - Opened by RaymondLi0 over 3 years ago

#15 - github scraping speed limit

Issue - State: open - Opened by bigximik over 3 years ago

#14 - Add decontamination code

Pull Request - State: closed - Opened by ChenghaoMou over 3 years ago - 3 comments

#13 - Decontamination

Issue - State: closed - Opened by ChenghaoMou over 3 years ago - 9 comments

#12 - Broken link

Issue - State: closed - Opened by Sleepyhead01 over 3 years ago - 1 comment

#11 - Adding alternative minhash script

Pull Request - State: closed - Opened by ChenghaoMou over 3 years ago - 15 comments

#10 - [Near Deduplication] Tokenization

Issue - State: open - Opened by ChenghaoMou over 3 years ago - 2 comments

#9 - [Near Deduplication] Post processing

Issue - State: open - Opened by ChenghaoMou over 3 years ago

#8 - [Exact Substring Deduplication] Analysis

Issue - State: open - Opened by ChenghaoMou over 3 years ago - 1 comment

#7 - [Near Deduplication] Benchmark

Issue - State: open - Opened by ChenghaoMou over 3 years ago - 2 comments

#6 - Create CONTRIBUTING.md

Pull Request - State: closed - Opened by lvwerra over 3 years ago

#5 - Add filtering to the near deduplicated safe dataset

Issue - State: closed - Opened by loubnabnl over 3 years ago - 1 comment
Labels: data curation

#4 - Multi query experiments

Pull Request - State: closed - Opened by bigximik over 3 years ago

#3 - Reorganize bigcode-data-analysis repository

Issue - State: closed - Opened by loubnabnl over 3 years ago - 1 comment
Labels: documentation, enhancement

#2 - Rename model names on HF hub

Issue - State: closed - Opened by harm-devries over 3 years ago - 1 comment
Labels: documentation, enhancement

#1 - Upload github dataset with license column

Issue - State: closed - Opened by harm-devries over 3 years ago
Labels: enhancement, data curation