Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / bigcode-project/transformers issues and pull requests

#30 - add embed and residual dropout

Pull Request - State: closed - Opened by RaymondLi0 5 months ago

#29 - For visibility: conversion scripts from fast-llm

Pull Request - State: open - Opened by RaymondLi0 6 months ago

#28 - Starcoder2 model

Pull Request - State: open - Opened by jlamypoirier 6 months ago

#27 - log tensors

Pull Request - State: open - Opened by RaymondLi0 6 months ago

#26 - change KV splitting based on Megatron-LM

Pull Request - State: closed - Opened by suiyoubi 7 months ago

#25 - For visibility: Gqa megatron rope

Pull Request - State: open - Opened by RaymondLi0 7 months ago

#24 - Move megatron conversion script and add rope arguments

Pull Request - State: open - Opened by loubnabnl 8 months ago - 4 comments

#23 - Make modeling compatible with Nanotron + few optims

Pull Request - State: closed - Opened by NouamaneTazi 8 months ago - 3 comments

#22 - For visibility: conversion scripts for fast-llm

Pull Request - State: closed - Opened by RaymondLi0 9 months ago

#20 - Simplified kv caching

Pull Request - State: open - Opened by jlamypoirier about 1 year ago

#19 - Add flash attention

Pull Request - State: open - Opened by jlamypoirier about 1 year ago

#18 - Flash attention experiments

Pull Request - State: open - Opened by jlamypoirier about 1 year ago

#17 - Add back experimental features

Pull Request - State: closed - Opened by jlamypoirier about 1 year ago

#16 - Diff from Huggingface main

Pull Request - State: open - Opened by jlamypoirier about 1 year ago

#14 - Add gpu optimizations to base model

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago

#13 - More optimizations

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago

#11 - add test to ensure mqa and mha have the same behaviour

Pull Request - State: closed - Opened by minimario over 1 year ago

#10 - Upcasting, scaling, masking and fused kernels to match Megatron-LM

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago

#9 - Add santacoder model

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago - 1 comment

#8 - Megatron conversion script

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago

#7 - Fast inference

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago

#6 - Fork the model into GPTBigCode

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago - 1 comment

#5 - Fast inference

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago

#4 - Multi-query attention

Pull Request - State: closed - Opened by jlamypoirier over 1 year ago - 3 comments

#3 - Just to see the diff

Pull Request - State: open - Opened by Muennighoff over 1 year ago - 4 comments

#2 - add: 2 variants of multi query implementation; printing some details

Pull Request - State: closed - Opened by bigximik almost 2 years ago

#1 - Benchmark multi-query attention in HF transformers

Issue - State: closed - Opened by harm-devries almost 2 years ago - 1 comment
Labels: inference, architecture