TransformerLensOrg/TransformerLens issues and pull requests

#566 - moved report to static section

Pull Request - State: closed - Opened by bryce13950 10 months ago

#565 - Revert "moved coverage report download (#564)"

Pull Request - State: closed - Opened by bryce13950 10 months ago

#564 - moved coverage report download

Pull Request - State: closed - Opened by bryce13950 10 months ago

#563 - Refactor components

Pull Request - State: closed - Opened by bryce13950 10 months ago

#562 - Ci full coverage

Pull Request - State: closed - Opened by bryce13950 10 months ago

#561 - Ci coverage location

Pull Request - State: closed - Opened by bryce13950 10 months ago

#560 - Resolve SAE CI Test failures

Pull Request - State: closed - Opened by bryce13950 10 months ago

#559 - reworked CI to publish code coverage report

Pull Request - State: closed - Opened by bryce13950 10 months ago

#558 - [Bug Report] Gemma-2b model fails when splitting QKV input

Issue - State: closed - Opened by Aaquib111 10 months ago - 2 comments

#557 - [Question] Difference in gpt-neo-125m weights loading with huggingface from_pretrained and HookTransformers.from_pretrained

Issue - State: closed - Opened by petezone 10 months ago - 1 comment

#556 - Bert demo ci

Pull Request - State: closed - Opened by bryce13950 10 months ago

#555 - removed deuplicate rearrange block

Pull Request - State: closed - Opened by bryce13950 10 months ago

#554 - [Proposal] Save and Load subgraph as dict

Issue - State: closed - Opened by chris-aeviator 10 months ago - 7 comments

#553 - [Bug Report] Cannot load llama-3 8B with hf_model arg

Issue - State: closed - Opened by Butanium 10 months ago - 6 comments

#552 - Hf secret

Pull Request - State: closed - Opened by bryce13950 10 months ago

#551 - Fixed device being set to cpu:0 instead of cpu

Pull Request - State: closed - Opened by Butanium 10 months ago - 2 comments

#550 - Hf token auth

Pull Request - State: closed - Opened by bryce13950 10 months ago

#549 - Add support for Llama 3 (and Llama-2-70b-hf)

Pull Request - State: closed - Opened by joelburget 10 months ago - 5 comments

#548 - [Bug Report] Deployments no longer possible without huggingface access token

Issue - State: closed - Opened by bryce13950 10 months ago
Labels: bug, blocking

#547 - Llama 4bit v2 typing

Pull Request - State: closed - Opened by hannamw 10 months ago - 1 comment

#546 - Fixed Santa Coder demo

Pull Request - State: closed - Opened by bryce13950 10 months ago

#545 - Othello colab fix

Pull Request - State: closed - Opened by bryce13950 10 months ago

#544 - Demo no position fix

Pull Request - State: closed - Opened by bryce13950 10 months ago

#543 - [Bug Report] Grokking demo currently broken in Colab

Issue - State: open - Opened by bryce13950 10 months ago - 1 comment
Labels: bug, demo

#542 - revised demo testing to check all demos

Pull Request - State: open - Opened by bryce13950 10 months ago

#541 - locked attribution patching to 1.1.1

Pull Request - State: closed - Opened by bryce13950 10 months ago

#540 - [Bug Report] Test coverage missing on add_hook in hook_points

Issue - State: open - Opened by bryce13950 10 months ago - 2 comments
Labels: bug, testing

#539 - [Bug Report] Transformer_lens should be pinned to 1.1.1 for Attribution_Patching demo

Issue - State: closed - Opened by adamkarvonen 10 months ago - 1 comment

#538 - Standardize black line length to 100, in line with other project settings

Pull Request - State: closed - Opened by Chanlaw 10 months ago - 4 comments

#537 - Add Xavier and Kaiming Initializations

Pull Request - State: closed - Opened by Chanlaw 10 months ago - 3 comments

#536 - HookedSAETransformer

Pull Request - State: closed - Opened by ckkissane 10 months ago - 5 comments

#535 - [Proposal] Please ddd support for fine-tuned models

Issue - State: closed - Opened by jasonlim131 10 months ago - 3 comments

#534 - Bugfix: remove redundant assert checks

Pull Request - State: closed - Opened by tkukurin 11 months ago - 1 comment

#533 - updated docs to account for additional test suites

Pull Request - State: closed - Opened by bryce13950 11 months ago

#530 - [Bug Report] Demos utilizing PySvelte are not able to install

Issue - State: closed - Opened by vashchuko 11 months ago - 3 comments
Labels: demo

#529 - Update loading_from_pretrained.py

Pull Request - State: closed - Opened by jbloomAus 11 months ago - 2 comments

#528 - make tests pass mps

Pull Request - State: closed - Opened by jbloomAus 11 months ago

#524 - Remove FactoredMatrix.py<->utils.py circular dependency

Pull Request - State: open - Opened by ArthurConmy 11 months ago

#523 - [Bug Report] Residual Stack Not Adding Up

Issue - State: open - Opened by EitanGronich 11 months ago - 1 comment
Labels: documentation, demo

#522 - [Proposal] Include num_kv_heads in the Model Properties Table in the docs

Issue - State: closed - Opened by neelnanda-io 11 months ago - 2 comments

#521 - Add Mixtral

Pull Request - State: closed - Opened by collingray 11 months ago - 12 comments

#520 - Fix split_qkv_input for grouped query attention

Pull Request - State: closed - Opened by wesg52 11 months ago - 1 comment

#519 - [Bug Report] split_qkv_input issues with consistency and GQA

Issue - State: closed - Opened by wesg52 11 months ago - 1 comment

#518 - [Proposal] Update Attention parameter initialization to use Kaiming

Issue - State: open - Opened by justjhong 11 months ago
Labels: complexity-moderate

#517 - [Bug Report] Issues loading Llama-2

Issue - State: closed - Opened by davidquarel 12 months ago - 2 comments

#516 - chore: fixing type errors and enabling mypy

Pull Request - State: closed - Opened by chanind 12 months ago - 4 comments

#515 - In evals.IOIDataset, all entries in the dataset are the same.

Issue - State: open - Opened by tkwa 12 months ago
Labels: low-priority, complexity-simple

#510 - Speed up !pip install transformer-lens in colab

Pull Request - State: closed - Opened by pavanyellow 12 months ago - 3 comments

#509 - [Bug Report] Layer norm folding not properly implemented for BertBlock

Issue - State: open - Opened by soniajoseph about 1 year ago - 2 comments
Labels: bug, complexity-moderate

#506 - [Proposal] Add support for Qwen1.5 models

Issue - State: closed - Opened by andyrdt about 1 year ago - 1 comment

#505 - Refactor hook_points

Pull Request - State: closed - Opened by VasilGeorgiev39 about 1 year ago - 4 comments

#502 - [Question] How to migrate and use it on huggingface’s visual language model?

Issue - State: open - Opened by jiabao-wang about 1 year ago - 1 comment
Labels: complexity-high

#501 - [Draft] Support Flash Attention

Pull Request - State: open - Opened by cmathw about 1 year ago - 4 comments

#500 - [Question] Generation not possible with hooks?

Issue - State: closed - Opened by FergusFettes about 1 year ago - 1 comment

#497 - [Proposal] Speed up `!pip install transformer-lens` in colab

Issue - State: closed - Opened by ArthurConmy about 1 year ago - 4 comments

#496 - [Proposal] Add benchmarks for error introduced by HookedTransformer

Issue - State: closed - Opened by collingray about 1 year ago

#494 - Add Support for Yi-6B and Yi-34B

Pull Request - State: closed - Opened by collingray about 1 year ago

#493 - Construct causal mask on-the-fly

Pull Request - State: closed - Opened by andyrdt about 1 year ago - 8 comments

#492 - [Bug Report] Tiny stories models have longer n_ctx than they were trained with

Issue - State: open - Opened by nix-apollo about 1 year ago
Labels: good first issue, complexity-simple, implementation-inaccuracy

#488 - add optional arguments to make sure generate() works without tokenizer being defined

Pull Request - State: closed - Opened by JackCai1206 about 1 year ago - 1 comment

#486 - Loading of huggingface 4-bit quantized Llama

Pull Request - State: closed - Opened by coolvision about 1 year ago - 9 comments

#483 - [Bug Report] HookedTranformer.generate() with model.tokenizer unset gives pad_token_id error

Issue - State: open - Opened by JackCai1206 about 1 year ago - 2 comments
Labels: bug, low-priority, complexity-moderate

#482 - [Question] demo of 4bit quantized Llama -- what's next?

Issue - State: closed - Opened by coolvision about 1 year ago - 2 comments

#481 - [Bug Report] Demo for Tracr to TransformerLens is kind of broken

Issue - State: open - Opened by FlyingPumba about 1 year ago
Labels: bug, demo

#480 - [Question] load_state_dict: copy vs assign

Issue - State: closed - Opened by coolvision about 1 year ago - 3 comments

#479 - [Proposal] Memory efficient causal mask implementation

Issue - State: open - Opened by andyrdt about 1 year ago - 1 comment
Labels: enhancement, complexity-moderate

#474 - [Bug Report] Slack link is expired

Issue - State: closed - Opened by Aryan-Deshpande about 1 year ago - 2 comments

#473 - Make `tokenize_and_concatenate` work with more datasets

Pull Request - State: open - Opened by ArthurConmy about 1 year ago
Labels: enhancement

#472 - [Bug Report] `ActivationCache.apply_ln_to_stack` silently returns the resid stack for unsupported normalization types

Issue - State: closed - Opened by jmsdao about 1 year ago - 1 comment

#471 - [Proposal] Support Mixtral

Issue - State: closed - Opened by neelnanda-io about 1 year ago - 9 comments
Labels: enhancement, help wanted

#470 - [Bug Report] convert_llama_weights fails if I already quantized the weights to 4 bits

Issue - State: closed - Opened by abdurraheemali about 1 year ago - 2 comments

#468 - [Proposal] Add base parameter of rotary embedding in model configuration

Issue - State: closed - Opened by YuhengHuang42 about 1 year ago

#467 - [Bug Report] Attention masking is not used by model forward methods

Issue - State: closed - Opened by jmsdao about 1 year ago - 1 comment

#466 - (Draft) Add DLA function to utils

Pull Request - State: open - Opened by VasilGeorgiev39 about 1 year ago - 5 comments

#465 - Add pos_slice to run_with_cache

Pull Request - State: closed - Opened by VasilGeorgiev39 about 1 year ago - 3 comments

#464 - Draft PR: add Pyright static typing to hook_points.py

Pull Request - State: closed - Opened by starship006 about 1 year ago - 6 comments

#463 - Clean up project config

Pull Request - State: closed - Opened by alan-cooney about 1 year ago

#462 - [Proposal] Add support for Mamba

Issue - State: open - Opened by joker3212 about 1 year ago - 5 comments
Labels: complexity-high, model-request

#460 - [Question] weight decay for layernorms and biases?

Issue - State: closed - Opened by matthiasdellago about 1 year ago - 1 comment

#459 - [Question] I tried to load the bert model but got an error

Issue - State: closed - Opened by clclclaiggg about 1 year ago - 2 comments

#452 - Refactor components

Pull Request - State: closed - Opened by bryce13950 about 1 year ago - 3 comments

#446 - [Bug fix] GatedMLP not in docs. issue #264 [bug report]

Pull Request - State: closed - Opened by danlaudk about 1 year ago - 1 comment

#444 - Fix contributing docs typo

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#442 - Relax CUDA requirements

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#440 - add survey link

Pull Request - State: closed - Opened by jbloomAus over 1 year ago

#439 - [Bug Report] Load model to mutilple devices

Issue - State: open - Opened by liuxin99 over 1 year ago - 7 comments
Labels: bug, multi-gpu

#437 - Update GitHub CD Actions

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#436 - Add docs hot reloading instructions for contributors

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#435 - Make unit & acceptance tests run in parallel

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#432 - Improve ActivationCache docs

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#431 - Move cspell conf to its own file

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#430 - Organise & fix README

Pull Request - State: closed - Opened by alan-cooney over 1 year ago - 2 comments

#429 - Fix Exploratory Analysis Demo

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#428 - Sync readme with docs

Pull Request - State: closed - Opened by alan-cooney over 1 year ago - 3 comments

#422 - Fix all Sphinx warnings

Pull Request - State: closed - Opened by alan-cooney over 1 year ago

#385 - [Bug Report] Pythia models / Rotary Embeddings don't match Huggingface.

Issue - State: open - Opened by UFO-101 over 1 year ago - 15 comments
Labels: complexity-high, implementation-inaccuracy

#375 - Fix to include ln_final.w in RMSNorm hook

Pull Request - State: closed - Opened by clarenceluo78 over 1 year ago - 5 comments
Labels: no-rebase

#327 - Reduce memory use when loading model

Pull Request - State: closed - Opened by slavachalnev over 1 year ago - 7 comments

#258 - [Proposal] Support BERT Model Series (and/or T5).

Issue - State: closed - Opened by jbloomAus almost 2 years ago - 16 comments

#207 - [Bug Report] Can't add hook to pretrained model: AssertionError: Cannot add hook blocks.0.hook_q_input if use_split_qkv_input is False

Issue - State: open - Opened by jbloomAus almost 2 years ago - 4 comments
Labels: bug, help wanted

GitHub / TransformerLensOrg/TransformerLens issues and pull requests