TransformerLensOrg/TransformerLens issues and pull requests

#666 - Fix attention result projection

Pull Request - State: closed - Opened by callummcdougall 7 months ago - 2 comments

#665 - [Proposal] Allow recent versions of beartype

Issue - State: open - Opened by jettjaniak 7 months ago - 6 comments
Labels: complexity-simple, tooling

#664 - [Question] Offline Error HookedTransformer.from_pretrained

Issue - State: closed - Opened by pbernabeup 7 months ago - 3 comments

#663 - Adding RMSNorm to apply_ln_to_stack

Pull Request - State: closed - Opened by gaabrielfranco 7 months ago - 1 comment

#662 - Add support for Qwen2 models

Pull Request - State: closed - Opened by g-w1 7 months ago - 3 comments

#661 - [Bug Report] Pythia output inconsistent across batch sizes when use_split_qkv_input=True

Issue - State: open - Opened by oliveradk 7 months ago
Labels: bug, complexity-high, implementation-inaccuracy

#660 - removed einsum causing error when use_atten_result is enabled

Pull Request - State: closed - Opened by oliveradk 7 months ago - 2 comments

#659 - [Bug Report] Attn Result hook not working

Issue - State: closed - Opened by oliveradk 7 months ago - 2 comments

#658 - docs: update Main_Demo.ipynb

Pull Request - State: closed - Opened by eltociear 7 months ago - 1 comment

#657 - [Bug Report] RMSNormPre in Transformer_lens is maybe different from Llama source code?

Issue - State: open - Opened by wangyifei0047 7 months ago - 1 comment
Labels: complexity-moderate, needs-investigation

#656 - Release 2.2

Pull Request - State: closed - Opened by bryce13950 7 months ago

#655 - Is it possible to use a locally downloaded model without accessing HF?

Issue - State: closed - Opened by ccp123456 7 months ago - 14 comments

#654 - Fix Out bias not being summed in attention component when using 4 bit precision

Pull Request - State: closed - Opened by FlyingPumba 7 months ago - 1 comment

#653 - [Question] loading Llama3-8B-instruct to HookedTransformer got a warning saying You are not using LayerNorm, so the writing weights can't be centered! Skipping!

Issue - State: closed - Opened by wangyifei0047 7 months ago - 1 comment

#652 - Mlp cleanup

Pull Request - State: closed - Opened by bryce13950 7 months ago

#651 - [Bug Report] Phi-3 Model does not load on Transformer Lens

Issue - State: closed - Opened by KanishkT123 8 months ago - 3 comments

#650 - Added support for Gemma-2

Pull Request - State: closed - Opened by neelnanda-io 8 months ago - 11 comments

#649 - Model baichuan

Pull Request - State: open - Opened by bryce13950 8 months ago

#648 - Fixed weight conversion

Pull Request - State: closed - Opened by bryce13950 8 months ago

#647 - Move out pretrained weight conversions

Pull Request - State: closed - Opened by richardkronick 8 months ago

#646 - Moved mixtral weights to another module

Pull Request - State: closed - Opened by bryce13950 8 months ago

#645 - Match Huggingface GPT2 implementation exactly

Pull Request - State: closed - Opened by joelburget 8 months ago - 2 comments

#644 - [Proposal] Documentation: Map the Act Names to the Transformer

Issue - State: open - Opened by JuVogt 8 months ago - 3 comments
Labels: documentation, complexity-moderate

#643 - Add tests for ActivationCache

Pull Request - State: closed - Opened by FlyingPumba 8 months ago - 5 comments

#642 - Steering vanilla GPT2 with SAE vectors based on transformerlens version of GPT2

Issue - State: closed - Opened by ianand 8 months ago - 3 comments

#641 - Match Huggingface MLP implementation exactly.

Pull Request - State: closed - Opened by joelburget 8 months ago - 2 comments

#640 - add better model properties table to docs

Pull Request - State: closed - Opened by mivanit 8 months ago - 1 comment

#639 - add tests for Attention

Pull Request - State: closed - Opened by anthonyduong9 8 months ago

#638 - Add tests for gated mlp

Pull Request - State: closed - Opened by anthonyduong9 8 months ago - 1 comment

#637 - Add comparing-to-huggingface.ipynb.

Pull Request - State: closed - Opened by joelburget 8 months ago

#636 - Fix typo in Main_Demo.ipynb

Pull Request - State: closed - Opened by ianand 8 months ago

#635 - Add skip_verbose_naming in add_hook to give an option for skipping the naming

Pull Request - State: closed - Opened by verlocks 8 months ago

#634 - Release 2.1

Pull Request - State: closed - Opened by bryce13950 8 months ago

#633 - Move out pretrained weight conversion functions

Pull Request - State: closed - Opened by richardkronick 8 months ago - 2 comments

#632 - Lock datasets version

Pull Request - State: closed - Opened by courtney-sims 8 months ago - 1 comment

#631 - [Proposal] Remove the overhead caused by full_hook.name = (hook.repr())?

Issue - State: closed - Opened by verlocks 8 months ago - 2 comments

#630 - Update README links to ARENA mech interp tutorials

Pull Request - State: closed - Opened by gileshd 8 months ago - 1 comment

#629 - NanoGPT Conversation did not handle case when there were no biases in model

Pull Request - State: open - Opened by dashstander 8 months ago - 1 comment

#628 - Refactor the utilities file into utilities folder

Pull Request - State: closed - Opened by starship006 8 months ago - 3 comments

#627 - Model config tests

Pull Request - State: open - Opened by curt-tigges 8 months ago - 1 comment

#626 - v2.0.1

Pull Request - State: closed - Opened by bryce13950 8 months ago

#625 - Fix demos pip install packages from unfound repos

Pull Request - State: closed - Opened by anthonyduong9 8 months ago - 1 comment

#624 - [docs] Update Mixtral repo reference

Pull Request - State: closed - Opened by joelburget 8 months ago

#623 - Unit tests loading from pretrained fill missing keys

Pull Request - State: closed - Opened by richardkronick 8 months ago - 1 comment

#622 - [Proposal] Add support for Baichuan1 and Baichuan2

Issue - State: open - Opened by StarrySeas1 8 months ago - 3 comments
Labels: complexity-moderate

#621 - [Bug Report] Getting `ModuleNotFoundError` when running `Grokking_Demo.ipynb`

Issue - State: closed - Opened by anthonyduong9 9 months ago - 2 comments

#620 - added news link

Pull Request - State: closed - Opened by bryce13950 9 months ago

#619 - Fix llama demos

Pull Request - State: closed - Opened by bryce13950 9 months ago

#618 - added release blog

Pull Request - State: closed - Opened by bryce13950 9 months ago

#617 - Add tests for hook point add hook

Pull Request - State: closed - Opened by anthonyduong9 9 months ago - 1 comment

#616 - fixed format

Pull Request - State: closed - Opened by bryce13950 9 months ago

#615 - [Bug Report] The output from HookedTransformer is not identical compared to Huggingface model for Lllama 3

Issue - State: open - Opened by iamsimha 9 months ago - 12 comments
Labels: bug, complexity-high

#614 - v1.19

Pull Request - State: closed - Opened by bryce13950 9 months ago

#613 - moved enable hook functionality to separate functions and tested new functions

Pull Request - State: closed - Opened by bryce13950 9 months ago

#612 - [Proposal] Merge utils and utilities

Issue - State: closed - Opened by bryce13950 9 months ago - 1 comment
Labels: complexity-moderate, refactor

#611 - [Bug Report] Missing `.items()` in `HookedRootModule.hooks`

Issue - State: closed - Opened by dtch1997 9 months ago - 4 comments
Labels: bug, complexity-simple, needs-investigation

#610 - add n k v heads to model properties table

Pull Request - State: closed - Opened by anthonyduong9 9 months ago - 3 comments

#609 - More pytest fixtures

Pull Request - State: closed - Opened by bmillwood 9 months ago

#608 - 1.18 update

Pull Request - State: closed - Opened by bryce13950 9 months ago

#607 - (v3) Draft PR: add Pyright static typing to hook_points.py #590

Pull Request - State: closed - Opened by starship006 9 months ago - 3 comments

#606 - Add support for ai-forever/mGPT model

Pull Request - State: closed - Opened by SeuperHakkerJa 9 months ago - 1 comment

#605 - Encoder-Decoder (T5) support

Pull Request - State: closed - Opened by somvy 9 months ago - 6 comments

#604 - Support for T5 models

Pull Request - State: closed - Opened by somvy 9 months ago

#603 - Add support for model_name-less models.

Pull Request - State: open - Opened by ArthurConmy 9 months ago - 1 comment

#602 - Release 1.18

Pull Request - State: closed - Opened by bryce13950 9 months ago

#601 - Release 1.18

Pull Request - State: closed - Opened by bryce13950 9 months ago

#600 - removed Hooked SAE

Pull Request - State: closed - Opened by bryce13950 9 months ago

#599 - Make `FactoredMatrix` compatible with tensor-like arguments

Pull Request - State: open - Opened by JasonGross 9 months ago - 1 comment

#598 - Run GitHub CI on MacOS

Pull Request - State: open - Opened by bmillwood 9 months ago - 5 comments

#597 - allow user to force trust_remote_code=true via from_pretrained kwargs

Pull Request - State: closed - Opened by Butanium 9 months ago - 1 comment

#596 - Update Gemma to reflect upstream HF changes

Pull Request - State: closed - Opened by cmathw 9 months ago - 1 comment

#595 - [Feature Request] Add Stopping Criteria support

Issue - State: open - Opened by Butanium 9 months ago - 2 comments
Labels: enhancement, complexity-high

#594 - [Bug Report] Updates to Gemma

Issue - State: closed - Opened by cmathw 9 months ago

#593 - [Bug Report] First google search result for transformer lens is a 404

Issue - State: closed - Opened by FabienRoger 9 months ago - 3 comments

#592 - Setup for fine tuned Mistral model ?

Issue - State: closed - Opened by SiddhantOjha17 9 months ago - 8 comments
Labels: question

#591 - [Bug Report] TransformerLens's use of `einsum` leads to different training dynamics on TPUs

Issue - State: closed - Opened by jqhoogland 9 months ago - 9 comments
Labels: wontfix

#590 - (v2) Draft PR: add Pyright static typing to hook_points.py

Pull Request - State: closed - Opened by starship006 9 months ago - 11 comments

#589 - Interactive neuroscope ci

Pull Request - State: closed - Opened by bryce13950 9 months ago

#588 - [Proposal] Setup unit tests to cover model configurations

Issue - State: open - Opened by bryce13950 9 months ago - 1 comment
Labels: good first issue, testing

#587 - Mistral 7b v0.2

Pull Request - State: open - Opened by bryce13950 9 months ago

#586 - Revert "Add Mistral 7B v0.2 Instruct"

Pull Request - State: closed - Opened by bryce13950 9 months ago

#585 - Fix docs badge in README

Pull Request - State: closed - Opened by ArthurConmy 9 months ago

#584 - Add `.from_pretrained` to HookedSAE

Pull Request - State: closed - Opened by ArthurConmy 9 months ago - 2 comments

#583 - updated PR template to add a note about merging from different branches

Pull Request - State: closed - Opened by bryce13950 9 months ago

#582 - Release 2.0

Pull Request - State: closed - Opened by bryce13950 9 months ago

#581 - updated pull reqeust template to account for new dev branch

Pull Request - State: closed - Opened by bryce13950 9 months ago

#580 - updated repo URL throughout the project

Pull Request - State: closed - Opened by bryce13950 9 months ago

#579 - Add Mistral 7B v0.2 Instruct

Pull Request - State: closed - Opened by fakerybakery 9 months ago - 4 comments

#578 - Fix Pos Slice Issue

Pull Request - State: closed - Opened by hannamw 9 months ago - 3 comments

#577 - unwrapped config

Pull Request - State: closed - Opened by bryce13950 9 months ago

#576 - Refactor integration tests

Pull Request - State: closed - Opened by bryce13950 9 months ago

#575 - [Bug Report] Can't run the 4-bit quantized Llama-2 demo

Issue - State: closed - Opened by atlaie 9 months ago - 10 comments
Labels: bug, complexity-moderate

#574 - [Bug Report] Bug in `get_caching_hooks` when `pos_slice=None`

Issue - State: closed - Opened by hannamw 9 months ago - 3 comments

#573 - Add support for Phi-3

Pull Request - State: closed - Opened by slash3g 9 months ago - 6 comments

#572 - Fix broken HookedSAETransformer demo links

Pull Request - State: closed - Opened by ckkissane 9 months ago - 2 comments

#571 - added convenience function for unwrapping config to replace commonly …

Pull Request - State: closed - Opened by bryce13950 9 months ago

#570 - [Bug Report] Mixtral generates nonsense

Issue - State: closed - Opened by joelburget 9 months ago - 42 comments

#569 - [Bug Report] Unable to Llama 3 70b on multigpu in 4bit

Issue - State: open - Opened by winglian 10 months ago - 8 comments
Labels: bug, complexity-high, multi-gpu

#568 - added debug step

Pull Request - State: closed - Opened by bryce13950 10 months ago

#567 - Othello ci

Pull Request - State: closed - Opened by bryce13950 10 months ago

GitHub / TransformerLensOrg/TransformerLens issues and pull requests