TransformerLensOrg/TransformerLens issues and pull requests

#855 - moved setup python

Pull Request - State: closed - Opened by bryce13950 4 days ago

#854 - Ci hf secret

Pull Request - State: closed - Opened by bryce13950 4 days ago

#853 - Ci hf token empty

Pull Request - State: closed - Opened by bryce13950 4 days ago

#852 - Visualize weight conversions

Pull Request - State: open - Opened by degenfabian 4 days ago

#851 - Visualize weight conversions

Pull Request - State: closed - Opened by degenfabian 4 days ago

#850 - updated artifacts version

Pull Request - State: closed - Opened by bryce13950 5 days ago

#849 - Upgrade transformers

Pull Request - State: closed - Opened by bryce13950 5 days ago

#848 - Release 2.13.0

Pull Request - State: closed - Opened by bryce13950 5 days ago

#847 - feat: streaming response for HookedTransformer.generate

Pull Request - State: open - Opened by hijohnnylin 9 days ago - 3 comments

#846 - [Bug Report] Prioritize Local hf_model.config for Qwen Models to Avoid Unnecessary Hugging Face API Calls

Issue - State: open - Opened by yhr-code 9 days ago
Labels: high-priority, complexity-moderate

#845 - Manually create repr for partial hooks

Pull Request - State: closed - Opened by danbraunai 11 days ago - 3 comments

#844 - [Compatibility Report] meta-llama/Llama-3.3-70B-Instruct does not work

Issue - State: open - Opened by thisnick 16 days ago - 2 comments
Labels: complexity-moderate, implementation-inaccuracy

#843 - added python 3.13 to CI

Pull Request - State: open - Opened by bryce13950 18 days ago

#842 - [Question] code error when context length increases

Issue - State: closed - Opened by gom168 19 days ago

#841 - Recent releases

Pull Request - State: closed - Opened by bryce13950 19 days ago

#840 - Upstream update

Pull Request - State: closed - Opened by bryce13950 19 days ago

#839 - Phi 4 docs fix

Pull Request - State: closed - Opened by bryce13950 19 days ago

#838 - Enables return of activation cache variables during generation

Pull Request - State: open - Opened by japhba 21 days ago

#837 - [Bug Report] Multi-GPU device ordinal issue

Issue - State: open - Opened by safiness 21 days ago
Labels: bug, complexity-high, multi-gpu

#836 - aded changes needed for phi-4 docs

Pull Request - State: closed - Opened by bryce13950 21 days ago

#835 - Version 2.12

Pull Request - State: closed - Opened by bryce13950 21 days ago

#834 - Rotary factory

Pull Request - State: open - Opened by jonasrohw 24 days ago

#833 - Added model Phi 4

Pull Request - State: closed - Opened by jonasrohw 24 days ago

#832 - Extend support for T5 models

Pull Request - State: closed - Opened by degenfabian 25 days ago

#831 - updated lock command

Pull Request - State: closed - Opened by bryce13950 25 days ago

#830 - [Proposal] Type hint support for `self.model` in `ActivationCache`

Issue - State: open - Opened by Ja1Zhou 25 days ago - 1 comment
Labels: complexity-moderate

#829 - Extend Bert support

Pull Request - State: open - Opened by degenfabian about 1 month ago - 1 comment

#828 - [Bug Report] Logit attribution produces different values for Gemma2

Issue - State: closed - Opened by wj210 about 1 month ago - 1 comment

#827 - Release 2.11

Pull Request - State: closed - Opened by bryce13950 about 1 month ago

#826 - Feature llama 33

Pull Request - State: closed - Opened by bryce13950 about 1 month ago

#825 - Depen transformers

Pull Request - State: closed - Opened by bryce13950 about 1 month ago - 1 comment

#824 - Jaxtyiping Dependency Update

Pull Request - State: open - Opened by bryce13950 about 1 month ago

#823 - Test match hugging face

Pull Request - State: closed - Opened by bryce13950 about 1 month ago

#822 - Updates torch to use the most recent version

Pull Request - State: closed - Opened by bryce13950 about 1 month ago

#821 - updated python requirements

Pull Request - State: closed - Opened by bryce13950 about 1 month ago

#820 - Add LLaVA support, modify generate function

Pull Request - State: closed - Opened by zazamrykh about 2 months ago - 12 comments

#819 - [Bug Report] Memory Leakage in GEMMA2 -2b

Issue - State: closed - Opened by stupid-learner about 2 months ago - 1 comment

#818 - Add LLaVA support and make HookedTransformer.generate() able to get embedding as inputs

Pull Request - State: closed - Opened by zazamrykh about 2 months ago - 4 comments

#817 - fixed corner param

Pull Request - State: closed - Opened by bryce13950 about 2 months ago

#816 - Added OLMo(E) v1

Pull Request - State: open - Opened by jonasrohw about 2 months ago

#815 - Set prepend_bos to false by default for Qwen models

Pull Request - State: closed - Opened by degenfabian about 2 months ago

#814 - Fix that padding_side always defaults to "right" when no value is explicitly passed

Pull Request - State: open - Opened by degenfabian about 2 months ago - 1 comment

#813 - Release 2.10

Pull Request - State: closed - Opened by bryce13950 about 2 months ago

#812 - Updated devcontainers to use python3.11

Pull Request - State: closed - Opened by jonasrohw about 2 months ago

#811 - Add support for Qwen_with_Questions

Pull Request - State: closed - Opened by degenfabian about 2 months ago

#810 - Throw error when using attn_in with grouped query attention

Pull Request - State: closed - Opened by degenfabian about 2 months ago

#809 - Added support for Qwen2.5

Pull Request - State: closed - Opened by israel-adewuyi 2 months ago

#808 - [Proposal] Add support for QwQ models

Issue - State: closed - Opened by bryce13950 2 months ago - 1 comment
Labels: complexity-moderate, model-request

#807 - Add a demo of collecting activations from a single location in the model.

Pull Request - State: closed - Opened by adamkarvonen 2 months ago - 1 comment

#806 - Set default_prepend_bos to False in Bloom model configuration

Pull Request - State: closed - Opened by degenfabian 2 months ago - 2 comments

#805 - [Question] What is stanford-gpt2-small-a?

Issue - State: closed - Opened by leo1oel 2 months ago - 1 comment

#804 - [Proposal] Compatibility for OLMo and OLMo2?

Issue - State: open - Opened by spaidataiga 2 months ago - 4 comments
Labels: complexity-moderate, model-request

#803 - Circular dependency resolution

Pull Request - State: closed - Opened by bryce13950 2 months ago

#802 - bumped python min version

Pull Request - State: closed - Opened by bryce13950 2 months ago

#801 - [Bug Report] Padding side inconsistency with Huggingface Transformers

Issue - State: open - Opened by spfrommer 2 months ago
Labels: bug, complexity-moderate, needs-investigation

#800 - [Bug Report] Load model problem

Issue - State: open - Opened by LiuJinzhe-Keepgoing 2 months ago - 4 comments
Labels: question

#799 - clarified arguments a bit for hook_points

Pull Request - State: closed - Opened by bryce13950 2 months ago

#798 - [Proposal] Remove instances where `model_args` and `model_kwargs` are provided

Issue - State: open - Opened by bryce13950 2 months ago
Labels: complexity-simple, breaking-change

#797 - Actions token access

Pull Request - State: closed - Opened by bryce13950 3 months ago

#796 - [Proposal] change FactoredMatric.svd() so it doesn't prevent all instances of FactoredMatrix from being garbage collected

Issue - State: open - Opened by manulari 3 months ago - 1 comment
Labels: complexity-moderate

#795 - Raise exception when BERT is loaded with HookedTransformer instead of…

Pull Request - State: closed - Opened by degenfabian 3 months ago - 1 comment

#794 - Colab compatibility bug fixes

Pull Request - State: closed - Opened by degenfabian 3 months ago

#793 - Remove einsum usage in forward function of BertMLMHead

Pull Request - State: closed - Opened by degenfabian 3 months ago

#792 - Remove einsum usage in _get_w_in_matrix in SVDInterpreter

Pull Request - State: closed - Opened by degenfabian 3 months ago

#791 - Remove einsum usage in refactor_factored_attn_matrices in HookedTransformer

Pull Request - State: closed - Opened by degenfabian 3 months ago

#790 - Remove einsum in complex_attn_linear

Pull Request - State: closed - Opened by degenfabian 3 months ago - 2 comments

#789 - Remove einsum in compute_head_results in ActivationCache

Pull Request - State: closed - Opened by degenfabian 3 months ago

#788 - Remove einsum in logit_attrs in ActivationCache

Pull Request - State: closed - Opened by degenfabian 3 months ago

#787 - v2.9.1

Pull Request - State: closed - Opened by bryce13950 3 months ago

#786 - added typeguard dependency

Pull Request - State: closed - Opened by bryce13950 3 months ago

#785 - [Bug Report] Error on import due to missing `typeguard` dependency

Issue - State: closed - Opened by chanind 3 months ago - 1 comment

#784 - [Question] How to load a model in smaller precision?

Issue - State: open - Opened by MittelmanDaniel 3 months ago - 5 comments
Labels: needs-information

#783 - Remove einsum in forward pass in AbstractAttention

Pull Request - State: closed - Opened by degenfabian 3 months ago - 2 comments

#782 - Remove einsum in apply_causal_mask in abstract_attention.py

Pull Request - State: closed - Opened by degenfabian 3 months ago - 1 comment

#781 - Remove einsum usage from create_alibi_bias function

Pull Request - State: closed - Opened by degenfabian 3 months ago - 1 comment

#780 - Release 2.9

Pull Request - State: closed - Opened by bryce13950 3 months ago

#779 - Add model upload and load

Pull Request - State: open - Opened by mntss 3 months ago

#778 - [Bug Report] Global and Local Attn layer order of Gemma2 is wrong?

Issue - State: open - Opened by huangxt39 3 months ago
Labels: complexity-moderate, implementation-inaccuracy

#777 - Fix that if use_past_kv_cache is set to True models from the Bloom family produce weird outputs.

Pull Request - State: closed - Opened by degenfabian 3 months ago

#776 - [Bug Report] use_past_kv_cache yields weird outputs when used with Bloom model family

Issue - State: closed - Opened by degenfabian 3 months ago
Labels: complexity-moderate

#775 - Set prepend_bos to false by default for Bloom model family

Pull Request - State: closed - Opened by degenfabian 3 months ago - 6 comments

#774 - [Proposal] prepend_bos should by default be set to false for the Bloom model family

Issue - State: closed - Opened by degenfabian 3 months ago
Labels: complexity-moderate

#773 - [Question] Would it be possible to adopt TransformerLens on models with a different layernorm implementation?

Issue - State: open - Opened by Steven-Yiran 3 months ago - 2 comments
Labels: question, complexity-high

#772 - fix the bug that attention_mask and past_kv_cache cannot work together

Pull Request - State: closed - Opened by yzhhr 3 months ago

#771 - Would you like to support models from the Qwen 2.5 series?

Issue - State: closed - Opened by ArcherShirou 3 months ago - 1 comment

#770 - Restore consistency of hook_normalized between LayerNorm and RMSNorm

Pull Request - State: open - Opened by degenfabian 3 months ago - 1 comment

#769 - improve model properties table in docs

Pull Request - State: open - Opened by mivanit 3 months ago - 6 comments

#768 - Add support for `Mistral-Nemo-Base-2407` model (#751)

Pull Request - State: closed - Opened by frances720 4 months ago

#767 - v2.8.1

Pull Request - State: closed - Opened by bryce13950 4 months ago

#766 - [Question] Question title

Issue - State: closed - Opened by MittelmanDaniel 4 months ago - 4 comments

#765 - Logit comparator tool

Pull Request - State: closed - Opened by curt-tigges 4 months ago

#764 - Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series

Pull Request - State: closed - Opened by Hzfinfdu 4 months ago - 2 comments

#763 - Add support for NTK-by-Part Rotary Embedding & set correct rotary base for Llama-3.1 series

Pull Request - State: closed - Opened by Hzfinfdu 4 months ago

#762 - [Question] compatibility for 'Qwen/Qwen2.5-14B'

Issue - State: closed - Opened by hgftrdw45ud67is8o89 4 months ago - 1 comment
Labels: complexity-moderate, model-request

#761 - Add configurations for Llama 3.1 models(Llama-3.1-8B and Llama-3.1-70B)

Pull Request - State: closed - Opened by vatsalrathod16 4 months ago - 2 comments

#760 - Add py.typed for type hints

Pull Request - State: open - Opened by UFO-101 4 months ago - 2 comments

#759 - New issue template for reporting model compatibility

Pull Request - State: closed - Opened by bryce13950 4 months ago

#758 - added new block for recent diagram, and colab compatibility notebook

Pull Request - State: closed - Opened by bryce13950 4 months ago

#757 - Add warning and halt execution for incorrect T5 model usage

Pull Request - State: closed - Opened by vatsalrathod16 4 months ago

#756 - Release 2.8

Pull Request - State: closed - Opened by bryce13950 4 months ago

GitHub / TransformerLensOrg/TransformerLens issues and pull requests