linkedin/Liger-Kernel issues and pull requests

#561 - Update pyproject.toml

Pull Request - State: closed - Opened by shivam15s 6 days ago

#560 - add fuse reglu op with test

Pull Request - State: closed - Opened by YixinSong-e 7 days ago

#559 - add batch_norm op with test and benchmark

Pull Request - State: open - Opened by yanghailong-git 9 days ago - 1 comment

#558 - Support Granite 3.0 and 3.1 models

Pull Request - State: open - Opened by JamesKunstle 12 days ago - 1 comment

#557 - Support IBM Granite 3.(0, 1) models

Issue - State: open - Opened by JamesKunstle 12 days ago - 1 comment

#556 - add batch_norm op

Pull Request - State: closed - Opened by yanghailong-git 14 days ago

#555 - RMSNorm & SwiGLU activation recomputation

Issue - State: open - Opened by huyiwen 15 days ago - 2 comments

#554 - Fix DPO unit test fail and refactor

Pull Request - State: closed - Opened by Tcc0403 15 days ago - 5 comments

#553 - Grpo loss

Pull Request - State: open - Opened by kashif 15 days ago

#552 - Monkeypatch for Qwen2.5-VL

Pull Request - State: open - Opened by BenasdTW 16 days ago - 5 comments

#551 - [Verl] Add entropy loss to `cross_entropy_loss` and `fused_linear_cross_entropy_loss`

Pull Request - State: open - Opened by hongpeng-guo 18 days ago - 2 comments

#546 - [WIP] Fix convergence test for future transfomers version

Pull Request - State: open - Opened by Tcc0403 18 days ago

#543 - The convergence test `test_mini_models_with_logits` is failing with the latest transformers

Issue - State: open - Opened by Tcc0403 20 days ago - 7 comments
Labels: bug

#542 - `revert_liger_kernel_to_xxx` can't revert LigerCrossEntropyLoss for transformers>=4.46.1

Issue - State: open - Opened by Tcc0403 20 days ago - 3 comments
Labels: bug

#541 - Format files

Pull Request - State: closed - Opened by austin362667 21 days ago

#540 - Add Flex Attention Monkey Patch for LLAMA

Pull Request - State: open - Opened by austin362667 22 days ago

#539 - Improve Hugging Face SFT Script

Pull Request - State: open - Opened by ParagEkbote 23 days ago - 5 comments

#538 - Issue while building from source on ROCM

Issue - State: open - Opened by agunapal 25 days ago - 3 comments

#537 - Support the new Solar architecture

Issue - State: open - Opened by arnavgarg1 25 days ago

#536 - add GitHub CI for Intel GPU

Pull Request - State: open - Opened by faaany 26 days ago - 6 comments

#535 - [tests] use a valid hexadecimal string instead of a placeholder

Pull Request - State: closed - Opened by faaany 26 days ago - 1 comment

#534 - Add Mkdocs related dependencies to setup.py

Pull Request - State: closed - Opened by hebiao064 26 days ago

#533 - Gradient checkpointing for `grad_weight` in LFCE

Issue - State: open - Opened by cassanof 27 days ago - 4 comments

#532 - Fix LlamaRotaryEmbedding Tests [#520]

Pull Request - State: closed - Opened by manojks1999 28 days ago - 1 comment

#531 - Remove extra print

Pull Request - State: closed - Opened by apaz-cli 29 days ago - 1 comment

#530 - Add argument `return_z_loss` to flce

Pull Request - State: closed - Opened by Tcc0403 29 days ago - 1 comment

#529 - Handle cache_position for transformers 4.47.0 and later (#528)

Pull Request - State: closed - Opened by BenasdTW 29 days ago - 3 comments

#528 - Qwen2-VL breaks with transformers version 4.47.0+: `TypeError: lce_forward() got an unexpected keyword argument 'cache_position'`

Issue - State: closed - Opened by BenasdTW 30 days ago - 1 comment

#527 - `return_z_loss` is not supported for `LigerFusedLinearCrossEntropyFunction` and `LigerFusedLinearCrossEntropyLoss`

Issue - State: closed - Opened by apaz-cli 30 days ago
Labels: good first issue

#526 - Fix HF `transformers` Breaking Changes

Pull Request - State: closed - Opened by austin362667 about 1 month ago - 1 comment

#525 - `LlamaRotaryEmbedding` Input Argument Is Inconsistent with Hugging Face

Issue - State: closed - Opened by austin362667 about 1 month ago

#524 - Add huggingface llava

Pull Request - State: open - Opened by jp1924 about 1 month ago - 20 comments

#523 - Fix Unit Test error brought by Transformer Breaking Changes

Pull Request - State: closed - Opened by hebiao064 about 1 month ago - 3 comments

#522 - [Tiny] Add QVQ to readme

Pull Request - State: closed - Opened by tyler-romero about 1 month ago

#521 - [DPO] add reference log-prob outputs in DPO

Pull Request - State: open - Opened by kashif about 1 month ago

#520 - NVIDIA CI failing due to transformers v4.48.0 refactor

Issue - State: closed - Opened by Tcc0403 about 1 month ago - 2 comments
Labels: bug, good first issue

#519 - Fix mean subtraction in layer norm kernels

Pull Request - State: open - Opened by nhamanasu about 1 month ago - 5 comments

#518 - For better numerical accuracy in LayerNorm

Issue - State: open - Opened by nhamanasu about 1 month ago - 2 comments

#517 - Memory Optimization with Liger Kernel Shows Limited Effect on larger Model （more than 7B）

Issue - State: open - Opened by dyyoungg about 1 month ago - 3 comments

#515 - IndexError: The shape of the mask [7387] at index 0 does not match the shape of the indexed tensor [1] at index 0

Issue - State: closed - Opened by 14H034160212 about 1 month ago - 2 comments
Labels: bug, good first issue

#514 - Any plans to add models from the llava series?

Issue - State: open - Opened by jp1924 about 1 month ago - 10 comments

#513 - Megatron Support

Issue - State: open - Opened by huyiwen about 1 month ago

#512 - `LigerFusedLinearCrossEntropyLoss` Causes Training Loss to Diverge After Reaching ~8

Issue - State: open - Opened by penghui-yang about 1 month ago - 6 comments

#511 - Refactor CrossEntropy and FusedLinearCrossEntropy

Pull Request - State: open - Opened by Tcc0403 about 1 month ago

#510 - Add `average_log_prob` args for cpo

Pull Request - State: open - Opened by Mecoli1219 about 1 month ago - 1 comment

#509 - Is Liger-Kernel significantly slower than torch based on benchmark?

Issue - State: closed - Opened by wa008 about 2 months ago - 1 comment

#508 - Set z_loss_1d=None when return_z_loss=False in cross_entropy_loss to avoid tl.store fail when triton_interpret=1(for tl.device_print etc.)

Pull Request - State: closed - Opened by wa008 about 2 months ago

#507 - result of LigerCrossEntropyLoss is always 0

Issue - State: closed - Opened by wa008 about 2 months ago - 6 comments

#506 - [CI] Add ROCm 6.3 CI

Pull Request - State: open - Opened by tjtanaa about 2 months ago - 4 comments

#505 - Consider support liger kernel for internlm model

Issue - State: open - Opened by 14H034160212 about 2 months ago

#504 - Add unit tests for shared prefix masked attention with `torch.FlexAttention`

Pull Request - State: open - Opened by austin362667 about 2 months ago - 2 comments

#503 - [ORPO] add nll_target for orpo nll loss

Pull Request - State: open - Opened by kashif about 2 months ago

#502 - Fix Dtype Mismatch in torch.addmm within ops/fused_linear_cross_entropy.py in AMP training.

Pull Request - State: closed - Opened by DandinPower about 2 months ago - 5 comments

#501 - Dtype Mismatch in `torch.addmm` within `ops/fused_linear_cross_entropy.py` in AMP training

Issue - State: closed - Opened by DandinPower about 2 months ago

#500 - Extending Liger-Kernel Optimizations to Encoder Models Like BER

Issue - State: open - Opened by pengzhangzhi about 2 months ago

#499 - [Model] DeepseekV2 Support

Pull Request - State: open - Opened by saurabhkoshatwar about 2 months ago - 1 comment

#498 - [tests] skip failed tests for xpu

Pull Request - State: closed - Opened by faaany about 2 months ago - 2 comments

#497 - annotate tl constexpr values

Pull Request - State: closed - Opened by winglian about 2 months ago

#496 - Fix/liger fused linear cross entropy function does not support reduction=none

Pull Request - State: closed - Opened by ryankert01 about 2 months ago - 1 comment

#495 - speed up kto loss and some refactor

Pull Request - State: closed - Opened by shivam15s about 2 months ago

#494 - speed up kto loss and some refactor

Pull Request - State: closed - Opened by shivam15s about 2 months ago

#493 - CPO & SimPO add label_smoothing

Pull Request - State: closed - Opened by Mecoli1219 about 2 months ago - 3 comments

#492 - Add `aux_outputs` for CPO and SimPO

Pull Request - State: open - Opened by Mecoli1219 about 2 months ago

#491 - Refactor chunked preference functions and distillation base class

Pull Request - State: open - Opened by shivam15s about 2 months ago

#490 - fix dpo tests: reduce tolerance and change default compute_nll_loss false

Pull Request - State: closed - Opened by shivam15s about 2 months ago

#489 - Revert "fix chosen_nll_loss in chunked losses (#486)"

Pull Request - State: closed - Opened by shivam15s about 2 months ago

#488 - LigerFusedLinearCrossEntropyFunction does not support reduction=None

Issue - State: closed - Opened by Xiang-cd 2 months ago - 1 comment
Labels: good first issue

#487 - error when run `sh run_qwen.sh`

Issue - State: open - Opened by CharlesJhonson 2 months ago - 3 comments
Labels: good first issue

#486 - fix chosen_nll_loss in chunked losses

Pull Request - State: closed - Opened by kashif 2 months ago

#485 - Create Docs for Liger-Kernel

Pull Request - State: closed - Opened by ParagEkbote 2 months ago - 8 comments

#484 - Fix Preference Loss and Refactor for Readability

Pull Request - State: closed - Opened by austin362667 2 months ago

#483 - Move the checkstyle to [Ruff](https://docs.astral.sh/ruff/)

Pull Request - State: closed - Opened by shivam15s 2 months ago - 1 comment

#482 - fix: correct typos in docstrings

Pull Request - State: closed - Opened by shivam15s 2 months ago

#481 - preference loss sign is inverted and leads to negative loss

Pull Request - State: closed - Opened by winglian 2 months ago - 1 comment

#480 - Fine-tuned qwen2.5-7b reported a backward error

Issue - State: closed - Opened by chenchen0611 2 months ago - 1 comment

#479 - [Transformer] fix ORPO loss for MOE models

Pull Request - State: closed - Opened by kashif 2 months ago

#478 - What's the different with FlagGems

Issue - State: closed - Opened by CharlesJhonson 2 months ago

#477 - Fix Rope Compatibility with Cos/Sin Position Embedding for Batch Size > 1

Pull Request - State: closed - Opened by wizyoung 2 months ago - 1 comment

#476 - Potential Optimization for Preference Training with Prefix Sharing

Issue - State: open - Opened by austin362667 2 months ago

#475 - Add KTO Loss

Pull Request - State: closed - Opened by hebiao064 2 months ago - 2 comments

#474 - error when run kernel test

Issue - State: closed - Opened by CharlesJhonson 2 months ago - 3 comments

#473 - align post training loss at the center

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#472 - Add more post training in readme

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#471 - [CI] runtime pip install using uv

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#470 - modify ref_input in chunked_loss base class and fix tests

Pull Request - State: closed - Opened by shivam15s 2 months ago

#469 - Revert "Add ref_input parameter to support separate inputs for reference model"

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#468 - test: Add test for ref_input parameter in fused linear preference

Pull Request - State: open - Opened by xingyaoww 2 months ago

#467 - Add ref_input parameter to support separate inputs for reference model

Pull Request - State: closed - Opened by xingyaoww 2 months ago - 3 comments

#466 - Revert Workaround of Disabling QWEN2_VL in Convergence Tests

Pull Request - State: closed - Opened by austin362667 2 months ago

#465 - Add on-paper form of RoPE kernel

Pull Request - State: open - Opened by Comet0322 2 months ago - 1 comment

#464 - Fix Qwen2VL mrope for transformers 4.47.0

Pull Request - State: closed - Opened by li-plus 2 months ago - 5 comments

#463 - Disable Qwen2 VL test for with logits conv test

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#462 - Update pyproject.toml

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#461 - Qwen VL Convergence Test Fails for Transformers >= 4.47.0

Issue - State: closed - Opened by ByronHsu 2 months ago - 2 comments

#460 - Add dynamic dependency management for CUDA and ROCm

Pull Request - State: closed - Opened by hebiao064 2 months ago

#459 - Fix liger orpo trainer import error

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#458 - [0.5.0] from trl.trainer import ORPOTrainer ModuleNotFoundError: No module named 'trl'

Issue - State: closed - Opened by Fazziekey 2 months ago - 2 comments

#457 - add sponsorship and collab

Pull Request - State: closed - Opened by ByronHsu 2 months ago

#456 - Add HIP (ROCm) and Liger Kernel to env report

Pull Request - State: closed - Opened by Comet0322 2 months ago

#455 - version bump to 0.5.0

Pull Request - State: closed - Opened by shivam15s 2 months ago

GitHub / linkedin/Liger-Kernel issues and pull requests