lucidrains/PaLM-rlhf-pytorch issues and pull requests

#59 - Is there any documentation to train this on my own data ?

Issue - State: open - Opened by gauravgandhi1315 4 months ago

#58 - How to use lora?

Issue - State: open - Opened by xiaoguzai 4 months ago

#57 - Should critic's input be prompt only?

Issue - State: open - Opened by ginward 7 months ago

#56 - Possible incorrect creation of Rotary Embeddinigs

Issue - State: closed - Opened by AndyBarcia 8 months ago - 1 comment

#55 - Update train.py

Pull Request - State: closed - Opened by KnightBits 11 months ago

#54 - Flash Attention 2

Issue - State: open - Opened by conceptofmind 12 months ago

#52 - implement an argument to directly set ff_inner_dim

Pull Request - State: open - Opened by chris-ha458 about 1 year ago - 3 comments

#51 - I looked at the llama source code and there is an intermedie layer

Issue - State: open - Opened by wac81 about 1 year ago

#50 - Create 那个

Pull Request - State: closed - Opened by userodk about 1 year ago

#49 - Model Name

Issue - State: closed - Opened by conceptofmind about 1 year ago - 3 comments

#48 - memory-efficient attention is default opened? if i dont use flash attn

Issue - State: open - Opened by wac81 about 1 year ago - 3 comments

#47 - speed up with flash attn in A6000?

Issue - State: closed - Opened by wac81 about 1 year ago - 2 comments

#46 - norm.gamma not used during backprop

Issue - State: closed - Opened by conceptofmind about 1 year ago - 2 comments

#45 - i use other params with palm, but got error

Issue - State: closed - Opened by wac81 about 1 year ago - 4 comments

#44 - Column and Row Parallel Linear for Apex Tensor Parallel

Issue - State: closed - Opened by conceptofmind over 1 year ago - 1 comment

#43 - Calculating the kl loss seems has a mistake.

Issue - State: closed - Opened by Nightbringers over 1 year ago - 1 comment

#42 - Reason for using pooled critic embedding instead of the last embedding for value head

Issue - State: closed - Opened by gblackout over 1 year ago - 3 comments

#41 - Confusion about KL divergence calculation for human feedback policies

Issue - State: closed - Opened by dwyzzy over 1 year ago - 13 comments

#40 - Add PyTorch 2.0 Flash Attention

Pull Request - State: closed - Opened by conceptofmind over 1 year ago - 17 comments

#39 - mask raised error

Issue - State: closed - Opened by gongel over 1 year ago - 2 comments

#38 - KL divergence loss

Issue - State: closed - Opened by taynoel84 over 1 year ago - 1 comment

#37 - train your reward model issue

Issue - State: open - Opened by wac81 over 1 year ago - 1 comment

#36 - Can not train the model using PyTorch version 2?

Issue - State: closed - Opened by linhduongtuan over 1 year ago - 1 comment

#35 - Value function

Issue - State: open - Opened by tonylin52 over 1 year ago

#34 - test chat

Pull Request - State: closed - Opened by strint over 1 year ago

#33 - Is it possible to train this ai using open-assistant or vice versa?

Issue - State: closed - Opened by qwertystars over 1 year ago - 1 comment

#32 - Can we exploiting AGI ability of chatGPT ?

Issue - State: closed - Opened by youkpan over 1 year ago

#31 - Is this shift right for the action logits?

Issue - State: closed - Opened by kisseternity over 1 year ago - 4 comments

#30 - Do you need cuda for this?

Issue - State: closed - Opened by beew over 1 year ago - 1 comment

#29 - Are there some pictures that describe PaLM architecture?

Issue - State: closed - Opened by guotong1988 over 1 year ago - 1 comment

#28 - value function input

Issue - State: closed - Opened by kkissmart over 1 year ago - 1 comment

#26 - KL_div/ratio on policy

Issue - State: closed - Opened by kkissmart over 1 year ago

#24 - Is it possible to replace PaLM with other huggingface pretrained language model?

Issue - State: open - Opened by noanti over 1 year ago - 2 comments

#23 - ✨ 😅 Is possibale to use the ChatGPT of OpenAI to train this ChatGPT?

Issue - State: open - Opened by Yonv1943 over 1 year ago - 8 comments

#22 - The loss function of reward model.

Issue - State: open - Opened by huzechuan over 1 year ago - 2 comments

#21 - A few questions on training

Issue - State: open - Opened by TheRealAakash over 1 year ago - 3 comments

#20 - How to fine-tune and train on my own data?

Issue - State: open - Opened by rbhatia46 over 1 year ago

#19 - Training the reward model

Issue - State: closed - Opened by farhad-abdi over 1 year ago - 8 comments

#18 - PaLM-rlhf-pytorch Roadmap

Issue - State: closed - Opened by HappyPony over 1 year ago - 4 comments

#17 - Help with computational power

Issue - State: closed - Opened by byteunix over 1 year ago - 4 comments

#16 - Is it possible to release a code based on jax?

Issue - State: closed - Opened by sglucas over 1 year ago - 7 comments

#15 - Simple Web Interface

Issue - State: closed - Opened by conceptofmind over 1 year ago - 2 comments

#14 - Why the value calculate in generate and learn use different mask？

Issue - State: closed - Opened by Nightbringers over 1 year ago - 1 comment

#13 - Palm

Issue - State: closed - Opened by Phob3tor over 1 year ago

#12 - Can we just replace PPO+RLHF with a preference models thats basically a transformer encoder + sigmoid model, trained with BCE. And during finetuning perform a reward maximization by just making the reward model predict 1s?

Issue - State: closed - Opened by ssintelli over 1 year ago - 5 comments

#11 - I'm dumb

Issue - State: closed - Opened by cardonasMind over 1 year ago - 1 comment

#10 - Bug fix: Correct function call in RewardModel->finetune_parameters

Pull Request - State: closed - Opened by QasimWani over 1 year ago - 2 comments

#9 - Can I train a model on my own data?

Issue - State: closed - Opened by sveisa over 1 year ago - 1 comment

#8 - Noob question: How can I use this model for inference?

Issue - State: closed - Opened by PrasoonPratham over 1 year ago - 1 comment

#7 - Update README.md

Pull Request - State: closed - Opened by eltociear over 1 year ago

#6 - Encoder-Decoder

Issue - State: closed - Opened by Bachstelze over 1 year ago - 39 comments

#5 - GPU requirements

Issue - State: closed - Opened by ejarkm over 1 year ago - 3 comments

#4 - Unified reward function/model architecture for a wide range of tasks

Issue - State: open - Opened by James4Ever0 over 1 year ago - 2 comments

#3 - Add wandb logging

Pull Request - State: open - Opened by ell-hol over 1 year ago - 2 comments

#2 - Add HF's Accelerate

Pull Request - State: closed - Opened by ell-hol over 1 year ago - 4 comments

#1 - Easier (and faster) chunk and inplace under nograd

Pull Request - State: closed - Opened by hypnopump over 1 year ago - 1 comment

GitHub / lucidrains/PaLM-rlhf-pytorch issues and pull requests