ayaka14732/llama-2-jax issues and pull requests

#29 - Generated tokens contain gibberish when giving multiple inputs, i.e. batch > 1

Issue - State: open - Opened by ksmdnl 8 months ago

#28 - Llama2 fine-tuned chat version

Issue - State: open - Opened by ksmdnl 8 months ago

#27 - Training From Scrach -- Redpajama

Issue - State: open - Opened by opooladz 11 months ago

#26 - Unable to shard llama 13B (and 70B) on v4-32 TPU

Issue - State: open - Opened by defdet 11 months ago

#25 - Allow for transfer learning

Pull Request - State: closed - Opened by defdet about 1 year ago

#24 - Update

Pull Request - State: closed - Opened by divyapatel4 about 1 year ago

#23 - correct the formula for k

Pull Request - State: closed - Opened by defdet about 1 year ago

#22 - Problems sharding Llama-70B on TPU v3-32

Issue - State: open - Opened by divyapatel4 about 1 year ago - 1 comment

#21 - Adapt the repo for Mistral. Also error in calculation of `d_k` in `determine_params.py`

Issue - State: closed - Opened by defdet about 1 year ago - 6 comments

#20 - DEADLINE_EXCEEDED when running train.py on GPU node

Issue - State: open - Opened by zigzagcai about 1 year ago

#18 - Improve generation speed and add benchmark for generation

Pull Request - State: open - Opened by ayaka14732 over 1 year ago

#17 - Fix training

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#16 - forward_llama() missing 1 required keyword-only argument: 'rotary_values'

Issue - State: closed - Opened by GluckLee over 1 year ago - 2 comments

#15 - Generation speed

Issue - State: open - Opened by sh0416 over 1 year ago - 10 comments

#14 - Got jax.errors.TracerIntegerConversionError when running generate.py

Issue - State: open - Opened by zhangzx-uiuc over 1 year ago - 2 comments

#13 - Implement left padding

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#12 - Implement KV cache

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#11 - Update

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#10 - train.py OOM on TPUv3-8

Issue - State: open - Opened by ethanhe42 over 1 year ago - 9 comments

#9 - HF LLaMA Flax

Issue - State: open - Opened by sanchit-gandhi over 1 year ago - 1 comment

#7 - 13B parameter model

Issue - State: open - Opened by aniquetahir over 1 year ago

#6 - Update

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#5 - Convert back to Hugging Face model

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#4 - Multihost training support

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#3 - Update

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#2 - Update to Llama 2

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

#1 - Update to Llama 2

Pull Request - State: closed - Opened by ayaka14732 over 1 year ago

Ecosyste.ms: Issues

GitHub / ayaka14732/llama-2-jax issues and pull requests

#29 - Generated tokens contain gibberish when giving multiple inputs, i.e. batch > 1

#28 - Llama2 fine-tuned chat version

#27 - Training From Scrach -- Redpajama

#26 - Unable to shard llama 13B (and 70B) on v4-32 TPU

#25 - Allow for transfer learning

#24 - Update

#23 - correct the formula for k

#22 - Problems sharding Llama-70B on TPU v3-32

#21 - Adapt the repo for Mistral. Also error in calculation of `d_k` in `determine_params.py`

#20 - DEADLINE_EXCEEDED when running train.py on GPU node

#18 - Improve generation speed and add benchmark for generation

#17 - Fix training

#16 - forward_llama() missing 1 required keyword-only argument: 'rotary_values'

#15 - Generation speed

#14 - Got jax.errors.TracerIntegerConversionError when running generate.py

#13 - Implement left padding

#12 - Implement KV cache

#11 - Update

#10 - train.py OOM on TPUv3-8

#9 - HF LLaMA Flax

#7 - 13B parameter model

#6 - Update

#5 - Convert back to Hugging Face model

#4 - Multihost training support

#3 - Update

#2 - Update to Llama 2

#1 - Update to Llama 2