Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / hamelsmu/llama-inference issues and pull requests
#17 - CTranslate2: support multiple GPUs (if run with mpirun) and Flash Attention 2.
Pull Request -
State: open - Opened by ivanbaldo 7 months ago
#16 - OpenAI compatible servers benchmark based on the anyscale and exllama benchmarks.
Pull Request -
State: open - Opened by ivanbaldo 7 months ago
#15 - Update exllama/bench.py from OpenAI 0.27.8 to 1.16.2.
Pull Request -
State: open - Opened by ivanbaldo 7 months ago
#14 - Upstream renamed from mlc_chat to mlc_llm.
Pull Request -
State: open - Opened by ivanbaldo 8 months ago
#13 - Add new candle-vllm Dockerfile with instructions to benchmark it.
Pull Request -
State: open - Opened by ivanbaldo 9 months ago
#12 - Add PowerInfer benchmark.
Pull Request -
State: open - Opened by ivanbaldo 10 months ago
#11 - Add mlc/Dockerfile with instructions inside it.
Pull Request -
State: open - Opened by ivanbaldo 10 months ago
#10 - Add ctranslate/Dockerfile with instructions to use it.
Pull Request -
State: open - Opened by ivanbaldo 10 months ago
#9 - hf/bench.py: need to specify bfloat16 otherwise it consumes twice as much memory in A10.
Pull Request -
State: open - Opened by ivanbaldo 10 months ago
#8 - New hf/bench-bt_fa.py for testing Optimum BetterTransformer and Flash Attention.
Pull Request -
State: closed - Opened by ivanbaldo 11 months ago
- 1 comment
#7 - Add a Dockerfile for the /hf benchmarks with instructions to build and run them.
Pull Request -
State: open - Opened by ivanbaldo 11 months ago
#6 - Fix hf/bench-gptq.py.
Pull Request -
State: open - Opened by ivanbaldo 11 months ago
#5 - Add plot in summary notebook
Pull Request -
State: open - Opened by emattia 12 months ago
- 2 comments
#4 - Can you do a benchmark on what happens when you load the basic HF model with bfloat16 ?
Issue -
State: open - Opened by mihaipora about 1 year ago
#3 - Add support for all Hugging Face Chat, Text models + OpenAI, Claude2, Cohere, Palm, Replicate models
Pull Request -
State: open - Opened by ishaan-jaff over 1 year ago
- 2 comments
#2 - Incorrect results for exllama
Issue -
State: closed - Opened by arbi-dev over 1 year ago
- 3 comments
#1 - How is it compared with the Deepspeed inference?
Issue -
State: open - Opened by allanj over 1 year ago