Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / ericlbuehler/candle-vllm issues and pull requests
#88 - Custom benchmark with parameters
Pull Request -
State: closed - Opened by guoqingbao about 1 month ago
#87 - Fix Gemma-2 multiple eos/bos ids
Pull Request -
State: closed - Opened by guoqingbao about 1 month ago
#86 - Support softcapping (Gemma-2 models)
Pull Request -
State: closed - Opened by guoqingbao about 1 month ago
#85 - Restore previous bug fix
Pull Request -
State: closed - Opened by guoqingbao about 1 month ago
#84 - Add support for the Gemma 2 model
Pull Request -
State: closed - Opened by EricLBuehler about 1 month ago
- 5 comments
#83 - Apply clippy
Pull Request -
State: closed - Opened by EricLBuehler about 1 month ago
#82 - No crash when both hidden_act and hidden_activation are set for gemma models
Pull Request -
State: closed - Opened by guoqingbao about 2 months ago
#81 - Ask users to provide huggingface token if no token cached and passed to the program.
Pull Request -
State: closed - Opened by guoqingbao about 2 months ago
#80 - Fix bug for non-stream response
Pull Request -
State: closed - Opened by guoqingbao about 2 months ago
#79 - Add model support for gemma 9b
Issue -
State: open - Opened by sigridjineth about 2 months ago
- 16 comments
Labels: enhancement
#78 - Optimize quantized matmul in batch processing & update Q4K results
Pull Request -
State: closed - Opened by guoqingbao about 2 months ago
#77 - Support in-situ quantization
Pull Request -
State: closed - Opened by guoqingbao about 2 months ago
#76 - Running without huggingface token cache raises an error
Issue -
State: closed - Opened by sigridjineth about 2 months ago
- 2 comments
#75 - When using non-stream mode, the client is blocking.
Issue -
State: closed - Opened by wzzju about 2 months ago
- 2 comments
#74 - Parallel token sampling process & reset decoder after each generation
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#73 - Tweak sampling parameters & update batched generation results
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#72 - Fix bug for space token decoding & remove redundant code
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#71 - Fix bug for token decoding & remove token padding
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#70 - Applying the optimization options
Pull Request -
State: closed - Opened by kozistr 2 months ago
- 2 comments
#69 - Support streaming batched chat completion requests
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#68 - Update demo video
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#67 - LLaMa3.1 chat completion
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#66 - More elegant way for handing non-streaming finish signal.
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#65 - Fix bug for non-streaming generation.
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#64 - Fix typo & update ReadMe
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#63 - Switch streaming service to axum & standalone generation thread
Pull Request -
State: closed - Opened by guoqingbao 2 months ago
#62 - Using candle-vllm as crate in rust?
Issue -
State: open - Opened by gkvoelkl 3 months ago
- 1 comment
#61 - Server-side generation breaks down when the client closes the connection or stops the chat.
Issue -
State: closed - Opened by guoqingbao 3 months ago
- 2 comments
Labels: enhancement
#60 - Trim HF token
Pull Request -
State: closed - Opened by EricLBuehler 3 months ago
- 1 comment
#59 - model download failing from HF
Issue -
State: closed - Opened by Ranganaths 3 months ago
- 1 comment
#58 - Fix build
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#57 - Support Yi & StableLM models, change default maximum length of generated tokens for smooth chat.
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#56 - Fix corner case when block table too small
Pull Request -
State: closed - Opened by EricLBuehler 3 months ago
#55 - Fix mistral output repetition with F32 rope and penalty & temperature parameters
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#54 - Fix mistral model & more optional model-specific parameters.
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#53 - Support Phi2 and Mistral models, fix generation remainder, more sampling parameters, etc.
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#52 - Fix bug for previous removal of repeat_kv (when key_value_heads > 1 and < attention_heads)
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#51 - Qwen 2 model broken
Issue -
State: closed - Opened by EricLBuehler 3 months ago
- 3 comments
#50 - Higher precision for rope in Gemma model.
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#49 - Support Gemma model & remove repeat_kv (replaced with broadcast matmu…
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#48 - Error prompt for requested message exceeds model capacity
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#47 - LongRope support for Phi 3
Issue -
State: closed - Opened by EricLBuehler 3 months ago
- 2 comments
#46 - Support qwen2 model, optimize phi3 model, revise model loading strategy
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
- 2 comments
#45 - Unified pipeline for models & support phi3 model
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
#44 - Support chat serving for more models
Issue -
State: open - Opened by guoqingbao 3 months ago
- 7 comments
Labels: enhancement
#43 - Support stream response
Issue -
State: closed - Opened by guoqingbao 3 months ago
- 7 comments
Labels: enhancement
#42 - Support stream chat completion & optimization for decoding stage
Pull Request -
State: closed - Opened by guoqingbao 3 months ago
- 10 comments
#41 - Configurable kvcache & fix repeat chat history
Pull Request -
State: closed - Opened by guoqingbao 4 months ago
- 3 comments
#40 - Optional logprobs & fix llama eos/stop token
Pull Request -
State: closed - Opened by guoqingbao 4 months ago
- 4 comments
#39 - Dusting the project off
Pull Request -
State: closed - Opened by EricLBuehler 4 months ago
#38 - Correct generation with paged attention (fix kernel launch, kvcache, llama pipeline, etc.)
Pull Request -
State: closed - Opened by guoqingbao 4 months ago
- 1 comment
#37 - Fix pipeline generation (kernel launch, kernel compilation, rwlock, paged attention, etc.)
Pull Request -
State: closed - Opened by guoqingbao 4 months ago
- 2 comments
#36 - Doesn't compile
Issue -
State: closed - Opened by ivanbaldo 6 months ago
- 2 comments
#35 - candle-vllm build issue
Issue -
State: closed - Opened by tupleleap 7 months ago
- 1 comment
#34 - Support using arbitrary derivative models
Issue -
State: closed - Opened by ivanbaldo 8 months ago
- 5 comments
#33 - Support Mixtral-8x7B-v0.1
Issue -
State: closed - Opened by ivanbaldo 8 months ago
- 2 comments
Labels: enhancement, tracking
#32 - --repeat-last-n option not mentioned in the usage help
Issue -
State: closed - Opened by ivanbaldo 8 months ago
- 8 comments
Labels: triaged
#31 - Support running without the --hf-token parameter and using ~/.cache/huggingface/token instead
Issue -
State: closed - Opened by ivanbaldo 8 months ago
- 45 comments
Labels: enhancement, triaged
#30 - Fix model IDs
Pull Request -
State: closed - Opened by pcuenca 8 months ago
- 1 comment
#29 - Wrong URL for downloading models
Issue -
State: closed - Opened by ivanbaldo 8 months ago
- 5 comments
#28 - `paged_attention_v1` function
Issue -
State: closed - Opened by EricLBuehler 8 months ago
#27 - `rotary_embedding` function
Issue -
State: closed - Opened by EricLBuehler 8 months ago
#26 - [Request] Constrained Generation
Issue -
State: closed - Opened by scottwey 8 months ago
- 4 comments
Labels: enhancement
#25 - candle-flash-attn linking error with Red Hat based distributions
Issue -
State: closed - Opened by ivanbaldo 8 months ago
- 46 comments
Labels: bug, triaged
#24 - Use rotary embedding CUDA kernel
Pull Request -
State: closed - Opened by EricLBuehler 9 months ago
Labels: enhancement, tracking
#23 - `reshape_and_cache` function
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 1 comment
#22 - Pass tensor pointers
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 1 comment
#21 - `swap_blocks` function
Issue -
State: closed - Opened by EricLBuehler 10 months ago
#20 - `copy_blocks` function
Issue -
State: closed - Opened by EricLBuehler 10 months ago
Labels: tracking
#19 - Switch to a Rust-based `cudarc` based backend
Pull Request -
State: closed - Opened by EricLBuehler 10 months ago
- 1 comment
Labels: enhancement, tracking
#18 - Barriers to further development
Issue -
State: closed - Opened by EricLBuehler 10 months ago
Labels: urgent, tracking
#17 - Add devcontainer
Pull Request -
State: closed - Opened by sigma-andex 10 months ago
- 3 comments
#16 - Integrate cxx
Pull Request -
State: closed - Opened by sigma-andex 10 months ago
- 1 comment
Labels: enhancement, tracking
#15 - Add working scheduler
Pull Request -
State: closed - Opened by EricLBuehler 10 months ago
Labels: enhancement
#14 - KV Cache and Scheduler tracking issue
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 1 comment
Labels: tracking
#13 - Add PagedAttention
Pull Request -
State: closed - Opened by EricLBuehler 10 months ago
Labels: enhancement, tracking
#12 - mistral error
Issue -
State: closed - Opened by lambdaofgod 10 months ago
- 4 comments
Labels: bug, triaged, urgent
#11 - PagedAttention tracking issue
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 4 comments
Labels: tracking
#10 - Pipeline batching tracking issue
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 3 comments
Labels: tracking
#9 - PagedAttention tracking issue
Issue -
State: closed - Opened by EricLBuehler 10 months ago
Labels: tracking
#8 - Mistral does not load safetensors
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 1 comment
Labels: bug, triaged
#7 - OpenAI API version
Issue -
State: closed - Opened by lambdaofgod 10 months ago
- 4 comments
Labels: triaged
#6 - KV Cache causes breakage
Issue -
State: closed - Opened by EricLBuehler 10 months ago
- 2 comments
Labels: bug
#5 - added flan-t5 example into test
Pull Request -
State: closed - Opened by bm777 11 months ago
- 3 comments
#4 - Can the architectural design be improved?
Issue -
State: closed - Opened by mokeyish 11 months ago
- 9 comments
Labels: enhancement
#3 - Batching and VLLM-style kv caching missing
Issue -
State: closed - Opened by michaelfeil 11 months ago
- 7 comments
Labels: enhancement
#2 - Support streaming of tokens
Issue -
State: closed - Opened by michaelfeil 11 months ago
- 1 comment
Labels: enhancement
#1 - Readme request
Issue -
State: closed - Opened by bm777 11 months ago
- 4 comments