Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / ggerganov/llama.cpp issues and pull requests
#9753 - nix: update flake.lock
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
Labels: nix
#9752 - ggml : add backend registry / device interfaces to BLAS backend
Pull Request -
State: closed - Opened by slaren about 1 month ago
Labels: testing, ggml
#9750 - Problem with using llava_surgery_v2.py
Issue -
State: open - Opened by ssykee about 1 month ago
Labels: bug-unconfirmed, high severity
#9748 - Feature Request: Anti-slop / fine tuning of a model output in realtime / on the fly for output quality enhancement.
Issue -
State: open - Opened by David-AU-github about 1 month ago
Labels: enhancement
#9747 - Single allocation of encode_async block with non-ARC capture in ggml-metal.m
Pull Request -
State: closed - Opened by ptsochantaris about 1 month ago
- 1 comment
#9745 - llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch
Pull Request -
State: open - Opened by ngxson about 1 month ago
- 6 comments
Labels: breaking change, android, examples, server
#9742 - sampling : add XTC sampler
Pull Request -
State: closed - Opened by MaggotHATE about 1 month ago
- 33 comments
Labels: testing, examples, server
#9738 - Feature Request: multimodal on android
Issue -
State: open - Opened by surajat17 about 1 month ago
- 2 comments
Labels: enhancement
#9737 - rerank : use [SEP] token instead of [BOS]
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
Labels: examples, devops, server
#9734 - vulkan : add GGML_VK_FORCE_HEAP_INDEX env var
Pull Request -
State: open - Opened by gyf304 about 1 month ago
Labels: Vulkan, ggml
#9733 - ggml: Add POOL2D OP for GPU ACC to the Vulkan backend in the MobileVLM model.
Pull Request -
State: closed - Opened by cyzero-kim about 1 month ago
- 5 comments
Labels: Vulkan, ggml
#9724 - Potential GPU Usage During CPU Inference (ngl=0)
Issue -
State: open - Opened by RakshitAralimatti about 1 month ago
- 5 comments
#9722 - Feature Request: SYCL CI online
Issue -
State: closed - Opened by airMeng about 1 month ago
- 9 comments
Labels: enhancement
#9721 - vulkan : add backend registry / device interfaces
Pull Request -
State: closed - Opened by slaren about 1 month ago
- 6 comments
Labels: Vulkan, ggml
#9717 - Update convert_llama_ggml_to_gguf.py
Pull Request -
State: closed - Opened by Ahmad986Ferdaws about 1 month ago
- 2 comments
Labels: python
#9713 - ggml : add metal backend registry / device
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
- 5 comments
Labels: script, testing, Nvidia GPU, nix, Vulkan, examples, python, devops, server, ggml, SYCL, Apple Metal, Kompute
#9708 - Bug: win-vulkan-x64 crashed since b3831
Issue -
State: open - Opened by cwt about 1 month ago
Labels: bug-unconfirmed, critical severity
#9707 - ggml-backend : add device and backend reg interfaces
Pull Request -
State: closed - Opened by slaren about 1 month ago
- 2 comments
Labels: script, testing, Nvidia GPU, Vulkan, devops, ggml, SYCL, Apple Metal, Kompute
#9706 - Feature Request: Unify GGML logging mechanism
Issue -
State: open - Opened by bandoti about 1 month ago
Labels: enhancement
#9705 - [SYCL] Add SYCL Backend registry, device and Event Interfaces
Pull Request -
State: closed - Opened by OuadiElfarouki about 1 month ago
- 2 comments
Labels: examples, ggml, SYCL
#9704 - examples : remove benchmark
Pull Request -
State: open - Opened by ggerganov about 1 month ago
Labels: examples
#9702 - added implementation of DRY sampler (post-refactor)
Pull Request -
State: closed - Opened by wwoodsTM about 1 month ago
- 37 comments
Labels: testing, examples, server
#9701 - Bug: llama 3.2 error: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)
Issue -
State: closed - Opened by guoriyue about 1 month ago
- 3 comments
Labels: bug-unconfirmed, critical severity
#9700 - Feature Request: Support FlashAttention-3
Issue -
State: open - Opened by hg0428 about 1 month ago
Labels: enhancement
#9698 - metal : reduce command encoding overhead
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
Labels: examples, ggml, Apple Metal
#9697 - ci : reduce severity of unused Pyright ignore comments
Pull Request -
State: closed - Opened by compilade about 1 month ago
Labels: examples, python, devops
#9696 - convert : handle tokenizer merges format from transformers 4.45
Pull Request -
State: open - Opened by compilade about 1 month ago
- 4 comments
Labels: bugfix, Review Complexity : Low, python
#9695 - Bug: quality decreases in embeddings models
Issue -
State: open - Opened by Maxon081102 about 1 month ago
- 2 comments
Labels: bug-unconfirmed, medium severity
#9694 - update transfomers version.
Pull Request -
State: closed - Opened by Vaibhavs10 about 1 month ago
Labels: examples, python, server
#9692 - Bug: cannot find tokenizer merges in model file
Issue -
State: closed - Opened by nd791899 about 1 month ago
- 11 comments
Labels: bug, high priority, high severity
#9691 - musa: enable docker workflow
Pull Request -
State: closed - Opened by yeahdongcn about 1 month ago
Labels: documentation, devops
#9690 - utf-8 fix for windows stdin
Pull Request -
State: closed - Opened by hasaranga about 1 month ago
#9687 - llama : first attempt to implement vision API (WIP)
Pull Request -
State: open - Opened by ngxson about 1 month ago
- 2 comments
Labels: examples, python
#9685 - musa: add docker image support
Pull Request -
State: closed - Opened by yeahdongcn about 1 month ago
- 1 comment
Labels: documentation, devops
#9684 - ggml : define missing HWCAP flags
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
Labels: ggml
#9683 - Use new model class for chameleon conversion
Pull Request -
State: closed - Opened by nopperl about 1 month ago
Labels: python
#9680 - nix: update flake.lock
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
Labels: nix
#9679 - `server`: cancel non-streamed requests w/ closed connection
Pull Request -
State: open - Opened by ochafik about 1 month ago
Labels: examples, python, server
#9678 - Bug: Can't Convert Meta's Chameleon-7B to GGUF (ERROR:hf-to-gguf:Model ChameleonForConditionalGeneration is not supported)
Issue -
State: closed - Opened by joseph777111 about 1 month ago
- 3 comments
Labels: bug-unconfirmed, medium severity
#9676 - Bug: `illegal hardware instruction` when running on M3 mac Sequoia installed with brew
Issue -
State: open - Opened by Ben-Epstein about 1 month ago
- 3 comments
Labels: bug-unconfirmed, high severity
#9675 - contrib : add Resources section
Pull Request -
State: closed - Opened by ggerganov about 1 month ago
#9674 - Bug: baby-llama fails
Issue -
State: open - Opened by sfadaei about 1 month ago
- 1 comment
Labels: bug-unconfirmed, stale, medium severity
#9673 - Bug: convert_hf_to_gguf.py - Converting HF model to GGUF giving error Missing tokenizer.model - Qwen2.5 based
Issue -
State: closed - Opened by Spacellary about 1 month ago
- 1 comment
Labels: bug-unconfirmed, high severity
#9672 - Update building for Android
Pull Request -
State: closed - Opened by amqdn about 1 month ago
- 26 comments
Labels: documentation, merge ready
#9671 - Bug: Initializing KV Cache Spikes Memory, Crashing on Android
Issue -
State: closed - Opened by amqdn about 1 month ago
- 4 comments
Labels: bug-unconfirmed, critical severity
#9668 - common: ensure token addition to batch does not exceed llama_batch size
Pull Request -
State: closed - Opened by matiaslin about 1 month ago
- 3 comments
Labels: build, testing, Vulkan, examples, python, devops, server, ggml, merge ready
#9667 - Bug: llama-parallel crashes when adding more tokens to llama_batch than context size
Issue -
State: closed - Opened by matiaslin about 1 month ago
Labels: bug-unconfirmed, low severity
#9666 - Bug: Issue building hipBLAS error: call to undeclared function '_mm256_dpbusd_epi32'
Issue -
State: open - Opened by Zhaeong about 1 month ago
Labels: bug-unconfirmed, stale, low severity
#9665 - Bug: nvidia-container-cli: requirement error: unsatisfied condition: cuda>=12.6, please update your driver to a newer version, or use an earlier cuda container: unknown.
Issue -
State: open - Opened by wencan about 1 month ago
- 1 comment
Labels: bug-unconfirmed, stale, medium severity
#9664 - Bug: Termux adreno 618 vulkan support
Issue -
State: open - Opened by akac97 about 1 month ago
Labels: bug-unconfirmed, critical severity
#9663 - Feature Request: Add Support for MllamaForConditionalGeneration to Convert Llama 3.2 Vision Models to GGUF Format
Issue -
State: open - Opened by manishkumart about 1 month ago
- 8 comments
Labels: enhancement
#9662 - Dev refactoring
Pull Request -
State: closed - Opened by ykhrustalev about 1 month ago
- 2 comments
Labels: build, ggml
#9661 - cmake : add option for common library
Pull Request -
State: closed - Opened by iboB about 1 month ago
Labels: build
#9659 - Introduce Graph Profiler
Pull Request -
State: open - Opened by max-krasnyansky about 1 month ago
- 2 comments
Labels: ggml
#9658 - sycl: initial cmake support of SYCL for AMD GPUs
Pull Request -
State: open - Opened by Alcpz about 1 month ago
- 3 comments
Labels: documentation, SYCL
#9657 - test-backend-ops : use flops for some performance tests
Pull Request -
State: closed - Opened by slaren about 1 month ago
- 1 comment
Labels: testing
#9656 - Error: llama_model_load: error loading model: failed to open ggml-bagel-2.8b-v0.2-q8_0.gguf
Issue -
State: closed - Opened by vineel96 about 1 month ago
- 4 comments
Labels: bug-unconfirmed, low severity
#9655 - Docs: Add akx/ollama-dl
Pull Request -
State: closed - Opened by akx about 1 month ago
#9652 - Bug: server crashes when embedding model is passed in the -m parameter
Issue -
State: open - Opened by mesibo about 1 month ago
Labels: bug-unconfirmed, stale, low severity
#9651 - Feature Request: sgemm.cpp : Q5_0 support
Issue -
State: open - Opened by Srihari-mcw about 1 month ago
- 3 comments
Labels: enhancement, stale
#9648 - [Draft] Tensor Parallel support to llama.cpp
Pull Request -
State: open - Opened by ClarkChin08 about 1 month ago
- 2 comments
Labels: ggml, SYCL
#9647 - Resurrect Graph & Op Profiler
Pull Request -
State: closed - Opened by max-krasnyansky about 1 month ago
- 5 comments
Labels: ggml
#9645 - Feature Request: Molmo 72B vision support
Issue -
State: open - Opened by Kreijstal about 1 month ago
- 7 comments
Labels: enhancement
#9644 - Bug: IQ3_M is significantly slower than IQ4_XS on AMD, is it expected?
Issue -
State: open - Opened by Nekotekina about 1 month ago
- 3 comments
Labels: bug-unconfirmed, low severity
#9643 - Llama-3.2 11B Vision Support
Issue -
State: open - Opened by yukiarimo about 1 month ago
- 31 comments
#9642 - Feature Request: Add support for LLaMA 3.2
Issue -
State: closed - Opened by ndavidson19 about 1 month ago
Labels: enhancement
#9641 - Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS
Pull Request -
State: closed - Opened by serhii-nakon about 1 month ago
- 6 comments
Labels: devops
#9640 - Bug: server (New UI) ChatML templates are wrong
Issue -
State: open - Opened by ivanstepanovftw about 1 month ago
- 2 comments
Labels: good first issue, server/webui, bug-unconfirmed, medium severity
#9639 - Tool call support (Llama 3.x, Functionary v3, Hermes 2 Pro, Mistral Nemo, generic) w/ lazy grammars & minimalist Jinja engine
Pull Request -
State: open - Opened by ochafik about 1 month ago
- 7 comments
Labels: script, testing, examples, python, server
#9638 - ci : fix docker build number and tag name
Pull Request -
State: closed - Opened by ngxson about 1 month ago
Labels: devops
#9637 - Add inverse chat template metadata
Pull Request -
State: open - Opened by CISC about 1 month ago
Labels: python
#9636 - Bug: Assertion '__n < this->size()' failed.
Issue -
State: open - Opened by Luke100000 about 1 month ago
Labels: bug-unconfirmed, stale, high severity
#9635 - server : add more env vars, improve gen-docs
Pull Request -
State: closed - Opened by ngxson about 1 month ago
Labels: examples, server
#9633 - Examples: Add text compression example.
Pull Request -
State: open - Opened by stduhpf about 1 month ago
- 3 comments
Labels: examples
#9632 - Bug: python: can't open file 'llama.cpp/convert.py': [Errno 2] No such file or directory
Issue -
State: open - Opened by AmosBunde about 1 month ago
- 1 comment
Labels: bug-unconfirmed, stale, low severity
#9631 - Update convert_hf_to_gguf.py
Pull Request -
State: closed - Opened by Ahmad986Ferdaws about 2 months ago
Labels: python
#9630 - Do llama.cpp support input_embeds?
Issue -
State: open - Opened by OswaldoBornemann about 2 months ago
- 3 comments
Labels: bug-unconfirmed, stale, low severity
#9629 - Bug: ggml_cuda_host_malloc: failed to allocate 1900,00 MiB of pinned memory: invalid argument
Issue -
State: closed - Opened by XZVB12 about 2 months ago
- 2 comments
Labels: bug-unconfirmed, low severity
#9628 - Bug: Failed to run qwen2-57b-a14b-instruct-fp16.
Issue -
State: open - Opened by tang-t21 about 2 months ago
- 3 comments
Labels: bug, good first issue, high severity
#9627 - [CANN]: Fix crash when running on multiple cann devices
Pull Request -
State: closed - Opened by Dou-Git about 2 months ago
- 2 comments
Labels: Ascend NPU
#9623 - Bug: [Hardware: ppc64le] On ppc64le llama.cpp only uses 1 thread by default and not half of all threads as it does on x86
Issue -
State: open - Opened by mgiessing about 2 months ago
Labels: bug-unconfirmed, stale, low severity
#9622 - ggml : add AVX512DQ requirement for AVX512 builds
Pull Request -
State: closed - Opened by EZForever about 2 months ago
#9619 - make sure params --split and --merge are not specified at same time in gguf-split
Pull Request -
State: open - Opened by kylo5aby about 2 months ago
- 2 comments
Labels: examples
#9618 - Bug: Got "error: bad arguments" when mergin multiple gguf to single one by using llama-gguf-split
Issue -
State: closed - Opened by zloss about 2 months ago
- 3 comments
Labels: bug-unconfirmed, low severity
#9616 - Add newline after chat example in llama-server
Pull Request -
State: closed - Opened by StrangeBytesDev about 2 months ago
Labels: examples, server
#9615 - threads: fix msvc build without openmp
Pull Request -
State: closed - Opened by max-krasnyansky about 2 months ago
- 2 comments
Labels: ggml
#9613 - Bug: Failed to load llama3.1 405b model
Issue -
State: closed - Opened by Nightmir about 2 months ago
- 2 comments
Labels: bug-unconfirmed, medium severity
#9612 - Bug: [SYCL] crash since b-3805
Issue -
State: closed - Opened by easyfab about 2 months ago
- 43 comments
Labels: bug-unconfirmed, critical severity
#9611 - merge main
Pull Request -
State: closed - Opened by Aliebc about 2 months ago
Labels: nix, examples, devops, server
#9610 - log : add CONT level for continuing previous log entry
Pull Request -
State: closed - Opened by ggerganov about 2 months ago
Labels: examples, ggml
#9609 - llama : keep track of all EOG tokens in the vocab
Pull Request -
State: closed - Opened by ggerganov about 2 months ago
- 1 comment
#9608 - Bug: `llama-server` web UI resets the text selection during inference on every token update
Issue -
State: open - Opened by mashdragon about 2 months ago
Labels: bug-unconfirmed, stale, low severity
#9607 - server : add --no-context-shift option
Pull Request -
State: closed - Opened by ngxson about 2 months ago
Labels: examples, python, server
#9606 - Bug: Qwen2.5-Coder variants do not properly stop in FIM mode
Issue -
State: closed - Opened by tristandruyen about 2 months ago
- 3 comments
Labels: bug-unconfirmed, medium severity
#9605 - sampling : avoid expensive softmax during greedy sampling
Pull Request -
State: closed - Opened by ggerganov about 2 months ago
Labels: testing, examples
#9604 - sampling : fix off-by-one in tail-free sampling
Pull Request -
State: closed - Opened by ggerganov about 2 months ago
- 6 comments
Labels: testing
#9603 - keep the minimum `min_keep` value to 1 in sampling
Pull Request -
State: open - Opened by kylo5aby about 2 months ago
- 1 comment
#9602 - llama : introduce anonymous namespace in llama.cpp
Pull Request -
State: open - Opened by danbev about 2 months ago
- 1 comment
#9601 - Feature Request: OpenVINO backend support request
Issue -
State: open - Opened by aropb about 2 months ago
- 2 comments
Labels: enhancement, stale
#9600 - Feature Request: Word Llama
Issue -
State: open - Opened by TalonBvV about 2 months ago
Labels: enhancement, stale