Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / ggerganov/llama.cpp issues and pull requests

#9753 - nix: update flake.lock

Pull Request - State: closed - Opened by ggerganov about 1 month ago
Labels: nix

#9752 - ggml : add backend registry / device interfaces to BLAS backend

Pull Request - State: closed - Opened by slaren about 1 month ago
Labels: testing, ggml

#9750 - Problem with using llava_surgery_v2.py

Issue - State: open - Opened by ssykee about 1 month ago
Labels: bug-unconfirmed, high severity

#9747 - Single allocation of encode_async block with non-ARC capture in ggml-metal.m

Pull Request - State: closed - Opened by ptsochantaris about 1 month ago - 1 comment

#9745 - llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch

Pull Request - State: open - Opened by ngxson about 1 month ago - 6 comments
Labels: breaking change, android, examples, server

#9742 - sampling : add XTC sampler

Pull Request - State: closed - Opened by MaggotHATE about 1 month ago - 33 comments
Labels: testing, examples, server

#9738 - Feature Request: multimodal on android

Issue - State: open - Opened by surajat17 about 1 month ago - 2 comments
Labels: enhancement

#9737 - rerank : use [SEP] token instead of [BOS]

Pull Request - State: closed - Opened by ggerganov about 1 month ago
Labels: examples, devops, server

#9734 - vulkan : add GGML_VK_FORCE_HEAP_INDEX env var

Pull Request - State: open - Opened by gyf304 about 1 month ago
Labels: Vulkan, ggml

#9733 - ggml: Add POOL2D OP for GPU ACC to the Vulkan backend in the MobileVLM model.

Pull Request - State: closed - Opened by cyzero-kim about 1 month ago - 5 comments
Labels: Vulkan, ggml

#9724 - Potential GPU Usage During CPU Inference (ngl=0)

Issue - State: open - Opened by RakshitAralimatti about 1 month ago - 5 comments

#9722 - Feature Request: SYCL CI online

Issue - State: closed - Opened by airMeng about 1 month ago - 9 comments
Labels: enhancement

#9721 - vulkan : add backend registry / device interfaces

Pull Request - State: closed - Opened by slaren about 1 month ago - 6 comments
Labels: Vulkan, ggml

#9717 - Update convert_llama_ggml_to_gguf.py

Pull Request - State: closed - Opened by Ahmad986Ferdaws about 1 month ago - 2 comments
Labels: python

#9713 - ggml : add metal backend registry / device

Pull Request - State: closed - Opened by ggerganov about 1 month ago - 5 comments
Labels: script, testing, Nvidia GPU, nix, Vulkan, examples, python, devops, server, ggml, SYCL, Apple Metal, Kompute

#9708 - Bug: win-vulkan-x64 crashed since b3831

Issue - State: open - Opened by cwt about 1 month ago
Labels: bug-unconfirmed, critical severity

#9707 - ggml-backend : add device and backend reg interfaces

Pull Request - State: closed - Opened by slaren about 1 month ago - 2 comments
Labels: script, testing, Nvidia GPU, Vulkan, devops, ggml, SYCL, Apple Metal, Kompute

#9706 - Feature Request: Unify GGML logging mechanism

Issue - State: open - Opened by bandoti about 1 month ago
Labels: enhancement

#9705 - [SYCL] Add SYCL Backend registry, device and Event Interfaces

Pull Request - State: closed - Opened by OuadiElfarouki about 1 month ago - 2 comments
Labels: examples, ggml, SYCL

#9704 - examples : remove benchmark

Pull Request - State: open - Opened by ggerganov about 1 month ago
Labels: examples

#9702 - added implementation of DRY sampler (post-refactor)

Pull Request - State: closed - Opened by wwoodsTM about 1 month ago - 37 comments
Labels: testing, examples, server

#9701 - Bug: llama 3.2 error: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)

Issue - State: closed - Opened by guoriyue about 1 month ago - 3 comments
Labels: bug-unconfirmed, critical severity

#9700 - Feature Request: Support FlashAttention-3

Issue - State: open - Opened by hg0428 about 1 month ago
Labels: enhancement

#9698 - metal : reduce command encoding overhead

Pull Request - State: closed - Opened by ggerganov about 1 month ago
Labels: examples, ggml, Apple Metal

#9697 - ci : reduce severity of unused Pyright ignore comments

Pull Request - State: closed - Opened by compilade about 1 month ago
Labels: examples, python, devops

#9696 - convert : handle tokenizer merges format from transformers 4.45

Pull Request - State: open - Opened by compilade about 1 month ago - 4 comments
Labels: bugfix, Review Complexity : Low, python

#9695 - Bug: quality decreases in embeddings models

Issue - State: open - Opened by Maxon081102 about 1 month ago - 2 comments
Labels: bug-unconfirmed, medium severity

#9694 - update transfomers version.

Pull Request - State: closed - Opened by Vaibhavs10 about 1 month ago
Labels: examples, python, server

#9692 - Bug: cannot find tokenizer merges in model file

Issue - State: closed - Opened by nd791899 about 1 month ago - 11 comments
Labels: bug, high priority, high severity

#9691 - musa: enable docker workflow

Pull Request - State: closed - Opened by yeahdongcn about 1 month ago
Labels: documentation, devops

#9690 - utf-8 fix for windows stdin

Pull Request - State: closed - Opened by hasaranga about 1 month ago

#9687 - llama : first attempt to implement vision API (WIP)

Pull Request - State: open - Opened by ngxson about 1 month ago - 2 comments
Labels: examples, python

#9685 - musa: add docker image support

Pull Request - State: closed - Opened by yeahdongcn about 1 month ago - 1 comment
Labels: documentation, devops

#9684 - ggml : define missing HWCAP flags

Pull Request - State: closed - Opened by ggerganov about 1 month ago
Labels: ggml

#9683 - Use new model class for chameleon conversion

Pull Request - State: closed - Opened by nopperl about 1 month ago
Labels: python

#9680 - nix: update flake.lock

Pull Request - State: closed - Opened by ggerganov about 1 month ago
Labels: nix

#9679 - `server`: cancel non-streamed requests w/ closed connection

Pull Request - State: open - Opened by ochafik about 1 month ago
Labels: examples, python, server

#9678 - Bug: Can't Convert Meta's Chameleon-7B to GGUF (ERROR:hf-to-gguf:Model ChameleonForConditionalGeneration is not supported)

Issue - State: closed - Opened by joseph777111 about 1 month ago - 3 comments
Labels: bug-unconfirmed, medium severity

#9676 - Bug: `illegal hardware instruction` when running on M3 mac Sequoia installed with brew

Issue - State: open - Opened by Ben-Epstein about 1 month ago - 3 comments
Labels: bug-unconfirmed, high severity

#9675 - contrib : add Resources section

Pull Request - State: closed - Opened by ggerganov about 1 month ago

#9674 - Bug: baby-llama fails

Issue - State: open - Opened by sfadaei about 1 month ago - 1 comment
Labels: bug-unconfirmed, stale, medium severity

#9673 - Bug: convert_hf_to_gguf.py - Converting HF model to GGUF giving error Missing tokenizer.model - Qwen2.5 based

Issue - State: closed - Opened by Spacellary about 1 month ago - 1 comment
Labels: bug-unconfirmed, high severity

#9672 - Update building for Android

Pull Request - State: closed - Opened by amqdn about 1 month ago - 26 comments
Labels: documentation, merge ready

#9671 - Bug: Initializing KV Cache Spikes Memory, Crashing on Android

Issue - State: closed - Opened by amqdn about 1 month ago - 4 comments
Labels: bug-unconfirmed, critical severity

#9668 - common: ensure token addition to batch does not exceed llama_batch size

Pull Request - State: closed - Opened by matiaslin about 1 month ago - 3 comments
Labels: build, testing, Vulkan, examples, python, devops, server, ggml, merge ready

#9667 - Bug: llama-parallel crashes when adding more tokens to llama_batch than context size

Issue - State: closed - Opened by matiaslin about 1 month ago
Labels: bug-unconfirmed, low severity

#9666 - Bug: Issue building hipBLAS error: call to undeclared function '_mm256_dpbusd_epi32'

Issue - State: open - Opened by Zhaeong about 1 month ago
Labels: bug-unconfirmed, stale, low severity

#9664 - Bug: Termux adreno 618 vulkan support

Issue - State: open - Opened by akac97 about 1 month ago
Labels: bug-unconfirmed, critical severity

#9662 - Dev refactoring

Pull Request - State: closed - Opened by ykhrustalev about 1 month ago - 2 comments
Labels: build, ggml

#9661 - cmake : add option for common library

Pull Request - State: closed - Opened by iboB about 1 month ago
Labels: build

#9659 - Introduce Graph Profiler

Pull Request - State: open - Opened by max-krasnyansky about 1 month ago - 2 comments
Labels: ggml

#9658 - sycl: initial cmake support of SYCL for AMD GPUs

Pull Request - State: open - Opened by Alcpz about 1 month ago - 3 comments
Labels: documentation, SYCL

#9657 - test-backend-ops : use flops for some performance tests

Pull Request - State: closed - Opened by slaren about 1 month ago - 1 comment
Labels: testing

#9656 - Error: llama_model_load: error loading model: failed to open ggml-bagel-2.8b-v0.2-q8_0.gguf

Issue - State: closed - Opened by vineel96 about 1 month ago - 4 comments
Labels: bug-unconfirmed, low severity

#9655 - Docs: Add akx/ollama-dl

Pull Request - State: closed - Opened by akx about 1 month ago

#9652 - Bug: server crashes when embedding model is passed in the -m parameter

Issue - State: open - Opened by mesibo about 1 month ago
Labels: bug-unconfirmed, stale, low severity

#9651 - Feature Request: sgemm.cpp : Q5_0 support

Issue - State: open - Opened by Srihari-mcw about 1 month ago - 3 comments
Labels: enhancement, stale

#9648 - [Draft] Tensor Parallel support to llama.cpp

Pull Request - State: open - Opened by ClarkChin08 about 1 month ago - 2 comments
Labels: ggml, SYCL

#9647 - Resurrect Graph & Op Profiler

Pull Request - State: closed - Opened by max-krasnyansky about 1 month ago - 5 comments
Labels: ggml

#9645 - Feature Request: Molmo 72B vision support

Issue - State: open - Opened by Kreijstal about 1 month ago - 7 comments
Labels: enhancement

#9644 - Bug: IQ3_M is significantly slower than IQ4_XS on AMD, is it expected?

Issue - State: open - Opened by Nekotekina about 1 month ago - 3 comments
Labels: bug-unconfirmed, low severity

#9643 - Llama-3.2 11B Vision Support

Issue - State: open - Opened by yukiarimo about 1 month ago - 31 comments

#9642 - Feature Request: Add support for LLaMA 3.2

Issue - State: closed - Opened by ndavidson19 about 1 month ago
Labels: enhancement

#9641 - Fix Docker ROCM builds, use AMDGPU_TARGETS instead of GPU_TARGETS

Pull Request - State: closed - Opened by serhii-nakon about 1 month ago - 6 comments
Labels: devops

#9640 - Bug: server (New UI) ChatML templates are wrong

Issue - State: open - Opened by ivanstepanovftw about 1 month ago - 2 comments
Labels: good first issue, server/webui, bug-unconfirmed, medium severity

#9639 - Tool call support (Llama 3.x, Functionary v3, Hermes 2 Pro, Mistral Nemo, generic) w/ lazy grammars & minimalist Jinja engine

Pull Request - State: open - Opened by ochafik about 1 month ago - 7 comments
Labels: script, testing, examples, python, server

#9638 - ci : fix docker build number and tag name

Pull Request - State: closed - Opened by ngxson about 1 month ago
Labels: devops

#9637 - Add inverse chat template metadata

Pull Request - State: open - Opened by CISC about 1 month ago
Labels: python

#9636 - Bug: Assertion '__n < this->size()' failed.

Issue - State: open - Opened by Luke100000 about 1 month ago
Labels: bug-unconfirmed, stale, high severity

#9635 - server : add more env vars, improve gen-docs

Pull Request - State: closed - Opened by ngxson about 1 month ago
Labels: examples, server

#9633 - Examples: Add text compression example.

Pull Request - State: open - Opened by stduhpf about 1 month ago - 3 comments
Labels: examples

#9632 - Bug: python: can't open file 'llama.cpp/convert.py': [Errno 2] No such file or directory

Issue - State: open - Opened by AmosBunde about 1 month ago - 1 comment
Labels: bug-unconfirmed, stale, low severity

#9631 - Update convert_hf_to_gguf.py

Pull Request - State: closed - Opened by Ahmad986Ferdaws about 2 months ago
Labels: python

#9630 - Do llama.cpp support input_embeds?

Issue - State: open - Opened by OswaldoBornemann about 2 months ago - 3 comments
Labels: bug-unconfirmed, stale, low severity

#9629 - Bug: ggml_cuda_host_malloc: failed to allocate 1900,00 MiB of pinned memory: invalid argument

Issue - State: closed - Opened by XZVB12 about 2 months ago - 2 comments
Labels: bug-unconfirmed, low severity

#9628 - Bug: Failed to run qwen2-57b-a14b-instruct-fp16.

Issue - State: open - Opened by tang-t21 about 2 months ago - 3 comments
Labels: bug, good first issue, high severity

#9627 - [CANN]: Fix crash when running on multiple cann devices

Pull Request - State: closed - Opened by Dou-Git about 2 months ago - 2 comments
Labels: Ascend NPU

#9623 - Bug: [Hardware: ppc64le] On ppc64le llama.cpp only uses 1 thread by default and not half of all threads as it does on x86

Issue - State: open - Opened by mgiessing about 2 months ago
Labels: bug-unconfirmed, stale, low severity

#9622 - ggml : add AVX512DQ requirement for AVX512 builds

Pull Request - State: closed - Opened by EZForever about 2 months ago

#9619 - make sure params --split and --merge are not specified at same time in gguf-split

Pull Request - State: open - Opened by kylo5aby about 2 months ago - 2 comments
Labels: examples

#9618 - Bug: Got "error: bad arguments" when mergin multiple gguf to single one by using llama-gguf-split

Issue - State: closed - Opened by zloss about 2 months ago - 3 comments
Labels: bug-unconfirmed, low severity

#9616 - Add newline after chat example in llama-server

Pull Request - State: closed - Opened by StrangeBytesDev about 2 months ago
Labels: examples, server

#9615 - threads: fix msvc build without openmp

Pull Request - State: closed - Opened by max-krasnyansky about 2 months ago - 2 comments
Labels: ggml

#9613 - Bug: Failed to load llama3.1 405b model

Issue - State: closed - Opened by Nightmir about 2 months ago - 2 comments
Labels: bug-unconfirmed, medium severity

#9612 - Bug: [SYCL] crash since b-3805

Issue - State: closed - Opened by easyfab about 2 months ago - 43 comments
Labels: bug-unconfirmed, critical severity

#9611 - merge main

Pull Request - State: closed - Opened by Aliebc about 2 months ago
Labels: nix, examples, devops, server

#9610 - log : add CONT level for continuing previous log entry

Pull Request - State: closed - Opened by ggerganov about 2 months ago
Labels: examples, ggml

#9609 - llama : keep track of all EOG tokens in the vocab

Pull Request - State: closed - Opened by ggerganov about 2 months ago - 1 comment

#9608 - Bug: `llama-server` web UI resets the text selection during inference on every token update

Issue - State: open - Opened by mashdragon about 2 months ago
Labels: bug-unconfirmed, stale, low severity

#9607 - server : add --no-context-shift option

Pull Request - State: closed - Opened by ngxson about 2 months ago
Labels: examples, python, server

#9606 - Bug: Qwen2.5-Coder variants do not properly stop in FIM mode

Issue - State: closed - Opened by tristandruyen about 2 months ago - 3 comments
Labels: bug-unconfirmed, medium severity

#9605 - sampling : avoid expensive softmax during greedy sampling

Pull Request - State: closed - Opened by ggerganov about 2 months ago
Labels: testing, examples

#9604 - sampling : fix off-by-one in tail-free sampling

Pull Request - State: closed - Opened by ggerganov about 2 months ago - 6 comments
Labels: testing

#9603 - keep the minimum `min_keep` value to 1 in sampling

Pull Request - State: open - Opened by kylo5aby about 2 months ago - 1 comment

#9602 - llama : introduce anonymous namespace in llama.cpp

Pull Request - State: open - Opened by danbev about 2 months ago - 1 comment

#9601 - Feature Request: OpenVINO backend support request

Issue - State: open - Opened by aropb about 2 months ago - 2 comments
Labels: enhancement, stale

#9600 - Feature Request: Word Llama

Issue - State: open - Opened by TalonBvV about 2 months ago
Labels: enhancement, stale