Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / turboderp/exllama issues and pull requests
#112 - Update model_compatibility.md
Pull Request -
State: open - Opened by eltociear over 1 year ago
- 1 comment
#111 - Why is there a huge lag between reading the prompt and starting to generate output?
Issue -
State: closed - Opened by ENjoyBlue2021 over 1 year ago
- 6 comments
#110 - Strange behavior with caching on 8K models
Issue -
State: closed - Opened by kaiokendev over 1 year ago
- 2 comments
#108 - not support lora with autogptq/peft?
Issue -
State: closed - Opened by laoda513 over 1 year ago
- 6 comments
#107 - openllama support
Issue -
State: closed - Opened by cnut1648 over 1 year ago
- 4 comments
#106 - More functions in webui, interface is more adapted to mobile
Pull Request -
State: open - Opened by CORRUPTOR2037 over 1 year ago
- 1 comment
#105 - Compiling issue on Sagemaker
Issue -
State: closed - Opened by buzzCraft over 1 year ago
- 7 comments
#104 - Adds the possibility to influence prediction with bias
Pull Request -
State: closed - Opened by paolorechia over 1 year ago
- 18 comments
#103 - Integrating with Guidance: adding a positive bias to certain tokens
Issue -
State: closed - Opened by paolorechia over 1 year ago
- 5 comments
#101 - Fixed: batching lead to faulty results, crashes and men wielding bananas.
Pull Request -
State: closed - Opened by aljungberg over 1 year ago
- 14 comments
#100 - ImportError: DLL load failed while importing exllama_ext: 找不到指定的模块。
Issue -
State: closed - Opened by onexixi over 1 year ago
- 7 comments
#99 - TheBloke/robin-13B-v2-GPTQ - models keeps generating tokens
Issue -
State: closed - Opened by marcoripa96 over 1 year ago
- 2 comments
#98 - OOM even with multiple GPUs (4x 3090 @ 24GB)
Issue -
State: closed - Opened by nikshepsvn over 1 year ago
- 21 comments
#97 - Fix download_dataset and perplexity wrt to downloaded datasets on Windows
Pull Request -
State: closed - Opened by allenbenz over 1 year ago
#96 - Fix AttributeError: 'torch.device' object has no attribute 'startswith' when using gpu_peer_fix.
Pull Request -
State: closed - Opened by Panchovix over 1 year ago
#95 - 3-bit and 2-bit GPTQ support
Issue -
State: closed - Opened by TechnotechGit over 1 year ago
- 23 comments
#94 - About Llama checkpoint 4-bit
Issue -
State: closed - Opened by Iambestfeed over 1 year ago
- 3 comments
#93 - Custom this repo for another architecture like BLOOM, MPT, Falcon
Issue -
State: closed - Opened by Iambestfeed over 1 year ago
- 1 comment
#92 - Interesting method to extend a model's max context length.
Issue -
State: closed - Opened by allenbenz over 1 year ago
- 49 comments
#89 - Fix compiling in venv on Windows
Pull Request -
State: closed - Opened by EyeDeck over 1 year ago
- 5 comments
#88 - Benchmarks vs vLLM?
Issue -
State: closed - Opened by nikshepsvn over 1 year ago
- 6 comments
#87 - Request for server API script without sessions
Issue -
State: closed - Opened by CORRUPTOR2037 over 1 year ago
- 5 comments
#86 - elapsed can be 0 for prompt processing on windows
Pull Request -
State: closed - Opened by allenbenz over 1 year ago
- 1 comment
#85 - Support for models with 8-bit quants?
Issue -
State: closed - Opened by Panchovix over 1 year ago
- 3 comments
#84 - Minor import time output suppression for windows
Pull Request -
State: closed - Opened by allenbenz over 1 year ago
- 1 comment
#83 - Add option to run docker container as root user
Pull Request -
State: closed - Opened by nopperl over 1 year ago
- 2 comments
#82 - Add waitress to Dockerfile please
Issue -
State: closed - Opened by ghost over 1 year ago
- 1 comment
#81 - performance & quality drop (3x) when setting top_p = 1.0 vs. 0.99
Issue -
State: closed - Opened by matatonic over 1 year ago
- 4 comments
#79 - TypeError: 'type' object is not subscriptable
Issue -
State: closed - Opened by KPTK over 1 year ago
- 5 comments
#78 - Multimodal support
Issue -
State: closed - Opened by realsammyt over 1 year ago
- 9 comments
#77 - Problem with generation leading space.
Issue -
State: closed - Opened by Larryvrh over 1 year ago
#76 - "fatal error LNK1104: cannot open file 'python310.lib'" + Solution (Windows)
Issue -
State: closed - Opened by JLuke73 over 1 year ago
- 8 comments
#75 - Tesla P40 only using 70W underload
Issue -
State: closed - Opened by TimyIsCool over 1 year ago
- 15 comments
#74 - Support for llama models with >2048 context?
Issue -
State: closed - Opened by Panchovix over 1 year ago
- 1 comment
#73 - Using cache cause random behavior during generation
Issue -
State: closed - Opened by Larryvrh over 1 year ago
- 6 comments
#72 - Is is able to turning with exllama?
Issue -
State: closed - Opened by laoda513 over 1 year ago
- 21 comments
#71 - Fix some cublas hipification
Pull Request -
State: closed - Opened by ardfork over 1 year ago
#70 - OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Issue -
State: closed - Opened by zero-thermo over 1 year ago
- 4 comments
#69 - Using QLoRA?
Issue -
State: closed - Opened by gameveloster over 1 year ago
- 1 comment
#68 - Added streaming langchain example.
Pull Request -
State: open - Opened by CoffeeVampir3 over 1 year ago
- 2 comments
#65 - Error running. ArgTypes. Ninja: Build stopped: subcommand failed
Issue -
State: closed - Opened by ckasimis over 1 year ago
- 5 comments
#64 - Support for StarCoder
Issue -
State: closed - Opened by bkutasi over 1 year ago
- 1 comment
#63 - Possible to add a pip package?
Issue -
State: closed - Opened by CoffeeVampir3 over 1 year ago
- 2 comments
#62 - KeyError when loading GPTQ Model
Issue -
State: closed - Opened by mambug over 1 year ago
- 2 comments
#59 - Batch generation support
Issue -
State: closed - Opened by ri938 over 1 year ago
- 2 comments
#58 - Add option to run docker container as root user (fixes #57)
Pull Request -
State: closed - Opened by nopperl over 1 year ago
- 5 comments
#57 - Docker and ownership permissions
Issue -
State: closed - Opened by chrisbward over 1 year ago
- 5 comments
#56 - Add flask inference example
Pull Request -
State: closed - Opened by Kerushii over 1 year ago
- 2 comments
#55 - Lora support
Issue -
State: open - Opened by alain40 over 1 year ago
- 18 comments
#54 - SqueezeLLM Support?
Issue -
State: closed - Opened by nikshepsvn over 1 year ago
- 1 comment
#53 - New API endpoint
Pull Request -
State: closed - Opened by jisungk2 over 1 year ago
- 5 comments
#52 - ExLlamaDeviceMap's layers offload to CPU?
Issue -
State: closed - Opened by tiendung over 1 year ago
- 1 comment
#51 - Correct years from 2024 to 2023
Pull Request -
State: closed - Opened by tiendung over 1 year ago
- 2 comments
#50 - API for batched input?
Issue -
State: closed - Opened by 0x1997 over 1 year ago
- 8 comments
#49 - how to get correct model type?
Issue -
State: closed - Opened by lx0126z over 1 year ago
- 5 comments
#48 - Feature Request: length_penalty support
Issue -
State: closed - Opened by Qubitium over 1 year ago
- 3 comments
#47 - Very poor output quality
Issue -
State: open - Opened by calebmor460 over 1 year ago
- 55 comments
#46 - Landmark Attention support
Issue -
State: closed - Opened by grimulkan over 1 year ago
- 17 comments
#45 - Perplexity refactor
Pull Request -
State: closed - Opened by lhl over 1 year ago
#44 - "ValueError: Found group index but no groupsize. What do?"
Issue -
State: closed - Opened by dvoidus over 1 year ago
- 4 comments
#43 - Add docker support
Pull Request -
State: closed - Opened by nopperl over 1 year ago
- 4 comments
#42 - make pascal compile
Pull Request -
State: closed - Opened by Ph0rk0z over 1 year ago
#41 - Perplexity Data Format/Testing Data Question
Issue -
State: closed - Opened by lhl over 1 year ago
- 20 comments
#40 - RuntimeError: CUDA error: an illegal memory access was encountered
Issue -
State: open - Opened by TianqiYe over 1 year ago
- 11 comments
#38 - 65B working on multi-gpu
Issue -
State: closed - Opened by ortegaalfredo over 1 year ago
- 1 comment
#37 - Streaming API
Issue -
State: open - Opened by bkutasi over 1 year ago
- 5 comments
#36 - Improve Windows compatibility
Pull Request -
State: closed - Opened by EyeDeck over 1 year ago
- 2 comments
#35 - Pure C++ core instead of Python
Issue -
State: closed - Opened by gotzmann over 1 year ago
- 1 comment
#33 - Can't compile on Windows
Issue -
State: closed - Opened by Panchovix over 1 year ago
- 13 comments
#32 - 2 x RTX A5000 performance
Issue -
State: closed - Opened by alain40 over 1 year ago
- 14 comments
#30 - Typo in model.py
Issue -
State: closed - Opened by g0morra over 1 year ago
#29 - Performance degradation
Issue -
State: open - Opened by dvoidus over 1 year ago
- 20 comments
#28 - --host for running webui across the network
Pull Request -
State: closed - Opened by disarmyouwitha over 1 year ago
#27 - will it work with Nvidia P40 24GB on Linux?
Issue -
State: open - Opened by waan1 over 1 year ago
- 29 comments
#26 - WebUI Multi-bot
Issue -
State: closed - Opened by Fairfax-Mooresby over 1 year ago
- 3 comments
#25 - Get error when compiling.
Issue -
State: closed - Opened by Cortega13 over 1 year ago
- 2 comments
#23 - Fix reuse
Pull Request -
State: closed - Opened by osmarks over 1 year ago
- 1 comment
#21 - TransformerEngine FP8 support
Issue -
State: closed - Opened by SinanAkkoyun over 1 year ago
- 4 comments
#20 - Kernel wouldn't compile in my conda env
Issue -
State: closed - Opened by Ph0rk0z over 1 year ago
- 8 comments
#19 - the inference speed of GPTQ 4bit quantized model
Issue -
State: closed - Opened by pineking over 1 year ago
- 25 comments
#17 - Are you able to help?
Issue -
State: closed - Opened by NO-ob over 1 year ago
- 4 comments
#15 - Gradio error: "Not implemented yet"
Issue -
State: closed - Opened by mmealman over 1 year ago
- 2 comments
#14 - Question - possible to run starcoder with exllama?
Issue -
State: closed - Opened by tpfwrz over 1 year ago
- 8 comments
#13 - ExLlama API spec / discussion
Issue -
State: closed - Opened by nikshepsvn over 1 year ago
- 6 comments
#12 - Error when trying to run Wizard-Vicuna-13B-Uncensored-GPTQ
Issue -
State: closed - Opened by nikshepsvn over 1 year ago
- 8 comments
#11 - Pushing working code to master
Pull Request -
State: closed - Opened by disarmyouwitha over 1 year ago
- 1 comment
#10 - Splitting model on multiple GPUs produces RuntimeError
Issue -
State: closed - Opened by h3ss over 1 year ago
- 19 comments
#8 - Turn into Python module, hack in transformers support
Pull Request -
State: closed - Opened by 0cc4m over 1 year ago
- 4 comments
#7 - Add ROCm support
Pull Request -
State: closed - Opened by ardfork over 1 year ago
- 75 comments
#6 - RTX 3060 12GB Benchmarking
Issue -
State: closed - Opened by 1aienthusiast over 1 year ago
- 6 comments
#5 - Working with TheBloke/WizardLM-30B-Uncensored-GPTQ
Issue -
State: closed - Opened by gabriel-peracio over 1 year ago
- 4 comments
#3 - Multi-GPU
Issue -
State: closed - Opened by Fairfax-Mooresby over 1 year ago
- 6 comments
#2 - Cuda 12.1 - Fails to Build Here
Issue -
State: closed - Opened by ilikenwf over 1 year ago
- 29 comments
#1 - Crashing with act order and no act order since latest changes.
Issue -
State: closed - Opened by disarmyouwitha over 1 year ago
- 3 comments