SJTU-IPADS/PowerInfer issues and pull requests

#236 - Where are the weights of the sparse predictor?

Issue - State: open - Opened by Yues007 11 days ago - 1 comment
Labels: question

#235 - Which version of falcon-40b model used in llama.cpp reference in the demo?

Issue - State: open - Opened by wuooo339 27 days ago
Labels: question

#234 - How to OPT model with PowerInfer?

Issue - State: open - Opened by wuooo339 28 days ago - 2 comments
Labels: question

#233 - llama-7b-relu.powerinfer.gguf 进行 q8_0 量化后推理结果全是###

Issue - State: open - Opened by JocelynPanPan 29 days ago
Labels: question

#232 - 模型运行问题

Issue - State: closed - Opened by Yues007 about 1 month ago
Labels: question

#231 - fix a bug when calculating `neuron_cap` before invoking the solver

Pull Request - State: open - Opened by KiritoHugh about 1 month ago

#230 - 注意需要修改llama.cpp中的激活函数，从Gelu改为Relu

Issue - State: closed - Opened by eraser333 3 months ago

#229 - Error: the provided PTX was compiled with an unsupported toolchain

Issue - State: open - Opened by jiangzizi 3 months ago
Labels: bug-unconfirmed

#228 - about the use of OPT model

Issue - State: open - Opened by bobzhang208 3 months ago
Labels: question

#227 - add new model in power-infer2

Issue - State: open - Opened by Francis235 3 months ago - 1 comment
Labels: question

#226 - Qualcomm chips support

Issue - State: open - Opened by Francis235 3 months ago
Labels: question

#225 - Question about the perplexity

Issue - State: open - Opened by eljrte 3 months ago
Labels: question

#224 - 关于注意力块权重如何分配？

Issue - State: open - Opened by Yues007 4 months ago - 2 comments
Labels: question

#223 - 请问我该如何获得opt模型相关的weight文件？

Issue - State: open - Opened by a1bc2def6g 4 months ago - 2 comments
Labels: question

#222 - What does "co-activation" mean in Section 4.3 of the PowerInfer-2 paper?

Issue - State: closed - Opened by exhyy 4 months ago
Labels: question

#221 - 关于README视频Demo的问题

Issue - State: closed - Opened by lyeXzot 4 months ago
Labels: question

#220 - 统计predictor的overhead

Issue - State: open - Opened by guanchenl 4 months ago
Labels: question

#219 - Help! Want a toy example to run matmul with q40 weight by cuda kernel

Issue - State: open - Opened by Eutenacity 4 months ago
Labels: question

#218 - CUDA toolkit version?

Issue - State: open - Opened by shujiehan 5 months ago - 1 comment
Labels: question

#217 - Fix segmentation fault for models exceeding 40B on AMD GPUs & optimize mul_mat_axpy operation

Pull Request - State: closed - Opened by Tworan 5 months ago

#216 - Am i doing something wrong?

Issue - State: open - Opened by RealMrCactus 5 months ago - 1 comment
Labels: question

#215 - 有微信或QQ或其他交流群或者打算开一个吗？

Issue - State: open - Opened by lzcchl 5 months ago

#214 - .generated.gpuidx 是在用 huggingface-cli 命令下载模型的时候自动生成的吗？有没用别的办法获取？

Issue - State: closed - Opened by lzcchl 5 months ago - 1 comment

#213 - Some question about Fig4.

Issue - State: open - Opened by rhmaaa 6 months ago - 5 comments
Labels: question

#196 - 支持的量化类型

Issue - State: closed - Opened by deleteeeee 7 months ago - 1 comment
Labels: question

#194 - Source for v2 (mobile inference engine)

Issue - State: open - Opened by peeteeman 7 months ago - 8 comments
Labels: question

#189 - ReluFalcon 40B 在llama.cpp上无效输出

Issue - State: closed - Opened by Zctoylm0927 8 months ago - 4 comments
Labels: question

#183 - 请问大神有支持LLama 3 70B 的计划吗？

Issue - State: open - Opened by xiasw81 9 months ago - 1 comment
Labels: enhancement

#154 - The quesion about Neuron-aware Operator

Issue - State: closed - Opened by YuMJie 11 months ago - 3 comments
Labels: question

#108 - CUDA error 13 at /home/PowerInfer/ggml-cuda.cu:9619: invalid device symbol

Issue - State: open - Opened by zilunzhang about 1 year ago - 2 comments
Labels: bug-unconfirmed

#102 - 01-ai的Yi模型系列可以适配吗，我看模型结构是跟llama一样的

Issue - State: closed - Opened by felixstander about 1 year ago - 1 comment
Labels: question

#101 - ./build/bin/main -m /PATH/TO/MODEL -n $output_token_count -t $thread_num -p $prompt '.' 不是内部或外部命令，也不是可运行的程序

Issue - State: closed - Opened by 18635191739 about 1 year ago - 2 comments
Labels: question

#100 - 对稠密激活Llama模型的兼容性问题 Compatibility issue with densely activated Llama models

Issue - State: open - Opened by 1562668477 about 1 year ago - 6 comments
Labels: bug

#99 - Fix generation error under INT4 quantization and batched prompting

Pull Request - State: closed - Opened by hodlen about 1 year ago

#98 - Further optimisation of hybrid inference

Issue - State: open - Opened by hodlen about 1 year ago
Labels: tracker

#97 - Optimize CUDA sparse operator with Tensor Core

Issue - State: open - Opened by hodlen about 1 year ago
Labels: enhancement

#96 - Kernel fusion to reduce communication overhead

Issue - State: open - Opened by hodlen about 1 year ago
Labels: enhancement

#95 - Reclaim memory from offloaded model weights

Issue - State: open - Opened by hodlen about 1 year ago - 1 comment
Labels: enhancement

#94 - How to convert llama family model to powerinfer.gguf?

Issue - State: closed - Opened by Mokuroh0924 about 1 year ago - 1 comment
Labels: question

#93 - Meta: Wider model support for PowerInfer

Issue - State: open - Opened by hodlen about 1 year ago - 10 comments
Labels: tracker

#92 - Meta: Implementing hybrid inference across key desktop platforms

Issue - State: open - Opened by hodlen about 1 year ago
Labels: tracker

#91 - 我也遇到了类似的问题，找不到stdatomic.h，不过我是在linux平台

Issue - State: closed - Opened by yinghuo302 about 1 year ago - 1 comment

#90 - Update issue templates of PowerInfer

Pull Request - State: closed - Opened by hodlen about 1 year ago

#89 - Add our Kanban to README.md

Pull Request - State: closed - Opened by hodlen about 1 year ago

#88 - macOS/Metal inference support

Issue - State: open - Opened by hodlen about 1 year ago
Labels: tracker

#87 - WSL + CUDA issues

Issue - State: open - Opened by hodlen about 1 year ago
Labels: tracker

#86 - Windows CPU/GPU support

Issue - State: closed - Opened by hodlen about 1 year ago - 2 comments
Labels: tracker

#85 - Fix offloading / VRAM budget bugs

Issue - State: open - Opened by hodlen about 1 year ago - 2 comments
Labels: tracker

#84 - 请问original weight, predictor weights是怎么生成的？

Issue - State: open - Opened by sunnyregion about 1 year ago - 2 comments
Labels: question

#83 - Can we make it run on other models?

Issue - State: open - Opened by YLSnowy about 1 year ago - 6 comments
Labels: question

#82 - Converting GGUF Models and Support for Smaller Models

Issue - State: open - Opened by nndnnv about 1 year ago - 1 comment
Labels: enhancement

#81 - didn't use gpu

Issue - State: closed - Opened by yuxx0218 about 1 year ago - 4 comments

#80 - cmake -S . -B build -DLLAMA_CUBLAS=ON

Issue - State: open - Opened by hungptit123 about 1 year ago - 1 comment
Labels: bug-unconfirmed

#79 - 我不懂编程

Issue - State: closed - Opened by dyt06 about 1 year ago - 2 comments
Labels: help wanted

#78 - pip install -r requirements 提示 ./gguf-py not installable

Issue - State: closed - Opened by jqliu42 about 1 year ago - 3 comments
Labels: help wanted

#77 - When I enable the gpu split,the inference result is unacceptable

Issue - State: closed - Opened by Gengchunsheng about 1 year ago - 6 comments
Labels: bug

#76 - Convert HF models with sparse threshold specified

Pull Request - State: closed - Opened by Szy0127 about 1 year ago - 1 comment

#75 - 请问和llama.cpp 相比有什么优化的地方吗？因为我看大部分代码都是和他重合的

Issue - State: open - Opened by 2213601279 about 1 year ago - 8 comments
Labels: question

#74 - Seems not support long prompt well.

Issue - State: open - Opened by swankong about 1 year ago - 3 comments
Labels: question

#73 - Add Windows CPU/GPU CMake support

Pull Request - State: closed - Opened by bobozi-cmd about 1 year ago - 7 comments

#72 - Update README.md

Pull Request - State: closed - Opened by YixinSong-e about 1 year ago

#71 - Add news

Pull Request - State: closed - Opened by YixinSong-e about 1 year ago

#70 - 请问下针对消费级卡的服务器的适配。

Issue - State: open - Opened by hua-bang about 1 year ago - 2 comments
Labels: question

#69 - 请问下针对消费级卡的服务器的适配。

Issue - State: closed - Opened by hua-bang about 1 year ago
Labels: enhancement

#68 - Add demo link to README.md

Pull Request - State: closed - Opened by hodlen about 1 year ago

#67 - 请问你们是否有兴趣支持deepseek？

Issue - State: closed - Opened by homosapien-lcy about 1 year ago - 3 comments

#66 - is it possible in future run mixtal8x7b

Issue - State: open - Opened by zotona about 1 year ago - 3 comments
Labels: enhancement

#65 - [HELP WANTED] 支持 InternLM 吗？

Issue - State: closed - Opened by vansin about 1 year ago - 1 comment
Labels: enhancement

#64 - How to integrate with LangChain?

Issue - State: open - Opened by tigerinus about 1 year ago - 1 comment
Labels: enhancement

#63 - CUDA error 1 in ggml-cuda.cu:8332: invalid argument, and then segmentation fault

Issue - State: open - Opened by 3dluvr about 1 year ago - 3 comments
Labels: bug

#62 - Fix VRAM capacity assertion bug

Pull Request - State: closed - Opened by hodlen about 1 year ago

#59 - GitHub

Issue - State: closed - Opened by maxrubelvai about 1 year ago - 2 comments
Labels: invalid

#58 - windows visual studio编译失败

Issue - State: open - Opened by ChenXiaoTemp about 1 year ago - 3 comments
Labels: bug

#57 - 精度的对比

Issue - State: closed - Opened by FL77N about 1 year ago - 1 comment
Labels: question

#56 - llama2中文 hf格式.bin 如何转换成PowerInfer格式?

Issue - State: closed - Opened by Chenhuaqi6 about 1 year ago - 3 comments
Labels: question

#55 - No module named powerinfer, can ot split gpu

Issue - State: closed - Opened by Gengchunsheng about 1 year ago - 6 comments
Labels: bug

#54 - 请问想要部署自己的模型

Issue - State: closed - Opened by tanklandry about 1 year ago - 1 comment
Labels: question

#53 - server cannot run

Issue - State: closed - Opened by Gengchunsheng about 1 year ago - 3 comments
Labels: bug

#52 - In-depth Analysis of Memory Management for Enhanced Performance on Consumer-grade GPUs

Issue - State: open - Opened by yihong1120 about 1 year ago - 1 comment
Labels: enhancement

#51 - Chat model

Issue - State: closed - Opened by yzc111 about 1 year ago - 2 comments

#50 - testing vs ollama mistral gives same speed results on llama2 7b

Issue - State: open - Opened by jtoy about 1 year ago - 9 comments

#49 - fatal error C1189: #error: <stdatomic.h> is not yet supported when compiling as C

Issue - State: open - Opened by xldistance about 1 year ago - 3 comments

#48 - Small code change - IFs to mapping

Pull Request - State: closed - Opened by 3x0dv5 about 1 year ago - 1 comment

#47 - Bitcoin

Issue - State: closed - Opened by Thato2009 about 1 year ago

#46 - no CUDA-capable device is detected

Issue - State: open - Opened by jasonmhead about 1 year ago - 4 comments

#45 - add this line in readme

Pull Request - State: closed - Opened by samehpalas about 1 year ago - 1 comment

#44 - Add more FAQs

Pull Request - State: closed - Opened by YixinSong-e about 1 year ago

#43 - Correct misleading description about offloading in README

Pull Request - State: closed - Opened by hodlen about 1 year ago

#42 - llama.cpp:3107: vram_allocated_bytes < vram_capacity

Issue - State: open - Opened by theodorDiaconu about 1 year ago - 12 comments
Labels: bug

#41 - Add more details on README evaluation

Pull Request - State: closed - Opened by hodlen about 1 year ago

#40 - Jetson Orin+ RTXA6000

Issue - State: open - Opened by Gengchunsheng about 1 year ago - 2 comments
Labels: help wanted

#39 - Combined with LLM in a flash

Issue - State: closed - Opened by qwopqwop200 about 1 year ago - 4 comments
Labels: enhancement

#38 - vram-budget doesn't work well.

Issue - State: open - Opened by YixinSong-e about 1 year ago
Labels: bug

#37 - 会提供Docker镜像吗

Issue - State: closed - Opened by lychee-2724540853 about 1 year ago
Labels: enhancement

#36 - [HELP WANTED] 支持qwen吗？

Issue - State: open - Opened by xxm1668 about 1 year ago - 8 comments
Labels: help wanted

#35 - Length

Issue - State: closed - Opened by cyzhh about 1 year ago - 2 comments
Labels: enhancement

#34 - How to get a relu-activated llama2 model with finetune? any supposed finetune scripts?

Issue - State: closed - Opened by skykiseki about 1 year ago - 2 comments

#33 - 想请问一下有没有在A100上运行PowerInfer的效果情况

Issue - State: open - Opened by jayfeather9 about 1 year ago - 1 comment
Labels: enhancement

#32 - 从meta-llama/Llama-2-13b-hf到SparseLLM/ReluLLaMA-13B

Issue - State: closed - Opened by Vincent131499 about 1 year ago - 3 comments
Labels: enhancement

#31 - Why performance dropped a lot?

Issue - State: closed - Opened by lucasjinreal about 1 year ago - 6 comments

GitHub / SJTU-IPADS/PowerInfer issues and pull requests