Ecosyste.ms: Issues

An open API service for providing issue and pull request metadata for open source projects.

GitHub / microsoft/DeepSpeed-MII issues and pull requests

#321 - Add RTD

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#319 - Is there gonna be metrics endpoint exposed?

Issue - State: open - Opened by flexwang 11 months ago - 1 comment

#318 - What is the recommended way of bringing up mii as a service

Issue - State: open - Opened by flexwang 11 months ago - 1 comment

#317 - Adding OpenAI Compatible RESTful API

Pull Request - State: closed - Opened by PawanOsman 11 months ago - 19 comments

#316 - openai compatible api

Issue - State: closed - Opened by dongxiaolong 11 months ago
Labels: enhancement, good first issue

#315 - Use smaller model for unit tests

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#314 - Why is the throughput of mii lower than that of vllm in actual measurements?

Issue - State: closed - Opened by pangr 11 months ago - 6 comments

#313 - Questions about token throughput about dynamic splitfuse

Issue - State: closed - Opened by ChuanhongLi 11 months ago - 1 comment

#311 - readable token streaming support

Pull Request - State: closed - Opened by jeffra 11 months ago - 3 comments

#310 - Is multiple gpu supported with non-persistent pipeline

Issue - State: closed - Opened by yaliqin 11 months ago - 1 comment

#309 - Illegal memory access error when infering input of length 100K

Issue - State: open - Opened by frankxyy 11 months ago - 4 comments

#308 - can not test with restful_api

Issue - State: open - Opened by irasin 11 months ago - 14 comments

#307 - Where to get log of server?

Issue - State: open - Opened by frankxyy 11 months ago - 1 comment
Labels: enhancement

#306 - Support for token streaming

Issue - State: open - Opened by Archmilio 11 months ago

#305 - Unable to load relatively large opt models (opt-6.7b opt-30b)

Issue - State: open - Opened by MeloYang05 11 months ago - 5 comments

#304 - Server launching error for model Yi-6B-200K-Llamafied

Issue - State: closed - Opened by frankxyy 11 months ago

#303 - unable to build model pipeline

Issue - State: open - Opened by sumitsahaykoantek 11 months ago - 4 comments

#301 - The result is irrelevant

Issue - State: closed - Opened by pangr 11 months ago

#299 - [Serving Stability] one request crashed, other requests can not be posted

Issue - State: closed - Opened by frankxyy 11 months ago - 3 comments

#298 - How to get num_of_new_tokens while calling client.generate()

Issue - State: closed - Opened by frankxyy 11 months ago - 3 comments

#297 - tp > 1 inference is very slow

Issue - State: open - Opened by easonfzw 11 months ago - 2 comments

#296 - Add safetensors support

Pull Request - State: closed - Opened by jihnenglin 11 months ago - 1 comment

#295 - Unify input/output types

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#294 - Update RESTful API

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#293 - how to set stop tokens?

Issue - State: open - Opened by PawanOsman 11 months ago - 3 comments

#292 - cannot send top_p temperature parameters through client.generate api calling

Issue - State: closed - Opened by frankxyy 11 months ago - 3 comments

#291 - Serving error when input of large length is sent

Issue - State: open - Opened by frankxyy 11 months ago - 4 comments

#289 - TypeError: cannot unpack non-iterable Response object

Issue - State: closed - Opened by flexwang 11 months ago - 7 comments

#288 - Non deterministic generation result from the same prompt

Issue - State: closed - Opened by flexwang 11 months ago - 3 comments

#287 - Why generation_config.json is a requirement for start server?

Issue - State: closed - Opened by flexwang 11 months ago - 6 comments

#286 - Is beam search supported?

Issue - State: open - Opened by flexwang 11 months ago - 2 comments

#285 - Question around DSStateManagerConfig.max_ragged_batch_size

Issue - State: closed - Opened by flexwang 11 months ago - 3 comments

#284 - Compatibility with DS Inference KV-cache flexibility PR

Pull Request - State: closed - Opened by cmikeh2 11 months ago

#283 - How to select specific gpu index when using tensor parallel?

Issue - State: closed - Opened by frankxyy 11 months ago - 2 comments

#282 - Server turns into broken state if queried with very long prompt

Issue - State: open - Opened by ttim 11 months ago - 1 comment

#281 - Streaming api seems broken

Issue - State: open - Opened by ttim 11 months ago - 2 comments

#280 - Add more generate() kwargs

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#279 - prevent load_with_sys_mem when using stable diffusion

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#278 - Recompute when the deadlock is detected

Pull Request - State: closed - Opened by tohtana 11 months ago

#277 - Installed CUDA version 11.7 does not match torch version

Issue - State: closed - Opened by frankxyy 11 months ago - 3 comments

#276 - Update precommit formatting and yapf to match DeepSpeed

Pull Request - State: closed - Opened by loadams 11 months ago

#275 - Enable multi-prompt input for persistent deployments

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#274 - Mitigate the risk of deadlock

Pull Request - State: closed - Opened by tohtana 11 months ago

#273 - Unable to load ragged_device_ops op due to no compute capabilities remaining after filtering

Issue - State: open - Opened by rogerbock 11 months ago - 10 comments
Labels: enhancement

#272 - `FileNotFoundError: No such file or directory: pytorch_model.bin` while loading a HF repository

Issue - State: open - Opened by jihnenglin 11 months ago - 1 comment
Labels: good first issue

#271 - [FastGen] Hot-swappable LoRA adapters?

Issue - State: open - Opened by corbt 11 months ago - 1 comment

#270 - Support AsyncPipeline for RESTful API

Issue - State: closed - Opened by toilaluan 11 months ago - 5 comments

#269 - Reorganize code structure, fix client import bug

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#268 - Expose top-p, top-k, and temperature to generate APIs

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#267 - set device on inference pipeline only if setter available

Pull Request - State: closed - Opened by gauravrajguru 11 months ago - 1 comment

#266 - Serve Fails while Pipeline is working

Issue - State: closed - Opened by aliozts 11 months ago - 4 comments

#265 - terminate_server only release memory on one gpu when using tensor_parallel

Issue - State: closed - Opened by baojunliu 11 months ago - 2 comments

#264 - diffuser model load using model and path params

Pull Request - State: closed - Opened by gauravrajguru 11 months ago

#263 - DeepSpeed MII Serve error on V100

Issue - State: closed - Opened by amazingkmy 11 months ago - 4 comments

#262 - Add MII v0.1 unit tests

Pull Request - State: closed - Opened by mrwyattii 11 months ago

#261 - Add ability to configure temperature, top P, top K, number of beams

Issue - State: closed - Opened by ttim 11 months ago - 2 comments

#260 - Provide async api in MII client

Issue - State: open - Opened by ttim 11 months ago - 2 comments

#259 - Fix typo in README.md

Pull Request - State: closed - Opened by eltociear 11 months ago

#258 - Server crashes whilst trying to spin up Mistral

Issue - State: open - Opened by harryjulian 11 months ago - 8 comments

#257 - Time to First Token almost same as vllm for large prompts

Issue - State: closed - Opened by idealover 11 months ago - 2 comments

#255 - Quantization Support for Fastgen?

Issue - State: open - Opened by aliozts 11 months ago - 4 comments

#254 - [FEATURE] Speculative Decoding

Issue - State: open - Opened by casper-hansen 11 months ago

#253 - Issues with llama 2 model example

Issue - State: open - Opened by ttim 11 months ago - 1 comment

#252 - MII v0.1.0 release

Pull Request - State: closed - Opened by tohtana 11 months ago

#251 - Fail to compile when kicking off the example

Issue - State: open - Opened by mozizhao 11 months ago

#250 - DeepSpeed bug multi-gpu in single node

Issue - State: open - Opened by muhammad-asn 11 months ago - 1 comment

#248 - Improved the code quality to ease future maintenance

Pull Request - State: closed - Opened by blackmambaza 12 months ago - 1 comment

#247 - Loadams/update yapf

Pull Request - State: closed - Opened by loadams 12 months ago

#246 - Update version.txt after 0.0.8 release

Pull Request - State: closed - Opened by loadams 12 months ago

#245 - Update autoPR creation in release script

Pull Request - State: closed - Opened by loadams 12 months ago

#244 - Fixes for AML metatensor loading

Pull Request - State: closed - Opened by mrwyattii almost 1 year ago

#243 - Multi node or remote machine inference doesn't work without "--force_multi" parameter

Issue - State: open - Opened by sarathkondeti about 1 year ago - 9 comments
Labels: bug

#242 - Does MII include batch-inference optimizations?

Issue - State: open - Opened by fr-ashikaumagiliya about 1 year ago

#241 - Is deepspeed-mii meant to run in a notebook?

Issue - State: closed - Opened by qrdlgit about 1 year ago - 3 comments

#240 - Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support

Pull Request - State: closed - Opened by ringohoffman about 1 year ago - 9 comments

#239 - Add PyPI release workflow

Pull Request - State: closed - Opened by mrwyattii about 1 year ago

#238 - Re-enable non persistent test cases

Pull Request - State: closed - Opened by mrwyattii about 1 year ago

#237 - DeepSpeed and Zero

Issue - State: open - Opened by UncleFB about 1 year ago - 2 comments

#236 - RuntimeError: This event loop is already running when running with fastapi

Issue - State: open - Opened by tulika612 about 1 year ago - 2 comments

#235 - How to load my local model

Issue - State: open - Opened by UncleFB about 1 year ago - 12 comments

#234 - use deploy_rank to allocate gpus

Pull Request - State: open - Opened by tulika612 about 1 year ago

#233 - Cache HF API results

Pull Request - State: closed - Opened by mrwyattii about 1 year ago

#232 - NON_PERSISTENT Deployment Multi-GPU Doesn't Work

Issue - State: closed - Opened by infosechoudini about 1 year ago - 1 comment

#231 - Error loading dolly model locally

Issue - State: closed - Opened by ethanenguyen about 1 year ago - 2 comments

#230 - Always got BadStatusLine Error when using http call inference service

Issue - State: open - Opened by qyhdt about 1 year ago - 1 comment

#229 - Update Stable Diffusion to match latest DeepSpeed-Inference

Pull Request - State: closed - Opened by mrwyattii about 1 year ago

#228 - Auto-generate host files

Pull Request - State: closed - Opened by mrwyattii about 1 year ago

#227 - test non-persistent deployments

Pull Request - State: closed - Opened by mrwyattii about 1 year ago - 1 comment

#226 - Error reasoning about llama2-7b-hf model using MII-Public

Issue - State: open - Opened by ly19970621 about 1 year ago - 4 comments

#225 - Support for HF Conversational Pipeline in DS MII

Pull Request - State: closed - Opened by srsaggam about 1 year ago

#224 - Issue loading larger models such as Llama-2 70B for serving

Issue - State: open - Opened by arjunv2489 about 1 year ago - 3 comments

#223 - Multi model refactor

Pull Request - State: open - Opened by TosinSeg about 1 year ago

#222 - waiting for server to start...

Issue - State: open - Opened by yunll about 1 year ago - 8 comments