Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / microsoft/DeepSpeedExamples issues and pull requests
#934 - No module named 'transformers.deepspeed'
Issue -
State: open - Opened by TianyuJIAA 27 days ago
- 1 comment
#933 - Fixed mistake in readme
Pull Request -
State: open - Opened by SCheekati about 1 month ago
#932 - Does DeepSpeed's Pipeline-Parallelism optimizer supports skip connections?
Issue -
State: open - Opened by RoyMahlab about 1 month ago
#931 - [cifar ds training]: Set cuda device during initialization of distributed backend.
Pull Request -
State: open - Opened by jagadish-amd about 1 month ago
- 2 comments
#930 - Εnable reward model offloading option
Pull Request -
State: open - Opened by kfertakis about 2 months ago
- 2 comments
#929 - Deepspeed-Domino
Pull Request -
State: open - Opened by zhangsmallshark 2 months ago
- 1 comment
#928 - After using steps 1, 2, and 3, the test reply content only replies Assistant: </s>。
Issue -
State: closed - Opened by jianmomo 2 months ago
#927 - Remove the fixed `eot_token` mechanism for SFT
Pull Request -
State: open - Opened by Xingfu-Yi 2 months ago
- 1 comment
#925 - Update requirements for opencv-python CVE
Pull Request -
State: closed - Opened by loadams 3 months ago
#924 - AttributeError: 'DeepSpeedEngine' object has no attribute 'model',
Issue -
State: open - Opened by lovychen 3 months ago
- 1 comment
#923 - How to calculate training efficiency ,i.e tokens/sec of step 1 fine tuning of llama2 model ?
Issue -
State: open - Opened by sowmya04101998 3 months ago
#922 - Actor loss nan and Resizing model embedding
Issue -
State: open - Opened by ouyanmei 3 months ago
- 1 comment
#921 - DeepNVMe ZeRO-inf Tutorial
Pull Request -
State: closed - Opened by jomayeri 3 months ago
#920 - FileNotFoundError: [Errno 2] No such file or directory: 'numactl'
Issue -
State: open - Opened by zhiwentian 3 months ago
- 4 comments
#919 - DeepNVMe README.md add xref
Pull Request -
State: closed - Opened by stas00 3 months ago
#916 - Update README.md
Pull Request -
State: closed - Opened by keshavkowshik 3 months ago
#916 - Update README.md
Pull Request -
State: closed - Opened by keshavkowshik 3 months ago
#915 - step2 without any response for a long time
Issue -
State: open - Opened by asfadfaf 3 months ago
#915 - step2 without any response for a long time
Issue -
State: open - Opened by asfadfaf 3 months ago
#914 - DeepNVMe example scripts
Pull Request -
State: closed - Opened by tjruwase 3 months ago
#913 - Add openai client to deepspeedometer
Pull Request -
State: closed - Opened by delock 4 months ago
- 2 comments
#912 - Different zero stage the training memory compute
Issue -
State: open - Opened by Arcmoon-Hu 4 months ago
#912 - Different zero stage the training memory compute
Issue -
State: open - Opened by Arcmoon-Hu 4 months ago
#911 - nvcc fatal : Unsupported gpu architecture 'compute_86' and nvcc fatal : Value 'c++17' is not defined for option 'std'
Issue -
State: closed - Opened by Xccanxin 4 months ago
- 1 comment
#911 - nvcc fatal : Unsupported gpu architecture 'compute_86' and nvcc fatal : Value 'c++17' is not defined for option 'std'
Issue -
State: closed - Opened by Xccanxin 4 months ago
- 1 comment
#910 - How to start deepspeed automatically?
Issue -
State: closed - Opened by qwerfdsadad 5 months ago
- 2 comments
#909 - Consult the first phase.
Issue -
State: closed - Opened by csxrzhang 5 months ago
- 2 comments
#909 - Consult the first phase.
Issue -
State: closed - Opened by csxrzhang 5 months ago
- 2 comments
#908 - an error with gradient checkpointing in DeepspeedChat step 3
Issue -
State: open - Opened by wangyuwen1999 5 months ago
#908 - an error with gradient checkpointing in DeepspeedChat step 3
Issue -
State: open - Opened by wangyuwen1999 5 months ago
#907 - 单机多卡进行RLHF在第三步中使用Qwen模型作Actor Model报错
Issue -
State: open - Opened by Dakai798 5 months ago
- 1 comment
#907 - 单机多卡进行RLHF在第三步中使用Qwen模型作Actor Model报错
Issue -
State: open - Opened by Dakai798 5 months ago
- 1 comment
#906 - DeepSpeed-Chat step-1 hanging for a long time
Issue -
State: open - Opened by lemon-little 5 months ago
#906 - DeepSpeed-Chat step-1 hanging for a long time
Issue -
State: open - Opened by lemon-little 5 months ago
#905 - Enable cpu/xpu support for the benchmarking suite
Pull Request -
State: closed - Opened by louie-tsai 6 months ago
- 8 comments
#905 - Enable cpu/xpu support for the benchmarking suite
Pull Request -
State: closed - Opened by louie-tsai 6 months ago
- 8 comments
#904 - CPU OOM when inferencing Llama3-70B-Chinese-Chat
Issue -
State: open - Opened by GORGEOUSLCX 6 months ago
#903 - cannot pickle 'Stream' object
Issue -
State: open - Opened by teis-e 6 months ago
#903 - cannot pickle 'Stream' object
Issue -
State: open - Opened by teis-e 6 months ago
#902 - can not run the test-gpt.sh because of assertionError
Issue -
State: open - Opened by leachee99 6 months ago
#901 - 请问fastgen 是否支持长文本和序列并行推理
Issue -
State: open - Opened by AceCoder0 6 months ago
#901 - 请问fastgen 是否支持长文本和序列并行推理
Issue -
State: open - Opened by AceCoder0 6 months ago
#900 - Add --client-only arg to mii benchmark
Pull Request -
State: closed - Opened by delock 7 months ago
#900 - Add --client-only arg to mii benchmark
Pull Request -
State: closed - Opened by delock 7 months ago
#899 - Refactored LLM benchmark code
Pull Request -
State: closed - Opened by mrwyattii 7 months ago
#899 - Refactored LLM benchmark code
Pull Request -
State: closed - Opened by mrwyattii 7 months ago
#898 - fix bug with queue.empty not being reliable
Pull Request -
State: closed - Opened by mrwyattii 7 months ago
#897 - Update tokens_per_sec calculation to work w/ stream and non-stream cases
Pull Request -
State: closed - Opened by lekurile 7 months ago
#897 - Update tokens_per_sec calculation to work w/ stream and non-stream cases
Pull Request -
State: closed - Opened by lekurile 7 months ago
#896 - run-example.sh fails with urllib3.exceptions.ProtocolError: Response ended prematurely
Issue -
State: closed - Opened by awan-10 7 months ago
- 11 comments
#895 - updating tokens per second to include the token count of generated tokens.
Pull Request -
State: closed - Opened by guptha23 7 months ago
#895 - updating tokens per second to include the token count of generated tokens.
Pull Request -
State: closed - Opened by guptha23 7 months ago
#894 - [Error] AutoTune: `connect to host localhost port 22: Connection refused`
Issue -
State: open - Opened by wqw547243068 7 months ago
#894 - [Error] AutoTune: `connect to host localhost port 22: Connection refused`
Issue -
State: open - Opened by wqw547243068 7 months ago
#893 - How to use deepspeed for multi-node and multi-card task in slurm cluster
Issue -
State: open - Opened by dshwei 7 months ago
#893 - How to use deepspeed for multi-node and multi-card task in slurm cluster
Issue -
State: open - Opened by dshwei 7 months ago
#892 - Does Zero-Inference support TP?
Issue -
State: open - Opened by preminstrel 7 months ago
- 11 comments
#892 - Does Zero-Inference support TP?
Issue -
State: open - Opened by preminstrel 7 months ago
- 11 comments
#891 - extend max_prompt_length and input text for 128k evaluation
Pull Request -
State: closed - Opened by HeyangQin 7 months ago
#890 - Deepspeed support finetune extra model with lora ?
Issue -
State: open - Opened by wanghongqu 7 months ago
- 1 comment
#890 - Deepspeed support finetune extra model with lora ?
Issue -
State: open - Opened by wanghongqu 7 months ago
- 1 comment
#889 - 不同机器上python环境变量路径不同,deepspeed启动后发现找不到其他机器的python环境,如何解决
Issue -
State: closed - Opened by liqwertyu 7 months ago
#888 - when calculating actor loss, why the mask is "action_mask[:, start: ] "
Issue -
State: closed - Opened by fancghit 8 months ago
#888 - when calculating actor loss, why the mask is "action_mask[:, start: ] "
Issue -
State: closed - Opened by fancghit 8 months ago
#887 - The actor constantly generates ['</s>'] or ['<|endoftext|></s>'] after 200 steps in RLHF with hybrid engine disabled
Issue -
State: open - Opened by mousewu 8 months ago
- 1 comment
#887 - The actor constantly generates ['</s>'] or ['<|endoftext|></s>'] after 200 steps in RLHF with hybrid engine disabled
Issue -
State: open - Opened by mousewu 8 months ago
- 1 comment
#886 - About multiple-thread attention computation on CPU using zero-inference example.
Issue -
State: open - Opened by luckyq 8 months ago
#886 - About multiple-thread attention computation on CPU using zero-inference example.
Issue -
State: open - Opened by luckyq 8 months ago
#885 - Suggested GPU to run the demo code of step2_reward_model_finetuning (DeepSpeed-Chat)
Issue -
State: open - Opened by wenbozhangjs 8 months ago
#885 - Suggested GPU to run the demo code of step2_reward_model_finetuning (DeepSpeed-Chat)
Issue -
State: open - Opened by wenbozhangjs 8 months ago
#884 - [REQUEST] More fine-grained distributed strategies for RLHF training
Issue -
State: open - Opened by youshaox 8 months ago
#884 - [REQUEST] More fine-grained distributed strategies for RLHF training
Issue -
State: open - Opened by youshaox 8 months ago
#883 - The reward value did not increase.
Issue -
State: open - Opened by Sun-Shiqi 8 months ago
- 1 comment
#883 - The reward value did not increase.
Issue -
State: open - Opened by Sun-Shiqi 8 months ago
- 1 comment
#882 - Fix response check in call_aml function
Pull Request -
State: closed - Opened by HeyangQin 8 months ago
#881 - Update throughput-latency plot script
Pull Request -
State: closed - Opened by lekurile 8 months ago
#880 - [Inference Benchmark] set `num_requests` based on `num_clients`
Pull Request -
State: closed - Opened by mrwyattii 8 months ago
#879 - Confusion about Deepspeed Inference
Issue -
State: open - Opened by ZekaiGalaxy 8 months ago
- 1 comment
#879 - Confusion about Deepspeed Inference
Issue -
State: open - Opened by ZekaiGalaxy 8 months ago
- 1 comment
#878 - `AttributeError: readonly attribute` while trying to run training/HelloDeepSpeed
Issue -
State: open - Opened by htjain 8 months ago
#878 - `AttributeError: readonly attribute` while trying to run training/HelloDeepSpeed
Issue -
State: open - Opened by htjain 8 months ago
#876 - [inference benchmark] update AML kwargs to match vLLM kwargs
Pull Request -
State: closed - Opened by mrwyattii 8 months ago
#876 - [inference benchmark] update AML kwargs to match vLLM kwargs
Pull Request -
State: closed - Opened by mrwyattii 8 months ago
#875 - Improve robustness of infernece AML benchmark
Pull Request -
State: closed - Opened by HeyangQin 8 months ago
#875 - Improve robustness of infernece AML benchmark
Pull Request -
State: closed - Opened by HeyangQin 8 months ago
#874 - Fix AML benchmark E2E measurment
Pull Request -
State: closed - Opened by mrwyattii 8 months ago
#873 - Add LoRA optimization to the SD training example
Pull Request -
State: open - Opened by PareesaMS 9 months ago
#873 - Add LoRA optimization to the SD training example
Pull Request -
State: open - Opened by PareesaMS 9 months ago
#872 - Replace deprecated transformers.deepspeed module
Pull Request -
State: open - Opened by HollowMan6 9 months ago
#872 - Replace deprecated transformers.deepspeed module
Pull Request -
State: open - Opened by HollowMan6 9 months ago
#871 - Xiaoxia/fp v1
Pull Request -
State: closed - Opened by xiaoxiawu-microsoft 9 months ago
#871 - Xiaoxia/fp v1
Pull Request -
State: closed - Opened by xiaoxiawu-microsoft 9 months ago
#870 - Remove AML key from args dict when saving results
Pull Request -
State: closed - Opened by lekurile 9 months ago
#870 - Remove AML key from args dict when saving results
Pull Request -
State: closed - Opened by lekurile 9 months ago
#869 - Inference Benchmark: Catch AML error response
Pull Request -
State: closed - Opened by mrwyattii 9 months ago
#869 - Inference Benchmark: Catch AML error response
Pull Request -
State: closed - Opened by mrwyattii 9 months ago
#868 - Update Inference Benchmarking Scripts - Support AML
Pull Request -
State: closed - Opened by lekurile 9 months ago
- 1 comment
#868 - Update Inference Benchmarking Scripts - Support AML
Pull Request -
State: closed - Opened by lekurile 9 months ago
- 1 comment
#867 - [Bug] DeepSpeed Inference Does not Work with LLaMA (Latest verison)
Issue -
State: open - Opened by allanj 9 months ago
- 3 comments
#867 - [Bug] DeepSpeed Inference Does not Work with LLaMA (Latest verison)
Issue -
State: open - Opened by allanj 9 months ago
- 3 comments