Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / deepjavalibrary/djl-serving issues and pull requests
#1260 - Update trtllm toolkit path
Pull Request -
State: closed - Opened by rohithkrn about 1 year ago
#1259 - [TRT partition] add realtime stream reader for the conversion script
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1258 - [TRTLLM] always setting request output length
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1257 - MME - deviceId while creating workers
Pull Request -
State: closed - Opened by sindhuvahinis about 1 year ago
#1256 - [TRTLLM] add trtllm with no deps
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1255 - [TRTLLM] use tensorrt wheel
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1254 - install trtllm toolkit
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1253 - [python] Fixes build error
Pull Request -
State: closed - Opened by frankfliu about 1 year ago
#1252 - Inf2 properties refactoring using pydantic
Pull Request -
State: closed - Opened by sindhuvahinis about 1 year ago
#1251 - [serving] Adds token latency metric
Pull Request -
State: closed - Opened by frankfliu about 1 year ago
#1250 - [feat] Add inf2 2.15 sdk and handler to 0.24.0 dlc
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1249 - [python] Buffer tokens for rolling batch
Pull Request -
State: closed - Opened by frankfliu about 1 year ago
#1248 - [TRTLLM] some clean up on trtllm handler
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1247 - add trtllm cuda-compat
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1246 - [DeepSpeed DLC] separate container build with multi-layers
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1245 - remove unused components
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1244 - removing ai template installation in deepspeed container
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1240 - New PR for tensorrt llm
Pull Request -
State: closed - Opened by ydm-amazon about 1 year ago
- 3 comments
#1236 - Issue with serving modes section (documentation)
Issue -
State: closed - Opened by segundovolante about 1 year ago
- 4 comments
Labels: bug
#1235 - Add trt-llm engine build step during model initialization
Pull Request -
State: closed - Opened by rohithkrn about 1 year ago
- 3 comments
#1230 - [SageMaker Galactus developer experience] model load integration to DJL serving
Pull Request -
State: closed - Opened by haNa-meister about 1 year ago
- 3 comments
#1229 - [fix] gpt2 neuron support handler and ci
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1227 - [neuronx] bump to 2.15 for tnx container and scripts
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
- 2 comments
#1222 - Cleans tensorParallelDegree with MultiDevice
Pull Request -
State: closed - Opened by zachgk about 1 year ago
- 2 comments
#1220 - Update mpirun options
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
- 1 comment
#1218 - [TRTLLM][SAMPLE] add trtllm rough rolling batcher
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1216 - Do warmup in multiple requests
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1214 - Ability to transform model outputs in DJL Serving
Issue -
State: closed - Opened by rachitchauhan43 about 1 year ago
- 4 comments
Labels: enhancement
#1212 - switch to torchrun as default
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1206 - [NeuronX] add attention mask porting from optimum-neuron
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1203 - Setting default datatype for deepspeed handlers
Pull Request -
State: closed - Opened by sindhuvahinis about 1 year ago
#1194 - [fix] update context estimate interface
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1193 - [python] Do not set default value for truncate
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1190 - CI Test
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1189 - [0.24.0] Fix lmi_dist garbage output issue
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1188 - djl lmi images with vllm and hf quantizaton support
Issue -
State: closed - Opened by Nagarajj about 1 year ago
- 1 comment
Labels: bug
#1187 - Fix lmi_dist garbage output issue
Pull Request -
State: open - Opened by xyang16 about 1 year ago
#1186 - [INF2] allow neuron to load split model directly
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1185 - Adding INF2 (transformers-neuronx) compilation latencies to SageMaker Health Metrics
Pull Request -
State: open - Opened by Lokiiiiii about 1 year ago
- 2 comments
#1184 - Add context length estimate for Neuron handler
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1183 - Add CI performance test for deepspeed smoothquant.
Pull Request -
State: closed - Opened by chen3933 about 1 year ago
#1182 - Fix max tensor_parallel_degree
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1181 - [bug fix] add entrypoint camel case recovery
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1180 - Update mmap version in `deepspeed.Dockerfile`
Pull Request -
State: closed - Opened by maaquib about 1 year ago
#1179 - Add aiccl support
Pull Request -
State: open - Opened by maaquib about 1 year ago
#1178 - [bugfix] parsing waiting steps to integer
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1177 - [CI] change xgen to standard llama model
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1176 - [LMI][Handler] add more model support coverage
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1175 - [fix] update tp for dynamic llama2 test back to 4
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1174 - Fix flash_attn import issue
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1173 - rolling batch does not work
Issue -
State: closed - Opened by prgawade about 1 year ago
- 2 comments
Labels: bug
#1172 - Faster in-memory weight transfer for transformers-neuronx
Pull Request -
State: closed - Opened by Lokiiiiii about 1 year ago
- 1 comment
#1171 - Adding llama2 w/ SmoothQuant ci test
Pull Request -
State: closed - Opened by maaquib about 1 year ago
#1170 - [Docker] free disk space for docker build
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1169 - Update java dependencies
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1168 - [INF2] add neuron batch size default and support rolling batch configs
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1167 - [feat] freeze deepspeed version for release
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1166 - Enable adapters preview in llm_integration test
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1165 - [Handler] disable flash attention as default as of now
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1164 - [fix] add fast loading to partition test
Pull Request -
State: closed - Opened by tosterberg about 1 year ago
#1163 - [serving] Cancel request if client disconnect
Pull Request -
State: open - Opened by frankfliu about 1 year ago
#1162 - installing official vLLM into container
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1161 - Update vllm wheel name
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1160 - Adds versions as labels in dockerfiles
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1159 - When doing smoothquant calibration, pass tokenizer through in deepspe…
Pull Request -
State: closed - Opened by davidthomas426 about 1 year ago
#1158 - [Handler] disable circular import
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1157 - Clarify error message with unsupported quantization algorithm, since …
Pull Request -
State: closed - Opened by davidthomas426 about 1 year ago
#1156 - Add error message for quantization when using checkpoint loading.
Pull Request -
State: closed - Opened by chen3933 about 1 year ago
#1155 - [IB] remove empty lines
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1154 - [0.24.0 branch] Release 0.24.0 changes
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1153 - Assert local lora models in the handler
Pull Request -
State: closed - Opened by rohithkrn about 1 year ago
#1152 - Add feature flag for adapters
Pull Request -
State: closed - Opened by zachgk about 1 year ago
- 1 comment
#1151 - Instance Benchmark Rev2
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1150 - [serving] Allow model_id point to djl model zoo
Pull Request -
State: closed - Opened by frankfliu about 1 year ago
#1149 - Instant benchmark
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1148 - Support adapters by properties
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1147 - Block remote adapter url and handler override
Pull Request -
State: closed - Opened by zachgk about 1 year ago
#1146 - Give a version of seq scheduler
Pull Request -
State: closed - Opened by KexinFeng about 1 year ago
#1145 - [INF2][CI] switch the model to pythia
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
- 1 comment
#1144 - [fix] version_fix
Pull Request -
State: closed - Opened by KexinFeng about 1 year ago
#1143 - [CI] allow inf2 instance to sleep longer
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1142 - [CI][Neuron] add extra timeout time for gpt neox
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1141 - [WIP][FasterTransformer] use python 3.10.0 and upgrade pytorch
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
- 2 comments
#1140 - Update vllm_rolling_batch.py
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1139 - Adding smoothquant integ tests
Pull Request -
State: closed - Opened by maaquib about 1 year ago
#1138 - [feat] Modify deepspeed handler to support smoothQuant.
Pull Request -
State: closed - Opened by chen3933 about 1 year ago
- 3 comments
#1137 - [fix] Gptq dependency
Pull Request -
State: closed - Opened by KexinFeng about 1 year ago
- 4 comments
#1136 - [vLLM][Handler] add quantization option for vLLM
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1135 - [python] Make rolling batch output not escape unicode characters
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1134 - [INF2][Handler] remove type conversion in Neuron
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1133 - Revert flash_attn v2 version back to 2.0.1
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
#1132 - [fix] fix hf transformer handler dependency
Pull Request -
State: closed - Opened by KexinFeng about 1 year ago
- 2 comments
#1131 - [fix] Fix falcon in seq_scheduler
Pull Request -
State: closed - Opened by KexinFeng about 1 year ago
#1130 - [0.22.1][DeepSpeed] make deepspeed run on cpu runner
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1129 - [fix] falcon test model failure in unittest
Pull Request -
State: closed - Opened by KexinFeng about 1 year ago
#1128 - [Backport][0.22.1][INF2] remove header installation
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1127 - [Backport][0.23.0] remove INF2 header installation
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
#1126 - [INF2] remove neuron settings on cache hit for the folder
Pull Request -
State: closed - Opened by lanking520 about 1 year ago
- 1 comment
#1125 - Add rolling batch gptq integration test
Pull Request -
State: closed - Opened by xyang16 about 1 year ago
- 2 comments
#1124 - [Handler] bump up vllm version and fix some bugs
Pull Request -
State: closed - Opened by lanking520 about 1 year ago