Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / google/jetstream-pytorch issues and pull requests
#187 - Add model warmup and jax compilation cache flags
Pull Request -
State: open - Opened by vivianrwu about 2 months ago
#186 - Fix too many positional arguments lint error
Pull Request -
State: closed - Opened by FanhaiLu1 about 2 months ago
#185 - [Feature Request] Per request sampling params
Issue -
State: open - Opened by qihqi about 2 months ago
- 1 comment
#184 - Switch to NP from Jax to improve attention manager performance
Pull Request -
State: closed - Opened by FanhaiLu1 about 2 months ago
- 1 comment
#183 - Make sure the server does not crash if the input is too long
Issue -
State: open - Opened by qihqi 2 months ago
#182 - [RFC] Formalizing commandline arguments.
Issue -
State: open - Opened by qihqi 2 months ago
#181 - Add offline perf ci
Pull Request -
State: closed - Opened by qihqi 2 months ago
- 6 comments
#180 - Support End To End PagedAttention in JetStream
Pull Request -
State: closed - Opened by FanhaiLu1 2 months ago
#179 - Pa decode checkin 1
Pull Request -
State: closed - Opened by FanhaiLu1 2 months ago
#178 - Update README for new CLI
Pull Request -
State: closed - Opened by qihqi 3 months ago
#177 - Update Jetstream, add optional sampler args.
Pull Request -
State: closed - Opened by qihqi 3 months ago
#176 - Add gemma support in better cli
Pull Request -
State: closed - Opened by qihqi 3 months ago
#175 - Use kwargs to simplify the call sites a bit
Pull Request -
State: closed - Opened by yixinshi 3 months ago
#174 - Add mixtral support to new CLI
Pull Request -
State: closed - Opened by qihqi 3 months ago
#173 - Issues with prefill & generate
Issue -
State: open - Opened by qihqi 3 months ago
#172 - Fix the performance regression with ragged attention on for llama2 7b.
Pull Request -
State: closed - Opened by wang2yn84 3 months ago
- 2 comments
#171 - Replace repeat kv with proper GQA handling.
Pull Request -
State: closed - Opened by wang2yn84 3 months ago
- 3 comments
#170 - fix ray engine crashes on multihost
Pull Request -
State: closed - Opened by sixiang-google 3 months ago
#169 - Error Running `run_ray_serve_interleave` with Llama3 8B
Issue -
State: open - Opened by ryanaoleary 3 months ago
#168 - Add a script to measure speed of basic ops
Pull Request -
State: closed - Opened by qihqi 3 months ago
#167 - Add page attention manager and kvcache manager
Pull Request -
State: closed - Opened by FanhaiLu1 3 months ago
#166 - Add page attention manager and kvcache manager
Pull Request -
State: closed - Opened by FanhaiLu1 3 months ago
#165 - Fix TPU head resource name for v4 and v5e
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#164 - Fix Ray engine crash on multihost
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#163 - Fixed exhausted bug between head and workers
Pull Request -
State: closed - Opened by FanhaiLu1 4 months ago
#162 - Handle v5e-8 in run_ray_serve_interleave
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#161 - Update Ray version in Dockerfile and add v5 configs
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#160 - Add newest llama-3 benchmarks
Pull Request -
State: closed - Opened by qihqi 4 months ago
#159 - V5e8 ray
Pull Request -
State: closed - Opened by FanhaiLu1 4 months ago
#158 - Return np instead of jax array for prefill result tokens
Pull Request -
State: closed - Opened by FanhaiLu1 4 months ago
#157 - Correct typo enbedding -> embedding
Pull Request -
State: closed - Opened by tengomucho 4 months ago
- 1 comment
#156 - commit act quant for conditional ffn
Pull Request -
State: open - Opened by qihqi 4 months ago
#155 - Stacked cache mixtral.
Pull Request -
State: closed - Opened by wang2yn84 4 months ago
#154 - Stacked cache for MLPerf
Pull Request -
State: closed - Opened by wang2yn84 4 months ago
#153 - Add mlperf benchmark for offline for mixtral
Pull Request -
State: closed - Opened by qihqi 4 months ago
- 2 comments
#152 - Set accumulate type to bf16 in activation quant
Pull Request -
State: closed - Opened by lsy323 4 months ago
- 1 comment
#151 - Optimize cache update.
Pull Request -
State: closed - Opened by wang2yn84 4 months ago
- 7 comments
#150 - Ray engine crashes on multihost when fetching Jax.array from prefill_ray
Issue -
State: closed - Opened by richardsliu 4 months ago
- 1 comment
#149 - Fix blockwise sharding
Pull Request -
State: open - Opened by lsy323 4 months ago
#148 - Add mlperf benchmark scripts in-tree.
Pull Request -
State: closed - Opened by qihqi 4 months ago
#147 - Make Ray engine and worker process prefill returning first token
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#146 - Jetstream + RayServe deployment for interleave mode
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#145 - Set JAX_PLATFORMS to "tpu, cpu" for ray worker
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#144 - Fix exception in ray_worker
Pull Request -
State: closed - Opened by richardsliu 4 months ago
#143 - Make prefilling return first token for loadgen integration
Pull Request -
State: closed - Opened by sixiang-google 4 months ago
- 1 comment
#142 - Add server tests
Pull Request -
State: closed - Opened by bvrockwell 4 months ago
- 1 comment
#141 - Update benchmark command in README.md
Pull Request -
State: closed - Opened by bhavya01 5 months ago
#140 - add enable jax profiler to run_server
Pull Request -
State: closed - Opened by bvrockwell 5 months ago
#139 - Update README.md to state the limitation of accessing GCS when conver…
Pull Request -
State: closed - Opened by wang2yn84 5 months ago
#138 - Minor fixes to README
Pull Request -
State: closed - Opened by wang2yn84 5 months ago
#137 - Empty response returned for prompt responses when using run_server_with_ray.py and batch_size > 1
Issue -
State: open - Opened by richardsliu 5 months ago
- 2 comments
#136 - Add layer id in scope for each TransformerBlock layer
Pull Request -
State: closed - Opened by FanhaiLu1 5 months ago
#135 - Checkpoint conversion script breaks for meta-llama/llama-2-7b on HF
Issue -
State: open - Opened by vivianrwu 5 months ago
#134 - prototyping better UX
Pull Request -
State: closed - Opened by qihqi 5 months ago
- 2 comments
#133 - Add left aligned cache support.
Pull Request -
State: closed - Opened by wang2yn84 5 months ago
#132 - fix mixtral quantization scaler axis when dimension > 2
Pull Request -
State: closed - Opened by sixiang-google 5 months ago
#131 - Add test for Mixtral model.
Pull Request -
State: closed - Opened by wang2yn84 5 months ago
#130 - make sure GPU works
Pull Request -
State: closed - Opened by qihqi 5 months ago
#129 - Update README.md
Pull Request -
State: closed - Opened by bhavya01 5 months ago
#128 - Update README.md
Pull Request -
State: closed - Opened by qihqi 5 months ago
#127 - Update submodules, prepare for leasing v0.2.4
Pull Request -
State: closed - Opened by qihqi 5 months ago
- 1 comment
#126 - Add lock in prefill and generate to prevent starvation
Pull Request -
State: closed - Opened by FanhaiLu1 5 months ago
- 1 comment
#125 - Update summary.md
Pull Request -
State: closed - Opened by qihqi 5 months ago
- 1 comment
#124 - Remove JSON config mangling for Gemma ckpt
Pull Request -
State: closed - Opened by lsy323 5 months ago
- 1 comment
#123 - Add different token sampling algorithms to decoder.
Pull Request -
State: closed - Opened by bvrockwell 5 months ago
- 1 comment
#122 - add script to isntall for GPU
Pull Request -
State: closed - Opened by qihqi 5 months ago
- 2 comments
#121 - Fix convert_checkpoint.py for hf and gemma
Pull Request -
State: closed - Opened by qihqi 5 months ago
#120 - Mixtral enablement.
Pull Request -
State: closed - Opened by wang2yn84 5 months ago
- 1 comment
#119 - Add guide on adding HF ckpt conversion support
Pull Request -
State: closed - Opened by lsy323 5 months ago
#118 - Support HF LLaMA ckpt conversion
Pull Request -
State: closed - Opened by lsy323 5 months ago
#117 - Integrate disaggregated serving with JetStream
Pull Request -
State: closed - Opened by FanhaiLu1 5 months ago
#116 - Fix conversion bug
Pull Request -
State: closed - Opened by yeandy 5 months ago
#115 - Bug in model conversion script
Issue -
State: closed - Opened by yeandy 5 months ago
- 2 comments
#114 - Add for readme interleave multiple host with ray
Pull Request -
State: closed - Opened by FanhaiLu1 5 months ago
- 1 comment
#113 - Metrics bug: server_lib should be config_lib
Pull Request -
State: closed - Opened by Bslabe123 6 months ago
#112 - Enable jax profiler server in run with ray
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
#111 - Jetstream: 8128c8a -> v0.2.2
Pull Request -
State: closed - Opened by Bslabe123 6 months ago
#110 - Release JetStream v0.2.2
Pull Request -
State: closed - Opened by JoeZijunZhou 6 months ago
#109 - Add run_server with ray for interleave serving
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
#108 - Update Jetstream commit id
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
#107 - Return Tuple(interleaveEngList, prefillEngineList, decodeEngineList) in create ray engine
Issue -
State: open - Opened by FanhaiLu1 6 months ago
#106 - Ray Disaggregated Serving MVP
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
- 2 comments
#105 - Add activation quantization support to per-channel quantized linear layers
Pull Request -
State: closed - Opened by lsy323 6 months ago
#104 - Fix convert script cannot generate bf16 weights
Pull Request -
State: closed - Opened by lsy323 6 months ago
#103 - Update run_interactive.py with finer control of profiler.
Pull Request -
State: closed - Opened by wang2yn84 6 months ago
#102 - Update run_server.py. metrics_server_config is not supported in JetStream[8128c8a] yet
Pull Request -
State: closed - Opened by wang2yn84 6 months ago
- 2 comments
#101 - Add support for Llama3-70b
Pull Request -
State: closed - Opened by bhavya01 6 months ago
- 3 comments
#100 - Fix ray conflict changes
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
- 2 comments
#99 - Pass metrics client config through to Jetstream
Pull Request -
State: closed - Opened by Bslabe123 6 months ago
- 1 comment
#98 - Fix gemma model, enable_weight_quantization is available through quant_config.
Pull Request -
State: closed - Opened by wang2yn84 6 months ago
- 1 comment
#97 - Update README.md, the quantize flag is no longer available, quantize_type assumes the role of the original flag.
Pull Request -
State: closed - Opened by wang2yn84 6 months ago
- 1 comment
#96 - Fix flax and ray dependencies
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
#95 - Fixes tests. Can now run on CPU by default.
Pull Request -
State: closed - Opened by wang2yn84 6 months ago
- 4 comments
#94 - Add regression test to detect service broken and performance degradation
Issue -
State: open - Opened by FanhaiLu1 6 months ago
- 2 comments
#93 - Integrates ragged attention to JetStream Pytorch
Pull Request -
State: closed - Opened by wang2yn84 6 months ago
#92 - Move flags in scripts to a common function
Pull Request -
State: closed - Opened by lsy323 6 months ago
#91 - Update README.md
Pull Request -
State: closed - Opened by qihqi 6 months ago
#90 - Leverage tokens_utils to process result tokens
Pull Request -
State: closed - Opened by FanhaiLu1 6 months ago
#89 - Move deps to git submodule
Pull Request -
State: closed - Opened by qihqi 6 months ago
#88 - Update version of jetstream; misc fixes
Pull Request -
State: closed - Opened by qihqi 6 months ago