Ecosyste.ms: Issues
An open API service for providing issue and pull request metadata for open source projects.
GitHub / tensorflow/mesh issues and pull requests
#396 - Error while importing Meshtensorflow
Issue -
State: closed - Opened by billygrahamram 11 months ago
#395 - Migrate references and remove legacy target tpu:tpu_estimator.
Pull Request -
State: closed - Opened by copybara-service[bot] about 1 year ago
#394 - Update attention.py
Pull Request -
State: open - Opened by sjw8793 about 1 year ago
- 1 comment
#393 - Optimizer momentums not properly populated training model with DTensors
Issue -
State: closed - Opened by pentney about 1 year ago
- 1 comment
#392 - AttributeError: module 'tensorflow.python.framework.ops' has no attribute 'register_tensor_conversion_function'
Issue -
State: closed - Opened by Xnhyacinth about 1 year ago
- 4 comments
#391 - Does load-balanced loss help the loss converge?
Issue -
State: open - Opened by mathfinder over 1 year ago
#389 - Move `convert_to_tensor`, `convert_to_tensor_v1`, `convert_to_tensor_v1_with_dispatch`, `convert_to_tensor_v2_with_dispatch`, and `convert_to_tensor_v2` into `tensor_conversion_registry`.
Pull Request -
State: open - Opened by copybara-service[bot] over 1 year ago
#388 - feat(ci): enable `pip` caching in CI
Pull Request -
State: closed - Opened by SauravMaheshkar over 1 year ago
- 1 comment
#387 - Remove legacy references from `ops.py`.
Pull Request -
State: closed - Opened by copybara-service[bot] almost 2 years ago
#386 - Remove legacy references from `ops.py`.
Pull Request -
State: closed - Opened by copybara-service[bot] almost 2 years ago
#385 - Fix docstring typos
Pull Request -
State: closed - Opened by copybara-service[bot] about 2 years ago
- 1 comment
#384 - Enable multi-file inference
Pull Request -
State: closed - Opened by copybara-service[bot] about 2 years ago
- 1 comment
#383 - When running BERT on GPU: Resource exhausted: failed to allocate memory
Issue -
State: open - Opened by Currycurrycurry about 2 years ago
- 1 comment
#382 - Internal change
Pull Request -
State: closed - Opened by copybara-service[bot] over 2 years ago
- 1 comment
#381 - Make mesh_tensorflow's call of `get_replicated_var_handle` backward-compatible with tf <= 2.8.0. Fixes https://github.com/google-research/text-to-text-transfer-transformer/issues/1020.
Pull Request -
State: closed - Opened by copybara-service[bot] over 2 years ago
#380 - bump version number to release updated PyPI package that includes last year enhancements
Pull Request -
State: closed - Opened by copybara-service[bot] over 2 years ago
#379 - Getting "NanLossDuringTrainingError: NaN loss during training."
Issue -
State: open - Opened by dhruval-p over 2 years ago
#378 - mask_1_flat and mask_2_flat applied to gates twice?
Issue -
State: open - Opened by marhlder over 2 years ago
#377 - Explicitly import estimator from tensorflow as a separate import instead of accessing it via tf.estimator and depend on the tensorflow estimator target.
Pull Request -
State: closed - Opened by copybara-service[bot] over 2 years ago
#376 - Remove unused comments related to Python 2 compatibility.
Pull Request -
State: closed - Opened by copybara-service[bot] over 2 years ago
#375 - Make TPU variable name deterministic.
Pull Request -
State: closed - Opened by copybara-service[bot] over 2 years ago
#374 - Adding a new Gradient Estimator for Routing using REINFORCE with a leave-one-out baseline.
Pull Request -
State: open - Opened by copybara-service[bot] over 2 years ago
#373 - #HyperPrompt Part 2 of HyperPrompt implementation: the actual computation of HyperPrompt inside self-attention layer.
Pull Request -
State: closed - Opened by copybara-service[bot] almost 3 years ago
#372 - Use math.gcd instead of fractions.gcd, the former is deprecated in Python 3.5 and removed in 3.9.
Pull Request -
State: closed - Opened by copybara-service[bot] almost 3 years ago
#371 - Split out optimizer call for internal purposes.
Pull Request -
State: closed - Opened by copybara-service[bot] almost 3 years ago
#370 - fix typo in logging statement.
Pull Request -
State: closed - Opened by copybara-service[bot] almost 3 years ago
#369 - About the mixture of expert model
Issue -
State: open - Opened by fym0503 almost 3 years ago
#368 - Mesh-tf model conversion to onnx?
Issue -
State: open - Opened by b-analyst about 3 years ago
- 2 comments
#367 - Minor comment fix to refer to the correct argument name.
Pull Request -
State: open - Opened by copybara-service[bot] about 3 years ago
Labels: cla: yes
#366 - Make sure gates are not normalized for n=1 for top_n routing
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
- 3 comments
Labels: cla: no
#365 - Fix some example code in readme for einsum operation
Pull Request -
State: open - Opened by baragona about 3 years ago
- 2 comments
Labels: cla: yes
#364 - How to freeze embedding layers
Issue -
State: open - Opened by lintangsutawika about 3 years ago
#363 - Add a link to the Primer paper
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
- 4 comments
Labels: cla: no
#362 - Beam search
Issue -
State: open - Opened by antonio-mastropaolo about 3 years ago
#361 - Output raw model outputs during eval
Pull Request -
State: open - Opened by craffel about 3 years ago
Labels: cla: yes
#360 - Add utility to save score predictions to TFRecords for scoring large datasets.
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
Labels: cla: yes
#359 - Save scores lazily.
Pull Request -
State: open - Opened by copybara-service[bot] about 3 years ago
Labels: cla: yes
#358 - Remove unnecessary name and cwise in squared relu.
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
Labels: cla: yes
#357 - Expert Attention Fixes:
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
- 3 comments
Labels: cla: no
#356 - Squared ReLU from Primer paper.
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
Labels: cla: yes
#355 - Internal
Pull Request -
State: closed - Opened by copybara-service[bot] about 3 years ago
- 18 comments
Labels: cla: no
#354 - Remove dataset checkpoint policy override now that b/181765832 is resolved.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#353 - Add more extensive top-2 logging.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#352 - Ability to add Custom Tensorflow Hooks
Issue -
State: open - Opened by trisongz over 3 years ago
#351 - Only add z_loss to losses if during training.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#350 - Expert Attention Fixes:
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#349 - Fix bug in shared_kv attention for autoregressive decoding.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 2 comments
Labels: cla: no
#348 - Change second d_model_split dim's size to be the output shape, instead of input shape. This allows it to work for layers where the input size is different than the output size.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#347 - heterogeneous mixture of experts layer
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 5 comments
Labels: cla: no
#346 - Add more options to Experts Attention. These options remove 1/3 of the all2all communication costs:
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 2 comments
Labels: cla: no
#345 - Update mesh tensorflow to use device assignments to map logical to physical processor numbers on N-D Meshes. Currently only enabled when logical cores per replica is set to 1.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 5 comments
Labels: cla: no
#344 - Add in Z-loss to all routing algorithms.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#343 - Minor changes to make Experts Attention work.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 6 comments
Labels: cla: no
#342 - MODE models with hetereogeneous expert width
Pull Request -
State: open - Opened by copybara-service[bot] over 3 years ago
- 1 comment
Labels: cla: no
#341 - Add top-n routing, which generalized top-2 routing. Improves model quality for larger capacity factors (e.g. 2.0+).
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#340 - Add z_loss on all attention logits. This does not change model quality and can effectively decrease the attention logits by order of magnitude.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#339 - Using the soft loss dtype instead of hardcoding bfloat16.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#338 - - Fix casting for NTLB.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#337 - Switch logging to warm to not fail when using deterministic dataset checkpointing.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#336 - Next gen fish optimizations for MeshTF.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 4 comments
Labels: cla: no
#335 - Add z-loss to the top_2_gating method.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#334 - Add z-loss to the top_2_gating method.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#333 - Option to add a unique suffix to eval subdirectories. Allows to easily have many different eval jobs going with a single training job.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 1 comment
Labels: cla: no
#332 - Add option to stochastically use the non-top expert during training.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#331 - Allow tokens embeddings to be used for routing decisions.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#330 - [MOE-transformer] How do you build static graph of MOE-Model?
Issue -
State: open - Opened by imyzx2017 over 3 years ago
#329 - Option to use mtf.Print to log which tokens are sent to which experts when run on CPU.
Pull Request -
State: open - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#328 - How to use tf.contrib.opt.ScipyOptimizerInterface or tfp.optimizer.lbfgs_minimize with MeshTF ?
Issue -
State: open - Opened by harshil-patel-code over 3 years ago
#327 - Make directory if it doesn't exist.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#326 - Splitting tokens when routing
Pull Request -
State: open - Opened by copybara-service[bot] over 3 years ago
- 2 comments
Labels: cla: no
#325 - Log expert_gating once it is been masked by the importance tensor to be sure no padded probabilities are being logged.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 10 comments
Labels: cla: no
#324 - How to assign values to specific slice of a data block on a specific GPU?
Issue -
State: open - Opened by harshil-patel-code over 3 years ago
#323 - Unique variable names for ParallelLayer
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#322 - Modified the `eval_model` function in mesh_tensorflow/transformer/utils.py to accept Summary protos in addition to tag-to-scalar dicts.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 6 comments
Labels: cla: no
#321 - Add original AI2 version of c4 v3.0.1, ND3 deduplicated with param = 0.8, and LM1B, Wiki40B, and lm_first_len512 versions of original AI2 C4 and ND3 deduped AI2 C4 for evaluation.
Pull Request -
State: open - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#320 - Add `cast` preprocessor and add tasks for inference prompts for deduplication project.
Pull Request -
State: open - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#319 - Use %g instead of %f for printing in mesh_tensorflow/transformer/utils.py.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 5 comments
Labels: cla: no
#318 - performing the opposite of mtf.lowering
Issue -
State: open - Opened by DavidPeleg6 over 3 years ago
- 1 comment
#317 - Rolls back a change that broke several clients.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 4 comments
Labels: cla: no
#316 - Minor fix to make sure printing does not crash if a filter_fn is used.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#315 - Internal only change : )
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#314 - Explicitly pass named-arg to mtf.dropout
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#313 - Fix ALBERT arXiv URL
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#312 - [MTF] Minor usability change in get_inputs_from_file for accidentally empty files.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#311 - Add in z_loss for router softmax for switch layer.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#310 - try to create gin related flags and pass if the flags are created.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#309 - Allow for not scaling certain parameters updates by its norm in Adafactor. Also add a parameter to allow for changing the Adafactor decay rate.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#308 - no public changes
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#307 - Add flexible checkpoint loading option to allow for loading checkpoints
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#303 - Add new routing method where each expert chooses when tokens it wants. A token can be chosen multiple times across different experts.
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
- 3 comments
Labels: cla: no
#302 - Operation to linearly anneal dropout rate between start_step and end_step
Pull Request -
State: closed - Opened by copybara-service[bot] over 3 years ago
Labels: cla: yes
#291 - Add loss functions for multiple-target objectives for distillation.
Pull Request -
State: open - Opened by copybara-service[bot] almost 4 years ago
- 2 comments
Labels: cla: no
#290 - Use multiple target objectives for distillation. Also see cl/356382304
Pull Request -
State: open - Opened by copybara-service[bot] almost 4 years ago
- 2 comments
Labels: cla: no
#289 - Change get_replicated_var_handle to accept resource tensors instead of variables
Pull Request -
State: closed - Opened by copybara-service[bot] almost 4 years ago
Labels: cla: yes
#283 - internal
Pull Request -
State: open - Opened by copybara-service[bot] almost 4 years ago
- 1 comment
Labels: cla: no
#281 - Decode Unicode strings in inference mode.
Pull Request -
State: open - Opened by copybara-service[bot] almost 4 years ago
- 1 comment
Labels: cla: no
#278 - the `model_executor.py` example is broken
Issue -
State: closed - Opened by XMaster96 almost 4 years ago
#259 - Fixing model export breakage.
Pull Request -
State: open - Opened by copybara-service[bot] almost 4 years ago
- 1 comment
Labels: cla: no
#235 - Debug in mesh Tensorflow
Issue -
State: open - Opened by patrickvonplaten about 4 years ago
- 3 comments
#181 - Future of this project?
Issue -
State: open - Opened by Mistobaan about 4 years ago
- 2 comments