lucidrains/audiolm-pytorch issues and pull requests

#286 - no positional info used for SemanticTransformer

Issue - State: open - Opened by drwangxian 5 days ago - 3 comments

#285 - big change identified in Attention module

Issue - State: open - Opened by drwangxian 7 days ago - 2 comments

#284 - Can the tokenizer and synthesizer be made accessible separately?

Issue - State: open - Opened by 6cubed 12 days ago - 1 comment

#283 - Install fails due to fairseq dependency issue

Issue - State: closed - Opened by jbwallace123 about 2 months ago - 1 comment

#282 - Description of wandb logs

Issue - State: open - Opened by anuragkumar95 2 months ago

#281 - CUDA error: an illegal memory access was encountered

Issue - State: open - Opened by bulieme 2 months ago

#280 - Does SoundStream support streaming inference? If so, can you provide the relevant code?

Issue - State: open - Opened by dengcunqin 2 months ago

#279 - SoundStream training hangs using accelerate launch and the use_finite_scalar_quantizer=True setting

Issue - State: closed - Opened by ThomasLWang 3 months ago - 23 comments

#278 - How to use multiple GPUs for training and inference?

Issue - State: open - Opened by Oyiyi 5 months ago

#277 - what's your loss rate?

Issue - State: open - Opened by Oyiyi 5 months ago

#276 - Trying to overfit SounsStream

Issue - State: open - Opened by hishammadcor 6 months ago

#275 - Why is Encodec only encoding 1 frame?

Issue - State: closed - Opened by sivannavis 8 months ago - 1 comment

#274 - checkpoint

Issue - State: open - Opened by why414 10 months ago

#273 - Classifier for detecting synthetic speech

Issue - State: open - Opened by Ashigarg123 10 months ago

#272 - Model cascade training

Issue - State: open - Opened by a897456 11 months ago

#271 - Audiolm as an embedder model?

Issue - State: open - Opened by Darel13712 11 months ago

#270 - Soundstream training using birdsongs. Any guidance appreciated!

Issue - State: open - Opened by haydensflee 12 months ago

#269 - About get_embeds function

Issue - State: open - Opened by jihoojung0106 12 months ago - 1 comment

#268 - Why not use the output of Attention in Transformer?

Issue - State: closed - Opened by jihoojung0106 almost 1 year ago

#267 - AssertionError: File Not Found: data/hyp.scratch.yaml

Issue - State: closed - Opened by zrshello about 1 year ago - 1 comment

#266 - skip the eos when adding offset to avoid overlapping

Pull Request - State: closed - Opened by biendltb about 1 year ago - 5 comments

#265 - fix wrong tensor assignment of the output of attention

Pull Request - State: closed - Opened by biendltb about 1 year ago - 1 comment

#264 - Training dataset

Issue - State: open - Opened by hahust191806 about 1 year ago

#263 - Missing softmax after Linear layer

Issue - State: closed - Opened by biendltb about 1 year ago - 1 comment

#262 - Cannot retrieve dependency version for gateloop-transformer>=0.5.2, possible regression?

Issue - State: closed - Opened by afreemanio about 1 year ago - 1 comment

#261 - Removal of the last token id from fine_token_ids in FineTransformerWrapper.forward()

Issue - State: closed - Opened by biendltb about 1 year ago - 1 comment

#260 - Fix #259

Pull Request - State: closed - Opened by orrp about 1 year ago - 1 comment

#259 - `data_max_length_seconds` causes typecheck error in `CoarseTransformerTrainer`

Issue - State: closed - Opened by orrp about 1 year ago

#258 - `use_wandb_tracking` was not stored in most Trainers when it is `False`

Pull Request - State: closed - Opened by orrp about 1 year ago - 1 comment

#257 - Added wandb tracking to SemanticTransformerTrainer, CoarseTransformerTrainer, and FineTransformerTrainer

Pull Request - State: closed - Opened by LukasNel about 1 year ago - 1 comment

#256 - Soundstream discriminator clip_grad_norm - some params are not clipped.

Issue - State: closed - Opened by avihu111 about 1 year ago - 3 comments

#255 - Gradient Issue when Finetuning

Issue - State: closed - Opened by tysonjordan about 1 year ago

#254 - Error in exporting soundstream to onnx

Issue - State: open - Opened by kalradivyanshu about 1 year ago - 14 comments

#253 - Only noise as a result

Issue - State: open - Opened by mpastewski about 1 year ago - 4 comments

#251 - Update RVQ projection layers during training

Pull Request - State: closed - Opened by ilya16 about 1 year ago - 1 comment

#249 - Question: Random semantic embedding in SemanticTransformer?

Issue - State: open - Opened by stg1205 about 1 year ago - 1 comment

#248 - bugfix - swap codec variable for course wrapper

Pull Request - State: closed - Opened by rgxb2807 about 1 year ago - 1 comment

#247 - I very thanks for your work. But when i train the soundstream model, why does it need a pre-trained Encodec and then error?

Issue - State: closed - Opened by DingWeiPeng about 1 year ago

#246 - IndexError Using Encodec and setting return_coarse_generated_wave=True

Issue - State: closed - Opened by rgxb2807 about 1 year ago - 5 comments

#244 - Fixed typo in README.md

Pull Request - State: closed - Opened by y4umeng about 1 year ago

#243 - Bugfix - Fixing validation dataset variable on FineTransformerTrainer

Pull Request - State: closed - Opened by rgxb2807 over 1 year ago - 1 comment

#242 - Question: Any way to specify validation dataset for SemanticTransformer, CoarseTransformer and FineTransformer?

Issue - State: closed - Opened by rgxb2807 over 1 year ago - 2 comments

#241 - Question: Checkpoint of the model

Issue - State: open - Opened by fernandals over 1 year ago - 1 comment

#240 - Question: Are there any work arounds for using DeepSpeed for multi-gpu training

Issue - State: open - Opened by rgxb2807 over 1 year ago - 4 comments

#239 - Question: How to load pythorch format as HubertWithKmeans?

Issue - State: open - Opened by Selectorrr over 1 year ago

#238 - Question on discrepancy between original data and reconstructed data sizes

Issue - State: open - Opened by tysonjordan over 1 year ago - 1 comment

#236 - Bug in generation when generating with Encodec

Issue - State: closed - Opened by FrancescoVV over 1 year ago - 7 comments

#235 - have trouble to generate semantic tokens using the demo code

Issue - State: closed - Opened by dwangF0 over 1 year ago - 2 comments

#234 - multi-gpu training not working with accelerate

Issue - State: closed - Opened by FrancescoVV over 1 year ago - 13 comments

#233 - Does VALL-E follow the same semantic/coarse hierarchical structure as AudioLM?

Issue - State: closed - Opened by williamluer over 1 year ago - 3 comments

#231 - Likely beartype package breakage

Issue - State: closed - Opened by rsxdalv over 1 year ago - 1 comment

#230 - Question about 'attention bias not supported for flash attention'

Issue - State: open - Opened by amitaie over 1 year ago - 2 comments

#229 - Error when running 3rd cell in demo ipynb

Issue - State: closed - Opened by uday18git over 1 year ago

#228 - Code Refactoring

Pull Request - State: closed - Opened by tosemml over 1 year ago - 1 comment

#227 - pretrained soundstream weights?

Issue - State: open - Opened by muazhuda over 1 year ago - 1 comment

#226 - Dependency error

Issue - State: closed - Opened by amrzv over 1 year ago - 4 comments

#225 - bandwidth params not work!

Issue - State: closed - Opened by wotulong over 1 year ago - 3 comments

#224 - Inconsistent samples for multiple targets in SoundDataset

Issue - State: closed - Opened by ilya16 over 1 year ago - 2 comments

#223 - Average validation loss across grad_accum_every

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 1 comment

#222 - Dataloader save

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 6 comments

#221 - Soundstream Training Goes From Great to Horrible

Issue - State: open - Opened by adamfils over 1 year ago - 11 comments

#219 - hi. If i want to train text to Chinese speech audiolm , what confuse me is that pretrained hubert model is English-style, Does it affect my Chinese version training? Or i have to re-train hubert with my own large chinese dataset ? Appreciate !!!! @lucidrains

Issue - State: closed - Opened by hyhzl over 1 year ago - 4 comments

#218 - Is resampling needed when using EnCodec?

Issue - State: open - Opened by m-pana over 1 year ago - 7 comments

#216 - yolov5

Issue - State: open - Opened by fangwei888 over 1 year ago - 1 comment

#215 - AssertionError: only one Trainer can be instantiated at a time for training

Issue - State: open - Opened by tiansiyuan over 1 year ago - 2 comments

#214 - Separate transformer and trainer checkpoint load logic

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 1 comment

#213 - How to run the inference?

Issue - State: closed - Opened by LianaN over 1 year ago - 6 comments

#212 - TypeError: cannot unpack non-iterable NoneType object

Issue - State: closed - Opened by tiansiyuan over 1 year ago - 16 comments

#211 - When running the example code, I get an error where the trainer says to be instantiated twice:

Issue - State: closed - Opened by LukasNel over 1 year ago - 6 comments

#210 - accelerate's wait_for_everyone hangs on the final step of training coarse/fine transformer

Issue - State: closed - Opened by LWprogramming over 1 year ago - 4 comments

#209 - Accelerate failing on multi-gpu rng synchronization

Issue - State: closed - Opened by LWprogramming over 1 year ago - 15 comments

#208 - Question about length of data in training \ generating

Issue - State: closed - Opened by amitaie over 1 year ago

#207 - Implement accelerate support for semantic/coarse/fine transformers

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 2 comments

#206 - Change torch.no_grad() to torch.inference_mode()

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 1 comment

#205 - Max length fix

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 1 comment

#204 - Questions about training Soundstream: poor intelligibility and gradients explosion after 10k steps. (sr=16k, B=96)

Issue - State: open - Opened by Makiyuyuko over 1 year ago - 1 comment

#203 - OpenBLAS/OpenMP Loop error message

Issue - State: closed - Opened by LWprogramming over 1 year ago - 8 comments

#201 - generation form of the inference

Issue - State: closed - Opened by Hit1ron over 1 year ago - 4 comments

#200 - Eos handling

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 3 comments

#199 - Audio generation failing at FineTransformer

Issue - State: closed - Opened by LWprogramming over 1 year ago - 15 comments

#198 - pin sklearn exactly to 0.24.0

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 7 comments

#197 - 'SemanticTransformerWrapper' object has no attribute 'embed_text'

Issue - State: closed - Opened by jinyuli over 1 year ago - 4 comments

#196 - A problem with EncodecWrapper()

Issue - State: closed - Opened by Leezp99 over 1 year ago - 3 comments

#195 - Bug fix in encodec.py

Pull Request - State: closed - Opened by yang1fan2 over 1 year ago - 1 comment

#194 - `MultiScaleDiscriminator` differs from paper

Issue - State: closed - Opened by haydenshively over 1 year ago - 3 comments

#193 - Improve ComplexConv2d FSDP compatibility

Pull Request - State: closed - Opened by haydenshively over 1 year ago - 3 comments

#192 - Question about the generate

Issue - State: open - Opened by asr-pub over 1 year ago

#191 - Poor audio quality

Issue - State: closed - Opened by cpdu over 1 year ago - 1 comment

#190 - a small question about the loss function

Issue - State: closed - Opened by PB20000090 over 1 year ago - 1 comment

#189 - Fix off-by-one error in train step update

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 1 comment

#188 - Fix path type

Pull Request - State: closed - Opened by LWprogramming over 1 year ago

#187 - Load from correct self.steps to resume training

Pull Request - State: closed - Opened by LWprogramming over 1 year ago - 1 comment

#186 - question about the semantic process

Issue - State: closed - Opened by asr-pub over 1 year ago - 3 comments

#185 - Loss about CoarseTransformerWrapper

Issue - State: closed - Opened by asr-pub over 1 year ago - 2 comments

#184 - Something wrong when i use the “soundstream” repo

Issue - State: open - Opened by wangyuxuan11 over 1 year ago - 1 comment

#183 - training data

Issue - State: open - Opened by linlongrd over 1 year ago

#182 - can not install audiolm-pytorch

Issue - State: closed - Opened by linlongrd over 1 year ago - 4 comments

#181 - /path/to/audio/files

Issue - State: open - Opened by linlongrd over 1 year ago

#179 - more demo needed

Issue - State: open - Opened by fire-keeper almost 2 years ago

#177 - Use a pretrained model as a discriminator (and for feature maps)

Issue - State: closed - Opened by turian almost 2 years ago - 5 comments

GitHub / lucidrains/audiolm-pytorch issues and pull requests