facebookresearch/av_hubert issues and pull requests

#121 - Enquiry about Visual Hubert pre-training procedures, hyper-parameter settings and Visual Hubert model parameters.

Issue - State: open - Opened by yiwang454 6 months ago

#120 - decode text with temporal or duration information

Issue - State: open - Opened by Neil-Shah 6 months ago

#119 - Can I use this work to generate/predict new audio frame in real time?

Issue - State: open - Opened by felixshing 7 months ago

#119 - Can I use this work to generate/predict new audio frame in real time?

Issue - State: open - Opened by felixshing 7 months ago

#118 - 1. Differences between "extract_finetune" and "extract_features" and 2. extract discrete unit feature

Issue - State: open - Opened by hyunbin70 7 months ago

#118 - 1. Differences between "extract_finetune" and "extract_features" and 2. extract discrete unit feature

Issue - State: open - Opened by hyunbin70 7 months ago

#117 - How to extract embeddings from a specific layer

Issue - State: open - Opened by shakeel608 7 months ago

#116 - To load the previous model when doing double finetuning,

Issue - State: open - Opened by Peter-SungwooCho 8 months ago

#116 - To load the previous model when doing double finetuning,

Issue - State: open - Opened by Peter-SungwooCho 8 months ago

#115 - Reproducing base_noise_pt_noise_ft_30h.pt

Issue - State: open - Opened by nobel861017 8 months ago

#115 - Reproducing base_noise_pt_noise_ft_30h.pt

Issue - State: open - Opened by nobel861017 8 months ago

#114 - Unable to Locate mix_babble.py for LRS3 Audio Noise Preparation

Issue - State: closed - Opened by cyzhung 10 months ago

#114 - Unable to Locate mix_babble.py for LRS3 Audio Noise Preparation

Issue - State: closed - Opened by cyzhung 10 months ago

#113 - The counts of correct predictions for both masked and unmasked tokens are considerably low.

Issue - State: open - Opened by soloistzy 11 months ago

#113 - The counts of correct predictions for both masked and unmasked tokens are considerably low.

Issue - State: open - Opened by soloistzy 11 months ago

#112 - How to train model on Urdu dataset

Issue - State: open - Opened by ronit450 12 months ago - 1 comment

#111 - Inference for AVSR

Issue - State: closed - Opened by AAISSJ 12 months ago

#110 - Too many CPU resources for fine-tuning

Issue - State: open - Opened by sausage-333 about 1 year ago - 2 comments

#109 - a recipe and checkpoint for CTC decoding using CMUDict

Issue - State: open - Opened by hyunbin70 about 1 year ago - 1 comment

#108 - Errors about loading a pretrained model

Issue - State: open - Opened by wuliting-wlt about 1 year ago

#107 - best_checkpoint_metric is accuracy even for finetuning?

Issue - State: closed - Opened by PussyCat0700 about 1 year ago - 2 comments

#106 - How to load a pre-trained AVHuBERT? (problems after following the instructions)

Issue - State: open - Opened by CCTN-BCI about 1 year ago - 6 comments

#105 - Update README.md

Pull Request - State: closed - Opened by SUGE2016 about 1 year ago - 1 comment
Labels: CLA Signed

#104 - Problem Extracting Audio-Only Embeddings using AV-HuBERT

Issue - State: closed - Opened by david-gimeno about 1 year ago - 1 comment

#103 - I'm trying to pretrain a model with another langauge

Issue - State: open - Opened by sympathize123 about 1 year ago - 1 comment

#102 - Errors with distributed training on a single node

Issue - State: open - Opened by roudimit about 1 year ago - 1 comment

#101 - Concatenation of features from multiple videos

Issue - State: closed - Opened by snapfinger about 1 year ago - 1 comment

#100 - AssertionError

Issue - State: open - Opened by mabaisen over 1 year ago - 1 comment

#99 - Running AVSR model in colab causes kernel to restart

Issue - State: open - Opened by nastia-lado over 1 year ago - 3 comments

#98 - sentences_.empty() when setting up data directory

Issue - State: closed - Opened by nastia-lado over 1 year ago

#97 - Decoding with AV-HuBERT in colab

Issue - State: closed - Opened by nastia-lado over 1 year ago

#96 - HuBERT Pre-training for the Second Iteration without Previous Checkpoints?

Issue - State: closed - Opened by jojonki over 1 year ago - 1 comment

#95 - AV-huBERT as backbone for a slightly different task

Issue - State: open - Opened by gadese-vooban over 1 year ago - 3 comments

#94 - How can I fine-tuned the pretained model on my own dataset like video-emotion8

Issue - State: open - Opened by JinChow over 1 year ago

#93 - Batch normalization with ResNet encoder on 0-padded videos

Issue - State: closed - Opened by roudimit over 1 year ago - 2 comments

#92 - How to adapt or train AV-HuBERT for other languages?

Issue - State: open - Opened by cooelf over 1 year ago - 1 comment

#91 - Error loading AVSR model

Issue - State: open - Opened by llamasrock almost 2 years ago - 3 comments

#90 - Minor modifications to support MuAViC trained models

Pull Request - State: closed - Opened by Anwarvic almost 2 years ago
Labels: CLA Signed

#89 - Request for Base model pre-trained on multi-lingual data

Issue - State: open - Opened by xiabingquan almost 2 years ago - 1 comment

#88 - How to decode without any label files

Issue - State: closed - Opened by xinluyu1 almost 2 years ago - 2 comments

#87 - Release of clustering models

Issue - State: open - Opened by mtran14 almost 2 years ago - 3 comments

#86 - A problem of traning a new model

Issue - State: open - Opened by TanYuChen1 almost 2 years ago - 2 comments

#85 - Extraction of features with AV HuBERT

Issue - State: open - Opened by shakeel608 almost 2 years ago - 14 comments

#84 - Cannot register duplicate model (av_hubert)

Issue - State: open - Opened by shakeel608 almost 2 years ago - 1 comment

#83 - Format of pretrain data in the LRS3 dataset

Issue - State: open - Opened by hungnv21292 about 2 years ago - 2 comments

#82 - Value expected for ${layer}-th transformer layer of a trained AV-HuBERT model saved at ${ckpt_path}

Issue - State: open - Opened by JeetShah25 about 2 years ago - 1 comment

#81 - What's the difference between A/MFCC→A and A/MFCC→AV in the paper?

Issue - State: closed - Opened by PussyCat0700 about 2 years ago - 1 comment

#80 - How to finetune on AVSR setting?

Issue - State: closed - Opened by PussyCat0700 about 2 years ago - 3 comments

#79 - Fixing Colab

Issue - State: open - Opened by hideosnes about 2 years ago - 1 comment

#78 - non-deterministic results when decoding with noises

Issue - State: closed - Opened by joyolee about 2 years ago - 5 comments

#77 - Error in step 2 of preprocessing, what values to put in ${rank} and ${nshard}

Issue - State: closed - Opened by JeetShah25 about 2 years ago - 5 comments

#76 - How to train a LM used for decoding

Issue - State: open - Opened by li563042811 about 2 years ago - 2 comments

#75 - issue during 1st iteration of pretraining

Issue - State: open - Opened by sungheedong about 2 years ago - 1 comment

#74 - Update hubert_dataset.py

Pull Request - State: open - Opened by minkyu119 about 2 years ago
Labels: CLA Signed

#73 - Update hubert_dataset.py

Pull Request - State: closed - Opened by minkyu119 about 2 years ago - 1 comment

#72 - Question about the use of CMUDict in CTC finetuning

Issue - State: closed - Opened by PussyCat0700 about 2 years ago - 3 comments

#71 - preparation-cnn_face_detector

Issue - State: open - Opened by sungheedong over 2 years ago - 1 comment

#70 - ImportError: cannot import name 'metrics' from 'fairseq' (unknown location)

Issue - State: closed - Opened by PussyCat0700 over 2 years ago - 6 comments

#69 - infer_s2s.py: Load dataset (possibly sharded) ???

Issue - State: open - Opened by david-gimeno over 2 years ago - 2 comments

#68 - How to only perform test?

Issue - State: closed - Opened by BDHU over 2 years ago - 2 comments

#67 - FileNotFoundError

Issue - State: closed - Opened by BDHU over 2 years ago - 2 comments

#66 - Question on result of pretrain 433h and finetune 30h on LRS3

Issue - State: open - Opened by li563042811 over 2 years ago - 5 comments

#65 - How to align and fuse acoustic and visual features

Issue - State: open - Opened by mysxs over 2 years ago - 2 comments

#64 - Checkpoints of finetuned AVSR models without VoxCeleb2 data?

Issue - State: closed - Opened by ALIVE321 over 2 years ago - 4 comments

#63 - Finetuning Models for Visual Speech Recognition

Issue - State: open - Opened by david-gimeno over 2 years ago - 9 comments

#62 - How to get the ${nshard} value and ${rank} value? Are they random numbers?

Issue - State: open - Opened by SE-Nickjackson over 2 years ago - 3 comments

#61 - pretrain log issue

Issue - State: open - Opened by li563042811 over 2 years ago - 5 comments

#60 - How to find the English files in VoxCeleb2

Issue - State: closed - Opened by joyolee over 2 years ago - 2 comments

#59 - Can you open source extracted facial landmark?

Issue - State: closed - Opened by qszhum over 2 years ago - 2 comments

#58 - 'task.input_modality' is not used in the pre-training

Issue - State: closed - Opened by qszhum over 2 years ago - 2 comments

#57 - problem with the skvideo library during the data preparation stage

Issue - State: closed - Opened by qszhum over 2 years ago - 2 comments

#56 - Generating adversarial examples for av_hubert

Issue - State: closed - Opened by ashwath98 over 2 years ago - 2 comments

#55 - model register problem with multiple pre-trained models

Issue - State: closed - Opened by chrisole over 2 years ago - 4 comments

#54 - Cannot make inferences from videos over 30 seconds with Colab example

Issue - State: open - Opened by Zepplin18 over 2 years ago - 2 comments

#53 - Using an untrained model, equivalent to the pre-trained model

Issue - State: closed - Opened by miraodasilva over 2 years ago - 2 comments

#52 - Question about freeze-finetuning-updates

Issue - State: closed - Opened by YadiraRoCa over 2 years ago - 3 comments

#51 - fairseq

Issue - State: closed - Opened by mysxs over 2 years ago - 7 comments

#50 - Cython error during pre-training

Issue - State: open - Opened by Aaryan369 over 2 years ago - 1 comment

#49 - Audio/Video data augmentation

Issue - State: closed - Opened by YUCHEN005 over 2 years ago - 2 comments

#48 - Error while training a new model

Issue - State: closed - Opened by Aaryan369 over 2 years ago - 2 comments

#47 - Removing unnecessary flags and updating help section in lrs3_manifest.py

Pull Request - State: open - Opened by Aaryan369 almost 3 years ago
Labels: CLA Signed

#46 - [Easy Question] Sequence length

Issue - State: closed - Opened by JuanFMontesinos almost 3 years ago - 2 comments

#45 - Incorrect test results

Issue - State: closed - Opened by mysxs almost 3 years ago - 20 comments

#44 - Convert AV-HuBERT Model into ONNX Format

Issue - State: open - Opened by xuan97916 almost 3 years ago - 1 comment

#43 - decode issue

Issue - State: open - Opened by EvAlex01 almost 3 years ago - 1 comment

#42 - No dictionary error when inference provided finetuned model

Issue - State: open - Opened by Kuzhuahu almost 3 years ago - 3 comments

#41 - LRS3 433h pretrain configuration

Issue - State: open - Opened by jxzhanggg almost 3 years ago - 3 comments

#40 - How to fine-tune with my own dataset

Issue - State: closed - Opened by YadiraRoCa almost 3 years ago - 22 comments

#39 - Pseudo-labels of self-training

Issue - State: closed - Opened by joyolee almost 3 years ago - 1 comment

#38 - How to extract audio-visual features?

Issue - State: open - Opened by zuujhyt almost 3 years ago - 2 comments

#37 - How to download data from https://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs3.html link in Linux terminal?

Issue - State: open - Opened by muxiddin19 almost 3 years ago - 3 comments

#36 - ValueError: Cannot register duplicate model (av_hubert)

Issue - State: open - Opened by muxiddin19 almost 3 years ago - 6 comments

#35 - OOM when finetuning using multi-GPUs

Issue - State: open - Opened by xuan97916 almost 3 years ago - 2 comments

#34 - CTC decoding script

Issue - State: closed - Opened by joyolee almost 3 years ago - 1 comment

#33 - Finetuning parameter mismatch between paper and configs

Issue - State: closed - Opened by timolohrenz almost 3 years ago - 3 comments

#32 - Questions about pre-training an AV-HuBERT model.

Issue - State: closed - Opened by jc-hou almost 3 years ago - 2 comments

#31 - LRS3 data

Issue - State: closed - Opened by zuujhyt almost 3 years ago - 2 comments

#30 - ValueError: need at least one array to stack

Issue - State: closed - Opened by YUCHEN005 almost 3 years ago - 3 comments

#29 - reproductivity of the first iter.

Issue - State: open - Opened by jxzhanggg almost 3 years ago - 2 comments

#28 - pip install --editable ./

Issue - State: open - Opened by Cristian-Fioravanti almost 3 years ago - 7 comments

GitHub / facebookresearch/av_hubert issues and pull requests