Full Code of facebookresearch/fairseq for AI

main 3d262bb25690 cached

1626 files

9.2 MB

2.5M tokens

9202 symbols

1 requests

Download .txt

Showing preview only (9,936K chars total). Download the full file or copy to clipboard to get everything.

Repository: facebookresearch/fairseq
Branch: main
Commit: 3d262bb25690
Files: 1626
Total size: 9.2 MB

Directory structure:
gitextract_ldfkme3g/

├── .github/
│   ├── CODEOWNERS
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.md
│   │   ├── documentation.md
│   │   ├── feature_request.md
│   │   └── how-to-question.md
│   ├── ISSUE_TEMPLATE.md
│   ├── PULL_REQUEST_TEMPLATE.md
│   ├── stale.yml
│   └── workflows/
│       ├── build.yml
│       ├── depreview.yml
│       └── release.yml
├── .gitignore
├── .gitmodules
├── .pre-commit-config.yaml
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── MANIFEST.in
├── README.md
├── RELEASE.md
├── docs/
│   ├── Makefile
│   ├── command_line_tools.rst
│   ├── conf.py
│   ├── criterions.rst
│   ├── data.rst
│   ├── docutils.conf
│   ├── getting_started.rst
│   ├── hydra_integration.md
│   ├── index.rst
│   ├── lr_scheduler.rst
│   ├── make.bat
│   ├── models.rst
│   ├── modules.rst
│   ├── optim.rst
│   ├── overview.rst
│   ├── tasks.rst
│   ├── tutorial_classifying_names.rst
│   └── tutorial_simple_lstm.rst
├── examples/
│   ├── .gitignore
│   ├── MMPT/
│   │   ├── .gitignore
│   │   ├── CONFIG.md
│   │   ├── DATASET.md
│   │   ├── README.md
│   │   ├── endtask.md
│   │   ├── locallaunch.py
│   │   ├── mmpt/
│   │   │   ├── __init__.py
│   │   │   ├── datasets/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmdataset.py
│   │   │   │   └── mmdataset.py
│   │   │   ├── evaluators/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── evaluator.py
│   │   │   │   ├── metric.py
│   │   │   │   └── predictor.py
│   │   │   ├── losses/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmloss.py
│   │   │   │   ├── loss.py
│   │   │   │   └── nce.py
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmmodel.py
│   │   │   │   ├── mmfusion.py
│   │   │   │   ├── mmfusionnlg.py
│   │   │   │   └── transformermodel.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── mm.py
│   │   │   │   ├── retri.py
│   │   │   │   └── vectorpool.py
│   │   │   ├── processors/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── dedupprocessor.py
│   │   │   │   ├── dsprocessor.py
│   │   │   │   ├── how2processor.py
│   │   │   │   ├── how2retriprocessor.py
│   │   │   │   ├── models/
│   │   │   │   │   └── s3dg.py
│   │   │   │   └── processor.py
│   │   │   ├── tasks/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmtask.py
│   │   │   │   ├── milncetask.py
│   │   │   │   ├── retritask.py
│   │   │   │   ├── task.py
│   │   │   │   └── vlmtask.py
│   │   │   └── utils/
│   │   │       ├── __init__.py
│   │   │       ├── load_config.py
│   │   │       └── shardedtensor.py
│   │   ├── mmpt_cli/
│   │   │   ├── localjob.py
│   │   │   └── predict.py
│   │   ├── pretraining.md
│   │   ├── projects/
│   │   │   ├── mfmmlm.yaml
│   │   │   ├── mtm/
│   │   │   │   ├── mmfusionmtm.yaml
│   │   │   │   ├── vlm/
│   │   │   │   │   ├── coin.yaml
│   │   │   │   │   ├── crosstask.yaml
│   │   │   │   │   ├── how2.yaml
│   │   │   │   │   ├── test_coin.yaml
│   │   │   │   │   ├── test_crosstask.yaml
│   │   │   │   │   ├── test_crosstask_zs.yaml
│   │   │   │   │   ├── test_vtt.yaml
│   │   │   │   │   ├── test_vttqa.yaml
│   │   │   │   │   ├── test_youcook.yaml
│   │   │   │   │   ├── test_youcookcap.yaml
│   │   │   │   │   ├── vtt.yaml
│   │   │   │   │   ├── vttqa.yaml
│   │   │   │   │   ├── youcook.yaml
│   │   │   │   │   └── youcookcap.yaml
│   │   │   │   └── vlm.yaml
│   │   │   ├── retri/
│   │   │   │   ├── videoclip/
│   │   │   │   │   ├── coin_videoclip.yaml
│   │   │   │   │   ├── crosstask_videoclip.yaml
│   │   │   │   │   ├── how2.yaml
│   │   │   │   │   ├── test_coin_videoclip.yaml
│   │   │   │   │   ├── test_coin_zs.yaml
│   │   │   │   │   ├── test_crosstask_videoclip.yaml
│   │   │   │   │   ├── test_crosstask_zs_videoclip.yaml
│   │   │   │   │   ├── test_didemo_zs.yaml
│   │   │   │   │   ├── test_vtt_videoclip.yaml
│   │   │   │   │   ├── test_vtt_zs.yaml
│   │   │   │   │   ├── test_vttqa_videoclip.yaml
│   │   │   │   │   ├── test_vttqa_zs.yaml
│   │   │   │   │   ├── test_youcook_videoclip.yaml
│   │   │   │   │   ├── test_youcook_zs.yaml
│   │   │   │   │   ├── vtt_videoclip.yaml
│   │   │   │   │   ├── vttqa_videoclip.yaml
│   │   │   │   │   └── youcook_videoclip.yaml
│   │   │   │   ├── videoclip.yaml
│   │   │   │   └── videoretri.yaml
│   │   │   └── task/
│   │   │       ├── coin.yaml
│   │   │       ├── coin_videoclip.yaml
│   │   │       ├── crosstask.yaml
│   │   │       ├── crosstask_videoclip.yaml
│   │   │       ├── default.yaml
│   │   │       ├── ft.yaml
│   │   │       ├── how2.yaml
│   │   │       ├── test.yaml
│   │   │       ├── test_coin.yaml
│   │   │       ├── test_coin_videoclip.yaml
│   │   │       ├── test_coin_zs.yaml
│   │   │       ├── test_crosstask.yaml
│   │   │       ├── test_crosstask_videoclip.yaml
│   │   │       ├── test_crosstask_zs.yaml
│   │   │       ├── test_crosstask_zs_videoclip.yaml
│   │   │       ├── test_didemo_zs.yaml
│   │   │       ├── test_vtt.yaml
│   │   │       ├── test_vtt_videoclip.yaml
│   │   │       ├── test_vtt_zs.yaml
│   │   │       ├── test_vttqa.yaml
│   │   │       ├── test_vttqa_videoclip.yaml
│   │   │       ├── test_vttqa_zs.yaml
│   │   │       ├── test_youcook.yaml
│   │   │       ├── test_youcook_videoclip.yaml
│   │   │       ├── test_youcook_zs.yaml
│   │   │       ├── test_youcookcap.yaml
│   │   │       ├── vtt.yaml
│   │   │       ├── vtt_videoclip.yaml
│   │   │       ├── vttqa.yaml
│   │   │       ├── vttqa_videoclip.yaml
│   │   │       ├── youcook.yaml
│   │   │       ├── youcook_videoclip.yaml
│   │   │       └── youcookcap.yaml
│   │   ├── scripts/
│   │   │   ├── text_token_extractor/
│   │   │   │   ├── configs/
│   │   │   │   │   └── bert-base-uncased.yaml
│   │   │   │   └── pretokenization.py
│   │   │   └── video_feature_extractor/
│   │   │       ├── extract.py
│   │   │       ├── how2/
│   │   │       │   └── s3d.sh
│   │   │       ├── model.py
│   │   │       ├── pathbuilder.py
│   │   │       ├── preprocessing.py
│   │   │       ├── random_sequence_shuffler.py
│   │   │       ├── shard_feature.py
│   │   │       └── videoreader.py
│   │   └── setup.py
│   ├── __init__.py
│   ├── adaptive_span/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── adagrad_with_grad_clip.py
│   │   ├── adaptive_span_attention.py
│   │   ├── adaptive_span_loss.py
│   │   ├── adaptive_span_model.py
│   │   └── adaptive_span_model_wrapper.py
│   ├── attention_head_selection/
│   │   ├── README.md
│   │   └── src/
│   │       ├── __init__.py
│   │       ├── data/
│   │       │   ├── __init__.py
│   │       │   └── speech_to_text_dataset_with_domain.py
│   │       ├── loss/
│   │       │   ├── __init__.py
│   │       │   └── attention_head_selection.py
│   │       ├── models/
│   │       │   ├── __init__.py
│   │       │   ├── head_selection_s2t_transformer.py
│   │       │   └── head_selection_transformer.py
│   │       ├── modules/
│   │       │   ├── __init__.py
│   │       │   ├── attn_head_selector.py
│   │       │   ├── head_selection_transformer_layer.py
│   │       │   ├── multihead_attention_selection.py
│   │       │   └── multihead_functional.py
│   │       └── speech_to_text_head_selection.py
│   ├── audio_nlp/
│   │   └── nlu/
│   │       ├── README.md
│   │       ├── configs/
│   │       │   └── nlu_finetuning.yaml
│   │       ├── create_dict_stop.sh
│   │       └── generate_manifests.py
│   ├── backtranslation/
│   │   ├── README.md
│   │   ├── deduplicate_lines.py
│   │   ├── extract_bt_data.py
│   │   ├── prepare-de-monolingual.sh
│   │   ├── prepare-wmt18en2de.sh
│   │   ├── sacrebleu.sh
│   │   └── tokenized_bleu.sh
│   ├── bart/
│   │   ├── README.glue.md
│   │   ├── README.md
│   │   ├── README.summarization.md
│   │   └── summarize.py
│   ├── byte_level_bpe/
│   │   ├── README.md
│   │   ├── get_bitext.py
│   │   ├── get_data.sh
│   │   └── gru_transformer.py
│   ├── camembert/
│   │   └── README.md
│   ├── constrained_decoding/
│   │   ├── README.md
│   │   ├── normalize.py
│   │   └── tok.py
│   ├── conv_seq2seq/
│   │   └── README.md
│   ├── criss/
│   │   ├── README.md
│   │   ├── download_and_preprocess_flores_test.sh
│   │   ├── download_and_preprocess_tatoeba.sh
│   │   ├── mining/
│   │   │   ├── mine.py
│   │   │   └── mine_example.sh
│   │   ├── save_encoder.py
│   │   ├── sentence_retrieval/
│   │   │   ├── encoder_analysis.py
│   │   │   └── sentence_retrieval_tatoeba.sh
│   │   └── unsupervised_mt/
│   │       └── eval.sh
│   ├── cross_lingual_language_model/
│   │   └── README.md
│   ├── data2vec/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── audio/
│   │   │   │   ├── classification/
│   │   │   │   │   ├── base_classification.yaml
│   │   │   │   │   └── run_config/
│   │   │   │   │       ├── slurm_1.yaml
│   │   │   │   │       ├── slurm_1g.yaml
│   │   │   │   │       └── slurm_2.yaml
│   │   │   │   └── pretraining/
│   │   │   │       ├── audioset.yaml
│   │   │   │       ├── base_librispeech.yaml
│   │   │   │       └── run_config/
│   │   │   │           ├── local.yaml
│   │   │   │           ├── slurm_1.yaml
│   │   │   │           ├── slurm_1_aws.yaml
│   │   │   │           ├── slurm_2.yaml
│   │   │   │           ├── slurm_2_aws.yaml
│   │   │   │           ├── slurm_3.yaml
│   │   │   │           ├── slurm_4.yaml
│   │   │   │           ├── slurm_4_aws.yaml
│   │   │   │           ├── slurm_6_aws.yaml
│   │   │   │           └── slurm_8_aws.yaml
│   │   │   ├── text/
│   │   │   │   └── pretraining/
│   │   │   │       ├── base.yaml
│   │   │   │       └── run_config/
│   │   │   │           ├── local.yaml
│   │   │   │           ├── slurm_1_aws.yaml
│   │   │   │           ├── slurm_2.yaml
│   │   │   │           ├── slurm_2_aws.yaml
│   │   │   │           ├── slurm_3.yaml
│   │   │   │           ├── slurm_4.yaml
│   │   │   │           ├── slurm_4_aws.yaml
│   │   │   │           └── slurm_8_aws.yaml
│   │   │   ├── v2/
│   │   │   │   ├── base_audio_only_task.yaml
│   │   │   │   ├── base_images_only_task.yaml
│   │   │   │   ├── base_text_only_task.yaml
│   │   │   │   ├── huge_images14_only_task.yaml
│   │   │   │   ├── huge_images_only_task.yaml
│   │   │   │   ├── large_audio_only_task.yaml
│   │   │   │   ├── large_images_only_task.yaml
│   │   │   │   ├── large_text_only_task.yaml
│   │   │   │   ├── large_text_only_task_pgrp_1M.yaml
│   │   │   │   ├── run_config/
│   │   │   │   │   ├── local.yaml
│   │   │   │   │   ├── slurm_1.yaml
│   │   │   │   │   ├── slurm_1_aws.yaml
│   │   │   │   │   ├── slurm_2.yaml
│   │   │   │   │   ├── slurm_2_aws.yaml
│   │   │   │   │   ├── slurm_3.yaml
│   │   │   │   │   ├── slurm_4.yaml
│   │   │   │   │   ├── slurm_4_aws.yaml
│   │   │   │   │   ├── slurm_6_aws.yaml
│   │   │   │   │   ├── slurm_8.yaml
│   │   │   │   │   └── slurm_8_aws.yaml
│   │   │   │   └── text_finetuning/
│   │   │   │       ├── cola.yaml
│   │   │   │       ├── mnli.yaml
│   │   │   │       ├── mrpc.yaml
│   │   │   │       ├── qnli.yaml
│   │   │   │       ├── qqp.yaml
│   │   │   │       ├── rte.yaml
│   │   │   │       ├── run_config/
│   │   │   │       │   └── local.yaml
│   │   │   │       ├── sst_2.yaml
│   │   │   │       └── sts_b.yaml
│   │   │   └── vision/
│   │   │       ├── finetuning/
│   │   │       │   ├── imagenet.yaml
│   │   │       │   ├── mae_imagenet_clean.yaml
│   │   │       │   ├── mae_imagenet_huge_clean.yaml
│   │   │       │   ├── mae_imagenet_large_clean.yaml
│   │   │       │   └── run_config/
│   │   │       │       ├── local.yaml
│   │   │       │       ├── slurm_1.yaml
│   │   │       │       ├── slurm_1_aws.yaml
│   │   │       │       ├── slurm_2.yaml
│   │   │       │       ├── slurm_2_aws.yaml
│   │   │       │       ├── slurm_3.yaml
│   │   │       │       ├── slurm_4.yaml
│   │   │       │       ├── slurm_4_aws.yaml
│   │   │       │       ├── slurm_6_aws.yaml
│   │   │       │       └── slurm_8_aws.yaml
│   │   │       └── pretraining/
│   │   │           ├── base_imagenet.yaml
│   │   │           ├── base_imagenet_d2v1.yaml
│   │   │           ├── base_mae_imagenet.yaml
│   │   │           └── run_config/
│   │   │               ├── local.yaml
│   │   │               ├── slurm_1.yaml
│   │   │               ├── slurm_1_aws.yaml
│   │   │               ├── slurm_2.yaml
│   │   │               ├── slurm_2_aws.yaml
│   │   │               ├── slurm_3.yaml
│   │   │               ├── slurm_4.yaml
│   │   │               ├── slurm_4_aws.yaml
│   │   │               ├── slurm_6_aws.yaml
│   │   │               └── slurm_8_aws.yaml
│   │   ├── data/
│   │   │   ├── __init__.py
│   │   │   ├── add_class_target_dataset.py
│   │   │   ├── image_dataset.py
│   │   │   ├── mae_finetuning_image_dataset.py
│   │   │   ├── mae_image_dataset.py
│   │   │   ├── modality.py
│   │   │   └── path_dataset.py
│   │   ├── fb_convert_beit_cp.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── audio_classification.py
│   │   │   ├── data2vec2.py
│   │   │   ├── data2vec_audio.py
│   │   │   ├── data2vec_image_classification.py
│   │   │   ├── data2vec_text.py
│   │   │   ├── data2vec_text_classification.py
│   │   │   ├── data2vec_vision.py
│   │   │   ├── mae.py
│   │   │   ├── mae_image_classification.py
│   │   │   ├── modalities/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── audio.py
│   │   │   │   ├── base.py
│   │   │   │   ├── images.py
│   │   │   │   ├── modules.py
│   │   │   │   └── text.py
│   │   │   └── utils.py
│   │   ├── scripts/
│   │   │   ├── convert_audioset_labels.py
│   │   │   ├── multi/
│   │   │   │   ├── finetune_all_fair_aws_local_lr.sh
│   │   │   │   ├── finetune_all_fair_aws_local_lr_nodep.sh
│   │   │   │   └── finetune_all_fair_local_lr.sh
│   │   │   └── text/
│   │   │       ├── finetune_all_char_fair_aws_local_lr.sh
│   │   │       ├── finetune_all_fair.sh
│   │   │       ├── finetune_all_fair_aws.sh
│   │   │       ├── finetune_all_fair_aws_local_lr.sh
│   │   │       ├── finetune_all_fair_aws_lr.sh
│   │   │       ├── finetune_all_fair_local_lr.sh
│   │   │       ├── finetune_all_fair_nodep.sh
│   │   │       ├── finetune_all_fair_nodep_aws.sh
│   │   │       ├── finetune_all_fair_nodep_aws_local_lr.sh
│   │   │       ├── finetune_all_fair_nodep_aws_lr.sh
│   │   │       ├── finetune_all_fair_nodep_aws_lr_nopos.sh
│   │   │       ├── finetune_all_large_fair_aws_local_lr.sh
│   │   │       ├── finetune_all_large_fair_local_lr.sh
│   │   │       ├── finetune_all_large_fair_nodep_aws_local_lr.sh
│   │   │       ├── finetune_sst2_qnli_sweep_fair_nodep.sh
│   │   │       ├── glue.py
│   │   │       ├── glue_lr.py
│   │   │       ├── unprocess_data.py
│   │   │       └── valids.py
│   │   └── tasks/
│   │       ├── __init__.py
│   │       ├── audio_classification.py
│   │       ├── image_classification.py
│   │       ├── image_pretraining.py
│   │       ├── mae_image_classification.py
│   │       ├── mae_image_pretraining.py
│   │       └── multimodal.py
│   ├── discriminative_reranking_nmt/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   └── deen.yaml
│   │   ├── criterions/
│   │   │   ├── __init__.py
│   │   │   └── discriminative_reranking_criterion.py
│   │   ├── drnmt_rerank.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   └── discriminative_reranking_model.py
│   │   ├── scripts/
│   │   │   └── prep_data.py
│   │   └── tasks/
│   │       ├── __init__.py
│   │       └── discriminative_reranking_task.py
│   ├── emotion_conversion/
│   │   ├── README.md
│   │   ├── emotion_models/
│   │   │   ├── __init__.py
│   │   │   ├── duration_predictor.py
│   │   │   ├── duration_predictor.yaml
│   │   │   ├── pitch_predictor.py
│   │   │   ├── pitch_predictor.yaml
│   │   │   └── utils.py
│   │   ├── fairseq_models/
│   │   │   └── __init__.py
│   │   ├── preprocess/
│   │   │   ├── __init__.py
│   │   │   ├── build_hifigan_manifest.py
│   │   │   ├── build_translation_manifests.py
│   │   │   ├── create_core_manifest.py
│   │   │   ├── extract_f0.py
│   │   │   ├── process_km.py
│   │   │   ├── split_emov_km_tsv_by_uttid.py
│   │   │   ├── split_km.py
│   │   │   └── split_km_tsv.py
│   │   ├── requirements.txt
│   │   └── synthesize.py
│   ├── fast_noisy_channel/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── noisy_channel_beam_search.py
│   │   ├── noisy_channel_sequence_generator.py
│   │   └── noisy_channel_translation.py
│   ├── flores101/
│   │   └── README.md
│   ├── fully_sharded_data_parallel/
│   │   └── README.md
│   ├── gottbert/
│   │   └── README.md
│   ├── hubert/
│   │   ├── README.md
│   │   ├── config/
│   │   │   ├── decode/
│   │   │   │   ├── ax_sweep/
│   │   │   │   │   ├── ngram.yaml
│   │   │   │   │   └── transformer.yaml
│   │   │   │   ├── infer_fsqlm.yaml
│   │   │   │   ├── infer_kenlm.yaml
│   │   │   │   ├── infer_viterbi.yaml
│   │   │   │   └── run/
│   │   │   │       ├── submitit_slurm.yaml
│   │   │   │       └── submitit_slurm_8gpu.yaml
│   │   │   ├── finetune/
│   │   │   │   ├── base_10h.yaml
│   │   │   │   ├── ckpt/
│   │   │   │   │   └── it1.yaml
│   │   │   │   ├── lm/
│   │   │   │   │   └── ls_4gram.yaml
│   │   │   │   └── run/
│   │   │   │       └── submitit_reg.yaml
│   │   │   └── pretrain/
│   │   │       ├── data/
│   │   │       │   ├── iter1.yaml
│   │   │       │   └── iter2.yaml
│   │   │       ├── hubert_base_librispeech.yaml
│   │   │       ├── hubert_large_librivox.yaml
│   │   │       ├── hubert_xlarge_librivox.yaml
│   │   │       └── run/
│   │   │           └── submitit_reg.yaml
│   │   ├── measure_teacher_quality.py
│   │   ├── simple_kmeans/
│   │   │   ├── README.md
│   │   │   ├── dump_hubert_feature.py
│   │   │   ├── dump_hubert_feature_s2t.py
│   │   │   ├── dump_km_label.py
│   │   │   ├── dump_mfcc_feature.py
│   │   │   ├── dump_w2v2_feature.py
│   │   │   ├── feature_utils.py
│   │   │   └── learn_kmeans.py
│   │   ├── tests/
│   │   │   ├── 6313-76958-0021.flac
│   │   │   ├── sample.base.L9.km500.km
│   │   │   ├── sample.base.L9.len
│   │   │   ├── sample.base.L9.npy
│   │   │   ├── sample.large.L20.len
│   │   │   ├── sample.large.L20.npy
│   │   │   ├── sample.large.hypo.word
│   │   │   ├── sample.xlarge.L30.len
│   │   │   ├── sample.xlarge.L30.npy
│   │   │   ├── sample.xlarge.hypo.word
│   │   │   ├── test_feature_and_unit.sh
│   │   │   └── test_finetuned_asr.sh
│   │   └── update_ckpt.py
│   ├── joint_alignment_translation/
│   │   ├── README.md
│   │   └── prepare-wmt18en2de_no_norm_no_escape_no_agressive.sh
│   ├── language_model/
│   │   ├── README.adaptive_inputs.md
│   │   ├── README.conv.md
│   │   ├── README.md
│   │   └── prepare-wikitext-103.sh
│   ├── laser/
│   │   ├── README.md
│   │   └── laser_src/
│   │       ├── __init__.py
│   │       ├── laser_lstm.py
│   │       ├── laser_task.py
│   │       ├── laser_transformer.py
│   │       └── multitask_data_utils.py
│   ├── latent_depth/
│   │   ├── README.md
│   │   └── latent_depth_src/
│   │       ├── __init__.py
│   │       ├── loss/
│   │       │   ├── __init__.py
│   │       │   └── latent_depth.py
│   │       ├── models/
│   │       │   ├── __init__.py
│   │       │   ├── latent_multilingual_transformer.py
│   │       │   └── latent_transformer.py
│   │       ├── modules/
│   │       │   ├── __init__.py
│   │       │   └── latent_layers.py
│   │       └── multilingual_translation_latent_depth.py
│   ├── layerdrop/
│   │   └── README.md
│   ├── linformer/
│   │   ├── README.md
│   │   └── linformer_src/
│   │       ├── __init__.py
│   │       ├── models/
│   │       │   ├── __init__.py
│   │       │   └── linformer_roberta.py
│   │       └── modules/
│   │           ├── __init__.py
│   │           ├── linformer_sentence_encoder.py
│   │           ├── linformer_sentence_encoder_layer.py
│   │           └── multihead_linear_attention.py
│   ├── m2m_100/
│   │   ├── README.md
│   │   ├── install_dependecies.sh
│   │   ├── process_data/
│   │   │   ├── clean_histogram.py
│   │   │   ├── dedup_data.py
│   │   │   └── remove_too_much_punc.py
│   │   ├── tok.sh
│   │   └── tokenizers/
│   │       ├── README.md
│   │       ├── seg_ja.sh
│   │       ├── seg_ko.sh
│   │       ├── thirdparty/
│   │       │   └── .gitignore
│   │       ├── tokenize_indic.py
│   │       ├── tokenize_thai.py
│   │       ├── tokenize_zh.py
│   │       └── tokenizer_ar.sh
│   ├── mbart/
│   │   └── README.md
│   ├── megatron_11b/
│   │   ├── README.md
│   │   └── detok.py
│   ├── mms/
│   │   ├── MODEL_CARD.md
│   │   ├── README.md
│   │   ├── asr/
│   │   │   ├── config/
│   │   │   │   └── infer_common.yaml
│   │   │   ├── infer/
│   │   │   │   ├── example_infer_adapter.sh
│   │   │   │   └── mms_infer.py
│   │   │   └── tutorial/
│   │   │       └── MMS_ASR_Inference_Colab.ipynb
│   │   ├── data_prep/
│   │   │   ├── README.md
│   │   │   ├── align_and_segment.py
│   │   │   ├── align_utils.py
│   │   │   ├── norm_config.py
│   │   │   ├── punctuations.lst
│   │   │   └── text_normalization.py
│   │   ├── lid/
│   │   │   ├── infer.py
│   │   │   └── tutorial/
│   │   │       └── MMS_LID_Inference_Colab.ipynb
│   │   ├── lid_rerank/
│   │   │   ├── README.md
│   │   │   ├── cer_langs.txt
│   │   │   ├── mala/
│   │   │   │   └── infer.py
│   │   │   ├── mms/
│   │   │   │   ├── make_parallel_single_runs.py
│   │   │   │   ├── merge_by_lang.py
│   │   │   │   ├── prep_wav_list.py
│   │   │   │   ├── run_single_lang.py
│   │   │   │   └── split_by_lang.py
│   │   │   ├── mms-zs/
│   │   │   │   ├── falign.py
│   │   │   │   ├── lib.py
│   │   │   │   └── uromanize.py
│   │   │   ├── nllb/
│   │   │   │   └── infer.py
│   │   │   ├── requirements.txt
│   │   │   ├── rerank/
│   │   │   │   ├── rerank.py
│   │   │   │   └── tune_coefficients.py
│   │   │   └── whisper/
│   │   │       ├── infer_asr.py
│   │   │       ├── infer_lid.py
│   │   │       └── lid_mapping.txt
│   │   ├── misc/
│   │   │   └── get_sample_size.py
│   │   ├── tts/
│   │   │   ├── infer.py
│   │   │   └── tutorial/
│   │   │       └── MMS_TTS_Inference_Colab.ipynb
│   │   └── zero_shot/
│   │       └── README.md
│   ├── moe_lm/
│   │   ├── README.md
│   │   ├── data_card.md
│   │   └── model_card.md
│   ├── mr_hubert/
│   │   ├── README.md
│   │   ├── config/
│   │   │   ├── decode/
│   │   │   │   ├── infer.yaml
│   │   │   │   ├── infer_lm.yaml
│   │   │   │   └── run/
│   │   │   │       ├── submitit_slurm.yaml
│   │   │   │       └── submitit_slurm_8gpu.yaml
│   │   │   ├── finetune/
│   │   │   │   ├── base_100h.yaml
│   │   │   │   ├── base_100h_large.yaml
│   │   │   │   ├── base_10h.yaml
│   │   │   │   ├── base_10h_large.yaml
│   │   │   │   ├── base_1h.yaml
│   │   │   │   └── base_1h_large.yaml
│   │   │   └── pretrain/
│   │   │       ├── mrhubert_base_librispeech.yaml
│   │   │       ├── mrhubert_large_librilight.yaml
│   │   │       └── run/
│   │   │           └── submitit_reg.yaml
│   │   ├── decode.sh
│   │   ├── finetune.sh
│   │   └── train.sh
│   ├── multilingual/
│   │   ├── ML50_langs.txt
│   │   ├── README.md
│   │   ├── data_scripts/
│   │   │   ├── README.md
│   │   │   ├── binarize.py
│   │   │   ├── check_iswlt_test_data.py
│   │   │   ├── check_self_overlaps.py
│   │   │   ├── check_valid_test_overlaps.py
│   │   │   ├── dedup_all.py
│   │   │   ├── download_ML50_v1.sh
│   │   │   ├── download_af_xh.sh
│   │   │   ├── download_flores_data.sh
│   │   │   ├── download_iitb.sh
│   │   │   ├── download_iwslt_and_extract.sh
│   │   │   ├── download_lotus.sh
│   │   │   ├── download_ted_and_extract.py
│   │   │   ├── download_wat19_my.sh
│   │   │   ├── download_wmt19_and_before.py
│   │   │   ├── download_wmt20.sh
│   │   │   ├── preprocess_ML50_v1.sh
│   │   │   ├── remove_valid_test_in_train.py
│   │   │   ├── requirement.txt
│   │   │   └── utils/
│   │   │       ├── dedup.py
│   │   │       ├── fasttext_multi_filter.py
│   │   │       └── strip_sgm.sh
│   │   ├── finetune_multilingual_model.sh
│   │   ├── multilingual_fairseq_gen.sh
│   │   └── train_multilingual_model.sh
│   ├── noisychannel/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── rerank.py
│   │   ├── rerank_generate.py
│   │   ├── rerank_options.py
│   │   ├── rerank_score_bw.py
│   │   ├── rerank_score_lm.py
│   │   ├── rerank_tune.py
│   │   └── rerank_utils.py
│   ├── nonautoregressive_translation/
│   │   ├── README.md
│   │   └── scripts.md
│   ├── normformer/
│   │   ├── README.md
│   │   └── train_lm.sh
│   ├── operators/
│   │   ├── alignment_train_cpu.cpp
│   │   ├── alignment_train_cuda.cpp
│   │   ├── alignment_train_cuda.h
│   │   ├── alignment_train_kernel.cu
│   │   └── utils.h
│   ├── paraphraser/
│   │   ├── README.md
│   │   └── paraphrase.py
│   ├── pay_less_attention_paper/
│   │   └── README.md
│   ├── pointer_generator/
│   │   ├── README.md
│   │   ├── README.xsum.md
│   │   ├── pointer_generator_src/
│   │   │   ├── __init__.py
│   │   │   └── transformer_pg.py
│   │   ├── postprocess.py
│   │   └── preprocess.py
│   ├── quant_noise/
│   │   ├── README.md
│   │   └── transformer_quantization_config.yaml
│   ├── roberta/
│   │   ├── README.custom_classification.md
│   │   ├── README.glue.md
│   │   ├── README.md
│   │   ├── README.pretraining.md
│   │   ├── README.race.md
│   │   ├── commonsense_qa/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── commonsense_qa_task.py
│   │   │   └── download_cqa_data.sh
│   │   ├── config/
│   │   │   ├── finetuning/
│   │   │   │   ├── cola.yaml
│   │   │   │   ├── mnli.yaml
│   │   │   │   ├── mrpc.yaml
│   │   │   │   ├── qnli.yaml
│   │   │   │   ├── qqp.yaml
│   │   │   │   ├── rte.yaml
│   │   │   │   ├── run_config/
│   │   │   │   │   ├── local.yaml
│   │   │   │   │   ├── slurm_1g.yaml
│   │   │   │   │   └── slurm_1g_aws.yaml
│   │   │   │   ├── sst_2.yaml
│   │   │   │   └── sts_b.yaml
│   │   │   └── pretraining/
│   │   │       ├── base.yaml
│   │   │       └── run_config/
│   │   │           ├── local.yaml
│   │   │           ├── slurm_2.yaml
│   │   │           ├── slurm_2_aws.yaml
│   │   │           ├── slurm_3.yaml
│   │   │           └── slurm_4.yaml
│   │   ├── fb_multilingual/
│   │   │   └── README.multilingual.pretraining.md
│   │   ├── multiprocessing_bpe_encoder.py
│   │   ├── preprocess_GLUE_tasks.sh
│   │   ├── preprocess_RACE.py
│   │   ├── preprocess_RACE.sh
│   │   └── wsc/
│   │       ├── README.md
│   │       ├── __init__.py
│   │       ├── wsc_criterion.py
│   │       ├── wsc_task.py
│   │       └── wsc_utils.py
│   ├── rxf/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   └── rxf_src/
│   │       ├── __init__.py
│   │       ├── label_smoothed_cross_entropy_r3f.py
│   │       └── sentence_prediction_r3f.py
│   ├── scaling_nmt/
│   │   └── README.md
│   ├── shuffled_word_order/
│   │   ├── README.finetuning.md
│   │   └── README.md
│   ├── simultaneous_translation/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── docs/
│   │   │   ├── ende-mma.md
│   │   │   └── enja-waitk.md
│   │   ├── eval/
│   │   │   └── agents/
│   │   │       └── simul_t2t_enja.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── convtransformer_simul_trans.py
│   │   │   └── transformer_monotonic_attention.py
│   │   ├── modules/
│   │   │   ├── __init__.py
│   │   │   ├── fixed_pre_decision.py
│   │   │   ├── monotonic_multihead_attention.py
│   │   │   └── monotonic_transformer_layer.py
│   │   ├── tests/
│   │   │   ├── test_alignment_train.py
│   │   │   └── test_text_models.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── functions.py
│   │       ├── monotonic_attention.py
│   │       └── p_choose_strategy.py
│   ├── speech_recognition/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── criterions/
│   │   │   ├── ASG_loss.py
│   │   │   ├── __init__.py
│   │   │   └── cross_entropy_acc.py
│   │   ├── data/
│   │   │   ├── __init__.py
│   │   │   ├── asr_dataset.py
│   │   │   ├── collaters.py
│   │   │   ├── data_utils.py
│   │   │   └── replabels.py
│   │   ├── datasets/
│   │   │   ├── asr_prep_json.py
│   │   │   └── prepare-librispeech.sh
│   │   ├── infer.py
│   │   ├── kaldi/
│   │   │   ├── __init__.py
│   │   │   ├── add-self-loop-simple.cc
│   │   │   ├── config/
│   │   │   │   └── kaldi_initializer.yaml
│   │   │   ├── kaldi_decoder.py
│   │   │   └── kaldi_initializer.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── vggtransformer.py
│   │   │   └── w2l_conv_glu_enc.py
│   │   ├── new/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── conf/
│   │   │   │   ├── hydra/
│   │   │   │   │   └── sweeper/
│   │   │   │   │       ├── ax.yaml
│   │   │   │   │       └── ax_sil.yaml
│   │   │   │   ├── infer.yaml
│   │   │   │   └── run_config/
│   │   │   │       ├── fb_slurm_1.yaml
│   │   │   │       └── fb_slurm_2g.yaml
│   │   │   ├── decoders/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── base_decoder.py
│   │   │   │   ├── decoder.py
│   │   │   │   ├── decoder_config.py
│   │   │   │   ├── flashlight_decoder.py
│   │   │   │   └── viterbi_decoder.py
│   │   │   └── infer.py
│   │   ├── tasks/
│   │   │   ├── __init__.py
│   │   │   └── speech_recognition.py
│   │   ├── utils/
│   │   │   └── wer_utils.py
│   │   └── w2l_decoder.py
│   ├── speech_synthesis/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── data_utils.py
│   │   ├── docs/
│   │   │   ├── common_voice_example.md
│   │   │   ├── ljspeech_example.md
│   │   │   └── vctk_example.md
│   │   ├── evaluation/
│   │   │   ├── __init__.py
│   │   │   ├── eval_asr.py
│   │   │   ├── eval_f0.py
│   │   │   ├── eval_sp.py
│   │   │   └── get_eval_manifest.py
│   │   ├── generate_waveform.py
│   │   ├── preprocessing/
│   │   │   ├── __init__.py
│   │   │   ├── denoise_and_vad_audio.py
│   │   │   ├── denoiser/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── demucs.py
│   │   │   │   ├── pretrained.py
│   │   │   │   ├── resample.py
│   │   │   │   └── utils.py
│   │   │   ├── get_common_voice_audio_manifest.py
│   │   │   ├── get_feature_manifest.py
│   │   │   ├── get_ljspeech_audio_manifest.py
│   │   │   ├── get_speaker_embedding.py
│   │   │   ├── get_vctk_audio_manifest.py
│   │   │   ├── speaker_embedder/
│   │   │   │   └── __init__.py
│   │   │   └── vad/
│   │   │       └── __init__.py
│   │   └── utils.py
│   ├── speech_text_joint_to_text/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── configs/
│   │   │   └── mustc_noise.list
│   │   ├── criterions/
│   │   │   ├── __init__.py
│   │   │   ├── multi_modality_compound.py
│   │   │   ├── multi_modality_cross_entropy.py
│   │   │   └── text_guide_cross_entropy_acc.py
│   │   ├── data/
│   │   │   └── pair_denoising_dataset.py
│   │   ├── docs/
│   │   │   ├── ende-mustc.md
│   │   │   ├── iwslt2021.md
│   │   │   └── pre-training.md
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── joint_speech_text_pretrain_transformer.py
│   │   │   ├── s2t_dualinputtransformer.py
│   │   │   ├── s2t_dualinputwavtransformer.py
│   │   │   └── s2t_dualinputxmtransformer.py
│   │   ├── scripts/
│   │   │   ├── convert_model.py
│   │   │   └── g2p_encode.py
│   │   └── tasks/
│   │       ├── __init__.py
│   │       ├── pair_denoising.py
│   │       ├── speech_text_denoise_pretrain.py
│   │       └── speech_text_joint.py
│   ├── speech_to_speech/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── asr_bleu/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── asr_model_cfgs.json
│   │   │   ├── compute_asr_bleu.py
│   │   │   ├── requirements.txt
│   │   │   └── utils.py
│   │   ├── benchmarking/
│   │   │   ├── README.md
│   │   │   ├── configs/
│   │   │   │   ├── 2StageS2ST.yaml
│   │   │   │   ├── 3StageS2ST.yaml
│   │   │   │   ├── DirectS2U.yaml
│   │   │   │   └── S2T.yaml
│   │   │   ├── core.py
│   │   │   ├── data_utils.py
│   │   │   └── get_metrics.py
│   │   ├── docs/
│   │   │   ├── data_augmentation.md
│   │   │   ├── direct_s2st_discrete_units.md
│   │   │   ├── enhanced_direct_s2st_discrete_units.md
│   │   │   └── textless_s2st_real_data.md
│   │   ├── generate_waveform_from_code.py
│   │   ├── preprocessing/
│   │   │   ├── __init__.py
│   │   │   ├── data_utils.py
│   │   │   ├── prep_s2spect_data.py
│   │   │   ├── prep_s2ut_data.py
│   │   │   ├── prep_sn_data.py
│   │   │   └── prep_sn_output_data.py
│   │   └── unity/
│   │       ├── __init__.py
│   │       ├── sequence_generator.py
│   │       └── sequence_generator_multi_decoder.py
│   ├── speech_to_text/
│   │   ├── README.md
│   │   ├── data_utils.py
│   │   ├── docs/
│   │   │   ├── covost_example.md
│   │   │   ├── librispeech_example.md
│   │   │   ├── mtedx_example.md
│   │   │   ├── mustc_example.md
│   │   │   └── simulst_mustc_example.md
│   │   ├── prep_covost_data.py
│   │   ├── prep_librispeech_data.py
│   │   ├── prep_mtedx_data.py
│   │   ├── prep_mustc_data.py
│   │   ├── seg_mustc_data.py
│   │   └── simultaneous_translation/
│   │       └── agents/
│   │           └── fairseq_simul_st_agent.py
│   ├── stories/
│   │   └── README.md
│   ├── textless_nlp/
│   │   ├── dgslm/
│   │   │   ├── README.md
│   │   │   ├── create_code_file.py
│   │   │   ├── dgslm_utils.py
│   │   │   ├── hubert_fisher/
│   │   │   │   └── README.md
│   │   │   ├── sample_speech_dlm.py
│   │   │   └── vocoder_hifigan/
│   │   │       ├── README.md
│   │   │       └── generate_stereo_waveform.py
│   │   ├── gslm/
│   │   │   ├── README.md
│   │   │   ├── metrics/
│   │   │   │   ├── README.md
│   │   │   │   ├── abx_metrics/
│   │   │   │   │   ├── README.md
│   │   │   │   │   └── dump_abx_feats.py
│   │   │   │   └── asr_metrics/
│   │   │   │       ├── README.md
│   │   │   │       ├── continuation_eval.py
│   │   │   │       ├── misc/
│   │   │   │       │   ├── bleu_utils.py
│   │   │   │       │   ├── cut_as.py
│   │   │   │       │   └── dict.ltr.txt
│   │   │   │       ├── ppx.py
│   │   │   │       └── self_auto_bleu.py
│   │   │   ├── speech2unit/
│   │   │   │   ├── README.md
│   │   │   │   ├── __init__.py
│   │   │   │   ├── clustering/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── cluster_kmeans.py
│   │   │   │   │   ├── dump_feats.py
│   │   │   │   │   ├── quantize_with_kmeans.py
│   │   │   │   │   └── utils.py
│   │   │   │   └── pretrained/
│   │   │   │       ├── cpc_feature_reader.py
│   │   │   │       ├── hubert_feature_reader.py
│   │   │   │       ├── logmel_feature_reader.py
│   │   │   │       ├── utils.py
│   │   │   │       └── w2v2_feature_reader.py
│   │   │   ├── tools/
│   │   │   │   ├── README.md
│   │   │   │   └── resynthesize_speech.py
│   │   │   ├── ulm/
│   │   │   │   ├── README.md
│   │   │   │   └── sample.py
│   │   │   └── unit2speech/
│   │   │       ├── README.md
│   │   │       ├── convert_to_16k.py
│   │   │       ├── glow.py
│   │   │       ├── multiproc.py
│   │   │       ├── synthesize_audio_from_units.py
│   │   │       ├── tacotron2/
│   │   │       │   ├── __init__.py
│   │   │       │   ├── audio_processing.py
│   │   │       │   ├── cleaners.py
│   │   │       │   ├── cmudict.py
│   │   │       │   ├── layers.py
│   │   │       │   ├── model.py
│   │   │       │   ├── numbers.py
│   │   │       │   ├── stft.py
│   │   │       │   ├── symbols.py
│   │   │       │   ├── text.py
│   │   │       │   ├── utils.py
│   │   │       │   └── waveglow_denoiser.py
│   │   │       ├── tts_data.py
│   │   │       └── utils.py
│   │   ├── pgslm/
│   │   │   ├── README.md
│   │   │   ├── data_utils.py
│   │   │   ├── eval/
│   │   │   │   ├── __init__.py
│   │   │   │   └── cont_metrics.py
│   │   │   ├── generate_waveform.py
│   │   │   ├── inference_dataset.py
│   │   │   ├── naive_decoder.py
│   │   │   ├── prepare_dataset.py
│   │   │   ├── preprocess_f0.py
│   │   │   ├── quantize_f0.py
│   │   │   ├── sample/
│   │   │   │   ├── __init__.py
│   │   │   │   └── sample.py
│   │   │   ├── scripts/
│   │   │   │   ├── join_units_manifest.py
│   │   │   │   ├── prepare_data.sh
│   │   │   │   └── prepare_f0_quantization.sh
│   │   │   └── truncated_laplace.py
│   │   └── speech-resynth/
│   │       └── README.md
│   ├── translation/
│   │   ├── README.md
│   │   ├── prepare-iwslt14.sh
│   │   ├── prepare-iwslt17-multilingual.sh
│   │   ├── prepare-wmt14en2de.sh
│   │   └── prepare-wmt14en2fr.sh
│   ├── translation_moe/
│   │   ├── README.md
│   │   ├── score.py
│   │   └── translation_moe_src/
│   │       ├── __init__.py
│   │       ├── logsumexp_moe.py
│   │       ├── mean_pool_gating_network.py
│   │       └── translation_moe.py
│   ├── truncated_bptt/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── transformer_xl_model.py
│   │   └── truncated_bptt_lm_task.py
│   ├── unsupervised_quality_estimation/
│   │   ├── README.md
│   │   ├── aggregate_scores.py
│   │   ├── meteor.py
│   │   └── repeat_lines.py
│   ├── wav2vec/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── finetuning/
│   │   │   │   ├── base_100h.yaml
│   │   │   │   ├── base_10h.yaml
│   │   │   │   ├── base_10m.yaml
│   │   │   │   ├── base_1h.yaml
│   │   │   │   ├── base_960h.yaml
│   │   │   │   ├── run_config/
│   │   │   │   │   ├── slurm_1.yaml
│   │   │   │   │   ├── slurm_16.yaml
│   │   │   │   │   ├── slurm_1_aws.yaml
│   │   │   │   │   ├── slurm_1_old.yaml
│   │   │   │   │   ├── slurm_2.yaml
│   │   │   │   │   ├── slurm_2_aws.yaml
│   │   │   │   │   ├── slurm_2g.yaml
│   │   │   │   │   ├── slurm_3.yaml
│   │   │   │   │   ├── slurm_4g.yaml
│   │   │   │   │   ├── slurm_4g_aws.yaml
│   │   │   │   │   └── slurm_8.yaml
│   │   │   │   ├── vox_100h.yaml
│   │   │   │   ├── vox_100h_2.yaml
│   │   │   │   ├── vox_100h_2_aws.yaml
│   │   │   │   ├── vox_100h_3.yaml
│   │   │   │   ├── vox_10h.yaml
│   │   │   │   ├── vox_10h_2.yaml
│   │   │   │   ├── vox_10h_2_aws.yaml
│   │   │   │   ├── vox_10h_aws.yaml
│   │   │   │   ├── vox_10h_aws_v100.yaml
│   │   │   │   ├── vox_10m.yaml
│   │   │   │   ├── vox_10m_2.yaml
│   │   │   │   ├── vox_10m_2_aws.yaml
│   │   │   │   ├── vox_10m_3.yaml
│   │   │   │   ├── vox_1h.yaml
│   │   │   │   ├── vox_1h_2.yaml
│   │   │   │   ├── vox_1h_2_aws.yaml
│   │   │   │   ├── vox_1h_3.yaml
│   │   │   │   ├── vox_1h_4.yaml
│   │   │   │   ├── vox_1h_aws.yaml
│   │   │   │   ├── vox_960h.yaml
│   │   │   │   ├── vox_960h_2.yaml
│   │   │   │   ├── vox_960h_2_aws.yaml
│   │   │   │   └── vox_960h_3.yaml
│   │   │   └── pretraining/
│   │   │       ├── wav2vec2_base_librispeech.yaml
│   │   │       ├── wav2vec2_conformer_base_librispeech.yaml
│   │   │       ├── wav2vec2_conformer_large_librivox.yaml
│   │   │       ├── wav2vec2_large_librivox.yaml
│   │   │       ├── wav2vec2_large_librivox_tpu-pod.yaml
│   │   │       └── wav2vec2_large_librivox_tpu.yaml
│   │   ├── libri_labels.py
│   │   ├── scripts/
│   │   │   └── binarize_manifest.sh
│   │   ├── unsupervised/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── config/
│   │   │   │   ├── finetuning/
│   │   │   │   │   └── w2v_finetune.yaml
│   │   │   │   ├── gan/
│   │   │   │   │   ├── w2vu.yaml
│   │   │   │   │   └── w2vu2.yaml
│   │   │   │   ├── generate/
│   │   │   │   │   └── viterbi.yaml
│   │   │   │   ├── timit_matched/
│   │   │   │   │   ├── test.uid
│   │   │   │   │   ├── train.uid
│   │   │   │   │   ├── train_text.uid
│   │   │   │   │   └── valid.uid
│   │   │   │   └── timit_unmatched/
│   │   │   │       ├── test.uid
│   │   │   │       ├── train.uid
│   │   │   │       ├── train_text.uid
│   │   │   │       └── valid.uid
│   │   │   ├── data/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── extracted_features_dataset.py
│   │   │   │   └── random_input_dataset.py
│   │   │   ├── kaldi_self_train/
│   │   │   │   ├── README.md
│   │   │   │   └── st/
│   │   │   │       ├── cmd.sh
│   │   │   │       ├── decode_phone.sh
│   │   │   │       ├── decode_word_step1.sh
│   │   │   │       ├── decode_word_step2.sh
│   │   │   │       ├── local/
│   │   │   │       │   ├── copy_aligned_text.py
│   │   │   │       │   ├── decode.sh
│   │   │   │       │   ├── prepare_data_from_w2v.py
│   │   │   │       │   ├── prepare_lang.sh
│   │   │   │       │   ├── prepare_lang_word.sh
│   │   │   │       │   ├── prepare_lm.sh
│   │   │   │       │   ├── score.sh
│   │   │   │       │   ├── show_wer.sh
│   │   │   │       │   ├── train_subset_lgbeam.sh
│   │   │   │       │   ├── unsup_select.py
│   │   │   │       │   ├── unsup_select_decode.sh
│   │   │   │       │   └── unsup_select_decode_word.sh
│   │   │   │       ├── path.sh
│   │   │   │       ├── steps_gan/
│   │   │   │       │   ├── train_deltas.sh
│   │   │   │       │   ├── train_lda_mllt.sh
│   │   │   │       │   └── train_sat.sh
│   │   │   │       └── train.sh
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   └── wav2vec_u.py
│   │   │   ├── scripts/
│   │   │   │   ├── apply_pca.py
│   │   │   │   ├── copy_labels.py
│   │   │   │   ├── filter_lexicon.py
│   │   │   │   ├── filter_tsv.py
│   │   │   │   ├── g2p_wrd_to_phn.py
│   │   │   │   ├── ltr_to_wrd.py
│   │   │   │   ├── mean_pool.py
│   │   │   │   ├── merge_clusters.py
│   │   │   │   ├── normalize_and_filter_text.py
│   │   │   │   ├── normalize_text.py
│   │   │   │   ├── pca.py
│   │   │   │   ├── phonemize_with_sil.py
│   │   │   │   ├── prepare_audio.sh
│   │   │   │   ├── prepare_audio_v2.sh
│   │   │   │   ├── prepare_text.sh
│   │   │   │   ├── prepare_timit.sh
│   │   │   │   ├── remove_silence.py
│   │   │   │   ├── vads.py
│   │   │   │   ├── wav2vec_apply_cluster_faiss.py
│   │   │   │   ├── wav2vec_cluster_faiss.py
│   │   │   │   ├── wav2vec_extract_features.py
│   │   │   │   ├── wer.py
│   │   │   │   └── wrd_to_ltr.py
│   │   │   ├── tasks/
│   │   │   │   ├── __init__.py
│   │   │   │   └── unpaired_audio_text.py
│   │   │   └── w2vu_generate.py
│   │   ├── vq-wav2vec_featurize.py
│   │   ├── wav2vec_featurize.py
│   │   ├── wav2vec_manifest.py
│   │   └── xlsr/
│   │       ├── README.md
│   │       ├── config/
│   │       │   └── finetune.yaml
│   │       └── scripts/
│   │           ├── eval_speaker_clf_task.py
│   │           └── gen_audio_embedding.py
│   ├── wmt19/
│   │   └── README.md
│   ├── wmt20/
│   │   └── README.md
│   ├── wmt21/
│   │   ├── README.md
│   │   ├── eval.sh
│   │   └── scripts/
│   │       ├── normalize-punctuation.perl
│   │       └── replace-unicode-punctuation.perl
│   ├── womens_bios/
│   │   ├── README.md
│   │   └── query_occupations_from_wikidata.py
│   ├── xformers/
│   │   └── README.md
│   ├── xglm/
│   │   ├── README.md
│   │   ├── XStoryCloze.md
│   │   └── model_card.md
│   ├── xlmr/
│   │   └── README.md
│   └── xmod/
│       ├── README.md
│       └── preprocess_nli.py
├── fairseq/
│   ├── __init__.py
│   ├── benchmark/
│   │   ├── __init__.py
│   │   ├── benchmark_multihead_attention.py
│   │   ├── dummy_dataset.py
│   │   ├── dummy_lm.py
│   │   ├── dummy_masked_lm.py
│   │   ├── dummy_model.py
│   │   └── dummy_mt.py
│   ├── binarizer.py
│   ├── checkpoint_utils.py
│   ├── clib/
│   │   ├── cuda/
│   │   │   ├── ngram_repeat_block_cuda.cpp
│   │   │   └── ngram_repeat_block_cuda_kernel.cu
│   │   ├── libbase/
│   │   │   └── balanced_assignment.cpp
│   │   ├── libbleu/
│   │   │   ├── libbleu.cpp
│   │   │   └── module.cpp
│   │   ├── libnat/
│   │   │   └── edit_dist.cpp
│   │   └── libnat_cuda/
│   │       ├── binding.cpp
│   │       ├── edit_dist.cu
│   │       └── edit_dist.h
│   ├── config/
│   │   ├── __init__.py
│   │   ├── config.yaml
│   │   ├── fb_run_config/
│   │   │   └── slurm.yaml
│   │   └── model/
│   │       ├── transformer_lm/
│   │       │   ├── transformer_lm_baevski_gbw.yaml
│   │       │   ├── transformer_lm_baevski_wiki103.yaml
│   │       │   ├── transformer_lm_big.yaml
│   │       │   ├── transformer_lm_gbw.yaml
│   │       │   ├── transformer_lm_gpt.yaml
│   │       │   ├── transformer_lm_gpt2_big.yaml
│   │       │   ├── transformer_lm_gpt2_medium.yaml
│   │       │   ├── transformer_lm_gpt2_small.yaml
│   │       │   └── transformer_lm_wiki103.yaml
│   │       ├── wav2vec/
│   │       │   └── vq_wav2vec_gumbel.yaml
│   │       └── wav2vec2/
│   │           ├── wav2vec2_base.yaml
│   │           └── wav2vec2_large.yaml
│   ├── criterions/
│   │   ├── __init__.py
│   │   ├── adaptive_loss.py
│   │   ├── composite_loss.py
│   │   ├── cross_entropy.py
│   │   ├── ctc.py
│   │   ├── fairseq_criterion.py
│   │   ├── fastspeech2_loss.py
│   │   ├── hubert_criterion.py
│   │   ├── label_smoothed_cross_entropy.py
│   │   ├── label_smoothed_cross_entropy_latency_augmented.py
│   │   ├── label_smoothed_cross_entropy_with_alignment.py
│   │   ├── label_smoothed_cross_entropy_with_ctc.py
│   │   ├── label_smoothed_cross_entropy_with_rdrop.py
│   │   ├── legacy_masked_lm.py
│   │   ├── masked_lm.py
│   │   ├── model_criterion.py
│   │   ├── nat_loss.py
│   │   ├── sentence_prediction.py
│   │   ├── sentence_prediction_adapters.py
│   │   ├── sentence_ranking.py
│   │   ├── speech_dlm_criterion.py
│   │   ├── speech_to_speech_criterion.py
│   │   ├── speech_ulm_criterion.py
│   │   ├── tacotron2_loss.py
│   │   └── wav2vec_criterion.py
│   ├── data/
│   │   ├── __init__.py
│   │   ├── add_class_target_dataset.py
│   │   ├── add_target_dataset.py
│   │   ├── append_token_dataset.py
│   │   ├── audio/
│   │   │   ├── __init__.py
│   │   │   ├── audio_utils.py
│   │   │   ├── data_cfg.py
│   │   │   ├── dataset_transforms/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── concataugment.py
│   │   │   │   └── noisyoverlapaugment.py
│   │   │   ├── feature_transforms/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── delta_deltas.py
│   │   │   │   ├── global_cmvn.py
│   │   │   │   ├── specaugment.py
│   │   │   │   └── utterance_cmvn.py
│   │   │   ├── frm_text_to_speech_dataset.py
│   │   │   ├── hubert_dataset.py
│   │   │   ├── multi_modality_dataset.py
│   │   │   ├── raw_audio_dataset.py
│   │   │   ├── speech_to_speech_dataset.py
│   │   │   ├── speech_to_text_dataset.py
│   │   │   ├── speech_to_text_joint_dataset.py
│   │   │   ├── text_to_speech_dataset.py
│   │   │   └── waveform_transforms/
│   │   │       ├── __init__.py
│   │   │       └── noiseaugment.py
│   │   ├── backtranslation_dataset.py
│   │   ├── base_wrapper_dataset.py
│   │   ├── bucket_pad_length_dataset.py
│   │   ├── codedataset.py
│   │   ├── colorize_dataset.py
│   │   ├── concat_dataset.py
│   │   ├── concat_sentences_dataset.py
│   │   ├── data_utils.py
│   │   ├── data_utils_fast.pyx
│   │   ├── denoising_dataset.py
│   │   ├── dictionary.py
│   │   ├── encoders/
│   │   │   ├── __init__.py
│   │   │   ├── byte_bpe.py
│   │   │   ├── byte_utils.py
│   │   │   ├── bytes.py
│   │   │   ├── characters.py
│   │   │   ├── fastbpe.py
│   │   │   ├── gpt2_bpe.py
│   │   │   ├── gpt2_bpe_utils.py
│   │   │   ├── hf_bert_bpe.py
│   │   │   ├── hf_byte_bpe.py
│   │   │   ├── moses_tokenizer.py
│   │   │   ├── nltk_tokenizer.py
│   │   │   ├── sentencepiece_bpe.py
│   │   │   ├── space_tokenizer.py
│   │   │   ├── subword_nmt_bpe.py
│   │   │   └── utils.py
│   │   ├── fairseq_dataset.py
│   │   ├── fasta_dataset.py
│   │   ├── huffman/
│   │   │   ├── __init__.py
│   │   │   ├── huffman_coder.py
│   │   │   └── huffman_mmap_indexed_dataset.py
│   │   ├── id_dataset.py
│   │   ├── indexed_dataset.py
│   │   ├── iterators.py
│   │   ├── language_pair_dataset.py
│   │   ├── legacy/
│   │   │   ├── __init__.py
│   │   │   ├── block_pair_dataset.py
│   │   │   ├── masked_lm_dataset.py
│   │   │   └── masked_lm_dictionary.py
│   │   ├── list_dataset.py
│   │   ├── lm_context_window_dataset.py
│   │   ├── lru_cache_dataset.py
│   │   ├── mask_tokens_dataset.py
│   │   ├── monolingual_dataset.py
│   │   ├── multi_corpus_dataset.py
│   │   ├── multi_corpus_sampled_dataset.py
│   │   ├── multilingual/
│   │   │   ├── __init__.py
│   │   │   ├── multilingual_data_manager.py
│   │   │   ├── multilingual_utils.py
│   │   │   ├── sampled_multi_dataset.py
│   │   │   ├── sampled_multi_epoch_dataset.py
│   │   │   └── sampling_method.py
│   │   ├── nested_dictionary_dataset.py
│   │   ├── noising.py
│   │   ├── num_samples_dataset.py
│   │   ├── numel_dataset.py
│   │   ├── offset_tokens_dataset.py
│   │   ├── pad_dataset.py
│   │   ├── padding_mask_dataset.py
│   │   ├── plasma_utils.py
│   │   ├── prepend_dataset.py
│   │   ├── prepend_token_dataset.py
│   │   ├── raw_label_dataset.py
│   │   ├── replace_dataset.py
│   │   ├── resampling_dataset.py
│   │   ├── roll_dataset.py
│   │   ├── round_robin_zip_datasets.py
│   │   ├── shorten_dataset.py
│   │   ├── sort_dataset.py
│   │   ├── span_mask_tokens_dataset.py
│   │   ├── speech_dlm_dataset.py
│   │   ├── strip_token_dataset.py
│   │   ├── subsample_dataset.py
│   │   ├── text_compressor.py
│   │   ├── token_block_dataset.py
│   │   ├── token_block_utils_fast.pyx
│   │   ├── transform_eos_concat_langpair_dataset.py
│   │   ├── transform_eos_dataset.py
│   │   └── transform_eos_lang_pair_dataset.py
│   ├── dataclass/
│   │   ├── __init__.py
│   │   ├── configs.py
│   │   ├── constants.py
│   │   ├── initialize.py
│   │   └── utils.py
│   ├── distributed/
│   │   ├── __init__.py
│   │   ├── distributed_timeout_wrapper.py
│   │   ├── fully_sharded_data_parallel.py
│   │   ├── legacy_distributed_data_parallel.py
│   │   ├── module_proxy_wrapper.py
│   │   ├── tpu_distributed_data_parallel.py
│   │   └── utils.py
│   ├── file_chunker_utils.py
│   ├── file_io.py
│   ├── file_utils.py
│   ├── hub_utils.py
│   ├── incremental_decoding_utils.py
│   ├── iterative_refinement_generator.py
│   ├── logging/
│   │   ├── __init__.py
│   │   ├── meters.py
│   │   ├── metrics.py
│   │   └── progress_bar.py
│   ├── model_parallel/
│   │   ├── __init__.py
│   │   ├── criterions/
│   │   │   ├── __init__.py
│   │   │   └── vocab_parallel_cross_entropy.py
│   │   ├── megatron_trainer.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── pipeline_parallel_transformer/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── layers.py
│   │   │   │   └── model.py
│   │   │   ├── roberta/
│   │   │   │   ├── __init__.py
│   │   │   │   └── model.py
│   │   │   ├── transformer.py
│   │   │   └── transformer_lm.py
│   │   └── modules/
│   │       ├── __init__.py
│   │       ├── multihead_attention.py
│   │       └── transformer_layer.py
│   ├── models/
│   │   ├── __init__.py
│   │   ├── bart/
│   │   │   ├── __init__.py
│   │   │   ├── hub_interface.py
│   │   │   └── model.py
│   │   ├── composite_encoder.py
│   │   ├── distributed_fairseq_model.py
│   │   ├── ema/
│   │   │   ├── __init__.py
│   │   │   └── ema.py
│   │   ├── fairseq_decoder.py
│   │   ├── fairseq_encoder.py
│   │   ├── fairseq_incremental_decoder.py
│   │   ├── fairseq_model.py
│   │   ├── fconv.py
│   │   ├── fconv_lm.py
│   │   ├── fconv_self_att.py
│   │   ├── hubert/
│   │   │   ├── __init__.py
│   │   │   ├── hubert.py
│   │   │   └── hubert_asr.py
│   │   ├── huggingface/
│   │   │   ├── __init__.py
│   │   │   └── hf_gpt2.py
│   │   ├── lightconv.py
│   │   ├── lightconv_lm.py
│   │   ├── lstm.py
│   │   ├── lstm_lm.py
│   │   ├── masked_lm.py
│   │   ├── model_utils.py
│   │   ├── multilingual_transformer.py
│   │   ├── multires_hubert/
│   │   │   ├── __init__.py
│   │   │   ├── multires_hubert.py
│   │   │   └── multires_hubert_asr.py
│   │   ├── nat/
│   │   │   ├── __init__.py
│   │   │   ├── cmlm_transformer.py
│   │   │   ├── fairseq_nat_model.py
│   │   │   ├── insertion_transformer.py
│   │   │   ├── iterative_nonautoregressive_transformer.py
│   │   │   ├── levenshtein_transformer.py
│   │   │   ├── levenshtein_utils.py
│   │   │   ├── nat_crf_transformer.py
│   │   │   ├── nonautoregressive_ensembles.py
│   │   │   └── nonautoregressive_transformer.py
│   │   ├── roberta/
│   │   │   ├── __init__.py
│   │   │   ├── alignment_utils.py
│   │   │   ├── enc_dec.py
│   │   │   ├── hub_interface.py
│   │   │   ├── model.py
│   │   │   ├── model_camembert.py
│   │   │   ├── model_gottbert.py
│   │   │   └── model_xlmr.py
│   │   ├── speech_dlm/
│   │   │   ├── __init__.py
│   │   │   ├── hub_interface.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── speech_dlm_decoder.py
│   │   │   │   └── speech_dlm_decoder_layer.py
│   │   │   ├── sequence_generator/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── multichannel_search.py
│   │   │   │   └── multichannel_sequence_generator.py
│   │   │   └── speech_dlm.py
│   │   ├── speech_to_speech/
│   │   │   ├── __init__.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── ctc_decoder.py
│   │   │   │   ├── stacked_embedding.py
│   │   │   │   ├── transformer_decoder_aug.py
│   │   │   │   └── transformer_encoder.py
│   │   │   ├── s2s_conformer.py
│   │   │   ├── s2s_conformer_translatotron2.py
│   │   │   ├── s2s_conformer_unity.py
│   │   │   └── s2s_transformer.py
│   │   ├── speech_to_text/
│   │   │   ├── __init__.py
│   │   │   ├── berard.py
│   │   │   ├── convtransformer.py
│   │   │   ├── hub_interface.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── augmented_memory_attention.py
│   │   │   │   ├── convolution.py
│   │   │   │   └── emformer.py
│   │   │   ├── multi_modality_model.py
│   │   │   ├── s2t_conformer.py
│   │   │   ├── s2t_transformer.py
│   │   │   ├── s2t_wav_transformer.py
│   │   │   ├── utils.py
│   │   │   ├── xm_transformer.py
│   │   │   └── xm_transformer_unity.py
│   │   ├── text_to_speech/
│   │   │   ├── __init__.py
│   │   │   ├── codehifigan.py
│   │   │   ├── fastspeech2.py
│   │   │   ├── hifigan.py
│   │   │   ├── hub_interface.py
│   │   │   ├── tacotron2.py
│   │   │   ├── tts_transformer.py
│   │   │   └── vocoder.py
│   │   ├── transformer/
│   │   │   ├── __init__.py
│   │   │   ├── transformer_base.py
│   │   │   ├── transformer_config.py
│   │   │   ├── transformer_decoder.py
│   │   │   ├── transformer_decoder_aug.py
│   │   │   ├── transformer_encoder.py
│   │   │   └── transformer_legacy.py
│   │   ├── transformer_align.py
│   │   ├── transformer_from_pretrained_xlm.py
│   │   ├── transformer_lm.py
│   │   ├── transformer_ulm.py
│   │   ├── wav2vec/
│   │   │   ├── __init__.py
│   │   │   ├── utils.py
│   │   │   ├── wav2vec.py
│   │   │   ├── wav2vec2.py
│   │   │   ├── wav2vec2_asr.py
│   │   │   ├── wav2vec2_classification.py
│   │   │   └── wav2vec2_laser.py
│   │   └── xmod/
│   │       ├── __init__.py
│   │       ├── hub_interface.py
│   │       ├── model.py
│   │       └── transformer_layer_xmod.py
│   ├── modules/
│   │   ├── __init__.py
│   │   ├── adaptive_input.py
│   │   ├── adaptive_softmax.py
│   │   ├── base_layer.py
│   │   ├── beamable_mm.py
│   │   ├── character_token_embedder.py
│   │   ├── checkpoint_activations.py
│   │   ├── conformer_layer.py
│   │   ├── conv_tbc.py
│   │   ├── cross_entropy.py
│   │   ├── cuda_utils.cu
│   │   ├── downsampled_multihead_attention.py
│   │   ├── dynamic_convolution.py
│   │   ├── dynamic_crf_layer.py
│   │   ├── dynamicconv_layer/
│   │   │   ├── __init__.py
│   │   │   ├── cuda_function_gen.py
│   │   │   ├── dynamicconv_cuda.cpp
│   │   │   ├── dynamicconv_cuda.cuh
│   │   │   ├── dynamicconv_cuda_kernel.cu
│   │   │   ├── dynamicconv_layer.py
│   │   │   ├── dynamiconv_cpu.cpp
│   │   │   └── setup.py
│   │   ├── ema_module.py
│   │   ├── espnet_multihead_attention.py
│   │   ├── fairseq_dropout.py
│   │   ├── fp32_batch_norm.py
│   │   ├── fp32_group_norm.py
│   │   ├── fp32_instance_norm.py
│   │   ├── gelu.py
│   │   ├── grad_multiply.py
│   │   ├── gumbel_vector_quantizer.py
│   │   ├── kmeans_attention.py
│   │   ├── kmeans_vector_quantizer.py
│   │   ├── layer_drop.py
│   │   ├── layer_norm.py
│   │   ├── learned_positional_embedding.py
│   │   ├── lightconv_layer/
│   │   │   ├── __init__.py
│   │   │   ├── cuda_function_gen.py
│   │   │   ├── lightconv_cuda.cpp
│   │   │   ├── lightconv_cuda.cuh
│   │   │   ├── lightconv_cuda_kernel.cu
│   │   │   ├── lightconv_layer.py
│   │   │   └── setup.py
│   │   ├── lightweight_convolution.py
│   │   ├── linearized_convolution.py
│   │   ├── location_attention.py
│   │   ├── lstm_cell_with_zoneout.py
│   │   ├── multihead_attention.py
│   │   ├── positional_embedding.py
│   │   ├── positional_encoding.py
│   │   ├── quant_noise.py
│   │   ├── quantization/
│   │   │   ├── __init__.py
│   │   │   ├── pq/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── em.py
│   │   │   │   ├── modules/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── qconv.py
│   │   │   │   │   ├── qemb.py
│   │   │   │   │   └── qlinear.py
│   │   │   │   ├── pq.py
│   │   │   │   └── utils.py
│   │   │   ├── quantization_options.py
│   │   │   └── scalar/
│   │   │       ├── __init__.py
│   │   │       ├── modules/
│   │   │       │   ├── __init__.py
│   │   │       │   ├── qact.py
│   │   │       │   ├── qconv.py
│   │   │       │   ├── qemb.py
│   │   │       │   └── qlinear.py
│   │   │       ├── ops.py
│   │   │       └── utils.py
│   │   ├── rotary_positional_embedding.py
│   │   ├── same_pad.py
│   │   ├── scalar_bias.py
│   │   ├── sinusoidal_positional_embedding.py
│   │   ├── sparse_multihead_attention.py
│   │   ├── sparse_transformer_sentence_encoder.py
│   │   ├── sparse_transformer_sentence_encoder_layer.py
│   │   ├── transformer_layer.py
│   │   ├── transformer_layer_aug.py
│   │   ├── transformer_sentence_encoder.py
│   │   ├── transformer_sentence_encoder_layer.py
│   │   ├── transpose_last.py
│   │   ├── unfold.py
│   │   └── vggblock.py
│   ├── nan_detector.py
│   ├── ngram_repeat_block.py
│   ├── optim/
│   │   ├── __init__.py
│   │   ├── adadelta.py
│   │   ├── adafactor.py
│   │   ├── adagrad.py
│   │   ├── adam.py
│   │   ├── adamax.py
│   │   ├── amp_optimizer.py
│   │   ├── bmuf.py
│   │   ├── composite.py
│   │   ├── cpu_adam.py
│   │   ├── dynamic_loss_scaler.py
│   │   ├── fairseq_optimizer.py
│   │   ├── fp16_optimizer.py
│   │   ├── fused_adam.py
│   │   ├── fused_lamb.py
│   │   ├── lr_scheduler/
│   │   │   ├── __init__.py
│   │   │   ├── cosine_lr_scheduler.py
│   │   │   ├── fairseq_lr_scheduler.py
│   │   │   ├── fixed_schedule.py
│   │   │   ├── inverse_square_root_schedule.py
│   │   │   ├── manual_lr_scheduler.py
│   │   │   ├── pass_through.py
│   │   │   ├── polynomial_decay_schedule.py
│   │   │   ├── reduce_lr_on_plateau.py
│   │   │   ├── step_lr_scheduler.py
│   │   │   ├── tri_stage_lr_scheduler.py
│   │   │   └── triangular_lr_scheduler.py
│   │   ├── nag.py
│   │   ├── sgd.py
│   │   └── shard.py
│   ├── options.py
│   ├── pdb.py
│   ├── quantization_utils.py
│   ├── registry.py
│   ├── scoring/
│   │   ├── __init__.py
│   │   ├── bertscore.py
│   │   ├── bleu.py
│   │   ├── chrf.py
│   │   ├── meteor.py
│   │   ├── tokenizer.py
│   │   └── wer.py
│   ├── search.py
│   ├── sequence_generator.py
│   ├── sequence_scorer.py
│   ├── speech_generator.py
│   ├── tasks/
│   │   ├── __init__.py
│   │   ├── audio_classification.py
│   │   ├── audio_finetuning.py
│   │   ├── audio_pretraining.py
│   │   ├── cross_lingual_lm.py
│   │   ├── denoising.py
│   │   ├── fairseq_task.py
│   │   ├── frm_text_to_speech.py
│   │   ├── hubert_pretraining.py
│   │   ├── language_modeling.py
│   │   ├── legacy_masked_lm.py
│   │   ├── masked_lm.py
│   │   ├── multilingual_denoising.py
│   │   ├── multilingual_language_modeling.py
│   │   ├── multilingual_masked_lm.py
│   │   ├── multilingual_translation.py
│   │   ├── multires_hubert_pretraining.py
│   │   ├── nlu_finetuning.py
│   │   ├── online_backtranslation.py
│   │   ├── semisupervised_translation.py
│   │   ├── sentence_prediction.py
│   │   ├── sentence_prediction_adapters.py
│   │   ├── sentence_ranking.py
│   │   ├── simultaneous_translation.py
│   │   ├── span_masked_lm.py
│   │   ├── speech_dlm_task.py
│   │   ├── speech_to_speech.py
│   │   ├── speech_to_text.py
│   │   ├── speech_ulm_task.py
│   │   ├── text_to_speech.py
│   │   ├── translation.py
│   │   ├── translation_from_pretrained_bart.py
│   │   ├── translation_from_pretrained_xlm.py
│   │   ├── translation_lev.py
│   │   └── translation_multi_simple_epoch.py
│   ├── token_generation_constraints.py
│   ├── tokenizer.py
│   ├── trainer.py
│   ├── utils.py
│   └── version.txt
├── fairseq_cli/
│   ├── __init__.py
│   ├── eval_lm.py
│   ├── generate.py
│   ├── hydra_train.py
│   ├── hydra_validate.py
│   ├── interactive.py
│   ├── preprocess.py
│   ├── score.py
│   ├── train.py
│   └── validate.py
├── hubconf.py
├── hydra_plugins/
│   └── dependency_submitit_launcher/
│       ├── hydra_plugins/
│       │   └── dependency_submitit_launcher/
│       │       ├── __init__.py
│       │       ├── config.py
│       │       └── launcher.py
│       └── setup.py
├── pyproject.toml
├── release_utils.py
├── scripts/
│   ├── __init__.py
│   ├── average_checkpoints.py
│   ├── build_sym_alignment.py
│   ├── check_installation.py
│   ├── compare_namespaces.py
│   ├── compound_split_bleu.sh
│   ├── constraints/
│   │   ├── extract.py
│   │   └── validate.py
│   ├── convert_dictionary.lua
│   ├── convert_model.lua
│   ├── count_docs.py
│   ├── read_binarized.py
│   ├── rm_pt.py
│   ├── sacrebleu.sh
│   ├── shard_docs.py
│   ├── split_train_valid_docs.py
│   ├── spm_decode.py
│   ├── spm_encode.py
│   ├── spm_train.py
│   └── test_fsdp.sh
├── setup.cfg
├── setup.py
├── tests/
│   ├── __init__.py
│   ├── distributed/
│   │   ├── __init__.py
│   │   ├── test_bmuf.py
│   │   ├── test_distributed_timeout_wrapper.py
│   │   ├── test_module_proxy_wrapper.py
│   │   ├── test_utils.py
│   │   └── utils.py
│   ├── gpu/
│   │   ├── __init__.py
│   │   ├── test_binaries_gpu.py
│   │   ├── test_ema_gpu.py
│   │   └── transformer_quantization_config.yaml
│   ├── speech/
│   │   ├── __init__.py
│   │   ├── test_convtransformer_simul_trans.py
│   │   ├── test_dual_input_wav_transformer.py
│   │   ├── test_dualinput_s2t_transformer.py
│   │   ├── test_fastspeech2.py
│   │   ├── test_s2s_transformer.py
│   │   ├── test_s2t_conformer.py
│   │   ├── test_s2t_transformer.py
│   │   ├── test_tts_transformer.py
│   │   ├── test_wav2vec2.py
│   │   └── test_xm_transformer.py
│   ├── speech_recognition/
│   │   ├── __init__.py
│   │   ├── asr_test_base.py
│   │   ├── test_collaters.py
│   │   ├── test_cross_entropy.py
│   │   ├── test_data_utils.py
│   │   └── test_vggtransformer.py
│   ├── tasks/
│   │   ├── test_denoising.py
│   │   ├── test_masked_lm.py
│   │   ├── test_multilingual_denoising.py
│   │   └── test_span_masked_lm.py
│   ├── test_activation_checkpointing.py
│   ├── test_amp_optimizer.py
│   ├── test_average_checkpoints.py
│   ├── test_backtranslation_dataset.py
│   ├── test_binaries.py
│   ├── test_binarizer.py
│   ├── test_character_token_embedder.py
│   ├── test_checkpoint_utils.py
│   ├── test_checkpoint_utils_for_task_level_attributes.py
│   ├── test_concat_dataset.py
│   ├── test_constraints.py
│   ├── test_convtbc.py
│   ├── test_data_utils.py
│   ├── test_dataclass_utils.py
│   ├── test_dataset.py
│   ├── test_dictionary.py
│   ├── test_ema.py
│   ├── test_espnet_multihead_attention.py
│   ├── test_export.py
│   ├── test_file_chunker_utils.py
│   ├── test_file_io.py
│   ├── test_fp16_optimizer.py
│   ├── test_hf_hub.py
│   ├── test_huffman.py
│   ├── test_inference_dropout.py
│   ├── test_iopath.py
│   ├── test_iterators.py
│   ├── test_label_smoothing.py
│   ├── test_lm_context_window.py
│   ├── test_lstm_jitable.py
│   ├── test_memory_efficient_fp16.py
│   ├── test_metrics.py
│   ├── test_multi_corpus_dataset.py
│   ├── test_multi_corpus_sampled_dataset.py
│   ├── test_multihead_attention.py
│   ├── test_noising.py
│   ├── test_online_backtranslation.py
│   ├── test_plasma_utils.py
│   ├── test_positional_encoding.py
│   ├── test_reproducibility.py
│   ├── test_resampling_dataset.py
│   ├── test_roberta.py
│   ├── test_rotary_positional_embedding.py
│   ├── test_sequence_generator.py
│   ├── test_sequence_scorer.py
│   ├── test_sparse_multihead_attention.py
│   ├── test_token_block_dataset.py
│   ├── test_train.py
│   ├── test_transformer.py
│   ├── test_utils.py
│   ├── test_valid_subset_checks.py
│   └── utils.py
└── train.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/CODEOWNERS
================================================
# Setting up CODEOWNERS for UST related codebase
# Documentation for open sourced models relevant to UST
examples/speech_to_text     @kahne @sravyapopuri388 @jmp84
examples/speech_to_speech   @an918tw @sravyapopuri388 @jmp84
examples/speech_synthesis   @kahne @jmp84
examples/simultaneous_translation   @kahne @jmp84
examples/speech_text_joint_to_text  @yuntang @jmp84

# Speech related models relevant to UST
fairseq/models/speech_to_speech @sravyapopuri388 @jmp84
fairseq/models/speech_to_text   @kahne @sravyapopuri388 @jmp84
fairseq/models/text_to_speech   @kahne @jmp84

# CONFORMER IMPLEMENTATION
fairseq/modules/conformer_layer.py @sravyapopuri388 @jmp84
fairseq/modules/espnet_multihead_attention.py @sravyapopuri388 @jmp84
fairseq/modules/rotary_positional_embedding.py @sravyapopuri388 @jmp84
fairseq/modules/positional_encoding.py @sravyapopuri388 @jmp84

# Machine Translation/NLLB
fairseq/tasks/translation.py @gwenzek


================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: 🐛 Bug Report
about: Submit a bug report to help us improve
labels: 'bug, needs triage'
---

## 🐛 Bug

<!-- A clear and concise description of what the bug is. -->

### To Reproduce

Steps to reproduce the behavior (**always include the command you ran**):

1. Run cmd '....'
2. See error

<!-- If you have a code sample, error messages, stack traces, please provide it here as well -->


#### Code sample
<!-- Ideally attach a minimal code sample to reproduce the decried issue.
Minimal means having the shortest code but still preserving the bug. -->

### Expected behavior

<!-- A clear and concise description of what you expected to happen. -->

### Environment

 - fairseq Version (e.g., 1.0 or main):
 - PyTorch Version (e.g., 1.0)
 - OS (e.g., Linux):
 - How you installed fairseq (`pip`, source):
 - Build command you used (if compiling from source):
 - Python version:
 - CUDA/cuDNN version:
 - GPU models and configuration:
 - Any other relevant information:

### Additional context

<!-- Add any other context about the problem here. -->


================================================
FILE: .github/ISSUE_TEMPLATE/documentation.md
================================================
---
name: 📚 Documentation/Typos
about: Report an issue related to documentation or a typo
labels: 'documentation, needs triage'
---

## 📚 Documentation

For typos and doc fixes, please go ahead and:

1. Create an issue.
2. Fix the typo.
3. Submit a PR.

Thanks!


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: 🚀 Feature Request
about: Submit a proposal/request for a new feature
labels: 'enhancement, help wanted, needs triage'
---

## 🚀 Feature Request
<!-- A clear and concise description of the feature proposal -->

### Motivation

<!-- Please outline the motivation for the proposal. Is your feature request related to a problem? e.g., I'm always frustrated when [...]. If this is related to another GitHub issue, please link here too -->

### Pitch

<!-- A clear and concise description of what you want to happen. -->

### Alternatives

<!-- A clear and concise description of any alternative solutions or features you've considered, if any. -->

### Additional context

<!-- Add any other context or screenshots about the feature request here. -->


================================================
FILE: .github/ISSUE_TEMPLATE/how-to-question.md
================================================
---
name: ❓ Questions/Help
about: If you have questions, please first search existing issues and docs
labels: 'question, needs triage'
---

## ❓ Questions and Help

### Before asking:
1. search the issues.
2. search the docs.

<!-- If you still can't find what you need: -->

#### What is your question?

#### Code

<!-- Please paste a code snippet if your question requires it! -->

#### What have you tried?

#### What's your environment?

 - fairseq Version (e.g., 1.0 or main):
 - PyTorch Version (e.g., 1.0)
 - OS (e.g., Linux):
 - How you installed fairseq (`pip`, source):
 - Build command you used (if compiling from source):
 - Python version:
 - CUDA/cuDNN version:
 - GPU models and configuration:
 - Any other relevant information:


================================================
FILE: .github/ISSUE_TEMPLATE.md
================================================
## 👉 [Please follow one of these issue templates](https://github.com/pytorch/fairseq/issues/new/choose) 👈

Note: to keep the backlog clean and actionable, issues may be immediately closed if they do not follow one of the above issue templates.


================================================
FILE: .github/PULL_REQUEST_TEMPLATE.md
================================================
# Before submitting

- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- [ ] Did you read the [contributor guideline](https://github.com/pytorch/fairseq/blob/main/CONTRIBUTING.md)?
- [ ] Did you make sure to update the docs?
- [ ] Did you write any new necessary tests?

## What does this PR do?
Fixes # (issue).

## PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

## Did you have fun?
Make sure you had fun coding 🙃


================================================
FILE: .github/stale.yml
================================================
# Configuration for probot-stale - https://github.com/probot/stale
# Mostly copied from github.com/facebook/react/blob/master/.github/stale.yml
# Number of days of inactivity before an issue becomes stale
daysUntilStale: 90
# Number of days of inactivity before a stale issue is closed
daysUntilClose: 7
# Issues with these labels will never be considered stale
exemptLabels:
  - bug
# Label to use when marking an issue as stale
staleLabel: stale
issues:
  # Comment to post when marking an issue as stale.
  markComment: >
    This issue has been automatically marked as stale.
    **If this issue is still affecting you, please leave any comment** (for example, "bump"), and we'll keep it open.
    We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!
  # Comment to post when closing a stale issue.
  closeComment: >
    Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!
pulls:
  # Comment to post when marking a pull request as stale.
  markComment: >
    This pull request has been automatically marked as stale.
    **If this pull request is still relevant, please leave any comment** (for example, "bump"), and we'll keep it open.
    We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated.
  # Comment to post when closing a stale pull request.
  closeComment: >
    Closing this pull request after a prolonged period of inactivity. If this issue is still present in the latest release, please ask for this pull request to be reopened. Thank you!



================================================
FILE: .github/workflows/build.yml
================================================
name: build

on:
  # Trigger the workflow on push to main or any pull request
  push:
    branches:
      - main
  pull_request:

jobs:
  build:

    strategy:
      max-parallel: 4
      matrix:
        platform: [ubuntu-latest, macos-latest]
        python-version: [3.8, 3.9]

    runs-on: ${{ matrix.platform }}

    steps:
    - uses: actions/checkout@v2

    - name: Set up Python ${{ matrix.python-version }}
      uses: actions/setup-python@v2
      with:
        python-version: ${{ matrix.python-version }}

    - name: Conditionally install pytorch
      if: matrix.platform == 'windows-latest'
      run: pip3 install torch -f https://download.pytorch.org/whl/torch_stable.html

    - name: Install locally
      run: |
        python -m pip install --upgrade pip
        git submodule update --init --recursive
        python -m pip install .

    - name: Check installation
      working-directory: /tmp
      run: python $GITHUB_WORKSPACE/scripts/check_installation.py

    - name: Install optional test requirements
      run: |
        python -m pip install '.[dev,docs]'
        python -m pip install iopath transformers pyarrow
        python -m pip install git+https://github.com/facebookresearch/fairscale.git@main
        python -m pip install pygit2 pgzip
        
    - name: Install xformers for Macos
      if: matrix.platform == 'macos-latest'
      run: |
        brew install llvm libomp
        CC=/usr/local/opt/llvm/bin/clang CXX=clang++ pip install git+https://github.com/facebookresearch/xformers.git@main

    - name: Install xformers for non-MacOS
      if: matrix.platform != 'macos-latest'
      run: |
        python -m pip install --progress-bar off git+https://github.com/facebookresearch/xformers.git@main

    - name: Lint with black
      run: black --check --diff .

    - name: Lint with flake8
      run: |
        # stop the build if there are Python syntax errors or undefined names
        flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
        flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

    - name: Build doc
      run: make singlehtml
      working-directory: docs/

    - name: Run tests
      # When installing in non-editable mode, the .so files will be generated in 'site-packages/fairseq'.
      # But by default, pytest import machinery will load local fairseq, and won't see the .so.
      # Use --import-mode=append to favorize the 'site-packages/fairseq'.
      # https://docs.pytest.org/en/7.1.x/explanation/pythonpath.html
      run: pytest --import-mode=append -vvv tests/



================================================
FILE: .github/workflows/depreview.yml
================================================
name: 'Dependency Review'
on: [pull_request]

permissions:
  contents: read

jobs:
  dependency-review:
    runs-on: ubuntu-latest
    steps:
     - name: 'Checkout Repository'
       uses: actions/checkout@v4
     - name: Dependency Review
       uses: actions/dependency-review-action@v4


================================================
FILE: .github/workflows/release.yml
================================================
name: Fairseq Release

on:
  workflow_dispatch:
    inputs:
      name:
        description: 'Release Type'
        default: 'patch'
        required: true

jobs:

  get_next_version:
    runs-on: ubuntu-latest
    steps:
      - name: checkout-repo-content
        uses: actions/checkout@v2

      - name: setup-python
        uses: actions/setup-python@v2
        with:
          python-version: 3.8

      - name: get next version and tag
        id: get-next-version-and-tag
        run: |
          output=$(python3 release_utils.py --release-type ${{ github.event.inputs.name }}) 
          echo $output
          new_version=$(echo $output | awk '{print $1}')
          new_tag=$(echo $output | awk '{print $2}')
          echo "new version is $new_version"
          echo "new tag is $new_tag"
          echo ::set-output name=version::$new_version
          echo ::set-output name=tag::$new_tag
          echo ::set-output name=branch_name::$new_version-release
          echo "NEW_TAG=$new_tag" >> $GITHUB_ENV
          echo "NEW_BRANCH=$new_version-release" >> $GITHUB_ENV


      # update the version number in version.txt
      - name: update version
        id: update-version
        run : |
          echo "current folder = $PWD"
          echo "current branch = $(git branch --show-current)"
          output=$(python3 release_utils.py --release-type ${{ github.event.inputs.name }} --update-version)

      - name: add and commit
        uses: EndBug/add-and-commit@v9
        with:
          author_name: ${{ secrets.AUTHOR_NAME }}
          author_email: ${{ secrets.AUTHOR_EMAIL }}

          # TODO: change this to main once shipit is disabled.
          new_branch: '${{ env.NEW_BRANCH }}'
          default_author: github_actor
          message: '${{ env.NEW_TAG }} release'
          pathspec_error_handling: exitAtEnd

          # Arguments for the git pull command. Use NO-PULL to avoid the action pulling at all.
          # pull: 'NO-PULL'
          tag: '${{ env.NEW_TAG }}'

    outputs:
      new_version: ${{ steps.get-next-version-and-tag.outputs.version }}
      new_tag: ${{ steps.get-next-version-and-tag.outputs.tag }}
      branch_name: ${{ steps.get-next-version-and-tag.outputs.branch_name }}

  create_sdist:
    runs-on: ubuntu-latest
    name: Create Source Distribution
    needs: get_next_version
    steps:
      - uses: actions/checkout@v3
        with:
          ref: ${{ needs.get_next_version.outputs.branch_name }}

      - name: Install Python
        uses: actions/setup-python@v2
        with:
          python-version: '3.8'

      - name: Upgrade pip
        run: |
          python3 -m pip install --upgrade pip

      - name: Create Source Distribution
        run: |
          python3 -m pip install setuptools wheel twine torch
          python3 setup.py sdist
 
      - uses: actions/upload-artifact@v2
        with:
          path: dist/*.tar.gz

  build_wheels:
    name: Build wheels on ${{ matrix.os }}
    runs-on: ${{ matrix.os }}
    needs: get_next_version
    strategy:
      matrix:
        os: [ubuntu-latest, macos-latest]

    steps:
      - uses: actions/checkout@v3
        with:
          ref: ${{ needs.get_next_version.outputs.branch_name }}

      - name: Install Python
        uses: actions/setup-python@v2
        with:
          python-version: '3.8'

      - name: Upgrade pip
        run: |
          python3 -m pip install --upgrade pip

      - name: Install cibuildwheel
        run: |
          python3 -m pip install cibuildwheel

      - name: Build wheels for CPython
        run: |
          python3 -m cibuildwheel --output-dir dist
        env:
          CIBW_BUILD: "cp38-*64"
          CIBW_MANYLINUX_X86_64_IMAGE: manylinux1
          CIBW_BEFORE_BUILD: git submodule update --init --recursive && pip install .
          # Install system library
          CIBW_BEFORE_BUILD_LINUX: (yum install -y libffi-devel || apt-get install -y libffi-devel || apk add --update --no-cache libffi-devel || true) && (yum install -y libc6 || apt-get install -y libc6 || apk add --update --no-cache libc6 || true)
          CIBW_ENVIRONMENT: "PIP_ONLY_BINARY=numpy"
          CIBW_SKIP: "*musllinux*"

      - uses: actions/upload-artifact@v2
        with:
          path: dist

  upload:
    name: Upload to PyPi and create release
    runs-on: ubuntu-latest
    needs: [build_wheels, create_sdist, get_next_version]
    steps:
      - uses: actions/download-artifact@v2
        with:
          name: artifact
          path: dist

      # build the PyPI package and upload it
      - name: upload
        env:
          TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
          TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
        run: |
          pip install setuptools wheel twine
          python3 -m twine upload --repository pypi dist/*

      # create the release on github
      - name: create release on github
        uses: ncipollo/release-action@v1
        with:
          tag: '${{ needs.get_next_version.outputs.new_tag }}'


================================================
FILE: .gitignore
================================================
# JetBrains PyCharm IDE
.idea/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# macOS dir files
.DS_Store

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Checkpoints
checkpoints

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/

# Generated files
/fairseq/temporal_convolution_tbc
/fairseq/modules/*_layer/*_forward.cu
/fairseq/modules/*_layer/*_backward.cu
/fairseq/version.py

# data
data-bin/

# reranking
/examples/reranking/rerank_data

# Cython-generated C++ source files
/fairseq/data/data_utils_fast.cpp
/fairseq/data/token_block_utils_fast.cpp

# VSCODE
.vscode/ftp-sync.json
.vscode/settings.json

# Experimental Folder
experimental/*

# Weights and Biases logs
wandb/

# Hydra artifacts
nohup.out
multirun
outputs


================================================
FILE: .gitmodules
================================================
[submodule "fairseq/model_parallel/megatron"]
    path = fairseq/model_parallel/megatron
    url = https://github.com/ngoyal2707/Megatron-LM
    branch = fairseq


================================================
FILE: .pre-commit-config.yaml
================================================
exclude: 'build|stubs'

default_language_version:
    python: python3

repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.1.0
    hooks:
    -   id: trailing-whitespace
    -   id: check-ast
    -   id: check-merge-conflict
    -   id: no-commit-to-branch
        args: ['--branch=master']
    -   id: check-added-large-files
        args: ['--maxkb=500']
    -   id: end-of-file-fixer

-   repo: https://github.com/ambv/black
    rev: 22.3.0
    hooks:
    - id: black
      language_version: python3.8

-   repo: https://gitlab.com/pycqa/flake8
    rev: 3.9.2
    hooks:
    -   id: flake8
        args: [
            # only error for syntax errors and undefined names
            "--select=E9,F63,F7,F82",
        ]

-   repo: https://github.com/pycqa/isort
    rev: 5.10.1
    hooks:
    -   id: isort
        exclude: README.md
        additional_dependencies: [toml]
        args: ["--profile", "black"]


================================================
FILE: CODE_OF_CONDUCT.md
================================================
# Code of Conduct

## Our Pledge

In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to make participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.

## Our Standards

Examples of behavior that contributes to creating a positive environment
include:

* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members

Examples of unacceptable behavior by participants include:

* The use of sexualized language or imagery and unwelcome sexual attention or
  advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
  address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
  professional setting

## Our Responsibilities

Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.

Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.

## Scope

This Code of Conduct applies within all project spaces, and it also applies when
an individual is representing the project or its community in public spaces.
Examples of representing a project or community include using an official
project e-mail address, posting via an official social media account, or acting
as an appointed representative at an online or offline event. Representation of
a project may be further defined and clarified by project maintainers.

## Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at <conduct@pytorch.org>. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.

## Attribution

This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html

[homepage]: https://www.contributor-covenant.org

For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq



================================================
FILE: CONTRIBUTING.md
================================================
# Contributing to Facebook AI Research Sequence-to-Sequence Toolkit (fairseq)
We want to make contributing to this project as easy and transparent as
possible.

## Pull Requests
We actively welcome your pull requests.

1. Fork the repo and create your branch from `main`.
2. If you've added code that should be tested, add tests.
3. If you've changed APIs, update the documentation.
4. Ensure the test suite passes.
5. Make sure your code lints.
6. If you haven't already, complete the Contributor License Agreement ("CLA").

## Contributor License Agreement ("CLA")
In order to accept your pull request, we need you to submit a CLA. You only need
to do this once to work on any of Facebook's open source projects.

Complete your CLA here: <https://code.facebook.com/cla>

## Issues
We use GitHub issues to track public bugs. Please ensure your description is
clear and has sufficient instructions to be able to reproduce the issue.

## License
By contributing to Facebook AI Research Sequence-to-Sequence Toolkit (fairseq),
you agree that your contributions will be licensed under the LICENSE file in
the root directory of this source tree.

## Pre-commit hooks
In order to ensure your code lints, there are pre-commit hooks configured in the repository which you can install.
After installation, they will automatically run each time you commit.
An abbreviated guide is given below; for more information, refer to [the offical pre-commit documentation](https://pre-commit.com/).

### Installation
```
pip install pre-commit
pre-commit install
```

### Usage
Just commit your changes:
```
git commit -m "My informative commit message"
```

If there was a failure, you will get feedback
```
[INFO] Initializing environment for https://github.com/PyCQA/flake8.
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
[INFO] Installing environment for https://github.com/PyCQA/flake8.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook
Fixing examples/nllb/modeling/wmt15_benchmark/eval_langs2.sh
Fix End of Files.........................................................Failed
- hook id: end-of-file-fixer
- exit code: 1
- files were modified by this hook
Fixing examples/few_shot/scripts/schedule_jobs_few_shot.py
flake8...................................................................Passed
```

Certain hooks modify your files to comply.
To include these modifications, you will need to add them (i.e. `git add ...`) and commit again.

If all is well, you should see something like:
```
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
flake8...................................................................Passed
[gshard-fix-ci 8698644e1] Fix lint, add pre-commit hooks
 10 files changed, 148 insertions(+), 110 deletions(-)
 create mode 100644 .flake8
 create mode 100644 .pre-commit-config.yaml
 rename examples/nllb/modeling/wmt15_benchmark/{eval_langs2.py => eval_langs2.sh} (99%)
 ```


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) Facebook, Inc. and its affiliates.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: MANIFEST.in
================================================
include fairseq/version.txt


================================================
FILE: README.md
================================================
<p align="center">
  <img src="docs/fairseq_logo.png" width="150">
  <br />
  <br />
  <a href="https://opensource.fb.com/support-ukraine"><img alt="Support Ukraine" src="https://img.shields.io/badge/Support-Ukraine-FFD500?style=flat&labelColor=005BBB" /></a>
  <a href="https://github.com/pytorch/fairseq/blob/main/LICENSE"><img alt="MIT License" src="https://img.shields.io/badge/license-MIT-blue.svg" /></a>
  <a href="https://github.com/pytorch/fairseq/releases"><img alt="Latest Release" src="https://img.shields.io/github/release/pytorch/fairseq.svg" /></a>
  <a href="https://github.com/pytorch/fairseq/actions?query=workflow:build"><img alt="Build Status" src="https://github.com/pytorch/fairseq/workflows/build/badge.svg" /></a>
  <a href="https://fairseq.readthedocs.io/en/latest/?badge=latest"><img alt="Documentation Status" src="https://readthedocs.org/projects/fairseq/badge/?version=latest" /></a>
  <a href="https://app.circleci.com/pipelines/github/facebookresearch/fairseq/"><img alt="CicleCI Status" src="https://circleci.com/gh/facebookresearch/fairseq.svg?style=shield" /></a>
</p>

--------------------------------------------------------------------------------

Fairseq(-py) is a sequence modeling toolkit that allows researchers and
developers to train custom models for translation, summarization, language
modeling and other text generation tasks.

We provide reference implementations of various sequence modeling papers:

<details><summary>List of implemented papers</summary><p>

* **Convolutional Neural Networks (CNN)**
  + [Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)](examples/language_model/conv_lm/README.md)
  + [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](examples/conv_seq2seq/README.md)
  + [Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)](https://github.com/pytorch/fairseq/tree/classic_seqlevel)
  + [Hierarchical Neural Story Generation (Fan et al., 2018)](examples/stories/README.md)
  + [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](examples/wav2vec/README.md)
* **LightConv and DynamicConv models**
  + [Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)](examples/pay_less_attention_paper/README.md)
* **Long Short-Term Memory (LSTM) networks**
  + Effective Approaches to Attention-based Neural Machine Translation (Luong et al., 2015)
* **Transformer (self-attention) networks**
  + Attention Is All You Need (Vaswani et al., 2017)
  + [Scaling Neural Machine Translation (Ott et al., 2018)](examples/scaling_nmt/README.md)
  + [Understanding Back-Translation at Scale (Edunov et al., 2018)](examples/backtranslation/README.md)
  + [Adaptive Input Representations for Neural Language Modeling (Baevski and Auli, 2018)](examples/language_model/README.adaptive_inputs.md)
  + [Lexically constrained decoding with dynamic beam allocation (Post & Vilar, 2018)](examples/constrained_decoding/README.md)
  + [Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (Dai et al., 2019)](examples/truncated_bptt/README.md)
  + [Adaptive Attention Span in Transformers (Sukhbaatar et al., 2019)](examples/adaptive_span/README.md)
  + [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](examples/translation_moe/README.md)
  + [RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)](examples/roberta/README.md)
  + [Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)](examples/wmt19/README.md)
  + [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md )
  + [Multilingual Denoising Pre-training for Neural Machine Translation (Liu et at., 2020)](examples/mbart/README.md)
  + [Neural Machine Translation with Byte-Level Subwords (Wang et al., 2020)](examples/byte_level_bpe/README.md)
  + [Unsupervised Quality Estimation for Neural Machine Translation (Fomicheva et al., 2020)](examples/unsupervised_quality_estimation/README.md)
  + [wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (Baevski et al., 2020)](examples/wav2vec/README.md)
  + [Generating Medical Reports from Patient-Doctor Conversations Using Sequence-to-Sequence Models (Enarvi et al., 2020)](examples/pointer_generator/README.md)
  + [Linformer: Self-Attention with Linear Complexity (Wang et al., 2020)](examples/linformer/README.md)
  + [Cross-lingual Retrieval for Iterative Self-Supervised Training (Tran et al., 2020)](examples/criss/README.md)
  + [Deep Transformers with Latent Depth (Li et al., 2020)](examples/latent_depth/README.md)
  + [Unsupervised Cross-lingual Representation Learning for Speech Recognition (Conneau et al., 2020)](https://arxiv.org/abs/2006.13979)
  + [Self-training and Pre-training are Complementary for Speech Recognition (Xu et al., 2020)](https://arxiv.org/abs/2010.11430)
  + [Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training (Hsu, et al., 2021)](https://arxiv.org/abs/2104.01027)
  + [Unsupervised Speech Recognition (Baevski, et al., 2021)](https://arxiv.org/abs/2105.11084)
  + [Simple and Effective Zero-shot Cross-lingual Phoneme Recognition (Xu et al., 2021)](https://arxiv.org/abs/2109.11680)
  + [VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding (Xu et. al., 2021)](https://arxiv.org/pdf/2109.14084.pdf)
  + [VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding (Xu et. al., 2021)](https://aclanthology.org/2021.findings-acl.370.pdf)
  + [NormFormer: Improved Transformer Pretraining with Extra Normalization (Shleifer et. al, 2021)](examples/normformer/README.md)
* **Non-autoregressive Transformers**
  + Non-Autoregressive Neural Machine Translation (Gu et al., 2017)
  + Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement (Lee et al. 2018)
  + Insertion Transformer: Flexible Sequence Generation via Insertion Operations (Stern et al. 2019)
  + Mask-Predict: Parallel Decoding of Conditional Masked Language Models (Ghazvininejad et al., 2019)
  + [Levenshtein Transformer (Gu et al., 2019)](examples/nonautoregressive_translation/README.md)
* **Finetuning**
  + [Better Fine-Tuning by Reducing Representational Collapse (Aghajanyan et al. 2020)](examples/rxf/README.md)

</p></details>

### What's New:
* May 2023 [Released models for Scaling Speech Technology to 1,000+ Languages  (Pratap, et al., 2023)](examples/mms/README.md)
* June 2022 [Released code for wav2vec-U 2.0 from Towards End-to-end Unsupervised Speech Recognition (Liu, et al., 2022)](examples/wav2vec/unsupervised/README.md)
* May 2022 [Integration with xFormers](https://github.com/facebookresearch/xformers)
* December 2021 [Released Direct speech-to-speech translation code](examples/speech_to_speech/README.md)
* October 2021 [Released VideoCLIP and VLM models](examples/MMPT/README.md)
* October 2021 [Released multilingual finetuned XLSR-53 model](examples/wav2vec/README.md)
* September 2021 [`master` branch renamed to `main`](https://github.com/github/renaming).
* July 2021 [Released DrNMT code](examples/discriminative_reranking_nmt/README.md)
* July 2021 [Released Robust wav2vec 2.0 model](examples/wav2vec/README.md)
* June 2021 [Released XLMR-XL and XLMR-XXL models](examples/xlmr/README.md)
* May 2021 [Released Unsupervised Speech Recognition code](examples/wav2vec/unsupervised/README.md)
* March 2021 [Added full parameter and optimizer state sharding + CPU offloading](examples/fully_sharded_data_parallel/README.md)
* February 2021 [Added LASER training code](examples/laser/README.md)
* December 2020: [Added Adaptive Attention Span code](examples/adaptive_span/README.md)
* December 2020: [GottBERT model and code released](examples/gottbert/README.md)
* November 2020: Adopted the [Hydra](https://github.com/facebookresearch/hydra) configuration framework
  * [see documentation explaining how to use it for new and existing projects](docs/hydra_integration.md)
* November 2020: [fairseq 0.10.0 released](https://github.com/pytorch/fairseq/releases/tag/v0.10.0)
* October 2020: [Added R3F/R4F (Better Fine-Tuning) code](examples/rxf/README.md)
* October 2020: [Deep Transformer with Latent Depth code released](examples/latent_depth/README.md)
* October 2020: [Added CRISS models and code](examples/criss/README.md)

<details><summary>Previous updates</summary><p>

* September 2020: [Added Linformer code](examples/linformer/README.md)
* September 2020: [Added pointer-generator networks](examples/pointer_generator/README.md)
* August 2020: [Added lexically constrained decoding](examples/constrained_decoding/README.md)
* August 2020: [wav2vec2 models and code released](examples/wav2vec/README.md)
* July 2020: [Unsupervised Quality Estimation code released](examples/unsupervised_quality_estimation/README.md)
* May 2020: [Follow fairseq on Twitter](https://twitter.com/fairseq)
* April 2020: [Monotonic Multihead Attention code released](examples/simultaneous_translation/README.md)
* April 2020: [Quant-Noise code released](examples/quant_noise/README.md)
* April 2020: [Initial model parallel support and 11B parameters unidirectional LM released](examples/megatron_11b/README.md)
* March 2020: [Byte-level BPE code released](examples/byte_level_bpe/README.md)
* February 2020: [mBART model and code released](examples/mbart/README.md)
* February 2020: [Added tutorial for back-translation](https://github.com/pytorch/fairseq/tree/main/examples/backtranslation#training-your-own-model-wmt18-english-german)
* December 2019: [fairseq 0.9.0 released](https://github.com/pytorch/fairseq/releases/tag/v0.9.0)
* November 2019: [VizSeq released (a visual analysis toolkit for evaluating fairseq models)](https://facebookresearch.github.io/vizseq/docs/getting_started/fairseq_example)
* November 2019: [CamemBERT model and code released](examples/camembert/README.md)
* November 2019: [BART model and code released](examples/bart/README.md)
* November 2019: [XLM-R models and code released](examples/xlmr/README.md)
* September 2019: [Nonautoregressive translation code released](examples/nonautoregressive_translation/README.md)
* August 2019: [WMT'19 models released](examples/wmt19/README.md)
* July 2019: fairseq relicensed under MIT license
* July 2019: [RoBERTa models and code released](examples/roberta/README.md)
* June 2019: [wav2vec models and code released](examples/wav2vec/README.md)

</p></details>

### Features:

* multi-GPU training on one machine or across multiple machines (data and model parallel)
* fast generation on both CPU and GPU with multiple search algorithms implemented:
  + beam search
  + Diverse Beam Search ([Vijayakumar et al., 2016](https://arxiv.org/abs/1610.02424))
  + sampling (unconstrained, top-k and top-p/nucleus)
  + [lexically constrained decoding](examples/constrained_decoding/README.md) (Post & Vilar, 2018)
* [gradient accumulation](https://fairseq.readthedocs.io/en/latest/getting_started.html#large-mini-batch-training-with-delayed-updates) enables training with large mini-batches even on a single GPU
* [mixed precision training](https://fairseq.readthedocs.io/en/latest/getting_started.html#training-with-half-precision-floating-point-fp16) (trains faster with less GPU memory on [NVIDIA tensor cores](https://developer.nvidia.com/tensor-cores))
* [extensible](https://fairseq.readthedocs.io/en/latest/overview.html): easily register new models, criterions, tasks, optimizers and learning rate schedulers
* [flexible configuration](docs/hydra_integration.md) based on [Hydra](https://github.com/facebookresearch/hydra) allowing a combination of code, command-line and file based configuration
* [full parameter and optimizer state sharding](examples/fully_sharded_data_parallel/README.md)
* [offloading parameters to CPU](examples/fully_sharded_data_parallel/README.md)

We also provide [pre-trained models for translation and language modeling](#pre-trained-models-and-examples)
with a convenient `torch.hub` interface:

``` python
en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'
```

See the PyTorch Hub tutorials for [translation](https://pytorch.org/hub/pytorch_fairseq_translation/)
and [RoBERTa](https://pytorch.org/hub/pytorch_fairseq_roberta/) for more examples.

# Requirements and Installation

* [PyTorch](http://pytorch.org/) version >= 1.10.0
* Python version >= 3.8
* For training new models, you'll also need an NVIDIA GPU and [NCCL](https://github.com/NVIDIA/nccl)
* **To install fairseq** and develop locally:

``` bash
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./

# on MacOS:
# CFLAGS="-stdlib=libc++" pip install --editable ./

# to install the latest stable release (0.10.x)
# pip install fairseq
```

* **For faster training** install NVIDIA's [apex](https://github.com/NVIDIA/apex) library:

``` bash
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
```

* **For large datasets** install [PyArrow](https://arrow.apache.org/docs/python/install.html#using-pip): `pip install pyarrow`
* If you use Docker make sure to increase the shared memory size either with `--ipc=host` or `--shm-size`
 as command line options to `nvidia-docker run` .

# Getting Started

The [full documentation](https://fairseq.readthedocs.io/) contains instructions
for getting started, training new models and extending fairseq with new model
types and tasks.

# Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below,
as well as example training and evaluation commands.

* [Translation](examples/translation/README.md): convolutional and transformer models are available
* [Language Modeling](examples/language_model/README.md): convolutional and transformer models are available

We also have more detailed READMEs to reproduce results from specific papers:

* [XLS-R: Self-supervised Cross-lingual Speech Representation Learning at Scale (Babu et al., 2021)](examples/wav2vec/xlsr/README.md)
* [Cross-lingual Retrieval for Iterative Self-Supervised Training (Tran et al., 2020)](examples/criss/README.md)
* [wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (Baevski et al., 2020)](examples/wav2vec/README.md)
* [Unsupervised Quality Estimation for Neural Machine Translation (Fomicheva et al., 2020)](examples/unsupervised_quality_estimation/README.md)
* [Training with Quantization Noise for Extreme Model Compression ({Fan*, Stock*} et al., 2020)](examples/quant_noise/README.md)
* [Neural Machine Translation with Byte-Level Subwords (Wang et al., 2020)](examples/byte_level_bpe/README.md)
* [Multilingual Denoising Pre-training for Neural Machine Translation (Liu et at., 2020)](examples/mbart/README.md)
* [Reducing Transformer Depth on Demand with Structured Dropout (Fan et al., 2019)](examples/layerdrop/README.md)
* [Jointly Learning to Align and Translate with Transformer Models (Garg et al., 2019)](examples/joint_alignment_translation/README.md)
* [Levenshtein Transformer (Gu et al., 2019)](examples/nonautoregressive_translation/README.md)
* [Facebook FAIR's WMT19 News Translation Task Submission (Ng et al., 2019)](examples/wmt19/README.md)
* [RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)](examples/roberta/README.md)
* [wav2vec: Unsupervised Pre-training for Speech Recognition (Schneider et al., 2019)](examples/wav2vec/README.md)
* [Mixture Models for Diverse Machine Translation: Tricks of the Trade (Shen et al., 2019)](examples/translation_moe/README.md)
* [Pay Less Attention with Lightweight and Dynamic Convolutions (Wu et al., 2019)](examples/pay_less_attention_paper/README.md)
* [Understanding Back-Translation at Scale (Edunov et al., 2018)](examples/backtranslation/README.md)
* [Classical Structured Prediction Losses for Sequence to Sequence Learning (Edunov et al., 2018)](https://github.com/pytorch/fairseq/tree/classic_seqlevel)
* [Hierarchical Neural Story Generation (Fan et al., 2018)](examples/stories/README.md)
* [Scaling Neural Machine Translation (Ott et al., 2018)](examples/scaling_nmt/README.md)
* [Convolutional Sequence to Sequence Learning (Gehring et al., 2017)](examples/conv_seq2seq/README.md)
* [Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)](examples/language_model/README.conv.md)

# Join the fairseq community

* Twitter: https://twitter.com/fairseq
* Facebook page: https://www.facebook.com/groups/fairseq.users
* Google group: https://groups.google.com/forum/#!forum/fairseq-users

# License

fairseq(-py) is MIT-licensed.
The license applies to the pre-trained models as well.

# Citation

Please cite as:

``` bibtex
@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}
```


================================================
FILE: RELEASE.md
================================================
# Creating a New Release

In order to create a new release:

1. Navigate to the [Fairseq Workflows](https://github.com/facebookresearch/fairseq/actions) and find the one named _Fairseq Release_. 

2. Under _Run Workflow_ choose the branch `main` and for _Release Type_ enter either `major`, `minor`, or `patch`.  

3. A branch named `$new_version-release` will be created where the `version.txt` file is updated. Merge those changes into `main`.

4. Make sure that a [new PYPI package](https://pypi.org/project/fairseq/) has been uploaded.

5. Make sure that a [new github release](https://github.com/facebookresearch/fairseq/releases) has been created.


================================================
FILE: docs/Makefile
================================================
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line.
SPHINXOPTS    =
SPHINXBUILD   = python -msphinx
SPHINXPROJ    = fairseq
SOURCEDIR     = .
BUILDDIR      = _build

# Put it first so that "make" without argument is like "make help".
help:
	@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

================================================
FILE: docs/command_line_tools.rst
================================================
.. _Command-line Tools:

Command-line Tools
==================

Fairseq provides several command-line tools for training and evaluating models:

- :ref:`fairseq-preprocess`: Data pre-processing: build vocabularies and binarize training data
- :ref:`fairseq-train`: Train a new model on one or multiple GPUs
- :ref:`fairseq-generate`: Translate pre-processed data with a trained model
- :ref:`fairseq-interactive`: Translate raw text with a trained model
- :ref:`fairseq-score`: BLEU scoring of generated translations against reference translations
- :ref:`fairseq-eval-lm`: Language model evaluation


.. _fairseq-preprocess:

fairseq-preprocess
~~~~~~~~~~~~~~~~~~
.. automodule:: fairseq_cli.preprocess

    .. argparse::
        :module: fairseq.options
        :func: get_preprocessing_parser
        :prog: fairseq-preprocess


.. _fairseq-train:

fairseq-train
~~~~~~~~~~~~~
.. automodule:: fairseq_cli.train

    .. argparse::
        :module: fairseq.options
        :func: get_training_parser
        :prog: fairseq-train


.. _fairseq-generate:

fairseq-generate
~~~~~~~~~~~~~~~~
.. automodule:: fairseq_cli.generate

    .. argparse::
        :module: fairseq.options
        :func: get_generation_parser
        :prog: fairseq-generate


.. _fairseq-interactive:

fairseq-interactive
~~~~~~~~~~~~~~~~~~~
.. automodule:: fairseq_cli.interactive

    .. argparse::
        :module: fairseq.options
        :func: get_interactive_generation_parser
        :prog: fairseq-interactive


.. _fairseq-score:

fairseq-score
~~~~~~~~~~~~~
.. automodule:: fairseq_cli.score

    .. argparse::
        :module: fairseq_cli.score
        :func: get_parser
        :prog: fairseq-score


.. _fairseq-eval-lm:

fairseq-eval-lm
~~~~~~~~~~~~~~~
.. automodule:: fairseq_cli.eval_lm

    .. argparse::
        :module: fairseq.options
        :func: get_eval_lm_parser
        :prog: fairseq-eval-lm


================================================
FILE: docs/conf.py
================================================
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
#
# fairseq documentation build configuration file, created by
# sphinx-quickstart on Fri Aug 17 21:45:30 2018.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.

# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.

import os
import sys
from fairseq import __version__


# source code directory, relative to this file, for sphinx-autobuild
sys.path.insert(0, os.path.abspath(".."))

source_suffix = [".rst"]

# -- General configuration ------------------------------------------------

# If your documentation needs a minimal Sphinx version, state it here.
#
# needs_sphinx = '1.0'

# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
    "sphinx.ext.autodoc",
    "sphinx.ext.intersphinx",
    "sphinx.ext.viewcode",
    "sphinx.ext.napoleon",
    "sphinxarg.ext",
]

# Add any paths that contain templates here, relative to this directory.
templates_path = ["_templates"]

# The master toctree document.
master_doc = "index"

# General information about the project.
project = "fairseq"
copyright = "Facebook AI Research (FAIR)"
author = "Facebook AI Research (FAIR)"

github_doc_root = "https://github.com/pytorch/fairseq/tree/main/docs/"

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
# built documents.
#
# The short X.Y version.
version = __version__
# The full version, including alpha/beta/rc tags.
release = __version__

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
#
# This is also used if you do content translation via gettext catalogs.
# Usually you set "language" from the command line for these cases.
language = None

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# This patterns also effect to html_static_path and html_extra_path
exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]

# The name of the Pygments (syntax highlighting) style to use.
pygments_style = "sphinx"
highlight_language = "python"

# If true, `todo` and `todoList` produce output, else they produce nothing.
todo_include_todos = False


# -- Options for HTML output ----------------------------------------------

html_theme = "classic"

# Example configuration for intersphinx: refer to the Python standard library.
intersphinx_mapping = {
    "numpy": ("http://docs.scipy.org/doc/numpy/", None),
    "python": ("https://docs.python.org/", None),
    "torch": ("https://pytorch.org/docs/master/", None),
}


================================================
FILE: docs/criterions.rst
================================================
.. role:: hidden
    :class: hidden-section

.. _Criterions:

Criterions
==========

Criterions compute the loss function given the model and batch, roughly::

  loss = criterion(model, batch)

.. automodule:: fairseq.criterions
    :members:

.. autoclass:: fairseq.criterions.FairseqCriterion
    :members:
    :undoc-members:

.. autoclass:: fairseq.criterions.adaptive_loss.AdaptiveLoss
    :members:
    :undoc-members:
.. autoclass:: fairseq.criterions.composite_loss.CompositeLoss
    :members:
    :undoc-members:
.. autoclass:: fairseq.criterions.cross_entropy.CrossEntropyCriterion
    :members:
    :undoc-members:
.. autoclass:: fairseq.criterions.label_smoothed_cross_entropy.LabelSmoothedCrossEntropyCriterion
    :members:
    :undoc-members:


================================================
FILE: docs/data.rst
================================================
.. role:: hidden
    :class: hidden-section

.. module:: fairseq.data

Data Loading and Utilities
==========================

.. _datasets:

Datasets
--------

**Datasets** define the data format and provide helpers for creating
mini-batches.

.. autoclass:: fairseq.data.FairseqDataset
    :members:
.. autoclass:: fairseq.data.LanguagePairDataset
    :members:
.. autoclass:: fairseq.data.MonolingualDataset
    :members:

**Helper Datasets**

These datasets wrap other :class:`fairseq.data.FairseqDataset` instances and
provide additional functionality:

.. autoclass:: fairseq.data.BacktranslationDataset
    :members:
.. autoclass:: fairseq.data.ConcatDataset
    :members:
.. autoclass:: fairseq.data.ResamplingDataset
    :members:
.. autoclass:: fairseq.data.RoundRobinZipDatasets
    :members:
.. autoclass:: fairseq.data.TransformEosDataset
    :members:


Dictionary
----------

.. autoclass:: fairseq.data.Dictionary
    :members:


Iterators
---------

.. autoclass:: fairseq.data.CountingIterator
    :members:
.. autoclass:: fairseq.data.EpochBatchIterator
    :members:
.. autoclass:: fairseq.data.GroupedIterator
    :members:
.. autoclass:: fairseq.data.ShardedIterator
    :members:


================================================
FILE: docs/docutils.conf
================================================
[writers]
option-limit=0


================================================
FILE: docs/getting_started.rst
================================================
Evaluating Pre-trained Models
=============================

First, download a pre-trained model along with its vocabularies:

.. code-block:: console

    > curl https://dl.fbaipublicfiles.com/fairseq/models/wmt14.v2.en-fr.fconv-py.tar.bz2 | tar xvjf -

This model uses a `Byte Pair Encoding (BPE)
vocabulary <https://arxiv.org/abs/1508.07909>`__, so we'll have to apply
the encoding to the source text before it can be translated. This can be
done with the
`apply\_bpe.py <https://github.com/rsennrich/subword-nmt/blob/master/subword_nmt/apply_bpe.py>`__
script using the ``wmt14.en-fr.fconv-cuda/bpecodes`` file. ``@@`` is
used as a continuation marker and the original text can be easily
recovered with e.g. ``sed s/@@ //g`` or by passing the ``--remove-bpe``
flag to :ref:`fairseq-generate`. Prior to BPE, input text needs to be tokenized
using ``tokenizer.perl`` from
`mosesdecoder <https://github.com/moses-smt/mosesdecoder>`__.

Let's use :ref:`fairseq-interactive` to generate translations interactively.
Here, we use a beam size of 5 and preprocess the input with the Moses
tokenizer and the given Byte-Pair Encoding vocabulary. It will automatically
remove the BPE continuation markers and detokenize the output.

.. code-block:: console

    > MODEL_DIR=wmt14.en-fr.fconv-py
    > fairseq-interactive \
        --path $MODEL_DIR/model.pt $MODEL_DIR \
        --beam 5 --source-lang en --target-lang fr \
        --tokenizer moses \
        --bpe subword_nmt --bpe-codes $MODEL_DIR/bpecodes
    | loading model(s) from wmt14.en-fr.fconv-py/model.pt
    | [en] dictionary: 44206 types
    | [fr] dictionary: 44463 types
    | Type the input sentence and press return:
    Why is it rare to discover new marine mammal species?
    S-0     Why is it rare to discover new marine mam@@ mal species ?
    H-0     -0.0643349438905716     Pourquoi est-il rare de découvrir de nouvelles espèces de mammifères marins?
    P-0     -0.0763 -0.1849 -0.0956 -0.0946 -0.0735 -0.1150 -0.1301 -0.0042 -0.0321 -0.0171 -0.0052 -0.0062 -0.0015

This generation script produces three types of outputs: a line prefixed
with *O* is a copy of the original source sentence; *H* is the
hypothesis along with an average log-likelihood; and *P* is the
positional score per token position, including the
end-of-sentence marker which is omitted from the text.

Other types of output lines you might see are *D*, the detokenized hypothesis,
*T*, the reference target, *A*, alignment info, *E* the history of generation steps.

See the `README <https://github.com/pytorch/fairseq#pre-trained-models>`__ for a
full list of pre-trained models available.

Training a New Model
====================

The following tutorial is for machine translation. For an example of how
to use Fairseq for other tasks, such as :ref:`language modeling`, please see the
``examples/`` directory.

Data Pre-processing
-------------------

Fairseq contains example pre-processing scripts for several translation
datasets: IWSLT 2014 (German-English), WMT 2014 (English-French) and WMT
2014 (English-German). To pre-process and binarize the IWSLT dataset:

.. code-block:: console

    > cd examples/translation/
    > bash prepare-iwslt14.sh
    > cd ../..
    > TEXT=examples/translation/iwslt14.tokenized.de-en
    > fairseq-preprocess --source-lang de --target-lang en \
        --trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test \
        --destdir data-bin/iwslt14.tokenized.de-en

This will write binarized data that can be used for model training to
``data-bin/iwslt14.tokenized.de-en``.

Training
--------

Use :ref:`fairseq-train` to train a new model. Here a few example settings that work
well for the IWSLT 2014 dataset:

.. code-block:: console

    > mkdir -p checkpoints/fconv
    > CUDA_VISIBLE_DEVICES=0 fairseq-train data-bin/iwslt14.tokenized.de-en \
        --optimizer nag --lr 0.25 --clip-norm 0.1 --dropout 0.2 --max-tokens 4000 \
        --arch fconv_iwslt_de_en --save-dir checkpoints/fconv

By default, :ref:`fairseq-train` will use all available GPUs on your machine. Use the
``CUDA_VISIBLE_DEVICES`` environment variable to select specific GPUs and/or to
change the number of GPU devices that will be used.

Also note that the batch size is specified in terms of the maximum
number of tokens per batch (``--max-tokens``). You may need to use a
smaller value depending on the available GPU memory on your system.

Generation
----------

Once your model is trained, you can generate translations using
:ref:`fairseq-generate` **(for binarized data)** or
:ref:`fairseq-interactive` **(for raw text)**:

.. code-block:: console

    > fairseq-generate data-bin/iwslt14.tokenized.de-en \
        --path checkpoints/fconv/checkpoint_best.pt \
        --batch-size 128 --beam 5
    | [de] dictionary: 35475 types
    | [en] dictionary: 24739 types
    | data-bin/iwslt14.tokenized.de-en test 6750 examples
    | model fconv
    | loaded checkpoint trainings/fconv/checkpoint_best.pt
    S-721   danke .
    T-721   thank you .
    ...

To generate translations with only a CPU, use the ``--cpu`` flag. BPE
continuation markers can be removed with the ``--remove-bpe`` flag.

Advanced Training Options
=========================

Large mini-batch training with delayed updates
----------------------------------------------

The ``--update-freq`` option can be used to accumulate gradients from
multiple mini-batches and delay updating, creating a larger effective
batch size. Delayed updates can also improve training speed by reducing
inter-GPU communication costs and by saving idle time caused by variance
in workload across GPUs. See `Ott et al.
(2018) <https://arxiv.org/abs/1806.00187>`__ for more details.

To train on a single GPU with an effective batch size that is equivalent
to training on 8 GPUs:

.. code-block:: console

    > CUDA_VISIBLE_DEVICES=0 fairseq-train --update-freq 8 (...)

Training with half precision floating point (FP16)
--------------------------------------------------

.. note::

    FP16 training requires a Volta GPU and CUDA 9.1 or greater

Recent GPUs enable efficient half precision floating point computation,
e.g., using `Nvidia Tensor Cores
<https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html>`__.
Fairseq supports FP16 training with the ``--fp16`` flag:

.. code-block:: console

    > fairseq-train --fp16 (...)

Distributed training
--------------------

Distributed training in fairseq is implemented on top of ``torch.distributed``.
The easiest way to launch jobs is with the `torch.distributed.launch
<https://pytorch.org/docs/stable/distributed.html#launch-utility>`__ tool.

For example, to train a large English-German Transformer model on 2 nodes each
with 8 GPUs (in total 16 GPUs), run the following command on each node,
replacing ``node_rank=0`` with ``node_rank=1`` on the second node and making
sure to update ``--master_addr`` to the IP address of the first node:

.. code-block:: console

    > python -m torch.distributed.launch --nproc_per_node=8 \
        --nnodes=2 --node_rank=0 --master_addr="192.168.1.1" \
        --master_port=12345 \
        $(which fairseq-train) data-bin/wmt16_en_de_bpe32k \
        --arch transformer_vaswani_wmt_en_de_big --share-all-embeddings \
        --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
        --lr-scheduler inverse_sqrt --warmup-init-lr 1e-07 --warmup-updates 4000 \
        --lr 0.0005 \
        --dropout 0.3 --weight-decay 0.0 --criterion label_smoothed_cross_entropy --label-smoothing 0.1 \
        --max-tokens 3584 \
        --max-epoch 70 \
        --fp16

On SLURM clusters, fairseq will automatically detect the number of nodes and
GPUs, but a port number must be provided:

.. code-block:: console

    > salloc --gpus=16 --nodes 2 (...)
    > srun fairseq-train --distributed-port 12345 (...).


.. warning::

    PyTorch Distributed features used in fairseq are intended for internal
    communication only. They are not built for use in untrusted environments or
    networks.

    For performance reasons, none of the PyTorch Distributed primitives include
    any authorization protocol and will send messages unencrypted. They accept
    connections from anywhere, and execute the workload sent without performing
    any checks. Therefore, if you run a distributed fairseq job on your network,
    anybody with access to the network can execute arbitrary code with the
    privileges of the user running the job.

Sharding very large datasets
----------------------------

It can be challenging to train over very large datasets, particularly if your
machine does not have much system RAM. Most tasks in fairseq support training
over "sharded" datasets, in which the original dataset has been preprocessed
into non-overlapping chunks (or "shards").

For example, instead of preprocessing all your data into a single "data-bin"
directory, you can split the data and create "data-bin1", "data-bin2", etc.
Then you can adapt your training command like so:

.. code-block:: console

    > fairseq-train data-bin1:data-bin2:data-bin3 (...)

Training will now iterate over each shard, one by one, with each shard
corresponding to an "epoch", thus reducing system memory usage.


================================================
FILE: docs/hydra_integration.md
================================================
## Hydra

[Hydra](https://github.com/facebookresearch/hydra) is an open-source Python
framework that simplifies the development of research and other complex
applications. The key feature is the ability to dynamically create a
hierarchical configuration by composition and override it through config files
and the command line. The name Hydra comes from its ability to run multiple
similar jobs - much like a Hydra with multiple heads.

## Motivation

Until recently, all components in fairseq were configured through a shared
`args` namespace that was created at application startup. Components declared
their own `add_args` method to update the argparse parser, hoping that the names
would not clash with arguments from other components. While this model works for
smaller applications, as fairseq grew and became integrated into other
applications, this became problematic. In order to determine how to configure
each component, one needed to a) examine what args were added by this component,
and b) read the code to figure out what shared arguments it is using that were
added in other places. Reproducing models involved sharing commands that often
contained dozens of command line switches.

The model described above is still supported by fairseq for backward
compatibility, but will be deprecated some time in the future.

New components in fairseq should now create a dataclass that encapsulates all
parameters required to configure this component. The dataclass is registered
along with the component, and fairseq takes care of constructing and providing
this configuration object to the component's constructor. Note that sharing
parameters can optionally still work, but one has to explicitly point to the
"source of truth" (see inheritance example below). These changes make components
in fairseq more independent and re-usable by other applications: all that is
needed to create a component is to initialize its dataclass and overwrite some
of the defaults.

While configuring fairseq through command line (using either the legacy argparse
based or the new Hydra based entry points) is still fully supported, you can now
take advantage of configuring fairseq completely or piece-by-piece through
hierarchical YAML configuration files. These files can also be shipped as
examples that others can use to run an identically configured job.

Additionally, Hydra has a rich and growing [library of
plugins](https://github.com/facebookresearch/hydra/tree/master/plugins) that
provide functionality such as hyperparameter sweeping (including using bayesian
optimization through the [Ax](https://github.com/facebook/Ax) library), job
launching across various platforms, and more.

## Creating or migrating components

In general, each new (or updated) component should provide a companion
[dataclass](https://www.python.org/dev/peps/pep-0557/). These dataclass are
typically located in the same file as the component and are passed as arguments
to the `register_*()` functions. Top-level configs that should be present in
every fairseq application are placed in the
[global](fairseq/dataclass/configs.py) config file and added to the
`FairseqConfig` object.

Each dataclass is a plain-old-data object, similar to a `NamedTuple`. These
classes are decorated with a `@dataclass` decorator, and typically inherit from
`FairseqDataclass` (which adds some functionality for backward compatibility).
Each field must have a type, and generally has metadata (such as a help string)
and a default value. Only primitive types or other config objects are allowed as
data types for each field.

#### Example:

```python
from dataclasses import dataclass, field
from fairseq.dataclass import FairseqDataclass

@dataclass
class InteractiveConfig(FairseqDataclass):
    buffer_size: int = field(
        default=0,
        metadata={
            "help": "read this many sentences into a buffer before processing them"
        },
    )
    input: str = field(
        default="-",
        metadata={"help": "file to read from; use - for stdin"},
    )
```

### Inherting values

Some components require sharing a value. For example, a learning rate scheduler
and an optimizer may both need to know the initial learning rate value. One can
declare a field that, by default, will inherit its value from another config
node in the same hierarchy:

```python
@dataclass
FairseqAdamConfig(FairseqDataclass):
    ...
    lr: List[float] = II("optimization.lr")
    ...
```

`II("optimization.lr")` is syntactic sugar for `"${optimization.lr}"`, which is
the value one can use in a YAML config file or through command line to achieve
the same effect. Note that this assumes that there is an "optimization" config
object in the root config and it has a field called "lr".

### Tasks and Models

Creating Tasks and Models works same as before, except that legacy
implementations now inherit from `LegacyFairseq*` base classes, while new
components inherit from `FairseqTask` and `FairseqModel` and provide a dataclass
to the `register_*()` functions.

#### Task example:

```python
@dataclass
class LanguageModelingConfig(FairseqDataclass):
    data: Optional[str] = field(
        default=None, metadata={"help": "path to data directory"}
    )
    ...

@register_task("language_modeling", dataclass=LanguageModelingConfig)
class LanguageModelingTask(FairseqTask):
    ...
    @classmethod
    def setup_task(cls, cfg: LanguageModelingConfig):
        ...
```

#### Model example:

```python
@dataclass
class TransformerLanguageModelConfig(FairseqDataclass):
    activation_fn: ChoiceEnum(utils.get_available_activation_fns()) = field(
        default="relu", metadata={"help": "activation function to use"}
    )
    dropout: float = field(default=0.1, metadata={"help": "dropout probability"})
    ...

@register_model("transformer_lm", dataclass=TransformerLanguageModelConfig)
class TransformerLanguageModel(FairseqLanguageModel):
    ...
    @classmethod
    def build_model(cls, cfg: TransformerLanguageModelConfig, task: FairseqTask):
        ...
```

### Other components

Other components work as before, but they now take their configuration dataclass
as the only constructor argument:

```python
@dataclass
class MosesTokenizerConfig(FairseqDataclass):
    source_lang: str = field(default="en", metadata={"help": "source language"})
    ...

@register_tokenizer("moses", dataclass=MosesTokenizerConfig)
class MosesTokenizer(object):
    def __init__(self, cfg: MosesTokenizerConfig):
        ...
```

Note that if you are adding a new registry for a new set of components, you need
to add it to the `FairseqConfig` object in `fairseq/dataclass/configs.py`:

```python
@dataclass
class FairseqConfig(object):
    ...
    my_new_registry: Any = None
```

## Training with `fairseq-hydra-train`

To fully take advantage of configuration flexibility offered by Hydra, you may
want to train new models using the `fairseq-hydra-train` entry point. Legacy CLI
tools such as `fairseq-train` will remain supported for the foreseeable future
but will be deprecated eventually.

On startup, Hydra will create a configuration object that contains a hierarchy
of all the necessary dataclasses populated with their default values in the
code. The default values are overwritten by values found in YAML files in
`fairseq/config` directory (which currently sets minimal defaults) and then
further overwritten by values provided through command line arguments.

Some of the most common use cases are shown below:

### 1. Override default values through command line:

```shell script
$ fairseq-hydra-train \
    distributed_training.distributed_world_size=1 \
    dataset.batch_size=2 \
    task.data=data-bin \
    model=transformer_lm/transformer_lm_gpt \
    task=language_modeling \
    optimization.max_update=5000
```

Note that along with explicitly providing values for parameters such as
`dataset.batch_size`, this also tells Hydra to overlay configuration found in
`fairseq/config/model/transformer_lm/transformer_lm_gpt.yaml` over the default
values in the dataclass. If you want to train a model without specifying a
particular architecture you can simply specify `model=transformer_lm`. This only
works for migrated tasks and models.

### 2. Replace bundled configs with an external config:

```shell script
$ fairseq-hydra-train \
    --config-dir /path/to/external/configs \
    --config-name wiki103
```

where `/path/to/external/configs/wiki103.yaml` contains:

```yaml
# @package _group_

model:
  _name: transformer_lm
distributed_training:
  distributed_world_size: 1
dataset:
  batch_size: 2
task:
  _name: language_modeling
  data: /path/to/data
  add_bos_token: false
  max_target_positions: 1024
optimization:
  max_update: 50000
  lr: [ 0.25 ]
criterion: cross_entropy
optimizer: adam
lr_scheduler:
  _name: cosine
```

Note that here bundled configs from `fairseq/config` directory are not used,
however the defaults from each dataclass will still be used (unless overwritten
by your external config).

Additionally you can choose to break up your configs by creating a directory
structure in the same location as your main config file, with the names of the
top-level fields (such as "model", "dataset", etc), and placing config files
with meaningful names that would populate that specific section of your
top-level config file (for example, you might have
`model/small_transformer_lm.yaml`, `model/big_transformer_lm.yaml`, etc). You
can then specify the correct configuration via command line, defaults in the
main config, or even launch all of them as a sweep (see Hydra documentation on
how to do this).

### 3. Add an external config directory to Hydra search path:

This allows combining default configuration (including using any bundled config
files), while specifying your own config files for some parts of the
configuration.

```shell script
$ fairseq-hydra-train \
    distributed_training.distributed_world_size=1 \
    dataset.batch_size=2 \
    task.data=/path/to/data/ \
    model=transformer_lm/2_layers \
    task=language_modeling \
    optimization.max_update=5000 \
    --config-dir /path/to/external/configs
```

where `/path/to/external/configs` has the following structure:
```
.
+-- model
|   +-- transformer_lm
|   |   +-- 2_layers.yaml
```

and `2_layers.yaml` contains a copy of `transformer_lm_gpt.yaml` but with
`decoder_layers` set to 2. You can add other configs to configure other
components as well.


================================================
FILE: docs/index.rst
================================================
.. fairseq documentation master file, created by
   sphinx-quickstart on Fri Aug 17 21:45:30 2018.
   You can adapt this file completely to your liking, but it should at least
   contain the root `toctree` directive.

:github_url: https://github.com/pytorch/fairseq


fairseq documentation
=====================

Fairseq is a sequence modeling toolkit written in `PyTorch
<http://pytorch.org/>`_ that allows researchers and developers to
train custom models for translation, summarization, language modeling and other
text generation tasks.

.. toctree::
    :maxdepth: 1
    :caption: Getting Started

    getting_started
    command_line_tools

.. toctree::
    :maxdepth: 1
    :caption: Extending Fairseq

    overview
    tutorial_simple_lstm
    tutorial_classifying_names

.. toctree::
    :maxdepth: 2
    :caption: Library Reference

    tasks
    models
    criterions
    optim
    lr_scheduler
    data
    modules


Indices and tables
==================

* :ref:`genindex`
* :ref:`search`


================================================
FILE: docs/lr_scheduler.rst
================================================
.. role:: hidden
    :class: hidden-section

.. _Learning Rate Schedulers:

Learning Rate Schedulers
========================

Learning Rate Schedulers update the learning rate over the course of training.
Learning rates can be updated after each update via :func:`step_update` or at
epoch boundaries via :func:`step`.

.. automodule:: fairseq.optim.lr_scheduler
    :members:

.. autoclass:: fairseq.optim.lr_scheduler.FairseqLRScheduler
    :members:
    :undoc-members:

.. autoclass:: fairseq.optim.lr_scheduler.cosine_lr_scheduler.CosineSchedule
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.lr_scheduler.fixed_schedule.FixedSchedule
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.lr_scheduler.inverse_square_root_schedule.InverseSquareRootSchedule
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.lr_scheduler.reduce_lr_on_plateau.ReduceLROnPlateau
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.lr_scheduler.triangular_lr_scheduler.TriangularSchedule
    :members:
    :undoc-members:


================================================
FILE: docs/make.bat
================================================
@ECHO OFF

pushd %~dp0

REM Command file for Sphinx documentation

if "%SPHINXBUILD%" == "" (
	set SPHINXBUILD=python -msphinx
)
set SOURCEDIR=.
set BUILDDIR=_build
set SPHINXPROJ=fairseq

if "%1" == "" goto help

%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
	echo.
	echo.The Sphinx module was not found. Make sure you have Sphinx installed,
	echo.then set the SPHINXBUILD environment variable to point to the full
	echo.path of the 'sphinx-build' executable. Alternatively you may add the
	echo.Sphinx directory to PATH.
	echo.
	echo.If you don't have Sphinx installed, grab it from
	echo.http://sphinx-doc.org/
	exit /b 1
)

%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%
goto end

:help
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS%

:end
popd


================================================
FILE: docs/models.rst
================================================
.. role:: hidden
    :class: hidden-section

.. module:: fairseq.models

.. _Models:

Models
======

A Model defines the neural network's ``forward()`` method and encapsulates all
of the learnable parameters in the network. Each model also provides a set of
named *architectures* that define the precise network configuration (e.g.,
embedding dimension, number of layers, etc.).

Both the model type and architecture are selected via the ``--arch``
command-line argument. Once selected, a model may expose additional command-line
arguments for further configuration.

.. note::

    All fairseq Models extend :class:`BaseFairseqModel`, which in turn extends
    :class:`torch.nn.Module`. Thus any fairseq Model can be used as a
    stand-alone Module in other PyTorch code.


Convolutional Neural Networks (CNN)
-----------------------------------

.. module:: fairseq.models.fconv
.. autoclass:: fairseq.models.fconv.FConvModel
    :members:
.. autoclass:: fairseq.models.fconv.FConvEncoder
    :members:
    :undoc-members:
.. autoclass:: fairseq.models.fconv.FConvDecoder
    :members:


Long Short-Term Memory (LSTM) networks
--------------------------------------

.. module:: fairseq.models.lstm
.. autoclass:: fairseq.models.lstm.LSTMModel
    :members:
.. autoclass:: fairseq.models.lstm.LSTMEncoder
    :members:
.. autoclass:: fairseq.models.lstm.LSTMDecoder
    :members:


Transformer (self-attention) networks
-------------------------------------

.. module:: fairseq.models.transformer
.. autoclass:: fairseq.models.transformer.TransformerModel
    :members:
.. autoclass:: fairseq.models.transformer.TransformerEncoder
    :members:
.. autoclass:: fairseq.models.transformer.TransformerEncoderLayer
    :members:
.. autoclass:: fairseq.models.transformer.TransformerDecoder
    :members:
.. autoclass:: fairseq.models.transformer.TransformerDecoderLayer
    :members:


Adding new models
-----------------

.. currentmodule:: fairseq.models
.. autofunction:: fairseq.models.register_model
.. autofunction:: fairseq.models.register_model_architecture
.. autoclass:: fairseq.models.BaseFairseqModel
    :members:
    :undoc-members:
.. autoclass:: fairseq.models.FairseqEncoderDecoderModel
    :members:
    :undoc-members:
.. autoclass:: fairseq.models.FairseqEncoderModel
    :members:
    :undoc-members:
.. autoclass:: fairseq.models.FairseqLanguageModel
    :members:
    :undoc-members:
.. autoclass:: fairseq.models.FairseqMultiModel
    :members:
    :undoc-members:
.. autoclass:: fairseq.models.FairseqEncoder
    :members:
.. autoclass:: fairseq.models.CompositeEncoder
    :members:
.. autoclass:: fairseq.models.FairseqDecoder
    :members:


.. _Incremental decoding:

Incremental decoding
--------------------

.. autoclass:: fairseq.models.FairseqIncrementalDecoder
    :members:
    :undoc-members:


================================================
FILE: docs/modules.rst
================================================
Modules
=======

Fairseq provides several stand-alone :class:`torch.nn.Module` classes that may
be helpful when implementing a new :class:`~fairseq.models.BaseFairseqModel`.

.. automodule:: fairseq.modules
    :members:
    :undoc-members:


================================================
FILE: docs/optim.rst
================================================
.. role:: hidden
    :class: hidden-section

.. _optimizers:

Optimizers
==========

Optimizers update the Model parameters based on the gradients.

.. automodule:: fairseq.optim
    :members:

.. autoclass:: fairseq.optim.FairseqOptimizer
    :members:
    :undoc-members:

.. autoclass:: fairseq.optim.adadelta.Adadelta
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.adagrad.Adagrad
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.adafactor.FairseqAdafactor
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.adam.FairseqAdam
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.fp16_optimizer.FP16Optimizer
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.nag.FairseqNAG
    :members:
    :undoc-members:
.. autoclass:: fairseq.optim.sgd.SGD
    :members:
    :undoc-members:


================================================
FILE: docs/overview.rst
================================================
Overview
========

Fairseq can be extended through user-supplied `plug-ins
<https://en.wikipedia.org/wiki/Plug-in_(computing)>`_. We support five kinds of
plug-ins:

- :ref:`Models` define the neural network architecture and encapsulate all of the
  learnable parameters.
- :ref:`Criterions` compute the loss function given the model outputs and targets.
- :ref:`Tasks` store dictionaries and provide helpers for loading/iterating over
  Datasets, initializing the Model/Criterion and calculating the loss.
- :ref:`Optimizers` update the Model parameters based on the gradients.
- :ref:`Learning Rate Schedulers` update the learning rate over the course of
  training.

**Training Flow**

Given a ``model``, ``criterion``, ``task``, ``optimizer`` and ``lr_scheduler``,
fairseq implements the following high-level training flow::

  for epoch in range(num_epochs):
      itr = task.get_batch_iterator(task.dataset('train'))
      for num_updates, batch in enumerate(itr):
          task.train_step(batch, model, criterion, optimizer)
          average_and_clip_gradients()
          optimizer.step()
          lr_scheduler.step_update(num_updates)
      lr_scheduler.step(epoch)

where the default implementation for ``task.train_step`` is roughly::

  def train_step(self, batch, model, criterion, optimizer, **unused):
      loss = criterion(model, batch)
      optimizer.backward(loss)
      return loss

**Registering new plug-ins**

New plug-ins are *registered* through a set of ``@register`` function
decorators, for example::

  @register_model('my_lstm')
  class MyLSTM(FairseqEncoderDecoderModel):
      (...)

Once registered, new plug-ins can be used with the existing :ref:`Command-line
Tools`. See the Tutorial sections for more detailed walkthroughs of how to add
new plug-ins.

**Loading plug-ins from another directory**

New plug-ins can be defined in a custom module stored in the user system. In
order to import the module, and make the plugin available to *fairseq*, the
command line supports the ``--user-dir`` flag that can be used to specify a
custom location for additional modules to load into *fairseq*.

For example, assuming this directory tree::

  /home/user/my-module/
  └── __init__.py
  
with ``__init__.py``::

  from fairseq.models import register_model_architecture
  from fairseq.models.transformer import transformer_vaswani_wmt_en_de_big

  @register_model_architecture('transformer', 'my_transformer')
  def transformer_mmt_big(args):
      transformer_vaswani_wmt_en_de_big(args)

it is possible to invoke the :ref:`fairseq-train` script with the new architecture with::

  fairseq-train ... --user-dir /home/user/my-module -a my_transformer --task translation


================================================
FILE: docs/tasks.rst
================================================
.. role:: hidden
    :class: hidden-section

.. module:: fairseq.tasks

.. _Tasks:

Tasks
=====

Tasks store dictionaries and provide helpers for loading/iterating over
Datasets, initializing the Model/Criterion and calculating the loss.

Tasks can be selected via the ``--task`` command-line argument. Once selected, a
task may expose additional command-line arguments for further configuration.

Example usage::

    # setup the task (e.g., load dictionaries)
    task = fairseq.tasks.setup_task(args)

    # build model and criterion
    model = task.build_model(args)
    criterion = task.build_criterion(args)

    # load datasets
    task.load_dataset('train')
    task.load_dataset('valid')

    # iterate over mini-batches of data
    batch_itr = task.get_batch_iterator(
        task.dataset('train'), max_tokens=4096,
    )
    for batch in batch_itr:
        # compute the loss
        loss, sample_size, logging_output = task.get_loss(
            model, criterion, batch,
        )
        loss.backward()


Translation
-----------

.. autoclass:: fairseq.tasks.translation.TranslationTask

.. _language modeling:

Language Modeling
-----------------

.. autoclass:: fairseq.tasks.language_modeling.LanguageModelingTask


Adding new tasks
----------------

.. autofunction:: fairseq.tasks.register_task
.. autoclass:: fairseq.tasks.FairseqTask
    :members:
    :undoc-members:


================================================
FILE: docs/tutorial_classifying_names.rst
================================================
Tutorial: Classifying Names with a Character-Level RNN
======================================================

In this tutorial we will extend fairseq to support *classification* tasks. In
particular we will re-implement the PyTorch tutorial for `Classifying Names with
a Character-Level RNN <https://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial.html>`_
in fairseq. It is recommended to quickly skim that tutorial before beginning
this one.

This tutorial covers:

1. **Preprocessing the data** to create dictionaries.
2. **Registering a new Model** that encodes an input sentence with a simple RNN
   and predicts the output label.
3. **Registering a new Task** that loads our dictionaries and dataset.
4. **Training the Model** using the existing command-line tools.
5. **Writing an evaluation script** that imports fairseq and allows us to
   interactively evaluate our model on new inputs.


1. Preprocessing the data
-------------------------

The original tutorial provides raw data, but we'll work with a modified version
of the data that is already tokenized into characters and split into separate
train, valid and test sets.

Download and extract the data from here:
`tutorial_names.tar.gz <https://dl.fbaipublicfiles.com/fairseq/data/tutorial_names.tar.gz>`_

Once extracted, let's preprocess the data using the :ref:`fairseq-preprocess`
command-line tool to create the dictionaries. While this tool is primarily
intended for sequence-to-sequence problems, we're able to reuse it here by
treating the label as a "target" sequence of length 1. We'll also output the
preprocessed files in "raw" format using the ``--dataset-impl`` option to
enhance readability:

.. code-block:: console

  > fairseq-preprocess \
    --trainpref names/train --validpref names/valid --testpref names/test \
    --source-lang input --target-lang label \
    --destdir names-bin --dataset-impl raw

After running the above command you should see a new directory,
:file:`names-bin/`, containing the dictionaries for *inputs* and *labels*.


2. Registering a new Model
--------------------------

Next we'll register a new model in fairseq that will encode an input sentence
with a simple RNN and predict the output label. Compared to the original PyTorch
tutorial, our version will also work with batches of data and GPU Tensors.

First let's copy the simple RNN module implemented in the `PyTorch tutorial
<https://pytorch.org/tutorials/intermediate/char_rnn_classification_tutorial.html#creating-the-network>`_.
Create a new file named :file:`fairseq/models/rnn_classifier.py` with the
following contents::

    import torch
    import torch.nn as nn

    class RNN(nn.Module):

        def __init__(self, input_size, hidden_size, output_size):
            super(RNN, self).__init__()

            self.hidden_size = hidden_size

            self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
            self.i2o = nn.Linear(input_size + hidden_size, output_size)
            self.softmax = nn.LogSoftmax(dim=1)

        def forward(self, input, hidden):
            combined = torch.cat((input, hidden), 1)
            hidden = self.i2h(combined)
            output = self.i2o(combined)
            output = self.softmax(output)
            return output, hidden

        def initHidden(self):
            return torch.zeros(1, self.hidden_size)

We must also *register* this model with fairseq using the
:func:`~fairseq.models.register_model` function decorator. Once the model is
registered we'll be able to use it with the existing :ref:`Command-line Tools`.

All registered models must implement the :class:`~fairseq.models.BaseFairseqModel`
interface, so we'll create a small wrapper class in the same file and register
it in fairseq with the name ``'rnn_classifier'``::

    from fairseq.models import BaseFairseqModel, register_model

    # Note: the register_model "decorator" should immediately precede the
    # definition of the Model class.

    @register_model('rnn_classifier')
    class FairseqRNNClassifier(BaseFairseqModel):

        @staticmethod
        def add_args(parser):
            # Models can override this method to add new command-line arguments.
            # Here we'll add a new command-line argument to configure the
            # dimensionality of the hidden state.
            parser.add_argument(
                '--hidden-dim', type=int, metavar='N',
                help='dimensionality of the hidden state',
            )

        @classmethod
        def build_model(cls, args, task):
            # Fairseq initializes models by calling the ``build_model()``
            # function. This provides more flexibility, since the returned model
            # instance can be of a different type than the one that was called.
            # In this case we'll just return a FairseqRNNClassifier instance.

            # Initialize our RNN module
            rnn = RNN(
                # We'll define the Task in the next section, but for now just
                # notice that the task holds the dictionaries for the "source"
                # (i.e., the input sentence) and "target" (i.e., the label).
                input_size=len(task.source_dictionary),
                hidden_size=args.hidden_dim,
                output_size=len(task.target_dictionary),
            )

            # Return the wrapped version of the module
            return FairseqRNNClassifier(
                rnn=rnn,
                input_vocab=task.source_dictionary,
            )

        def __init__(self, rnn, input_vocab):
            super(FairseqRNNClassifier, self).__init__()

            self.rnn = rnn
            self.input_vocab = input_vocab

            # The RNN module in the tutorial expects one-hot inputs, so we can
            # precompute the identity matrix to help convert from indices to
            # one-hot vectors. We register it as a buffer so that it is moved to
            # the GPU when ``cuda()`` is called.
            self.register_buffer('one_hot_inputs', torch.eye(len(input_vocab)))

        def forward(self, src_tokens, src_lengths):
            # The inputs to the ``forward()`` function are determined by the
            # Task, and in particular the ``'net_input'`` key in each
            # mini-batch. We'll define the Task in the next section, but for
            # now just know that *src_tokens* has shape `(batch, src_len)` and
            # *src_lengths* has shape `(batch)`.
            bsz, max_src_len = src_tokens.size()

            # Initialize the RNN hidden state. Compared to the original PyTorch
            # tutorial we'll also handle batched inputs and work on the GPU.
            hidden = self.rnn.initHidden()
            hidden = hidden.repeat(bsz, 1)  # expand for batched inputs
            hidden = hidden.to(src_tokens.device)  # move to GPU

            for i in range(max_src_len):
                # WARNING: The inputs have padding, so we should mask those
                # elements here so that padding doesn't affect the results.
                # This is left as an exercise for the reader. The padding symbol
                # is given by ``self.input_vocab.pad()`` and the unpadded length
                # of each input is given by *src_lengths*.

                # One-hot encode a batch of input characters.
                input = self.one_hot_inputs[src_tokens[:, i].long()]

                # Feed the input to our RNN.
                output, hidden = self.rnn(input, hidden)

            # Return the final output state for making a prediction
            return output

Finally let's define a *named architecture* with the configuration for our
model. This is done with the :func:`~fairseq.models.register_model_architecture`
function decorator. Thereafter this named architecture can be used with the
``--arch`` command-line argument, e.g., ``--arch pytorch_tutorial_rnn``::

    from fairseq.models import register_model_architecture

    # The first argument to ``register_model_architecture()`` should be the name
    # of the model we registered above (i.e., 'rnn_classifier'). The function we
    # register here should take a single argument *args* and modify it in-place
    # to match the desired architecture.

    @register_model_architecture('rnn_classifier', 'pytorch_tutorial_rnn')
    def pytorch_tutorial_rnn(args):
        # We use ``getattr()`` to prioritize arguments that are explicitly given
        # on the command-line, so that the defaults defined below are only used
        # when no other value has been specified.
        args.hidden_dim = getattr(args, 'hidden_dim', 128)


3. Registering a new Task
-------------------------

Now we'll register a new :class:`~fairseq.tasks.FairseqTask` that will load our
dictionaries and dataset. Tasks can also control how the data is batched into
mini-batches, but in this tutorial we'll reuse the batching provided by
:class:`fairseq.data.LanguagePairDataset`.

Create a new file named :file:`fairseq/tasks/simple_classification.py` with the
following contents::

  import os
  import torch

  from fairseq.data import Dictionary, LanguagePairDataset
  from fairseq.tasks import LegacyFairseqTask, register_task


  @register_task('simple_classification')
  class SimpleClassificationTask(LegacyFairseqTask):

      @staticmethod
      def add_args(parser):
          # Add some command-line arguments for specifying where the data is
          # located and the maximum supported input length.
          parser.add_argument('data', metavar='FILE',
                              help='file prefix for data')
          parser.add_argument('--max-positions', default=1024, type=int,
                              help='max input length')

      @classmethod
      def setup_task(cls, args, **kwargs):
          # Here we can perform any setup required for the task. This may include
          # loading Dictionaries, initializing shared Embedding layers, etc.
          # In this case we'll just load the Dictionaries.
          input_vocab = Dictionary.load(os.path.join(args.data, 'dict.input.txt'))
          label_vocab = Dictionary.load(os.path.join(args.data, 'dict.label.txt'))
          print('| [input] dictionary: {} types'.format(len(input_vocab)))
          print('| [label] dictionary: {} types'.format(len(label_vocab)))

          return SimpleClassificationTask(args, input_vocab, label_vocab)

      def __init__(self, args, input_vocab, label_vocab):
          super().__init__(args)
          self.input_vocab = input_vocab
          self.label_vocab = label_vocab

      def load_dataset(self, split, **kwargs):
          """Load a given dataset split (e.g., train, valid, test)."""

          prefix = os.path.join(self.args.data, '{}.input-label'.format(split))

          # Read input sentences.
          sentences, lengths = [], []
          with open(prefix + '.input', encoding='utf-8') as file:
              for line in file:
                  sentence = line.strip()

                  # Tokenize the sentence, splitting on spaces
                  tokens = self.input_vocab.encode_line(
                      sentence, add_if_not_exist=False,
                  )

                  sentences.append(tokens)
                  lengths.append(tokens.numel())

          # Read labels.
          labels = []
          with open(prefix + '.label', encoding='utf-8') as file:
              for line in file:
                  label = line.strip()
                  labels.append(
                      # Convert label to a numeric ID.
                      torch.LongTensor([self.label_vocab.add_symbol(label)])
                  )

          assert len(sentences) == len(labels)
          print('| {} {} {} examples'.format(self.args.data, split, len(sentences)))

          # We reuse LanguagePairDataset since classification can be modeled as a
          # sequence-to-sequence task where the target sequence has length 1.
          self.datasets[split] = LanguagePairDataset(
              src=sentences,
              src_sizes=lengths,
              src_dict=self.input_vocab,
              tgt=labels,
              tgt_sizes=torch.ones(len(labels)),  # targets have length 1
              tgt_dict=self.label_vocab,
              left_pad_source=False,
              # Since our target is a single class label, there's no need for
              # teacher forcing. If we set this to ``True`` then our Model's
              # ``forward()`` method would receive an additional argument called
              # *prev_output_tokens* that would contain a shifted version of the
              # target sequence.
              input_feeding=False,
          )

      def max_positions(self):
          """Return the max input length allowed by the task."""
          # The source should be less than *args.max_positions* and the "target"
          # has max length 1.
          return (self.args.max_positions, 1)

      @property
      def source_dictionary(self):
          """Return the source :class:`~fairseq.data.Dictionary`."""
          return self.input_vocab

      @property
      def target_dictionary(self):
          """Return the target :class:`~fairseq.data.Dictionary`."""
          return self.label_vocab

      # We could override this method if we wanted more control over how batches
      # are constructed, but it's not necessary for this tutorial since we can
      # reuse the batching provided by LanguagePairDataset.
      #
      # def get_batch_iterator(
      #     self, dataset, max_tokens=None, max_sentences=None, max_positions=None,
      #     ignore_invalid_inputs=False, required_batch_size_multiple=1,
      #     seed=1, num_shards=1, shard_id=0, num_workers=0, epoch=1,
      #     data_buffer_size=0, disable_iterator_cache=False,
      # ):
      #     (...)


4. Training the Model
---------------------

Now we're ready to train the model. We can use the existing :ref:`fairseq-train`
command-line tool for this, making sure to specify our new Task (``--task
simple_classification``) and Model architecture (``--arch
pytorch_tutorial_rnn``):

.. note::

  You can also configure the dimensionality of the hidden state by passing the
  ``--hidden-dim`` argument to :ref:`fairseq-train`.

.. code-block:: console

  > fairseq-train names-bin \
    --task simple_classification \
    --arch pytorch_tutorial_rnn \
    --optimizer adam --lr 0.001 --lr-shrink 0.5 \
    --max-tokens 1000
  (...)
  | epoch 027 | loss 1.200 | ppl 2.30 | wps 15728 | ups 119.4 | wpb 116 | bsz 116 | num_updates 3726 | lr 1.5625e-05 | gnorm 1.290 | clip 0% | oom 0 | wall 32 | train_wall 21
  | epoch 027 | valid on 'valid' subset | valid_loss 1.41304 | valid_ppl 2.66 | num_updates 3726 | best 1.41208
  | done training in 31.6 seconds

The model files should appear in the :file:`checkpoints/` directory.


5. Writing an evaluation script
-------------------------------

Finally we can write a short script to evaluate our model on new inputs. Create
a new file named :file:`eval_classifier.py` with the following contents::

  from fairseq import checkpoint_utils, data, options, tasks

  # Parse command-line arguments for generation
  parser = options.get_generation_parser(default_task='simple_classification')
  args = options.parse_args_and_arch(parser)

  # Setup task
  task = tasks.setup_task(args)

  # Load model
  print('| loading model from {}'.format(args.path))
  models, _model_args = checkpoint_utils.load_model_ensemble([args.path], task=task)
  model = models[0]

  while True:
      sentence = input('\nInput: ')

      # Tokenize into characters
      chars = ' '.join(list(sentence.strip()))
      tokens = task.source_dictionary.encode_line(
          chars, add_if_not_exist=False,
      )

      # Build mini-batch to feed to the model
      batch = data.language_pair_dataset.collate(
          samples=[{'id': -1, 'source': tokens}],  # bsz = 1
          pad_idx=task.source_dictionary.pad(),
          eos_idx=task.source_dictionary.eos(),
          left_pad_source=False,
          input_feeding=False,
      )

      # Feed batch to the model and get predictions
      preds = model(**batch['net_input'])

      # Print top 3 predictions and their log-probabilities
      top_scores, top_labels = preds[0].topk(k=3)
      for score, label_idx in zip(top_scores, top_labels):
          label_name = task.target_dictionary.string([label_idx])
          print('({:.2f})\t{}'.format(score, label_name))

Now we can evaluate our model interactively. Note that we have included the
original data path (:file:`names-bin/`) so that the dictionaries can be loaded:

.. code-block:: console

  > python eval_classifier.py names-bin --path checkpoints/checkpoint_best.pt
  | [input] dictionary: 64 types
  | [label] dictionary: 24 types
  | loading model from checkpoints/checkpoint_best.pt

  Input: Satoshi
  (-0.61) Japanese
  (-1.20) Arabic
  (-2.86) Italian

  Input: Sinbad
  (-0.30) Arabic
  (-1.76) English
  (-4.08) Russian


================================================
FILE: docs/tutorial_simple_lstm.rst
================================================
Tutorial: Simple LSTM
=====================

In this tutorial we will extend fairseq by adding a new
:class:`~fairseq.models.FairseqEncoderDecoderModel` that encodes a source
sentence with an LSTM and then passes the final hidden state to a second LSTM
that decodes the target sentence (without attention).

This tutorial covers:

1. **Writing an Encoder and Decoder** to encode/decode the source/target
   sentence, respectively.
2. **Registering a new Model** so that it can be used with the existing
   :ref:`Command-line tools`.
3. **Training the Model** using the existing command-line tools.
4. **Making generation faster** by modifying the Decoder to use
   :ref:`Incremental decoding`.


1. Building an Encoder and Decoder
----------------------------------

In this section we'll define a simple LSTM Encoder and Decoder. All Encoders
should implement the :class:`~fairseq.models.FairseqEncoder` interface and
Decoders should implement the :class:`~fairseq.models.FairseqDecoder` interface.
These interfaces themselves extend :class:`torch.nn.Module`, so FairseqEncoders
and FairseqDecoders can be written and used in the same ways as ordinary PyTorch
Modules.


Encoder
~~~~~~~

Our Encoder will embed the tokens in the source sentence, feed them to a
:class:`torch.nn.LSTM` and return the final hidden state. To create our encoder
save the following in a new file named :file:`fairseq/models/simple_lstm.py`::

  import torch.nn as nn
  from fairseq import utils
  from fairseq.models import FairseqEncoder

  class SimpleLSTMEncoder(FairseqEncoder):

      def __init__(
          self, args, dictionary, embed_dim=128, hidden_dim=128, dropout=0.1,
      ):
          super().__init__(dictionary)
          self.args = args

          # Our encoder will embed the inputs before feeding them to the LSTM.
          self.embed_tokens = nn.Embedding(
              num_embeddings=len(dictionary),
              embedding_dim=embed_dim,
              padding_idx=dictionary.pad(),
          )
          self.dropout = nn.Dropout(p=dropout)

          # We'll use a single-layer, unidirectional LSTM for simplicity.
          self.lstm = nn.LSTM(
              input_size=embed_dim,
              hidden_size=hidden_dim,
              num_layers=1,
              bidirectional=False,
              batch_first=True,
          )

      def forward(self, src_tokens, src_lengths):
          # The inputs to the ``forward()`` function are determined by the
          # Task, and in particular the ``'net_input'`` key in each
          # mini-batch. We discuss Tasks in the next tutorial, but for now just
          # know that *src_tokens* has shape `(batch, src_len)` and *src_lengths*
          # has shape `(batch)`.

          # Note that the source is typically padded on the left. This can be
          # configured by adding the `--left-pad-source "False"` command-line
          # argument, but here we'll make the Encoder handle either kind of
          # padding by converting everything to be right-padded.
          if self.args.left_pad_source:
              # Convert left-padding to right-padding.
              src_tokens = utils.convert_padding_direction(
                  src_tokens,
                  padding_idx=self.dictionary.pad(),
                  left_to_right=True
              )

          # Embed the source.
          x = self.embed_tokens(src_tokens)

          # Apply dropout.
          x = self.dropout(x)

          # Pack the sequence into a PackedSequence object to feed to the LSTM.
          x = nn.utils.rnn.pack_padded_sequence(x, src_lengths, batch_first=True)

          # Get the output from the LSTM.
          _outputs, (final_hidden, _final_cell) = self.lstm(x)

          # Return the Encoder's output. This can be any object and will be
          # passed directly to the Decoder.
          return {
              # this will have shape `(bsz, hidden_dim)`
              'final_hidden': final_hidden.squeeze(0),
          }

      # Encoders are required to implement this method so that we can rearrange
      # the order of the batch elements during inference (e.g., beam search).
      def reorder_encoder_out(self, encoder_out, new_order):
          """
          Reorder encoder output according to `new_order`.

          Args:
              encoder_out: output from the ``forward()`` method
              new_order (LongTensor): desired order

          Returns:
              `encoder_out` rearranged according to `new_order`
          """
          final_hidden = encoder_out['final_hidden']
          return {
              'final_hidden': final_hidden.index_select(0, new_order),
          }


Decoder
~~~~~~~

Our Decoder will predict the next word, conditioned on the Encoder's final
hidden state and an embedded representation of the previous target word -- which
is sometimes called *teacher forcing*. More specifically, we'll use a
:class:`torch.nn.LSTM` to produce a sequence of hidden states that we'll project
to the size of the output vocabulary to predict each target word.

::

  import torch
  from fairseq.models import FairseqDecoder

  class SimpleLSTMDecoder(FairseqDecoder):

      def __init__(
          self, dictionary, encoder_hidden_dim=128, embed_dim=128, hidden_dim=128,
          dropout=0.1,
      ):
          super().__init__(dictionary)

          # Our decoder will embed the inputs before feeding them to the LSTM.
          self.embed_tokens = nn.Embedding(
              num_embeddings=len(dictionary),
              embedding_dim=embed_dim,
              padding_idx=dictionary.pad(),
          )
          self.dropout = nn.Dropout(p=dropout)

          # We'll use a single-layer, unidirectional LSTM for simplicity.
          self.lstm = nn.LSTM(
              # For the first layer we'll concatenate the Encoder's final hidden
              # state with the embedded target tokens.
              input_size=encoder_hidden_dim + embed_dim,
              hidden_size=hidden_dim,
              num_layers=1,
              bidirectional=False,
          )

          # Define the output projection.
          self.output_projection = nn.Linear(hidden_dim, len(dictionary))

      # During training Decoders are expected to take the entire target sequence
      # (shifted right by one position) and produce logits over the vocabulary.
      # The *prev_output_tokens* tensor begins with the end-of-sentence symbol,
      # ``dictionary.eos()``, followed by the target sequence.
      def forward(self, prev_output_tokens, encoder_out):
          """
          Args:
              prev_output_tokens (LongTensor): previous decoder outputs of shape
                  `(batch, tgt_len)`, for teacher forcing
              encoder_out (Tensor, optional): output from the encoder, used for
                  encoder-side attention

          Returns:
              tuple:
                  - the last decoder layer's output of shape
                    `(batch, tgt_len, vocab)`
                  - the last decoder layer's attention weights of shape
                    `(batch, tgt_len, src_len)`
          """
          bsz, tgt_len = prev_output_tokens.size()

          # Extract the final hidden state from the Encoder.
          final_encoder_hidden = encoder_out['final_hidden']

          # Embed the target sequence, which has been shifted right by one
          # position and now starts with the end-of-sentence symbol.
          x = self.embed_tokens(prev_output_tokens)

          # Apply dropout.
          x = self.dropout(x)

          # Concatenate the Encoder's final hidden state to *every* embedded
          # target token.
          x = torch.cat(
              [x, final_encoder_hidden.unsqueeze(1).expand(bsz, tgt_len, -1)],
              dim=2,
          )

          # Using PackedSequence objects in the Decoder is harder than in the
          # Encoder, since the targets are not sorted in descending length order,
          # which is a requirement of ``pack_padded_sequence()``. Instead we'll
          # feed nn.LSTM directly.
          initial_state = (
              final_encoder_hidden.unsqueeze(0),  # hidden
              torch.zeros_like(final_encoder_hidden).unsqueeze(0),  # cell
          )
          output, _ = self.lstm(
              x.transpose(0, 1),  # convert to shape `(tgt_len, bsz, dim)`
              initial_state,
          )
          x = output.transpose(0, 1)  # convert to shape `(bsz, tgt_len, hidden)`

          # Project the outputs to the size of the vocabulary.
          x = self.output_projection(x)

          # Return the logits and ``None`` for the attention weights
          return x, None


2. Registering the Model
------------------------

Now that we've defined our Encoder and Decoder we must *register* our model with
fairseq using the :func:`~fairseq.models.register_model` function decorator.
Once the model is registered we'll be able to use it with the existing
:ref:`Command-line Tools`.

All registered models must implement the
:class:`~fairseq.models.BaseFairseqModel` interface. For sequence-to-sequence
models (i.e., any model with a single Encoder and Decoder), we can instead
implement the :class:`~fairseq.models.FairseqEncoderDecoderModel` interface.

Create a small wrapper class in the same file and register it in fairseq with
the name ``'simple_lstm'``::

  from fairseq.models import FairseqEncoderDecoderModel, register_model

  # Note: the register_model "decorator" should immediately precede the
  # definition of the Model class.

  @register_model('simple_lstm')
  class SimpleLSTMModel(FairseqEncoderDecoderModel):

      @staticmethod
      def add_args(parser):
          # Models can override this method to add new command-line arguments.
          # Here we'll add some new command-line arguments to configure dropout
          # and the dimensionality of the embeddings and hidden states.
          parser.add_argument(
              '--encoder-embed-dim', type=int, metavar='N',
              help='dimensionality of the encoder embeddings',
          )
          parser.add_argument(
              '--encoder-hidden-dim', type=int, metavar='N',
              help='dimensionality of the encoder hidden state',
          )
          parser.add_argument(
              '--encoder-dropout', type=float, default=0.1,
              help='encoder dropout probability',
          )
          parser.add_argument(
              '--decoder-embed-dim', type=int, metavar='N',
              help='dimensionality of the decoder embeddings',
          )
          parser.add_argument(
              '--decoder-hidden-dim', type=int, metavar='N',
              help='dimensionality of the decoder hidden state',
          )
          parser.add_argument(
              '--decoder-dropout', type=float, default=0.1,
              help='decoder dropout probability',
          )

      @classmethod
      def build_model(cls, args, task):
          # Fairseq initializes models by calling the ``build_model()``
          # function. This provides more flexibility, since the returned model
          # instance can be of a different type than the one that was called.
          # In this case we'll just return a SimpleLSTMModel instance.

          # Initialize our Encoder and Decoder.
          encoder = SimpleLSTMEncoder(
              args=args,
              dictionary=task.source_dictionary,
              embed_dim=args.encoder_embed_dim,
              hidden_dim=args.encoder_hidden_dim,
              dropout=args.encoder_dropout,
          )
          decoder = SimpleLSTMDecoder(
              dictionary=task.target_dictionary,
              encoder_hidden_dim=args.encoder_hidden_dim,
              embed_dim=args.decoder_embed_dim,
              hidden_dim=args.decoder_hidden_dim,
              dropout=args.decoder_dropout,
          )
          model = SimpleLSTMModel(encoder, decoder)

          # Print the model architecture.
          print(model)

          return model

      # We could override the ``forward()`` if we wanted more control over how
      # the encoder and decoder interact, but it's not necessary for this
      # tutorial since we can inherit the default implementation provided by
      # the FairseqEncoderDecoderModel base class, which looks like:
      #
      # def forward(self, src_tokens, src_lengths, prev_output_tokens):
      #     encoder_out = self.encoder(src_tokens, src_lengths)
      #     decoder_out = self.decoder(prev_output_tokens, encoder_out)
      #     return decoder_out

Finally let's define a *named architecture* with the configuration for our
model. This is done with the :func:`~fairseq.models.register_model_architecture`
function decorator. Thereafter this named architecture can be used with the
``--arch`` command-line argument, e.g., ``--arch tutorial_simple_lstm``::

  from fairseq.models import register_model_architecture

  # The first argument to ``register_model_architecture()`` should be the name
  # of the model we registered above (i.e., 'simple_lstm'). The function we
  # register here should take a single argument *args* and modify it in-place
  # to match the desired architecture.

  @register_model_architecture('simple_lstm', 'tutorial_simple_lstm')
  def tutorial_simple_lstm(args):
      # We use ``getattr()`` to prioritize arguments that are explicitly given
      # on the command-line, so that the defaults defined below are only used
      # when no other value has been specified.
      args.encoder_embed_dim = getattr(args, 'encoder_embed_dim', 256)
      args.encoder_hidden_dim = getattr(args, 'encoder_hidden_dim', 256)
      args.decoder_embed_dim = getattr(args, 'decoder_embed_dim', 256)
      args.decoder_hidden_dim = getattr(args, 'decoder_hidden_dim', 256)


3. Training the Model
---------------------

Now we're ready to train the model. We can use the existing :ref:`fairseq-train`
command-line tool for this, making sure to specify our new Model architecture
(``--arch tutorial_simple_lstm``).

.. note::

  Make sure you've already preprocessed the data from the IWSLT example in the
  :file:`examples/translation/` directory.

.. code-block:: console

  > fairseq-train data-bin/iwslt14.tokenized.de-en \
    --arch tutorial_simple_lstm \
    --encoder-dropout 0.2 --decoder-dropout 0.2 \
    --optimizer adam --lr 0.005 --lr-shrink 0.5 \
    --max-tokens 12000
  (...)
  | epoch 052 | loss 4.027 | ppl 16.30 | wps 420805 | ups 39.7 | wpb 9841 | bsz 400 | num_updates 20852 | lr 1.95313e-05 | gnorm 0.218 | clip 0% | oom 0 | wall 529 | train_wall 396
  | epoch 052 | valid on 'valid' subset | valid_loss 4.74989 | valid_ppl 26.91 | num_updates 20852 | best 4.74954

The model files should appear in the :file:`checkpoints/` directory. While this
model architecture is not very good, we can use the :ref:`fairseq-generate` script to
generate translations and compute our BLEU score over the test set:

.. code-block:: console

  > fairseq-generate data-bin/iwslt14.tokenized.de-en \
    --path checkpoints/checkpoint_best.pt \
    --beam 5 \
    --remove-bpe
  (...)
  | Translated 6750 sentences (153132 tokens) in 17.3s (389.12 sentences/s, 8827.68 tokens/s)
  | Generate test with beam=5: BLEU4 = 8.18, 38.8/12.1/4.7/2.0 (BP=1.000, ratio=1.066, syslen=139865, reflen=131146)


4. Making generation faster
---------------------------

While autoregressive generation from sequence-to-sequence models is inherently
slow, our implementation above is especially slow because it recomputes the
entire sequence of Decoder hidden states for every output token (i.e., it is
``O(n^2)``). We can make this significantly faster by instead caching the
previous hidden states.

In fairseq this is called :ref:`Incremental decoding`. Incremental decoding is a
special mode at inference time where the Model only receives a single timestep
of input corresponding to the immediately previous output token (for teacher
forcing) and must produce the next output incrementally. Thus the model must
cache any long-term state that is needed about the sequence, e.g., hidden
states, convolutional states, etc.

To implement incremental decoding we will modify our model to implement the
:class:`~fairseq.models.FairseqIncrementalDecoder` interface. Compared to the
standard :class:`~fairseq.models.FairseqDecoder` interface, the incremental
decoder interface allows ``forward()`` methods to take an extra keyword argument
(*incremental_state*) that can be used to cache state across time-steps.

Let's replace our ``SimpleLSTMDecoder`` with an incremental one::

  import torch
  from fairseq.models import FairseqIncrementalDecoder

  class SimpleLSTMDecoder(FairseqIncrementalDecoder):

      def __init__(
          self, dictionary, encoder_hidden_dim=128, embed_dim=128, hidden_dim=128,
          dropout=0.1,
      ):
          # This remains the same as before.
          super().__init__(dictionary)
          self.embed_tokens = nn.Embedding(
              num_embeddings=len(dictionary),
              embedding_dim=embed_dim,
              padding_idx=dictionary.pad(),
          )
          self.dropout = nn.Dropout(p=dropout)
          self.lstm = nn.LSTM(
              input_size=encoder_hidden_dim + embed_dim,
              hidden_size=hidden_dim,
              num_layers=1,
              bidirectional=False,
          )
          self.output_projection = nn.Linear(hidden_dim, len(dictionary))

      # We now take an additional kwarg (*incremental_state*) for caching the
      # previous hidden and cell states.
      def forward(self, prev_output_tokens, encoder_out, incremental_state=None):
          if incremental_state is not None:
              # If the *incremental_state* argument is not ``None`` then we are
              # in incremental inference mode. While *prev_output_tokens* will
              # still contain the entire decoded prefix, we will only use the
              # last step and assume that the rest of the state is cached.
              prev_output_tokens = prev_output_tokens[:, -1:]

          # This remains the same as before.
          bsz, tgt_len = prev_output_tokens.size()
          final_encoder_hidden = encoder_out['final_hidden']
          x = self.embed_tokens(prev_output_tokens)
          x = self.dropout(x)
          x = torch.cat(
              [x, final_encoder_hidden.unsqueeze(1).expand(bsz, tgt_len, -1)],
              dim=2,
          )

          # We will now check the cache and load the cached previous hidden and
          # cell states, if they exist, otherwise we will initialize them to
          # zeros (as before). We will use the ``utils.get_incremental_state()``
          # and ``utils.set_incremental_state()`` helpers.
          initial_state = utils.get_incremental_state(
              self, incremental_state, 'prev_state',
          )
          if initial_state is None:
              # first time initialization, same as the original version
              initial_state = (
                  final_encoder_hidden.unsqueeze(0),  # hidden
                  torch.zeros_like(final_encoder_hidden).unsqueeze(0),  # cell
              )

          # Run one step of our LSTM.
          output, latest_state = self.lstm(x.transpose(0, 1), initial_state)

          # Update the cache with the latest hidden and cell states.
          utils.set_incremental_state(
              self, incremental_state, 'prev_state', latest_state,
          )

          # This remains the same as before
          x = output.transpose(0, 1)
          x = self.output_projection(x)
          return x, None

      # The ``FairseqIncrementalDecoder`` interface also requires implementing a
      # ``reorder_incremental_state()`` method, which is used during beam search
      # to select and reorder the incremental state.
      def reorder_incremental_state(self, incremental_state, new_order):
          # Load the cached state.
          prev_state = utils.get_incremental_state(
              self, incremental_state, 'prev_state',
          )

          # Reorder batches according to *new_order*.
          reordered_state = (
              prev_state[0].index_select(1, new_order),  # hidden
              prev_state[1].index_select(1, new_order),  # cell
          )

          # Update the cached state.
          utils.set_incremental_state(
              self, incremental_state, 'prev_state', reordered_state,
          )

Finally, we can rerun generation and observe the speedup:

.. code-block:: console

  # Before

  > fairseq-generate data-bin/iwslt14.tokenized.de-en \
    --path checkpoints/checkpoint_best.pt \
    --beam 5 \
    --remove-bpe
  (...)
  | Translated 6750 sentences (153132 tokens) in 17.3s (389.12 sentences/s, 8827.68 tokens/s)
  | Generate test with beam=5: BLEU4 = 8.18, 38.8/12.1/4.7/2.0 (BP=1.000, ratio=1.066, syslen=139865, reflen=131146)

  # After

  > fairseq-generate data-bin/iwslt14.tokenized.de-en \
    --path checkpoints/checkpoint_best.pt \
    --beam 5 \
    --remove-bpe
  (...)
  | Translated 6750 sentences (153132 tokens) in 5.5s (1225.54 sentences/s, 27802.94 tokens/s)
  | Generate test with beam=5: BLEU4 = 8.18, 38.8/12.1/4.7/2.0 (BP=1.000, ratio=1.066, syslen=139865, reflen=131146)


================================================
FILE: examples/.gitignore
================================================
!*/*.sh
!*/*.md


================================================
FILE: examples/MMPT/.gitignore
================================================
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
#  Usually these files are written by a python script from a template
#  before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
#   However, in case of collaboration, if having platform-specific dependencies or dependencies
#   having no cross-platform support, pipenv may install dependencies that don't work, or not
#   install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
runs
data
pretrained_models
projects/mmfusion_*
log_test
third-party
python_log
slurm_snapshot_code
lightning_logs
demos


================================================
FILE: examples/MMPT/CONFIG.md
================================================
### Config Files Explained

Taking `projects/mfmmlm.yaml` for example, which run pretraining using masked frame model (MFM) and masked language model (MLM) on a single BERT:  

```yaml
project_dir: mfmmlm # specify the project dir for this baseline.
run_task:
  - how2.yaml # run pretraining on how2 when launching `projects/taskmfmmlm.yaml`
  - [vtt.yaml, vttcap.yaml, vttqa.yaml, youcook.yaml, youcookcap.yaml, crosstask.yaml, coin.yaml] # run fine-tuning tasks.
base_dir: task # a global template folder to specify each training task. 
task_group:
  pretrain: # section for pretraining. Most baselines differs in this section.
    task_list:
      - how2.yaml # reconfig `projects/task/how2.yaml`
    dataset:
      aligner: MFMMLMAligner # overwrite the aligner for MFMMLM training task.
    model:
      model_cls: MMFusionMFMMLM # overwrite the model, which constructs negative examples for MFM on-the-fly.
    loss:
      loss_cls: MFMMLM # overwrite the loss as MFMMLM, which combines MFM and MLM together.
    fairseq: # all fairseq args can be expecified under this name.
      dataset:
        batch_size: 128
  finetune: # section for fine-tuning tasks, we don't need to change anything here mostly since we want to see how pretraining can contribute to finetuning.
    task_list: # specify the list of downstream tasks, e.g., copy `projects/task/vtt.yaml` to `projects/mfmmlm`.
      - vtt.yaml
      - vttqa.yaml
      - youcook.yaml
      - youcookcap.yaml
      - crosstask.yaml
      - coin.yaml
  test: # section for testing.
    task_list:
      - test_vtt.yaml
      - test_vttqa.yaml
      - test_youcook.yaml
      - test_youcookcap.yaml
      - test_crosstask.yaml
      - test_crosstask_zs.yaml
      - test_coin.yaml
```


================================================
FILE: examples/MMPT/DATASET.md
================================================
# Dataset

We understand video data are challenging to download and process. For videos, we provide our preprocessing scripts under `scripts/video_feature_extractor` (deeply adapted from `https://github.com/antoine77340/video_feature_extractor`); for text, we pre-tokenizing scripts under `scripts/text_token_extractor`.

### S3D Feature Extraction
We use pre-trained [S3D](https://github.com/antoine77340/S3D_HowTo100M) for video feature extraction. Please place the models as `pretrained_models/s3d_dict.npy` and `pretrained_models/s3d_howto100m.pth`.

We implement a `PathBuilder` to automatically track video ids, source video paths to their feature locations (you may need `conda install -c anaconda pandas`). Decoding may need `pip install ffmpeg-python`.

### Howto100M
[Howto100M](https://www.di.ens.fr/willow/research/howto100m/) is a large-scale video pre-training datasets. You may download videos by yourself and run preprocessing of our scripts. 

Several key differences of our preprocessing from existing papers: (1) we use `raw_caption.json` instead of `caption.json` to have pure self-supervision on text (`caption.json` has manual removal of stop words); (2) we remove partially duplicated texts that are originally designed for real-time readability (see `mmpt/processors/dedupprocessor.py`); (3) then we shard video/text features using `SharedTensor` in `mmpt/utils/shardedtensor.py` for fast loading during training (faster than `h5py`).

#### Steps
##### video
To extract video features: edit and run `bash scripts/video_feature_extractor/how2/s3d.sh`. (consider to run this on multiple machines; by default, we store features in fp16 to save space and also for faster training).

Split available video ids as `data/how2/how2_s3d_train.lst` and `data/how2/how2_s3d_val.lst`.

Lastly, pack video features into `ShardedTensor` using `python scripts/video_feature_extractor/shard_feature.py`.

##### text
Clean captions using `python -m mmpt.processors.dedupprocessor`.

Tokenize dedupped captions `data/how2/raw_caption_dedup.pkl` into sharded numpy arrays:  
```
python scripts/text_token_extractor/pretokenization.py scripts/text_token_extractor/configs/bert-base-uncased.yaml
```

### Youcook, MSRVTT etc.
We use the version of Youcook and MSRVTT come with Howto100M and MILNCE. Please download the data to `data/youcook` and `data/msrvtt` accordingly, you can also check `projects/task/youcook.yaml` and `projects/task/vtt.yaml` etc. in details. 
We extract features for Youcook, MSRVTT similar to the first step of Howto100M but we read text from meta data directly and perform on-the-fly tokenization.



================================================
FILE: examples/MMPT/README.md
================================================
# VideoCLIP and VLM

You just find this toolkit for multimodal video understanding! It contains implementation of two recent multi-modal video understanding papers [VideoCLIP](https://arxiv.org/pdf/2109.14084.pdf) (EMNLP, 2021) and [VLM](https://aclanthology.org/2021.findings-acl.370.pdf) (ACL Findings, 2021), along with high-performance toolkits that are typically lacking in existing codebase. The toolkit is desigend to contain generic performance-tuned components that can be potentially adapted to other frameworks (we initially use fairseq). 

VideoCLIP is a contrastive learning model for zero-shot transfer to retrieval/classification/sequence labeling style tasks.

<img src="videoclip.png" width="350" class="center">

VLM is a masked language model style pre-training using only one encoder with masked modality model (MMM) for retrieval/generation/sequence labeling style tasks.

<img src="vlm.png" width="350" class="center">

### News
[Oct. 2021] Initial release of implementation for the following papers:  
[VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding](https://arxiv.org/pdf/2109.14084.pdf) (Xu et. al., EMNLP 2021)  
[VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding](https://aclanthology.org/2021.findings-acl.370.pdf) (Xu et. al., ACL Findings 2021)  


### Installation
We aim to minimize the dependency of this repo on other packages.  
We use fairseq as the main trainer (no models/datasets dependency on fairseq. We will support other trainer in future):  
```
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install -e .  # also optionally follow fairseq README for apex installation for fp16 training.
export MKL_THREADING_LAYER=GNU  # fairseq may need this for numpy.
```

Then install this toolkit:
```
cd examples/MMPT  # MMPT can be in any folder, not necessarily under fairseq/examples.
pip install -e .
```

The code is developed under Python=3.8.8, Pytorch=1.8, cuda=11.0 with fairseq=1.0.0a0+af0389f and tested under Python=3.8.8 pytorch=1.9 cuda=11.0 fairseq=1.0.0a0+8e7bc73 during code release.
Most models require `transformers==3.4` for API compatibility `pip install transformers==3.4`. 
In addition, some downstream tasks may need `conda install pandas`.  


### Usage
#### Download Checkpoints
We use pre-trained [S3D](https://github.com/antoine77340/S3D_HowTo100M) for video feature extraction. Please place the models as `pretrained_models/s3d_dict.npy` and `pretrained_models/s3d_howto100m.pth`.

Download VideoCLIP checkpoint `https://dl.fbaipublicfiles.com/MMPT/retri/videoclip/checkpoint_best.pt` to `runs/retri/videoclip` or VLM checkpoint `https://dl.fbaipublicfiles.com/MMPT/mtm/vlm/checkpoint_best.pt` to `runs/mtm/vlm`.

#### Demo of Inference
run `python locallaunch.py projects/retri/videoclip.yaml --dryrun` to get all `.yaml`s for VideoCLIP.

```python
import torch

from mmpt.models import MMPTModel


model, tokenizer, aligner = MMPTModel.from_pretrained(
    "projects/retri/videoclip/how2.yaml")

model.eval()


# B, T, FPS, H, W, C (VideoCLIP is trained on 30 fps of s3d)
video_frames = torch.randn(1, 2, 30, 224, 224, 3)
caps, cmasks = aligner._build_text_seq(
    tokenizer("some text", add_special_tokens=False)["input_ids"]
)

caps, cmasks = caps[None, :], cmasks[None, :]  # bsz=1

with torch.no_grad():
    output = model(video_frames, caps, cmasks, return_score=True)
print(output["score"])  # dot-product
```

#### Data Preparation
See [dataset](DATASET.md) for each dataset.

#### Global Config for Training Pipeline
We organize a global config file for a training/testing pipeline under projects (see a detailed [explanation](CONFIG.md)). For example, VideoCLIP in `projects/retri/videoclip.yaml` and VLM is in `projects/mtm/vlm.yaml`.

We wrap all cmds into `locallaunch.py` and `mmpt_cli/localjob.py`. You can check concrete cmds by `--dryrun` and then drop it for actual run.  

First, run `python locallaunch.py projects/retri/videoclip.yaml --dryrun` will generate configs for all configs of pre-training, zero-shot evaluation, fine-tuning and testing, for VideoCLIP under `projects/retri/videoclip`.  

Then each (either training or evaluation) process will be configed by a concrete config file (we save all complex arguments into the concrete config file for reproducibility, including fairseq args). For example, run zero-shot evaluation on youcook,
```
python locallaunch.py projects/retri/videoclip/test_youcook_zs.yaml --jobtype local_predict  # zero-shot evaluation.
python locallaunch.py projects/retri/videoclip/youcook_videoclip.yaml --jobtype local_single --dryrun  # fine-tuning: use --dryrun to check cmds and drop it to make an actual run; local_small will run on two gpus (as in paper).
python locallaunch.py projects/retri/videoclip/test_youcook_videoclip.yaml --jobtype local_predict  # testing on fine-tuned model.
```

Pretraining can be run as:  
```
python locallaunch.py projects/retri/videoclip/how2.yaml --jobtype local_single --dryrun # check then drop dryrun; paper is ran on local_big as 8 gpus.
```
You may need to change `--jobtype`, check/extend `LocalJob` in `mmpt_cli/localjob.py` for multi-gpu/multi-node pre-training.

The detailed instructions of pretraining and fine-tuning can be found at [pretraining instruction](pretraining.md) and [finetuning instruction](endtask.md).


### Development
Several components of this toolkit can be re-used for future research (and also our ongoing research).

#### Framework Wrapper
We currently only support fairseq, but most components can be easily fit into other frameworks like huggingface. This repo is a `--user-dir` of fairseq with fairseq wrapper. For example, `mmpt/tasks` includes a `FairseqMMTTask`, which manages `mmpt/datasets` with `FairseqDataset`, `mmpt/models` with `FairseqModel`, `mmpt/losses` with `FairseqCriterion`.  

#### Processors
**Multi**modal research introduces the complexity on modality alignment from different input sources to losses. Inspired by [MMF](https://github.com/facebookresearch/mmf), this toolkit leverages `mmpt/processors` to handle various needs of data preprocessing and loading, **alleviating** the needs of multiple `torch.data.utils.Dataset` (that can be tricky for ablation study).  
Processors can also be decoupled from `torch.data.utils.Dataset` for offline preprocessing instead of on-the-fly data preprocessing.

We decouple a `mmpt.MMDataset` as 3 types of processors: `MetaProcessor`, `VideoProcessor`, `TextProcessor` and `Aligner`. They can be configed in `dataset` field of a config file (e.g., see `projects/task/how2.yaml`).  
`MetaProcessor` is used to load the meta data about a dataset, aka, all video_ids of how2 dataset.  
`VideoProcessor` is used to load the video features about a dataset. For example, S3D features for each second of a video.  
`TextProcessor` is used to load the text (feature). For example, BERT pre-tokenized text clips for how2 dataset (with `start`s, `end`s of timestamps and `cap` for `token_ids`).  
`Aligner` is the core class for different baselines that prepares the training data. For example, sampling a clip, masking tokens for MLM, etc.

#### Performance-tuned Components
To speed up pre-training, this toolkit uses sharded features stored in mmaped numpy, backed by `ShardedTensor` in `mmpt/utils/shardedtensor.py` (adopted from MARGE paper). This reduces the loads of IO for multi-GPU training without loading all features for a video into the memory each time and `ShardedTensor` ensure features are stored in continuous disk space for near random access. This is used for both How2 video features and texts in `mmpt/processors/how2processor.py`.


### Citation
If this codebase is useful for your work, please cite the following papers:

```BibTeX
@inproceedings{xu-etal-2021-videoclip,
    title = "{VideoCLIP}: Contrastive Pre-training for\\Zero-shot Video-Text Understanding",
    author = "Xu, Hu  and
      Ghosh, Gargi  and
      Huang, Po-Yao  and
      Okhonko, Dmytro  and
      Aghajanyan, Armen  and
      Metze, Florian  and
      Zettlemoyer, Luke  and
      Feichtenhofer, Christoph",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
}

@inproceedings{xu-etal-2021-vlm,
    title = "{VLM}: Task-agnostic Video-Language Model Pre-training for Video Understanding",
    author = "Xu, Hu  and
      Ghosh, Gargi  and
      Huang, Po-Yao  and
      Arora, Prahal  and
      Aminzadeh, Masoumeh  and
      Feichtenhofer, Christoph  and
      Metze, Florian  and
      Zettlemoyer, Luke",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.370",
    doi = "10.18653/v1/2021.findings-acl.370",
    pages = "4227--4239",
}
```

### Bug Reports
This repo is in its initial stage, welcome bug reports to huxu@fb.com

### Copyright
The majority of Multimodal Pre-training (MMPT) is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Evaluation Codes/Models: Howto100M and HuggingFace Transformers are licensed under the Apache2.0 license; COIN and NLG-eval are licensed under the MIT license; CrossTask is licensed under the BSD-3; DiDeMo is licensed under the BSD-2 license.


================================================
FILE: examples/MMPT/endtask.md
================================================
# Zero-shot Transfer and Finetuning

(If you are new to the ideas of `mmpt.processors`, see [README](README.md) first.)
All finetuning datasets (specifically `processors`) are defined in `mmpt.processors.dsprocessor`.
Given the complexity of different types of finetuning tasks, each task may have their own meta/video/text/aligner processors and `mmpt/evaluators/{Predictor,Metric}`.

### Tasks

Currently, we support 5 end datasets: `MSRVTT`, `Youcook`, `COIN`, `Crosstask` and `DiDeMo` with the following tasks:  
text-video retrieval: `MSRVTT`, `Youcook`, `DiDeMo`;   
video captioning: `Youcook`;  
Video Question and Answering: `MSRVTT-QA`.  

To add your own dataset, you can specify the corresponding processors and config them in the `dataset` field of a config file, such as `projects/task/vtt.yaml`.

### Zero-shot Transfer (no Training)
Zero-shot transfer will run the pre-trained model (e.g., VideoCLIP) directly on testing data. Configs with pattern: `projects/task/*_zs_*.yaml` are dedicated for zero-shot transfer.

### Fine-tuning

The training of a downstream task is similar to pretraining, execept you may need to specify the `restore_file` in `fairseq.checkpoint` and reset optimizers, see `projects/task/ft.yaml` that is included by `projects/task/vtt.yaml`.

We typically do finetuning on 2 gpus (`local_small`).

### Testing
For each finetuning dataset, you may need to specify a testing config, similar to `projects/task/test_vtt.yaml`.  

We define `mmpt.evaluators.Predictor` for different types of prediction. For example, `MSRVTT` and `Youcook` are video-retrieval tasks and expecting to use `RetrievalPredictor`. You may need to define your new type of predictors and specify that in `predictor` field of a testing config.

Each task may also have their own metric for evaluation. This can be created in `mmpt.evaluators.Metric` and specified in the `metric` field of a testing config.

Launching a testing is as simple as training by specifying the path of a testing config:
```python locallaunch.py projects/mfmmlm/test_vtt.yaml```
Testing will be launched locally by default since prediction is computationally less expensive.

### Third-party Libraries
We list the following finetuning tasks that require third-party libraries.

Youcook captioning: `https://github.com/Maluuba/nlg-eval`  

CrossTask: `https://github.com/DmZhukov/CrossTask`'s `dp` under `third-party/CrossTask` (`python setup.py build_ext --inplace`)


================================================
FILE: examples/MMPT/locallaunch.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
import argparse
import os

from omegaconf import OmegaConf

from mmpt.utils import recursive_config, overwrite_dir
from mmpt_cli.localjob import LocalJob


class JobLauncher(object):
    JOB_CONFIG = {
        "local": LocalJob,
    }

    def __init__(self, yaml_file):
        self.yaml_file = yaml_file
        job_key = "local"

        if yaml_file.endswith(".yaml"):
            config = recursive_config(yaml_file)
            if config.task_type is not None:
                job_key = config.task_type.split("_")[0]
        else:
            raise ValueError("unknown extension of job file:", yaml_file)
        self.job_key = job_key

    def __call__(self, job_type=None, dryrun=False):
        if job_type is not None:
            self.job_key = job_type.split("_")[0]
        print("[JobLauncher] job_key", self.job_key)
        job = JobLauncher.JOB_CONFIG[self.job_key](
            self.yaml_file, job_type=job_type, dryrun=dryrun)
        return job.submit()


class Pipeline(object):
    """a job that loads yaml config."""

    def __init__(self, fn):
        """
        load a yaml config of a job and save generated configs as yaml for each task.
        return: a list of files to run as specified by `run_task`.
        """
        if fn.endswith(".py"):
            # a python command.
            self.backend = "python"
            self.run_yamls = [fn]
            return

        job_config = recursive_config(fn)
        if job_config.base_dir is None:  # single file job config.
            self.run_yamls = [fn]
            return

        self.project_dir = os.path.join("projects", job_config.project_dir)
        self.run_dir = os.path.join("runs", job_config.project_dir)

        if job_config.run_task is not None:
            run_yamls = []
            for stage in job_config.run_task:
                # each stage can have multiple tasks running in parallel.
                if OmegaConf.is_list(stage):
                    stage_yamls = []
                    for task_file in stage:
                        stage_yamls.append(
                            os.path.join(self.project_dir, task_file))
                    run_yamls.append(stage_yamls)
                else:
                    run_yamls.append(os.path.join(self.project_dir, stage))
            self.run_yamls = run_yamls
        configs_to_save = self._overwrite_task(job_config)
        self._save_configs(configs_to_save)

    def __getitem__(self, idx):
        yaml_files = self.run_yamls[idx]
        if isinstance(yaml_files, list):
            return [JobLauncher(yaml_file) for yaml_file in yaml_files]
        return [JobLauncher(yaml_files)]

    def __len__(self):
        return len(self.run_yamls)

    def _save_configs(self, configs_to_save: dict):
        # save
        os.makedirs(self.project_dir, exist_ok=True)
        for config_file in configs_to_save:
            config = configs_to_save[config_file]
            print("saving", config_file)
            OmegaConf.save(config=config, f=config_file)

    def _overwrite_task(self, job_config):
        configs_to_save = {}
        self.base_project_dir = os.path.join("projects", job_config.base_dir)
        self.base_run_dir = os.path.join("runs", job_config.base_dir)

        for config_sets in job_config.task_group:
            overwrite_config = job_config.task_group[config_sets]
            if (
                overwrite_config.task_list is None
                or len(overwrite_config.task_list) == 0
            ):
                print(
                    "[warning]",
                    job_config.task_group,
                    "has no task_list specified.")
            # we don't want this added to a final config.
            task_list = overwrite_config.pop("task_list", None)
            for config_file in task_list:
                config_file_path = os.path.join(
                    self.base_project_dir, config_file)
                config = recursive_config(config_file_path)
                # overwrite it.
                if overwrite_config:
                    config = OmegaConf.merge(config, overwrite_config)
                overwrite_dir(config, self.run_dir, basedir=self.base_run_dir)
                save_file_path = os.path.join(self.project_dir, config_file)
                configs_to_save[save_file_path] = config
        return configs_to_save


def main(args):
    job_type = args.jobtype if args.jobtype else None
    # parse multiple pipelines.
    pipelines = [Pipeline(fn) for fn in args.yamls.split(",")]

    for pipe_id, pipeline in enumerate(pipelines):
        if not hasattr(pipeline, "project_dir"):
            for job in pipeline[0]:
                job(job_type=job_type, dryrun=args.dryrun)


if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("yamls", type=str)
    parser.add_argument(
        "--dryrun",
        action="store_true",
        help="run config and prepare to submit without launch the job.",
    )
    parser.add_argument(
        "--jobtype", type=str, default="",
        help="force to run jobs as specified.")
    args = parser.parse_args()
    main(args)


================================================
FILE: examples/MMPT/mmpt/__init__.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
try:
    # fairseq user dir
    from .datasets import FairseqMMDataset
    from .losses import FairseqCriterion
    from .models import FairseqMMModel
    from .tasks import FairseqMMTask
except ImportError:
    pass


================================================
FILE: examples/MMPT/mmpt/datasets/__init__.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
from .mmdataset import *

try:
    from .fairseqmmdataset import *
except ImportError:
    pass


================================================
FILE: examples/MMPT/mmpt/datasets/fairseqmmdataset.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
"""
TODO (huxu): fairseq wrapper class for all dataset you defined: mostly MMDataset.
"""

from collections import OrderedDict

from torch.utils.data import Dataset
from torch.utils.data.dataloader import default_collate
from fairseq.data import FairseqDataset, data_utils


class FairseqMMDataset(FairseqDataset):
    """
    A wrapper class for MMDataset for fairseq.
    """

    def __init__(self, mmdataset):
        if not isinstance(mmdataset, Dataset):
            raise TypeError("mmdataset must be of type `torch.utils.data.dataset`.")
        self.mmdataset = mmdataset

    def set_epoch(self, epoch, **unused):
        super().set_epoch(epoch)
        self.epoch = epoch

    def __getitem__(self, idx):
        with data_utils.numpy_seed(43211, self.epoch, idx):
            return self.mmdataset[idx]

    def __len__(self):
        return len(self.mmdataset)

    def collater(self, samples):
        if hasattr(self.mmdataset, "collator"):
            return self.mmdataset.collator(samples)
        if len(samples) == 0:
            return {}
        if isinstance(samples[0], dict):
            batch = OrderedDict()
            for key in samples[0]:
                if samples[0][key] is not None:
                    batch[key] = default_collate([sample[key] for sample in samples])
            return batch
        else:
            return default_collate(samples)

    def size(self, index):
        """dummy implementation: we don't use --max-tokens"""
        return 1

    def num_tokens(self, index):
        """dummy implementation: we don't use --max-tokens"""
        return 1


================================================
FILE: examples/MMPT/mmpt/datasets/mmdataset.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

import torch

from collections import OrderedDict

from torch.utils.data import Dataset
from torch.utils.data.dataloader import default_collate

from ..utils import set_seed


class MMDataset(Dataset):
    """
    A generic multi-modal dataset.
        Args:
            `meta_processor`: a meta processor,
                handling loading meta data and return video_id and text_id.
            `video_processor`: a video processor,
                handling e.g., decoding, loading .np files.
            `text_processor`: a text processor,
                handling e.g., tokenization.
            `aligner`: combine the video and text feature
                as one training example.
    """

    def __init__(
        self,
        meta_processor,
        video_processor,
        text_processor,
        align_processor,
    ):
        self.split = meta_processor.split
        self.meta_processor = meta_processor
        self.video_processor = video_processor
        self.text_processor = text_processor
        self.align_processor = align_processor

    def __len__(self):
        return len(self.meta_processor)

    def __getitem__(self, idx):
        if self.split == "test":
            set_seed(idx)
        video_id, text_id = self.meta_processor[idx]
        video_feature = self.video_processor(video_id)
        text_feature = self.text_processor(text_id)
        output = self.align_processor(video_id, video_feature, text_feature)
        # TODO (huxu): the following is for debug purpose.
        output.update({"idx": idx})
        return output

    def collater(self, samples):
        """This collator is deprecated.
        set self.collator = MMDataset.collater.
        see collator in FairseqMMDataset.
        """

        if len(samples) == 0:
            return {}
        if isinstance(samples[0], dict):
            batch = OrderedDict()
            for key in samples[0]:
                if samples[0][key] is not None:
                    batch[key] = default_collate(
                        [sample[key] for sample in samples])
                # if torch.is_tensor(batch[key]):
                #    print(key, batch[key].size())
                # else:
                #    print(key, len(batch[key]))
            return batch
        else:
            return default_collate(samples)

    def print_example(self, output):
        print("[one example]", output["video_id"])
        if (
            hasattr(self.align_processor, "subsampling")
            and self.align_processor.subsampling is not None
            and self.align_processor.subsampling > 1
        ):
            for key in output:
                if torch.is_tensor(output[key]):
                    output[key] = output[key][0]

        # search tokenizer to translate ids back.
        tokenizer = None
        if hasattr(self.text_processor, "tokenizer"):
            tokenizer = self.text_processor.tokenizer
        elif hasattr(self.align_processor, "tokenizer"):
            tokenizer = self.align_processor.tokenizer
        if tokenizer is not None:
            caps = output["caps"].tolist()
            if isinstance(caps[0], list):
                caps = caps[0]
            print("caps", tokenizer.decode(caps))
            print("caps", tokenizer.convert_ids_to_tokens(caps))

        for key, value in output.items():
            if torch.is_tensor(value):
                if len(value.size()) >= 3:  # attention_mask.
                    print(key, value.size())
                    print(key, "first", value[0, :, :])
                    print(key, "last", value[-1, :, :])
                else:
                    print(key, value)
        print("[end of one example]")


================================================
FILE: examples/MMPT/mmpt/evaluators/__init__.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
from .metric import *
from .evaluator import *


# experimental.
try:
    from .expmetric import *
except ImportError:
    pass


================================================
FILE: examples/MMPT/mmpt/evaluators/evaluator.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
import os
import glob
import numpy as np

from . import metric as metric_path
from . import predictor as predictor_path


class Evaluator(object):
    """
    perform evaluation on a single (downstream) task.
    make this both offline and online.
    TODO(huxu) saving evaluation results.
    """

    def __init__(self, config, eval_dataloader=None):
        if config.metric is None:
            raise ValueError("config.metric is", config.metric)
        metric_cls = getattr(metric_path, config.metric)
        self.metric = metric_cls(config)
        if config.predictor is None:
            raise ValueError("config.predictor is", config.predictor)
        predictor_cls = getattr(predictor_path, config.predictor)
        self.predictor = predictor_cls(config)
        self.eval_dataloader = eval_dataloader

    def __call__(self):
        try:
            print(self.predictor.pred_dir)
            for pred_file in glob.glob(
                    self.predictor.pred_dir + "/*_merged.npy"):
                outputs = np.load(pred_file)
                results = self.metric.compute_metrics(outputs)
                self.metric.print_computed_metrics(results)

            outputs = np.load(os.path.join(
                    self.predictor.pred_dir, "merged.npy"))
            results = self.metric.compute_metrics(outputs)
            return {"results": results, "metric": self.metric}
        except FileNotFoundError:
            print("\n[missing]", self.predictor.pred_dir)
            return {}

    def evaluate(self, model, eval_dataloader=None, output_file="merged"):
        if eval_dataloader is None:
            eval_dataloader = self.eval_dataloader
        outputs = self.predictor.predict_loop(
            model, eval_dataloader, output_file)
        results = self.metric.compute_metrics(**outputs)
        return results


================================================
FILE: examples/MMPT/mmpt/evaluators/metric.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

import numpy as np
import json


class Metric(object):
    def __init__(self, config, metric_names):
        self.metric_names = metric_names

    def best_metric(self, metric):
        return metric[self.metric_names[0]]

    def save_metrics(self, fn, metrics):
        with open(fn, "w") as fw:
            json.dump(fw, metrics)

    def print_computed_metrics(self, metrics):
        raise NotImplementedError


class RetrievalMetric(Metric):
    """
    this is modified from `howto100m/metrics.py`.
    History of changes:
    refactor as a class.
    add metric_key in __init__
    """

    def __init__(self, config, metric_names=["R1", "R5", "R10", "MR"]):
        super().__init__(config, metric_names)
        self.error = False  # TODO(huxu): add to config to print error.

    def compute_metrics(self, outputs, texts, **kwargs):
        x = outputs
        sx = np.sort(-x, axis=1)
        d = np.diag(-x)
        d = d[:, np.newaxis]
        ind = sx - d
        ind = np.where(ind == 0)
        ind = ind[1]
        metrics = {}
        metrics["R1"] = float(np.sum(ind == 0)) / len(ind)
        metrics["R5"] = float(np.sum(ind < 5)) / len(ind)
        metrics["R10"] = float(np.sum(ind < 10)) / len(ind)
        metrics["MR"] = np.median(ind) + 1

        max_idx = np.argmax(outputs, axis=1)
        if self.error:
            # print top-20 errors.
            error = []
            for ex_idx in range(20):
                error.append((texts[ex_idx], texts[max_idx[ex_idx]]))
            metrics["error"] = error
        return metrics

    def print_computed_metrics(self, metrics):
        r1 = metrics["R1"]
        r5 = metrics["R5"]
        r10 = metrics["R10"]
        mr = metrics["MR"]
        print(
            "R@1: {:.4f} - R@5: {:.4f} - R@10: {:.4f} - Median R: {}".format(
                r1, r5, r10, mr
            )
        )
        if "error" in metrics:
            print(metrics["error"])


class DiDeMoMetric(Metric):
    """
    History of changes:
    python 2.x to python 3.x.
    merge utils.py into eval to save one file.
    reference: https://github.com/LisaAnne/LocalizingMoments/blob/master/utils/eval.py
    Code to evaluate your results on the DiDeMo dataset.
    """
    def __init__(self, config, metric_names=["rank1", "rank5", "miou"]):
        super().__init__(config, metric_names)

    def compute_metrics(self, outputs, targets, **kwargs):
        assert len(outputs) == len(targets)
        rank1, rank5, miou = self._eval_predictions(outputs, targets)
        metrics = {
            "rank1": rank1,
            "rank5": rank5,
            "miou": miou
        }
        return metrics

    def print_computed_metrics(self, metrics):
        rank1 = metrics["rank1"]
        rank5 = metrics["rank5"]
        miou = metrics["miou"]
        # print("Average rank@1: %f" % rank1)
        # print("Average rank@5: %f" % rank5)
        # print("Average iou: %f" % miou)

        print(
            "Average rank@1: {:.4f} Average rank@5: {:.4f} Average iou: {:.4f}".format(
                rank1, rank5, miou
            )
        )

    def _iou(self, pred, gt):
        intersection = max(0, min(pred[1], gt[1]) + 1 - max(pred[0], gt[0]))
        union = max(pred[1], gt[1]) + 1 - min(pred[0], gt[0])
        return float(intersection)/union

    def _rank(self, pred, gt):
        return pred.index(tuple(gt)) + 1

    def _eval_predictions(self, segments, data):
        '''
        Inputs:
        segments: For each item in the ground truth data, rank possible video segments given the description and video.
            In DiDeMo, there are 21 posible moments extracted for each video so the list of video segments will be of length 21.
            The first video segment should be the video segment that best corresponds to the text query.
            There are 4180 sentence in the validation data, so when evaluating a model on the val dataset,
            segments should be a list of lenght 4180, and each item in segments should be a list of length 21.
        data: ground truth data
        '''
        average_ranks = []
        average_iou = []
        for s, d in zip(segments, data):
            pred = s[0]
            ious = [self._iou(pred, t) for t in d['times']]
            average_iou.append(np.mean(np.sort(ious)[-3:]))
            ranks = [self._rank(s, t) for t in d['times'] if tuple(t) in s]  # if t in s] is added for s, e not in prediction.
            average_ranks.append(np.mean(np.sort(ranks)[:3]))
        rank1 = np.sum(np.array(average_ranks) <= 1)/float(len(average_ranks))
        rank5 = np.sum(np.array(average_ranks) <= 5)/float(len(average_ranks))
        miou = np.mean(average_iou)

        # print("Average rank@1: %f" % rank1)
        # print("Average rank@5: %f" % rank5)
        # print("Average iou: %f" % miou)
        return rank1, rank5, miou


class NLGMetric(Metric):
    def __init__(
        self,
        config,
        metric_names=[
            "Bleu_1", "Bleu_2", "Bleu_3", "Bleu_4",
            "METEOR", "ROUGE_L", "CIDEr"
        ]
    ):
        super().__init__(config, metric_names)
        # please install NLGEval from `https://github.com/Maluuba/nlg-eval`
        from nlgeval import NLGEval
        self.nlg = NLGEval()

    def compute_metrics(self, outputs, targets, **kwargs):
        return self.nlg.compute_metrics(
            hyp_list=outputs, ref_list=targets)

    def print_computed_metrics(self, metrics):
        Bleu_1 = metrics["Bleu_1"]
        Bleu_2 = metrics["Bleu_2"]
        Bleu_3 = metrics["Bleu_3"]
        Bleu_4 = metrics["Bleu_4"]
        METEOR = metrics["METEOR"]
        ROUGE_L = metrics["ROUGE_L"]
        CIDEr = metrics["CIDEr"]

        print(
            "Bleu_1: {:.4f} - Bleu_2: {:.4f} - Bleu_3: {:.4f} - Bleu_4: {:.4f} - METEOR: {:.4f} - ROUGE_L: {:.4f} - CIDEr: {:.4f}".format(
                Bleu_1, Bleu_2, Bleu_3, Bleu_4, METEOR, ROUGE_L, CIDEr
            )
        )


class QAMetric(Metric):
    def __init__(
        self,
        config,
        metric_names=["acc"]
    ):
        super().__init__(config, metric_names)

    def compute_metrics(self, outputs, targets, **kwargs):
        from sklearn.metrics import accuracy_score
        return {"acc": accuracy_score(targets, outputs)}

    def print_computed_metrics(self, metrics):
        print("acc: {:.4f}".format(metrics["acc"]))


class COINActionSegmentationMetric(Metric):
    """
    COIN dataset listed 3 repos for Action Segmentation.
    Action Sets, NeuralNetwork-Viterbi, TCFPN-ISBA.
    The first and second are the same.
    https://github.com/alexanderrichard/action-sets/blob/master/eval.py

    Future reference for the third:
    `https://github.com/Zephyr-D/TCFPN-ISBA/blob/master/utils/metrics.py`
    """
    def __init__(self, config, metric_name=["frame_acc"]):
        super().__init__(config, metric_name)

    def compute_metrics(self, outputs, targets):
        n_frames = 0
        n_errors = 0
        n_errors = sum(outputs != targets)
        n_frames = len(targets)
        return {"frame_acc": 1.0 - float(n_errors) / n_frames}

    def print_computed_metrics(self, metrics):
        fa = metrics["frame_acc"]
        print("frame accuracy:", fa)


class CrossTaskMetric(Metric):
    def __init__(self, config, metric_names=["recall"]):
        super().__init__(config, metric_names)

    def compute_metrics(self, outputs, targets, **kwargs):
        """refactored from line 166:
        https://github.com/DmZhukov/CrossTask/blob/master/train.py"""

        recalls = self._get_recalls(Y_true=targets, Y_pred=outputs)
        results = {}
        for task, rec in recalls.items():
            results[str(task)] = rec

        avg_recall = np.mean(list(recalls.values()))
        results["recall"] = avg_recall
        return results

    def print_computed_metrics(self, metrics):
        print('Recall: {0:0.3f}'.format(metrics["recall"]))
        for task in metrics:
            if task != "recall":
                print('Task {0}. Recall = {1:0.3f}'.format(
                    task, metrics[task]))

    def _get_recalls(self, Y_true, Y_pred):
        """refactored from
        https://github.com/DmZhukov/CrossTask/blob/master/train.py"""

        step_match = {task: 0 for task in Y_true.keys()}
        step_total = {task: 0 for task in Y_true.keys()}
        for task, ys_true in Y_true.items():
            ys_pred = Y_pred[task]
            for vid in set(ys_pred.keys()).intersection(set(ys_true.keys())):
                y_true = ys_true[vid]
                y_pred = ys_pred[vid]
                step_total[task] += (y_true.sum(axis=0) > 0).sum()
                step_match[task] += (y_true*y_pred).sum()
        recalls = {
            task: step_match[task] / n for task, n in step_total.items()}
        return recalls


class ActionRecognitionMetric(Metric):
    def __init__(
        self,
        config,
        metric_names=["acc", "acc_splits", "r1_splits", "r5_splits", "r10_splits"]
    ):
        super().__init__(config, metric_names)

    def compute_metrics(self, outputs, targets, splits, **kwargs):
        all_video_embd = outputs
        labels = targets
        split1, split2, split3 = splits
        accs = []
        r1s = []
        r5s = []
        r10s = []
        for split in range(3):
            if split == 0:
                s = split1
            elif split == 1:
                s = split2
            else:
                s = split3

            X_pred = all_video_embd[np.where(s == 2)[0]]
            label_test = labels[np.where(s == 2)[0]]
            logits = X_pred
            X_pred = np.argmax(X_pred, axis=1)
            acc = np.sum(X_pred == label_test) / float(len(X_pred))
            accs.append(acc)
            # compute recall.
            sorted_pred = (-logits).argsort(axis=-1)
            label_test_sp = label_test.reshape(-1, 1)

            r1 = np.mean((sorted_pred[:, :1] == label_test_sp).sum(axis=1), axis=0)
            r5 = np.mean((sorted_pred[:, :5] == label_test_sp).sum(axis=1), axis=0)
            r10 = np.mean((sorted_pred[:, :10] == label_test_sp).sum(axis=1), axis=0)
            r1s.append(r1)
            r5s.append(r5)
            r10s.append(r10)

        return {"acc": accs[0], "acc_splits": accs, "r1_splits": r1s, "r5_splits": r5s, "r10_splits": r10s}

    def print_computed_metrics(self, metrics):
        for split, acc in enumerate(metrics["acc_splits"]):
            print("Top 1 accuracy on split {}: {}; r1 {}; r5 {}; r10 {}".format(
                split + 1, acc,
                metrics["r1_splits"][split],
                metrics["r5_splits"][split],
                metrics["r10_splits"][split],
                )
            )


================================================
FILE: examples/MMPT/mmpt/evaluators/predictor.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
import os
import random
import json
import numpy as np
import torch
import pickle
import math

from tqdm import tqdm


class Predictor(object):
    """this base class is used to save predictions to disk
        (and being called by a evaluator later).
        Predictor has minimum support of single gpu prediction.
    """
    def __init__(self, config):
        self.pred_dir = None  # on-the-fly eval does not save the results.
        if hasattr(config, "eval") and config.eval is not None:
            self.pred_dir = config.eval.save_path
            os.makedirs(self.pred_dir, exist_ok=True)

    def __call__(self, outputs):
        """extract the prediction and save it."""
        raise NotImplementedError

    def predict_loop(self, model, eval_dataloader, output_file=None):
        """on-the-fly prediction on a single gpu."""
        self.full_scores = []
        model.eval()
        model = model.to(0)
        with torch.no_grad():
            for data in eval_dataloader:
                data = self.to_ctx(data)
                outputs = model(**data)
                outputs.update(data)
                self(outputs)
        return self.finalize(output_file)

    def finalize(self, output_file):
        pass

    def to_ctx(self, data, ctx=0, dtype=None):
        if isinstance(data, dict):
            for key in data:
                if torch.is_tensor(data[key]):
                    if dtype is not None and data[key].dtype == torch.float32:
                        data[key] = data[key].to(dtype)
                    data[key] = data[key].to(ctx)
            return data
        else:
            raise ValueError("non-dict type of batch is not supported yet.")


class NLGPredictor(Predictor):
    """Predicting Text from MMFusion models."""
    """TODO: make a context."""
    def __init__(self, config):
        super().__init__(config)
        from transformers import AutoTokenizer

        self.tokenizer = AutoTokenizer.from_pretrained(
            config.dataset.bert_name,
            bos_token="[CLS]", eos_token="[SEP]")
        self.bos_token_id = self.tokenizer.bos_token_id
        self.eos_token_id = self.tokenizer.eos_token_id

    def predict_loop(self, model, eval_dataloader, output_file=None):
        """TODO: refactor base classes."""
        ctx = 0
        outputs = {"outputs": [], "targets": [[]]}
        model.eval()
        model = model.to(ctx)
        with torch.no_grad():
            for data in tqdm(eval_dataloader):
                data = self.to_ctx(data, ctx)
                self(data, model, outputs)
        return self.finalize(outputs, output_file)

    def __call__(self, data, model, outputs):
        data.update({
            "bos_token_id": self.bos_token_id,
            "eos_token_id": self.eos_token_id
        })

        output = model.generate(**data)
        assert len(output) == len(data["ref"])
        for idx, _output in enumerate(output):
            generated_text = self.tokenizer.decode(
                _output, skip_special_tokens=True)
            if generated_text == "":
                generated_text = "none"
            outputs["outputs"].append(generated_text)
            outputs["targets"][0].append(data["ref"][idx])
            if random.random() < 0.001:
                print("_output", _output)
                print("generated_text", generated_text)
                print("ref", data["ref"][idx])

    def finalize(self, outputs, output_file=None):
        if output_file is not None:
            with open(os.path.join(
                    self.pred_dir, output_file + ".json"), "w") as fw:
                json.dump(outputs, fw, indent=4)
        return outputs


class RetrievalPredictor(Predictor):
    """generated `pooled_video` and `pooled_text`."""
    def __init__(self, config):
        super().__init__(config)
        from transformers import AutoTokenizer
        self.tokenizer = AutoTokenizer.from_pretrained(
            config.dataset.bert_name)

    def predict_loop(
        self,
        model,
        eval_dataloader,
        output_file="retrieval.npy"
    ):
        """on-the-fly prediction on a single gpu."""
        full_scores = []
        texts = []
        model.eval()
        model = model.cuda()
        with torch.no_grad():
            for data in eval_dataloader:
                # convert to dict.
                if not isinstance(data, dict):
                    data = {
                        "caps": data[0],
                        "cmasks": data[1],
                        "vfeats": data[2],
                        "vmasks": data[3],
                        "video_id": data[4]
                    }
                data = self.to_ctx(data)
                outputs = model(**data)
                outputs.update(data)
                self(outputs, full_scores)
                for _cap in data["caps"]:
                    texts.append(
                        self.tokenizer.decode(_cap, skip_special_tokens=True)
                    )

        return self.finalize(full_scores, texts, output_file)

    def __call__(self, sample, full_scores):
        scores = self._get_pooled_outputs(sample)
        self._append_scores(scores, full_scores)

    def finalize(self, full_scores, texts, output_file=None):
        outputs = self._aggregate_scores(full_scores)
        if output_file is not None:
            np.save(os.path.join(self.pred_dir, output_file + ".npy"), outputs)
        return {"outputs": outputs, "texts": texts}

    def _get_pooled_outputs(self, outputs):
        if "pooled_video" in outputs:
            return outputs["pooled_video"], outputs["pooled_text"]
        else:
            raise ValueError("unknown format of outputs.")

    def _append_scores(self, scores, full_scores):
        assert len(scores) == 2
        if len(full_scores) == 0:
            full_scores.append([])
            full_scores.append([])
        full_scores[0].append(scores[0].cpu().detach().numpy())
        full_scores[1].append(scores[1].cpu().detach().numpy())

    def _aggregate_scores(self, scores):
        assert len(scores) == 2
        video_hidden = np.concatenate(scores[0], axis=0)
        text_hidden = np.concatenate(scores[1], axis=0)
        # clear up.
        self.full_scores = []
        return np.matmul(text_hidden, video_hidden.T)


class QAPredictor(Predictor):
    """generated `pooled_video` and `pooled_text`."""
    def __init__(self, config):
        super().__init__(config)
        """predictor maintains scores and aggregate them."""

    def predict_loop(self, model, eval_dataloader, output_file="qa.npy"):
        """on-the-fly prediction on a single gpu."""
        self.full_scores = []
        model.eval()
        model = model.cuda()
        with torch.no_grad():
            for data in eval_dataloader:
                # reshape ans and dup video 5 times.
                v_len = data["vfeats"].size(1)
                hidden_size = data["vfeats"].size(2)
                data["vfeats"] = data["vfeats"].unsqueeze(1).repeat(1, 5, 1, 1).view(-1, v_len, hidden_size)
                data["vmasks"] = data["vmasks"].unsqueeze(1).repeat(1, 5, 1).view(-1, v_len)

                t_len = data["caps"].size(-1)
                data["caps"] = data["caps"].view(-1, t_len)
                data["cmasks"] = data["cmasks"].view(-1, t_len)

                data = self.to_ctx(data)
                outputs = model(**data)
                outputs.update(data)
                self(outputs)
        return self.finalize(output_file)

    def __call__(self, sample):
        hidden_size = sample["pooled_video"].size(-1)
        pooled_video = sample["pooled_video"].view(-1, 5, hidden_size)
        pooled_text = sample["pooled_text"].view(-1, 5, hidden_size)
        scores = torch.bmm(pooled_video, pooled_text.transpose(2, 1))
        scores = scores.argmax(-1)
        self._append_scores(scores[:, 0], sample["answers"], self.full_scores)

    def finalize(self, output_file=None):
        outputs, targets = self._aggregate_scores(self.full_scores)
        if output_file is not None:
            np.save(os.path.join(self.pred_dir, output_file + ".npy"), outputs)
        return {"outputs": outputs, "targets": targets}

    def _append_scores(self, scores, answers, full_scores):
        if len(full_scores) == 0:
            full_scores.append([])
            full_scores.append([])
        full_scores[0].append(scores.cpu().detach().numpy())
        full_scores[1].append(answers.cpu().detach().numpy())

    def _aggregate_scores(self, scores):
        assert len(scores) == 2
        outputs = np.concatenate(scores[0], axis=0)
        targets = np.concatenate(scores[1], axis=0)
        # clear up.
        self.full_scores = []
        return outputs, targets


class CrossTaskPredictor(Predictor):
    """
    CrossTaskPredictor needs to compute the average of logits
    for overlapped sliding-window.
    """
    def __init__(self, config):
        super().__init__(config)
        self.lsm = torch.nn.LogSoftmax(dim=1)
        self.max_video_len = config.dataset.max_video_len
        self.sliding_window = config.dataset.sliding_window
        self.sliding_window_size = config.dataset.sliding_window_size
        self.annotation_path = config.dataset.annotation_path

    def predict_loop(self, model, eval_dataloader, output_file="result.pkl"):
        """refactored from line 144:
        https://github.com/DmZhukov/CrossTask/blob/master/train.py
        """
        ctx = 0
        model.eval()
        model = model.to(ctx)
        # this is not a loss but just compute neg_log_prob.
        Y_pred = {}
        Y_true = {}
        with torch.no_grad():
            for batch in eval_dataloader:
                self(batch, model, Y_pred, Y_true)
        return self.finalize(Y_pred, Y_true, output_file)

    def __call__(self, sample, model, Y_pred, Y_true):
        # please install dp from `https://github.com/DmZhukov/CrossTask`
        from dp import dp
        vid, task = sample['video_id'][0], sample['task'][0]
        sample = self.to_ctx(sample)
        # compute the average logits over sliding windows.
        output = model(**sample)
        batch_logits = output["logits"].cpu()

        video_len = sample["video_len"][0]

        # the following version is slow.
        logits = torch.zeros((video_len, batch_logits.size(1)))
        logits_counts = torch.zeros((video_len, 1), dtype=torch.long)
        # use the same loop as aligner to recover.
        batch_logit_idx = 0
        for window_start in range(0, video_len, self.sliding_window):
            video_end = min(video_len - window_start, self.sliding_window_size)
            logits[window_start: window_start + video_end] += batch_logits[
                batch_logit_idx: batch_logit_idx + video_end]
            batch_logit_idx += video_end
            logits_counts[window_start: window_start + video_end] += torch.ones((video_end, 1), dtype=torch.long)

            if (video_len - window_start) <= self.sliding_window_size:
                break

        logits /= logits_counts
        assert logits.size() == (video_len, batch_logits.size(1)), "{}, {}".format(logits.size(), video_len)

        O = self.lsm(logits)
        y = np.zeros(O.size(), dtype=np.float32)
        dp(y, -O.detach().cpu().numpy())
        if task not in Y_pred:
            Y_pred[task] = {}
        Y_pred[task][vid] = y
        annot_path = os.path.join(
            self.annotation_path, task+'_'+vid+'.csv')
        if os.path.exists(annot_path):
            if task not in Y_true:
                Y_true[task] = {}
            Y_true[task][vid] = self._read_assignment(
                *y.shape, annot_path)

    def finalize(self, Y_pred, Y_true, output_file=None):
        if output_file is not None:
            with open(
                    os.path.join(self.pred_dir, output_file + ".pkl"),
                    "wb") as fw:
                pickle.dump(
                    {"Y_pred": Y_pred, "Y_true": Y_true}, fw,
                    protocol=pickle.HIGHEST_PROTOCOL)
        return {"outputs": Y_pred, "targets": Y_true}

    def _read_assignment(self, T, K, path):
        """
        refactored from https://github.com/DmZhukov/CrossTask/blob/master/data.py
        Howto interpret contraints on loss that is going to be minimized:
        lambd is a big number;
        self.lambd * C is a big number for all valid position (csv stores invalids)

        def forward(self, O, Y, C):
            return (Y*(self.lambd * C - self.lsm(O))).mean(dim=0).sum()

        This will load the csv file and fill-in the step col from start to end rows.
        """

        Y = np.zeros([T, K], dtype=np.uint8)
        with open(path, 'r') as f:
            for line in f:
                step, start, end = line.strip().split(',')
                start = int(math.floor(float(start)))
                end = int(math.ceil(float(end)))
                step = int(step) - 1
                Y[start:end, step] = 1
        return Y


class COINPredictor(Predictor):
    """
    COINPredictor is similar to CrossTask on sliding windows.
    """
    def __init__(self, config):
        super().__init__(config)
        self.max_video_len = config.dataset.max_video_len
        self.sliding_window = config.dataset.sliding_window
        self.sliding_window_size = config.dataset.sliding_window_size

    def predict_loop(self, model, eval_dataloader, output_file="result.pkl"):
        """refactored from line 144:
        https://github.com/DmZhukov/CrossTask/blob/master/train.py
        """
        ctx = 0
        model.eval()
        model = model.to(ctx)
        # this is not a loss but just compute neg_log_prob.
        Y_pred = []
        Y_true = []
        with torch.no_grad():
            for batch in eval_dataloader:
                self(batch, model, Y_pred, Y_true)
        return self.finalize(Y_pred, Y_true, output_file)

    def __call__(self, sample, model, Y_pred, Y_true):
        sample = self.to_ctx(sample)
        # compute the average logits over sliding windows.
        output = model(**sample)
        logits = self._merge_windows(sample, output)
        Y_pred.append(logits.argmax(dim=1))
        Y_true.append(sample["video_targets"].squeeze(0).cpu())

    def _merge_windows(self, sample, output):
        targets = sample["targets"].reshape(-1).cpu()
        valid_mask = targets != -100
        targets = targets[valid_mask]
        batch_logits = output["logits"].cpu()
        batch_logits = batch_logits.reshape(-1, batch_logits.size(-1))
        batch_logits = batch_logits[valid_mask]

        video_len = sample["video_len"][0]

        # the following version is slow.
        logits = torch.zeros((video_len, batch_logits.size(1)))
        logits_counts = torch.zeros((video_len, 1), dtype=torch.long)
        # use the same loop as aligner to recover.
        batch_logit_idx = 0
        for window_start in range(0, video_len, self.sliding_window):
            video_end = min(video_len - window_start, self.sliding_window_size)
            logits[window_start: window_start + video_end] += batch_logits[
                batch_logit_idx: batch_logit_idx + video_end]
            batch_logit_idx += video_end
            logits_counts[window_start: window_start + video_end] += torch.ones((video_end, 1), dtype=torch.long)
            if (video_len - window_start) <= self.sliding_window_size:
                break
        logits /= logits_counts
        assert logits.size() == (video_len, batch_logits.size(1)), "{}, {}".format(logits.size(), video_len)
        return logits

    def finalize(self, Y_pred, Y_true, output_file=None):
        Y_pred = torch.cat(Y_pred, dim=0).numpy()
        Y_true = torch.cat(Y_true, dim=0).numpy()
        assert len(Y_pred) == len(Y_true)

        error_mask = Y_pred != Y_true
        print("sample error", Y_pred[error_mask][:10], Y_true[error_mask][:10])
        print("sample error", Y_pred[error_mask][10:20], Y_true[error_mask][10:20])

        if output_file is not None:
            with open(
                    os.path.join(self.pred_dir, output_file + ".pkl"),
                    "wb") as fw:
                pickle.dump(
                    {"Y_pred": Y_pred, "Y_true": Y_true}, fw,
                    protocol=pickle.HIGHEST_PROTOCOL)
        return {"outputs": Y_pred, "targets": Y_true}


class COINZSPredictor(COINPredictor):
    """
    COINZSPredictor for COIN zero-shot prediction.
    """

    def __init__(self, config):
        super().__init__(config)
        self.dataset_config = config.dataset

    def predict_loop(self, model, eval_dataloader, output_file="result.pkl"):
        """refactored from line 144:
        https://github.com/DmZhukov/CrossTask/blob/master/train.py
        """
        ctx = 0
        model.eval()
        model = model.to(ctx)

        with torch.no_grad():
            outputs = eval_dataloader.dataset.meta_processor.meta_text_labels(
                self.dataset_config)
            outputs = self.to_ctx(outputs, ctx)
            label_hidden_states = model.forward_text(**outputs).cpu()
            label_sim = label_hidden_states @ label_hidden_states.t()
            num_labels = label_sim.size(0)
            eye_mask = ~torch.eye(num_labels, dtype=torch.bool)
            label_sim = label_sim.masked_select(eye_mask).view(num_labels, num_labels - 1)
            lbd = label_sim.max()

        # this is not a loss but just compute neg_log_prob.
        Y_pred = []
        Y_true = []
        with torch.no_grad():
            for batch in eval_dataloader:
                self(batch, label_hidden_states, model, lbd, Y_pred, Y_true)
        return self.finalize(Y_pred, Y_true, output_file)

    def reshape_subsample(self, sample):
        for key in sample:
            if torch.is_tensor(sample[key]):
                sample[key] = self.flat_subsample(sample[key])
        return sample

    def flat_subsample(self, tensor):
        if len(tensor.size()) > 1 and tensor.size(0) == 1:
            tensor = tensor.squeeze(0)
        return tensor

    def __call__(self, sample, label_hidden_states, model, lbd, Y_pred, Y_true):
        sample = self.reshape_subsample(sample)
        sample = self.to_ctx(sample)
        # compute the average logits over sliding windows.
        sample["output_hidden_states"] = True
        video_outputs = model.forward_video(**sample).cpu()
        output = {"logits": video_outputs[:, 1:sample["vmasks"].size(1)+1] @ label_hidden_states.t()}
        logits = self._merge_windows(sample, output)
        # logic of zero-shot for sequence labeling.
        logits_argmax = logits.argmax(dim=1) + 1  # 0 is "O" label.
        logits_max = logits.max(dim=1)[0]

        pred = torch.zeros_like(logits_argmax)
        label_select = logits_max > lbd  # 73 or 74
        pred[label_select] = logits_argmax[label_select]

        Y_pred.append(pred)
        Y_true.append(sample["video_targets"].squeeze(0).cpu())

    def finalize(self, Y_pred, Y_true, output_file=None):
        Y_pred = torch.cat(Y_pred, dim=0).numpy()
        Y_true = torch.cat(Y_true, dim=0).numpy()
        assert len(Y_pred) == len(Y_true)

        error_mask = Y_pred != Y_true
        print("sample error", Y_pred[error_mask][:10], Y_true[error_mask][:10])
        print("sample error", Y_pred[error_mask][10:20], Y_true[error_mask][10:20])

        if output_file is not None:
            with open(
                    os.path.join(self.pred_dir, output_file + ".pkl"),
                    "wb") as fw:
                pickle.dump(
                    {"Y_pred": Y_pred, "Y_true": Y_true}, fw,
                    protocol=pickle.HIGHEST_PROTOCOL)
        return {"outputs": Y_pred, "targets": Y_true}


class DiDeMoPredictor(Predictor):
    """reference: https://github.com/LisaAnne/LocalizingMoments/blob/master/utils/eval.py
    https://github.com/LisaAnne/LocalizingMoments/blob/master/utils/data_processing.py
    """
    def __init__(self, config):
        super().__init__(config)
        # load targets.
        with open(config.dataset.test_path) as data_file:
            self.test_data = json.load(data_file)

    def predict_loop(self, model, eval_dataloader, output_file="didemo.npy"):
        """
        TODO: two solutions here.
        """
        import itertools
        # 21 chunks.
        self.possible_segments = [(0,0), (1,1), (2,2), (3,3), (4,4), (5,5)]
        for i in itertools.combinations(range(6), 2):
            self.possible_segments.append(i)
        # pick segments from a video.

        """on-the-fly prediction on a single gpu."""
        self.full_scores = []
        model.eval()
        model = model.cuda()
        with torch.no_grad():
            for data in eval_dataloader:
                # TODO special forwarding logic here.
                data = self.to_ctx(data)
                data["output_hidden_states"] = True
                hidden_video = model.forward_video(**data)
                data["output_hidden_states"] = False
                pooled_text = model.forward_text(**data)
                outputs = {
                    "hidden_video": hidden_video,
                    "pooled_text": pooled_text
                }
                outputs.update(data)
                self(outputs)
        return self.finalize(output_file)

    def __call__(self, sample):
        # TODO: make an index select from self.possible_segments.
        hidden_video = sample["hidden_video"]
        pooled_text = sample["pooled_text"]
        vmasks = sample["vmasks"]
        # probably maintain valid results here.

        hidden_video = hidden_video[:, 1:-1, :]
        # probably maintain valid results here.
        pooled_video = []
        for s, e in self.possible_segments:
            pooled_video.append(
                torch.mean(
                    hidden_video[:, int(s*5):int((e+1)*5), :],
                    dim=1, keepdim=True)
            )
        pooled_video = torch.cat(pooled_video, dim=1)
        scores = torch.bmm(
            pooled_video, pooled_text.unsqueeze(-1)).squeeze(-1).cpu()

        ranks = scores.argsort(dim=-1, descending=True)

        for batch_idx, rank in enumerate(ranks):
            rank_of_moment = []
            for m_idx, moment in enumerate(rank):
                s, e = self.possible_segments[moment.item()]
                if torch.any(
                    vmasks[batch_idx, int(s*5):int((e+1)*5)]
                ):
                    rank_of_moment.append((s, e))
            self.full_scores.append(rank_of_moment)

    def finalize(self, output_file=None):
        outputs = self._aggregate_scores(self.full_scores)
        if output_file is not None:
            np.save(os.path.join(self.pred_dir, output_file + ".npy"), outputs)
        return {"outputs": outputs, "targets": self.test_data}

    def _aggregate_scores(self, scores):
        self.full_scores = []
        return scores


================================================
FILE: examples/MMPT/mmpt/losses/__init__.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
from .loss import *
from .nce import *

try:
    from .fairseqmmloss import *
except ImportError:
    pass

try:
    from .expnce import *
except ImportError:
    pass


================================================
FILE: examples/MMPT/mmpt/losses/fairseqmmloss.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

"""
TODO (huxu): a general fairseq criterion for all your pre-defined losses.
"""

from fairseq.criterions import FairseqCriterion, register_criterion
from fairseq.logging import metrics


@register_criterion("mmloss")
class MMCriterion(FairseqCriterion):
    def __init__(self, task):
        super().__init__(task)
        # TODO (huxu): wrap forward call of loss_fn and eval_fn into task.
        self.mmtask = task.mmtask

    def forward(self, model, sample):
        """Compute the loss for the given sample.
        Returns a tuple with three elements:
        1) the loss
        2) the sample size, which is used as the denominator for the gradient
        3) logging outputs to display while training
        """
        outputs = self.mmtask(model, sample)

        loss, loss_scalar, max_len, batch_size, sample_size = (
            outputs["loss"],
            outputs["loss_scalar"],
            outputs["max_len"],
            outputs["batch_size"],
            outputs["sample_size"],
        )

        logging_output = {
            "loss": loss_scalar,
            "ntokens": max_len * batch_size,  # dummy report.
            "nsentences": batch_size,  # dummy report.
            "sample_size": sample_size,
        }

        return loss, 1, logging_output

    @staticmethod
    def reduce_metrics(logging_outputs) -> None:
        """Aggregate logging outputs from data parallel training."""
        """since we use NCE, our actual batch_size is 1 per GPU.
        Then we take the mean of each worker."""
        loss_sum = sum(log.get("loss", 0.0) for log in logging_outputs)
        sample_size = sum(log.get("sample_size", 0) for log in logging_outputs)
        metrics.log_scalar("loss", loss_sum / sample_size, round=3)

    @staticmethod
    def logging_outputs_can_be_summed() -> bool:
        """
        Whether the logging outputs returned by `forward` can be summed
        across workers prior to calling `reduce_metrics`. Setting this
        to True will improves distributed training speed.
        """
        return True


================================================
FILE: examples/MMPT/mmpt/losses/loss.py
================================================
# Copyright (c) Facebook, Inc. All Rights Reserved

import torch

from torch import nn


class Loss(object):
    def __call__(self, *args, **kwargs):
        raise NotImplementedError


# Dummy Loss for testing.
class DummyLoss(Loss):
    def __init__(self):
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(logits, targets)


class DummyK400Loss(Loss):
    """dummy k400 loss for MViT."""
    def __init__(self):
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(
            logits, torch.randint(0, 400, (logits.size(0),), device=logits.device))


class CrossEntropy(Loss):
    def __init__(self):
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(logits.reshape(-1, logits.size(-1)), targets.reshape(-1))


class ArgmaxCrossEntropy(Loss):
    def __init__(self):
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(logits, targets.argmax(dim=1))


class BCE(Loss):
    def __init__(self):
        self.loss = nn.BCEWithLogitsLoss()

    def __call__(self, logits, targets, **kwargs):
        targets = targets.squeeze(0)
        return self.loss(logits, targets)


class NLGLoss(Loss):
    def __init__(self):
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, logits, text_label, **kwargs):
        targets = text_label[text_label != -100]
        return self.loss(logits, targets)


class MSE(Loss):
    def __init__(self):
        self.loss = nn.MSELoss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(logits, targets)


class L1(Loss):
    def __init__(self):
        self.loss = nn.L1Loss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(logits, targets)


class SmoothL1(Loss):
    def __init__(self):
        self.loss = nn.SmoothL1Loss()

    def __call__(self, logits, targets, **kwargs):
        return self.loss(logits, targets)


================================================
FILE: examples/MMPT/mmpt/losses/nce.py
================================================
# Copyright (c) Facebook, Inc. and its affiliates.
#
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

"""
softmax-based NCE loss, used by this project.
"""

import torch

from torch import nn

from .loss import Loss


class NCE(Loss):
    def __init__(self):
        # TODO (huxu): define temperature.
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, align_scores, **kargs):
        # note: we reuse the same shape as cls head in BERT (batch_size, 2)
        # but NCE only needs one logits.
        # (so we drop all weights in the second neg logits.)
        align_scores = align_scores[:, :1]
        # duplicate negative examples
        batch_size = align_scores.size(0) // 2
        pos_scores = align_scores[:batch_size]
        neg_scores = align_scores[batch_size:].view(1, batch_size).repeat(
            batch_size, 1)
        scores = torch.cat([pos_scores, neg_scores], dim=1)
        return self.loss(
            scores,
            torch.zeros(
                (batch_size,),
                dtype=torch.long,
                device=align_scores.device),
        )


class T2VContraLoss(Loss):
    """NCE for MM joint space, on softmax text2video matrix.
    """
    def __init__(self):
        # TODO (huxu): define temperature.
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, pooled_video, pooled_text, **kargs):
        batch_size = pooled_video.size(0)
        logits = torch.mm(pooled_text, pooled_video.transpose(1, 0))
        targets = torch.arange(
            batch_size,
            dtype=torch.long,
            device=pooled_video.device)
        return self.loss(logits, targets)


class V2TContraLoss(Loss):
    """NCE for MM joint space, with softmax on video2text matrix."""

    def __init__(self):
        # TODO (huxu): define temperature.
        self.loss = nn.CrossEntropyLoss()

    def __call__(self, pooled_video, pooled_text, **kargs):
        batch_size = pooled_video.size(0)
        logits = torch.mm(pooled_video, pooled_

Download .txt

gitextract_ldfkme3g/

├── .github/
│   ├── CODEOWNERS
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.md
│   │   ├── documentation.md
│   │   ├── feature_request.md
│   │   └── how-to-question.md
│   ├── ISSUE_TEMPLATE.md
│   ├── PULL_REQUEST_TEMPLATE.md
│   ├── stale.yml
│   └── workflows/
│       ├── build.yml
│       ├── depreview.yml
│       └── release.yml
├── .gitignore
├── .gitmodules
├── .pre-commit-config.yaml
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── MANIFEST.in
├── README.md
├── RELEASE.md
├── docs/
│   ├── Makefile
│   ├── command_line_tools.rst
│   ├── conf.py
│   ├── criterions.rst
│   ├── data.rst
│   ├── docutils.conf
│   ├── getting_started.rst
│   ├── hydra_integration.md
│   ├── index.rst
│   ├── lr_scheduler.rst
│   ├── make.bat
│   ├── models.rst
│   ├── modules.rst
│   ├── optim.rst
│   ├── overview.rst
│   ├── tasks.rst
│   ├── tutorial_classifying_names.rst
│   └── tutorial_simple_lstm.rst
├── examples/
│   ├── .gitignore
│   ├── MMPT/
│   │   ├── .gitignore
│   │   ├── CONFIG.md
│   │   ├── DATASET.md
│   │   ├── README.md
│   │   ├── endtask.md
│   │   ├── locallaunch.py
│   │   ├── mmpt/
│   │   │   ├── __init__.py
│   │   │   ├── datasets/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmdataset.py
│   │   │   │   └── mmdataset.py
│   │   │   ├── evaluators/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── evaluator.py
│   │   │   │   ├── metric.py
│   │   │   │   └── predictor.py
│   │   │   ├── losses/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmloss.py
│   │   │   │   ├── loss.py
│   │   │   │   └── nce.py
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmmodel.py
│   │   │   │   ├── mmfusion.py
│   │   │   │   ├── mmfusionnlg.py
│   │   │   │   └── transformermodel.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── mm.py
│   │   │   │   ├── retri.py
│   │   │   │   └── vectorpool.py
│   │   │   ├── processors/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── dedupprocessor.py
│   │   │   │   ├── dsprocessor.py
│   │   │   │   ├── how2processor.py
│   │   │   │   ├── how2retriprocessor.py
│   │   │   │   ├── models/
│   │   │   │   │   └── s3dg.py
│   │   │   │   └── processor.py
│   │   │   ├── tasks/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── fairseqmmtask.py
│   │   │   │   ├── milncetask.py
│   │   │   │   ├── retritask.py
│   │   │   │   ├── task.py
│   │   │   │   └── vlmtask.py
│   │   │   └── utils/
│   │   │       ├── __init__.py
│   │   │       ├── load_config.py
│   │   │       └── shardedtensor.py
│   │   ├── mmpt_cli/
│   │   │   ├── localjob.py
│   │   │   └── predict.py
│   │   ├── pretraining.md
│   │   ├── projects/
│   │   │   ├── mfmmlm.yaml
│   │   │   ├── mtm/
│   │   │   │   ├── mmfusionmtm.yaml
│   │   │   │   ├── vlm/
│   │   │   │   │   ├── coin.yaml
│   │   │   │   │   ├── crosstask.yaml
│   │   │   │   │   ├── how2.yaml
│   │   │   │   │   ├── test_coin.yaml
│   │   │   │   │   ├── test_crosstask.yaml
│   │   │   │   │   ├── test_crosstask_zs.yaml
│   │   │   │   │   ├── test_vtt.yaml
│   │   │   │   │   ├── test_vttqa.yaml
│   │   │   │   │   ├── test_youcook.yaml
│   │   │   │   │   ├── test_youcookcap.yaml
│   │   │   │   │   ├── vtt.yaml
│   │   │   │   │   ├── vttqa.yaml
│   │   │   │   │   ├── youcook.yaml
│   │   │   │   │   └── youcookcap.yaml
│   │   │   │   └── vlm.yaml
│   │   │   ├── retri/
│   │   │   │   ├── videoclip/
│   │   │   │   │   ├── coin_videoclip.yaml
│   │   │   │   │   ├── crosstask_videoclip.yaml
│   │   │   │   │   ├── how2.yaml
│   │   │   │   │   ├── test_coin_videoclip.yaml
│   │   │   │   │   ├── test_coin_zs.yaml
│   │   │   │   │   ├── test_crosstask_videoclip.yaml
│   │   │   │   │   ├── test_crosstask_zs_videoclip.yaml
│   │   │   │   │   ├── test_didemo_zs.yaml
│   │   │   │   │   ├── test_vtt_videoclip.yaml
│   │   │   │   │   ├── test_vtt_zs.yaml
│   │   │   │   │   ├── test_vttqa_videoclip.yaml
│   │   │   │   │   ├── test_vttqa_zs.yaml
│   │   │   │   │   ├── test_youcook_videoclip.yaml
│   │   │   │   │   ├── test_youcook_zs.yaml
│   │   │   │   │   ├── vtt_videoclip.yaml
│   │   │   │   │   ├── vttqa_videoclip.yaml
│   │   │   │   │   └── youcook_videoclip.yaml
│   │   │   │   ├── videoclip.yaml
│   │   │   │   └── videoretri.yaml
│   │   │   └── task/
│   │   │       ├── coin.yaml
│   │   │       ├── coin_videoclip.yaml
│   │   │       ├── crosstask.yaml
│   │   │       ├── crosstask_videoclip.yaml
│   │   │       ├── default.yaml
│   │   │       ├── ft.yaml
│   │   │       ├── how2.yaml
│   │   │       ├── test.yaml
│   │   │       ├── test_coin.yaml
│   │   │       ├── test_coin_videoclip.yaml
│   │   │       ├── test_coin_zs.yaml
│   │   │       ├── test_crosstask.yaml
│   │   │       ├── test_crosstask_videoclip.yaml
│   │   │       ├── test_crosstask_zs.yaml
│   │   │       ├── test_crosstask_zs_videoclip.yaml
│   │   │       ├── test_didemo_zs.yaml
│   │   │       ├── test_vtt.yaml
│   │   │       ├── test_vtt_videoclip.yaml
│   │   │       ├── test_vtt_zs.yaml
│   │   │       ├── test_vttqa.yaml
│   │   │       ├── test_vttqa_videoclip.yaml
│   │   │       ├── test_vttqa_zs.yaml
│   │   │       ├── test_youcook.yaml
│   │   │       ├── test_youcook_videoclip.yaml
│   │   │       ├── test_youcook_zs.yaml
│   │   │       ├── test_youcookcap.yaml
│   │   │       ├── vtt.yaml
│   │   │       ├── vtt_videoclip.yaml
│   │   │       ├── vttqa.yaml
│   │   │       ├── vttqa_videoclip.yaml
│   │   │       ├── youcook.yaml
│   │   │       ├── youcook_videoclip.yaml
│   │   │       └── youcookcap.yaml
│   │   ├── scripts/
│   │   │   ├── text_token_extractor/
│   │   │   │   ├── configs/
│   │   │   │   │   └── bert-base-uncased.yaml
│   │   │   │   └── pretokenization.py
│   │   │   └── video_feature_extractor/
│   │   │       ├── extract.py
│   │   │       ├── how2/
│   │   │       │   └── s3d.sh
│   │   │       ├── model.py
│   │   │       ├── pathbuilder.py
│   │   │       ├── preprocessing.py
│   │   │       ├── random_sequence_shuffler.py
│   │   │       ├── shard_feature.py
│   │   │       └── videoreader.py
│   │   └── setup.py
│   ├── __init__.py
│   ├── adaptive_span/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── adagrad_with_grad_clip.py
│   │   ├── adaptive_span_attention.py
│   │   ├── adaptive_span_loss.py
│   │   ├── adaptive_span_model.py
│   │   └── adaptive_span_model_wrapper.py
│   ├── attention_head_selection/
│   │   ├── README.md
│   │   └── src/
│   │       ├── __init__.py
│   │       ├── data/
│   │       │   ├── __init__.py
│   │       │   └── speech_to_text_dataset_with_domain.py
│   │       ├── loss/
│   │       │   ├── __init__.py
│   │       │   └── attention_head_selection.py
│   │       ├── models/
│   │       │   ├── __init__.py
│   │       │   ├── head_selection_s2t_transformer.py
│   │       │   └── head_selection_transformer.py
│   │       ├── modules/
│   │       │   ├── __init__.py
│   │       │   ├── attn_head_selector.py
│   │       │   ├── head_selection_transformer_layer.py
│   │       │   ├── multihead_attention_selection.py
│   │       │   └── multihead_functional.py
│   │       └── speech_to_text_head_selection.py
│   ├── audio_nlp/
│   │   └── nlu/
│   │       ├── README.md
│   │       ├── configs/
│   │       │   └── nlu_finetuning.yaml
│   │       ├── create_dict_stop.sh
│   │       └── generate_manifests.py
│   ├── backtranslation/
│   │   ├── README.md
│   │   ├── deduplicate_lines.py
│   │   ├── extract_bt_data.py
│   │   ├── prepare-de-monolingual.sh
│   │   ├── prepare-wmt18en2de.sh
│   │   ├── sacrebleu.sh
│   │   └── tokenized_bleu.sh
│   ├── bart/
│   │   ├── README.glue.md
│   │   ├── README.md
│   │   ├── README.summarization.md
│   │   └── summarize.py
│   ├── byte_level_bpe/
│   │   ├── README.md
│   │   ├── get_bitext.py
│   │   ├── get_data.sh
│   │   └── gru_transformer.py
│   ├── camembert/
│   │   └── README.md
│   ├── constrained_decoding/
│   │   ├── README.md
│   │   ├── normalize.py
│   │   └── tok.py
│   ├── conv_seq2seq/
│   │   └── README.md
│   ├── criss/
│   │   ├── README.md
│   │   ├── download_and_preprocess_flores_test.sh
│   │   ├── download_and_preprocess_tatoeba.sh
│   │   ├── mining/
│   │   │   ├── mine.py
│   │   │   └── mine_example.sh
│   │   ├── save_encoder.py
│   │   ├── sentence_retrieval/
│   │   │   ├── encoder_analysis.py
│   │   │   └── sentence_retrieval_tatoeba.sh
│   │   └── unsupervised_mt/
│   │       └── eval.sh
│   ├── cross_lingual_language_model/
│   │   └── README.md
│   ├── data2vec/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── audio/
│   │   │   │   ├── classification/
│   │   │   │   │   ├── base_classification.yaml
│   │   │   │   │   └── run_config/
│   │   │   │   │       ├── slurm_1.yaml
│   │   │   │   │       ├── slurm_1g.yaml
│   │   │   │   │       └── slurm_2.yaml
│   │   │   │   └── pretraining/
│   │   │   │       ├── audioset.yaml
│   │   │   │       ├── base_librispeech.yaml
│   │   │   │       └── run_config/
│   │   │   │           ├── local.yaml
│   │   │   │           ├── slurm_1.yaml
│   │   │   │           ├── slurm_1_aws.yaml
│   │   │   │           ├── slurm_2.yaml
│   │   │   │           ├── slurm_2_aws.yaml
│   │   │   │           ├── slurm_3.yaml
│   │   │   │           ├── slurm_4.yaml
│   │   │   │           ├── slurm_4_aws.yaml
│   │   │   │           ├── slurm_6_aws.yaml
│   │   │   │           └── slurm_8_aws.yaml
│   │   │   ├── text/
│   │   │   │   └── pretraining/
│   │   │   │       ├── base.yaml
│   │   │   │       └── run_config/
│   │   │   │           ├── local.yaml
│   │   │   │           ├── slurm_1_aws.yaml
│   │   │   │           ├── slurm_2.yaml
│   │   │   │           ├── slurm_2_aws.yaml
│   │   │   │           ├── slurm_3.yaml
│   │   │   │           ├── slurm_4.yaml
│   │   │   │           ├── slurm_4_aws.yaml
│   │   │   │           └── slurm_8_aws.yaml
│   │   │   ├── v2/
│   │   │   │   ├── base_audio_only_task.yaml
│   │   │   │   ├── base_images_only_task.yaml
│   │   │   │   ├── base_text_only_task.yaml
│   │   │   │   ├── huge_images14_only_task.yaml
│   │   │   │   ├── huge_images_only_task.yaml
│   │   │   │   ├── large_audio_only_task.yaml
│   │   │   │   ├── large_images_only_task.yaml
│   │   │   │   ├── large_text_only_task.yaml
│   │   │   │   ├── large_text_only_task_pgrp_1M.yaml
│   │   │   │   ├── run_config/
│   │   │   │   │   ├── local.yaml
│   │   │   │   │   ├── slurm_1.yaml
│   │   │   │   │   ├── slurm_1_aws.yaml
│   │   │   │   │   ├── slurm_2.yaml
│   │   │   │   │   ├── slurm_2_aws.yaml
│   │   │   │   │   ├── slurm_3.yaml
│   │   │   │   │   ├── slurm_4.yaml
│   │   │   │   │   ├── slurm_4_aws.yaml
│   │   │   │   │   ├── slurm_6_aws.yaml
│   │   │   │   │   ├── slurm_8.yaml
│   │   │   │   │   └── slurm_8_aws.yaml
│   │   │   │   └── text_finetuning/
│   │   │   │       ├── cola.yaml
│   │   │   │       ├── mnli.yaml
│   │   │   │       ├── mrpc.yaml
│   │   │   │       ├── qnli.yaml
│   │   │   │       ├── qqp.yaml
│   │   │   │       ├── rte.yaml
│   │   │   │       ├── run_config/
│   │   │   │       │   └── local.yaml
│   │   │   │       ├── sst_2.yaml
│   │   │   │       └── sts_b.yaml
│   │   │   └── vision/
│   │   │       ├── finetuning/
│   │   │       │   ├── imagenet.yaml
│   │   │       │   ├── mae_imagenet_clean.yaml
│   │   │       │   ├── mae_imagenet_huge_clean.yaml
│   │   │       │   ├── mae_imagenet_large_clean.yaml
│   │   │       │   └── run_config/
│   │   │       │       ├── local.yaml
│   │   │       │       ├── slurm_1.yaml
│   │   │       │       ├── slurm_1_aws.yaml
│   │   │       │       ├── slurm_2.yaml
│   │   │       │       ├── slurm_2_aws.yaml
│   │   │       │       ├── slurm_3.yaml
│   │   │       │       ├── slurm_4.yaml
│   │   │       │       ├── slurm_4_aws.yaml
│   │   │       │       ├── slurm_6_aws.yaml
│   │   │       │       └── slurm_8_aws.yaml
│   │   │       └── pretraining/
│   │   │           ├── base_imagenet.yaml
│   │   │           ├── base_imagenet_d2v1.yaml
│   │   │           ├── base_mae_imagenet.yaml
│   │   │           └── run_config/
│   │   │               ├── local.yaml
│   │   │               ├── slurm_1.yaml
│   │   │               ├── slurm_1_aws.yaml
│   │   │               ├── slurm_2.yaml
│   │   │               ├── slurm_2_aws.yaml
│   │   │               ├── slurm_3.yaml
│   │   │               ├── slurm_4.yaml
│   │   │               ├── slurm_4_aws.yaml
│   │   │               ├── slurm_6_aws.yaml
│   │   │               └── slurm_8_aws.yaml
│   │   ├── data/
│   │   │   ├── __init__.py
│   │   │   ├── add_class_target_dataset.py
│   │   │   ├── image_dataset.py
│   │   │   ├── mae_finetuning_image_dataset.py
│   │   │   ├── mae_image_dataset.py
│   │   │   ├── modality.py
│   │   │   └── path_dataset.py
│   │   ├── fb_convert_beit_cp.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── audio_classification.py
│   │   │   ├── data2vec2.py
│   │   │   ├── data2vec_audio.py
│   │   │   ├── data2vec_image_classification.py
│   │   │   ├── data2vec_text.py
│   │   │   ├── data2vec_text_classification.py
│   │   │   ├── data2vec_vision.py
│   │   │   ├── mae.py
│   │   │   ├── mae_image_classification.py
│   │   │   ├── modalities/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── audio.py
│   │   │   │   ├── base.py
│   │   │   │   ├── images.py
│   │   │   │   ├── modules.py
│   │   │   │   └── text.py
│   │   │   └── utils.py
│   │   ├── scripts/
│   │   │   ├── convert_audioset_labels.py
│   │   │   ├── multi/
│   │   │   │   ├── finetune_all_fair_aws_local_lr.sh
│   │   │   │   ├── finetune_all_fair_aws_local_lr_nodep.sh
│   │   │   │   └── finetune_all_fair_local_lr.sh
│   │   │   └── text/
│   │   │       ├── finetune_all_char_fair_aws_local_lr.sh
│   │   │       ├── finetune_all_fair.sh
│   │   │       ├── finetune_all_fair_aws.sh
│   │   │       ├── finetune_all_fair_aws_local_lr.sh
│   │   │       ├── finetune_all_fair_aws_lr.sh
│   │   │       ├── finetune_all_fair_local_lr.sh
│   │   │       ├── finetune_all_fair_nodep.sh
│   │   │       ├── finetune_all_fair_nodep_aws.sh
│   │   │       ├── finetune_all_fair_nodep_aws_local_lr.sh
│   │   │       ├── finetune_all_fair_nodep_aws_lr.sh
│   │   │       ├── finetune_all_fair_nodep_aws_lr_nopos.sh
│   │   │       ├── finetune_all_large_fair_aws_local_lr.sh
│   │   │       ├── finetune_all_large_fair_local_lr.sh
│   │   │       ├── finetune_all_large_fair_nodep_aws_local_lr.sh
│   │   │       ├── finetune_sst2_qnli_sweep_fair_nodep.sh
│   │   │       ├── glue.py
│   │   │       ├── glue_lr.py
│   │   │       ├── unprocess_data.py
│   │   │       └── valids.py
│   │   └── tasks/
│   │       ├── __init__.py
│   │       ├── audio_classification.py
│   │       ├── image_classification.py
│   │       ├── image_pretraining.py
│   │       ├── mae_image_classification.py
│   │       ├── mae_image_pretraining.py
│   │       └── multimodal.py
│   ├── discriminative_reranking_nmt/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   └── deen.yaml
│   │   ├── criterions/
│   │   │   ├── __init__.py
│   │   │   └── discriminative_reranking_criterion.py
│   │   ├── drnmt_rerank.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   └── discriminative_reranking_model.py
│   │   ├── scripts/
│   │   │   └── prep_data.py
│   │   └── tasks/
│   │       ├── __init__.py
│   │       └── discriminative_reranking_task.py
│   ├── emotion_conversion/
│   │   ├── README.md
│   │   ├── emotion_models/
│   │   │   ├── __init__.py
│   │   │   ├── duration_predictor.py
│   │   │   ├── duration_predictor.yaml
│   │   │   ├── pitch_predictor.py
│   │   │   ├── pitch_predictor.yaml
│   │   │   └── utils.py
│   │   ├── fairseq_models/
│   │   │   └── __init__.py
│   │   ├── preprocess/
│   │   │   ├── __init__.py
│   │   │   ├── build_hifigan_manifest.py
│   │   │   ├── build_translation_manifests.py
│   │   │   ├── create_core_manifest.py
│   │   │   ├── extract_f0.py
│   │   │   ├── process_km.py
│   │   │   ├── split_emov_km_tsv_by_uttid.py
│   │   │   ├── split_km.py
│   │   │   └── split_km_tsv.py
│   │   ├── requirements.txt
│   │   └── synthesize.py
│   ├── fast_noisy_channel/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── noisy_channel_beam_search.py
│   │   ├── noisy_channel_sequence_generator.py
│   │   └── noisy_channel_translation.py
│   ├── flores101/
│   │   └── README.md
│   ├── fully_sharded_data_parallel/
│   │   └── README.md
│   ├── gottbert/
│   │   └── README.md
│   ├── hubert/
│   │   ├── README.md
│   │   ├── config/
│   │   │   ├── decode/
│   │   │   │   ├── ax_sweep/
│   │   │   │   │   ├── ngram.yaml
│   │   │   │   │   └── transformer.yaml
│   │   │   │   ├── infer_fsqlm.yaml
│   │   │   │   ├── infer_kenlm.yaml
│   │   │   │   ├── infer_viterbi.yaml
│   │   │   │   └── run/
│   │   │   │       ├── submitit_slurm.yaml
│   │   │   │       └── submitit_slurm_8gpu.yaml
│   │   │   ├── finetune/
│   │   │   │   ├── base_10h.yaml
│   │   │   │   ├── ckpt/
│   │   │   │   │   └── it1.yaml
│   │   │   │   ├── lm/
│   │   │   │   │   └── ls_4gram.yaml
│   │   │   │   └── run/
│   │   │   │       └── submitit_reg.yaml
│   │   │   └── pretrain/
│   │   │       ├── data/
│   │   │       │   ├── iter1.yaml
│   │   │       │   └── iter2.yaml
│   │   │       ├── hubert_base_librispeech.yaml
│   │   │       ├── hubert_large_librivox.yaml
│   │   │       ├── hubert_xlarge_librivox.yaml
│   │   │       └── run/
│   │   │           └── submitit_reg.yaml
│   │   ├── measure_teacher_quality.py
│   │   ├── simple_kmeans/
│   │   │   ├── README.md
│   │   │   ├── dump_hubert_feature.py
│   │   │   ├── dump_hubert_feature_s2t.py
│   │   │   ├── dump_km_label.py
│   │   │   ├── dump_mfcc_feature.py
│   │   │   ├── dump_w2v2_feature.py
│   │   │   ├── feature_utils.py
│   │   │   └── learn_kmeans.py
│   │   ├── tests/
│   │   │   ├── 6313-76958-0021.flac
│   │   │   ├── sample.base.L9.km500.km
│   │   │   ├── sample.base.L9.len
│   │   │   ├── sample.base.L9.npy
│   │   │   ├── sample.large.L20.len
│   │   │   ├── sample.large.L20.npy
│   │   │   ├── sample.large.hypo.word
│   │   │   ├── sample.xlarge.L30.len
│   │   │   ├── sample.xlarge.L30.npy
│   │   │   ├── sample.xlarge.hypo.word
│   │   │   ├── test_feature_and_unit.sh
│   │   │   └── test_finetuned_asr.sh
│   │   └── update_ckpt.py
│   ├── joint_alignment_translation/
│   │   ├── README.md
│   │   └── prepare-wmt18en2de_no_norm_no_escape_no_agressive.sh
│   ├── language_model/
│   │   ├── README.adaptive_inputs.md
│   │   ├── README.conv.md
│   │   ├── README.md
│   │   └── prepare-wikitext-103.sh
│   ├── laser/
│   │   ├── README.md
│   │   └── laser_src/
│   │       ├── __init__.py
│   │       ├── laser_lstm.py
│   │       ├── laser_task.py
│   │       ├── laser_transformer.py
│   │       └── multitask_data_utils.py
│   ├── latent_depth/
│   │   ├── README.md
│   │   └── latent_depth_src/
│   │       ├── __init__.py
│   │       ├── loss/
│   │       │   ├── __init__.py
│   │       │   └── latent_depth.py
│   │       ├── models/
│   │       │   ├── __init__.py
│   │       │   ├── latent_multilingual_transformer.py
│   │       │   └── latent_transformer.py
│   │       ├── modules/
│   │       │   ├── __init__.py
│   │       │   └── latent_layers.py
│   │       └── multilingual_translation_latent_depth.py
│   ├── layerdrop/
│   │   └── README.md
│   ├── linformer/
│   │   ├── README.md
│   │   └── linformer_src/
│   │       ├── __init__.py
│   │       ├── models/
│   │       │   ├── __init__.py
│   │       │   └── linformer_roberta.py
│   │       └── modules/
│   │           ├── __init__.py
│   │           ├── linformer_sentence_encoder.py
│   │           ├── linformer_sentence_encoder_layer.py
│   │           └── multihead_linear_attention.py
│   ├── m2m_100/
│   │   ├── README.md
│   │   ├── install_dependecies.sh
│   │   ├── process_data/
│   │   │   ├── clean_histogram.py
│   │   │   ├── dedup_data.py
│   │   │   └── remove_too_much_punc.py
│   │   ├── tok.sh
│   │   └── tokenizers/
│   │       ├── README.md
│   │       ├── seg_ja.sh
│   │       ├── seg_ko.sh
│   │       ├── thirdparty/
│   │       │   └── .gitignore
│   │       ├── tokenize_indic.py
│   │       ├── tokenize_thai.py
│   │       ├── tokenize_zh.py
│   │       └── tokenizer_ar.sh
│   ├── mbart/
│   │   └── README.md
│   ├── megatron_11b/
│   │   ├── README.md
│   │   └── detok.py
│   ├── mms/
│   │   ├── MODEL_CARD.md
│   │   ├── README.md
│   │   ├── asr/
│   │   │   ├── config/
│   │   │   │   └── infer_common.yaml
│   │   │   ├── infer/
│   │   │   │   ├── example_infer_adapter.sh
│   │   │   │   └── mms_infer.py
│   │   │   └── tutorial/
│   │   │       └── MMS_ASR_Inference_Colab.ipynb
│   │   ├── data_prep/
│   │   │   ├── README.md
│   │   │   ├── align_and_segment.py
│   │   │   ├── align_utils.py
│   │   │   ├── norm_config.py
│   │   │   ├── punctuations.lst
│   │   │   └── text_normalization.py
│   │   ├── lid/
│   │   │   ├── infer.py
│   │   │   └── tutorial/
│   │   │       └── MMS_LID_Inference_Colab.ipynb
│   │   ├── lid_rerank/
│   │   │   ├── README.md
│   │   │   ├── cer_langs.txt
│   │   │   ├── mala/
│   │   │   │   └── infer.py
│   │   │   ├── mms/
│   │   │   │   ├── make_parallel_single_runs.py
│   │   │   │   ├── merge_by_lang.py
│   │   │   │   ├── prep_wav_list.py
│   │   │   │   ├── run_single_lang.py
│   │   │   │   └── split_by_lang.py
│   │   │   ├── mms-zs/
│   │   │   │   ├── falign.py
│   │   │   │   ├── lib.py
│   │   │   │   └── uromanize.py
│   │   │   ├── nllb/
│   │   │   │   └── infer.py
│   │   │   ├── requirements.txt
│   │   │   ├── rerank/
│   │   │   │   ├── rerank.py
│   │   │   │   └── tune_coefficients.py
│   │   │   └── whisper/
│   │   │       ├── infer_asr.py
│   │   │       ├── infer_lid.py
│   │   │       └── lid_mapping.txt
│   │   ├── misc/
│   │   │   └── get_sample_size.py
│   │   ├── tts/
│   │   │   ├── infer.py
│   │   │   └── tutorial/
│   │   │       └── MMS_TTS_Inference_Colab.ipynb
│   │   └── zero_shot/
│   │       └── README.md
│   ├── moe_lm/
│   │   ├── README.md
│   │   ├── data_card.md
│   │   └── model_card.md
│   ├── mr_hubert/
│   │   ├── README.md
│   │   ├── config/
│   │   │   ├── decode/
│   │   │   │   ├── infer.yaml
│   │   │   │   ├── infer_lm.yaml
│   │   │   │   └── run/
│   │   │   │       ├── submitit_slurm.yaml
│   │   │   │       └── submitit_slurm_8gpu.yaml
│   │   │   ├── finetune/
│   │   │   │   ├── base_100h.yaml
│   │   │   │   ├── base_100h_large.yaml
│   │   │   │   ├── base_10h.yaml
│   │   │   │   ├── base_10h_large.yaml
│   │   │   │   ├── base_1h.yaml
│   │   │   │   └── base_1h_large.yaml
│   │   │   └── pretrain/
│   │   │       ├── mrhubert_base_librispeech.yaml
│   │   │       ├── mrhubert_large_librilight.yaml
│   │   │       └── run/
│   │   │           └── submitit_reg.yaml
│   │   ├── decode.sh
│   │   ├── finetune.sh
│   │   └── train.sh
│   ├── multilingual/
│   │   ├── ML50_langs.txt
│   │   ├── README.md
│   │   ├── data_scripts/
│   │   │   ├── README.md
│   │   │   ├── binarize.py
│   │   │   ├── check_iswlt_test_data.py
│   │   │   ├── check_self_overlaps.py
│   │   │   ├── check_valid_test_overlaps.py
│   │   │   ├── dedup_all.py
│   │   │   ├── download_ML50_v1.sh
│   │   │   ├── download_af_xh.sh
│   │   │   ├── download_flores_data.sh
│   │   │   ├── download_iitb.sh
│   │   │   ├── download_iwslt_and_extract.sh
│   │   │   ├── download_lotus.sh
│   │   │   ├── download_ted_and_extract.py
│   │   │   ├── download_wat19_my.sh
│   │   │   ├── download_wmt19_and_before.py
│   │   │   ├── download_wmt20.sh
│   │   │   ├── preprocess_ML50_v1.sh
│   │   │   ├── remove_valid_test_in_train.py
│   │   │   ├── requirement.txt
│   │   │   └── utils/
│   │   │       ├── dedup.py
│   │   │       ├── fasttext_multi_filter.py
│   │   │       └── strip_sgm.sh
│   │   ├── finetune_multilingual_model.sh
│   │   ├── multilingual_fairseq_gen.sh
│   │   └── train_multilingual_model.sh
│   ├── noisychannel/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── rerank.py
│   │   ├── rerank_generate.py
│   │   ├── rerank_options.py
│   │   ├── rerank_score_bw.py
│   │   ├── rerank_score_lm.py
│   │   ├── rerank_tune.py
│   │   └── rerank_utils.py
│   ├── nonautoregressive_translation/
│   │   ├── README.md
│   │   └── scripts.md
│   ├── normformer/
│   │   ├── README.md
│   │   └── train_lm.sh
│   ├── operators/
│   │   ├── alignment_train_cpu.cpp
│   │   ├── alignment_train_cuda.cpp
│   │   ├── alignment_train_cuda.h
│   │   ├── alignment_train_kernel.cu
│   │   └── utils.h
│   ├── paraphraser/
│   │   ├── README.md
│   │   └── paraphrase.py
│   ├── pay_less_attention_paper/
│   │   └── README.md
│   ├── pointer_generator/
│   │   ├── README.md
│   │   ├── README.xsum.md
│   │   ├── pointer_generator_src/
│   │   │   ├── __init__.py
│   │   │   └── transformer_pg.py
│   │   ├── postprocess.py
│   │   └── preprocess.py
│   ├── quant_noise/
│   │   ├── README.md
│   │   └── transformer_quantization_config.yaml
│   ├── roberta/
│   │   ├── README.custom_classification.md
│   │   ├── README.glue.md
│   │   ├── README.md
│   │   ├── README.pretraining.md
│   │   ├── README.race.md
│   │   ├── commonsense_qa/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── commonsense_qa_task.py
│   │   │   └── download_cqa_data.sh
│   │   ├── config/
│   │   │   ├── finetuning/
│   │   │   │   ├── cola.yaml
│   │   │   │   ├── mnli.yaml
│   │   │   │   ├── mrpc.yaml
│   │   │   │   ├── qnli.yaml
│   │   │   │   ├── qqp.yaml
│   │   │   │   ├── rte.yaml
│   │   │   │   ├── run_config/
│   │   │   │   │   ├── local.yaml
│   │   │   │   │   ├── slurm_1g.yaml
│   │   │   │   │   └── slurm_1g_aws.yaml
│   │   │   │   ├── sst_2.yaml
│   │   │   │   └── sts_b.yaml
│   │   │   └── pretraining/
│   │   │       ├── base.yaml
│   │   │       └── run_config/
│   │   │           ├── local.yaml
│   │   │           ├── slurm_2.yaml
│   │   │           ├── slurm_2_aws.yaml
│   │   │           ├── slurm_3.yaml
│   │   │           └── slurm_4.yaml
│   │   ├── fb_multilingual/
│   │   │   └── README.multilingual.pretraining.md
│   │   ├── multiprocessing_bpe_encoder.py
│   │   ├── preprocess_GLUE_tasks.sh
│   │   ├── preprocess_RACE.py
│   │   ├── preprocess_RACE.sh
│   │   └── wsc/
│   │       ├── README.md
│   │       ├── __init__.py
│   │       ├── wsc_criterion.py
│   │       ├── wsc_task.py
│   │       └── wsc_utils.py
│   ├── rxf/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   └── rxf_src/
│   │       ├── __init__.py
│   │       ├── label_smoothed_cross_entropy_r3f.py
│   │       └── sentence_prediction_r3f.py
│   ├── scaling_nmt/
│   │   └── README.md
│   ├── shuffled_word_order/
│   │   ├── README.finetuning.md
│   │   └── README.md
│   ├── simultaneous_translation/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── docs/
│   │   │   ├── ende-mma.md
│   │   │   └── enja-waitk.md
│   │   ├── eval/
│   │   │   └── agents/
│   │   │       └── simul_t2t_enja.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── convtransformer_simul_trans.py
│   │   │   └── transformer_monotonic_attention.py
│   │   ├── modules/
│   │   │   ├── __init__.py
│   │   │   ├── fixed_pre_decision.py
│   │   │   ├── monotonic_multihead_attention.py
│   │   │   └── monotonic_transformer_layer.py
│   │   ├── tests/
│   │   │   ├── test_alignment_train.py
│   │   │   └── test_text_models.py
│   │   └── utils/
│   │       ├── __init__.py
│   │       ├── functions.py
│   │       ├── monotonic_attention.py
│   │       └── p_choose_strategy.py
│   ├── speech_recognition/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── criterions/
│   │   │   ├── ASG_loss.py
│   │   │   ├── __init__.py
│   │   │   └── cross_entropy_acc.py
│   │   ├── data/
│   │   │   ├── __init__.py
│   │   │   ├── asr_dataset.py
│   │   │   ├── collaters.py
│   │   │   ├── data_utils.py
│   │   │   └── replabels.py
│   │   ├── datasets/
│   │   │   ├── asr_prep_json.py
│   │   │   └── prepare-librispeech.sh
│   │   ├── infer.py
│   │   ├── kaldi/
│   │   │   ├── __init__.py
│   │   │   ├── add-self-loop-simple.cc
│   │   │   ├── config/
│   │   │   │   └── kaldi_initializer.yaml
│   │   │   ├── kaldi_decoder.py
│   │   │   └── kaldi_initializer.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── vggtransformer.py
│   │   │   └── w2l_conv_glu_enc.py
│   │   ├── new/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── conf/
│   │   │   │   ├── hydra/
│   │   │   │   │   └── sweeper/
│   │   │   │   │       ├── ax.yaml
│   │   │   │   │       └── ax_sil.yaml
│   │   │   │   ├── infer.yaml
│   │   │   │   └── run_config/
│   │   │   │       ├── fb_slurm_1.yaml
│   │   │   │       └── fb_slurm_2g.yaml
│   │   │   ├── decoders/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── base_decoder.py
│   │   │   │   ├── decoder.py
│   │   │   │   ├── decoder_config.py
│   │   │   │   ├── flashlight_decoder.py
│   │   │   │   └── viterbi_decoder.py
│   │   │   └── infer.py
│   │   ├── tasks/
│   │   │   ├── __init__.py
│   │   │   └── speech_recognition.py
│   │   ├── utils/
│   │   │   └── wer_utils.py
│   │   └── w2l_decoder.py
│   ├── speech_synthesis/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── data_utils.py
│   │   ├── docs/
│   │   │   ├── common_voice_example.md
│   │   │   ├── ljspeech_example.md
│   │   │   └── vctk_example.md
│   │   ├── evaluation/
│   │   │   ├── __init__.py
│   │   │   ├── eval_asr.py
│   │   │   ├── eval_f0.py
│   │   │   ├── eval_sp.py
│   │   │   └── get_eval_manifest.py
│   │   ├── generate_waveform.py
│   │   ├── preprocessing/
│   │   │   ├── __init__.py
│   │   │   ├── denoise_and_vad_audio.py
│   │   │   ├── denoiser/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── demucs.py
│   │   │   │   ├── pretrained.py
│   │   │   │   ├── resample.py
│   │   │   │   └── utils.py
│   │   │   ├── get_common_voice_audio_manifest.py
│   │   │   ├── get_feature_manifest.py
│   │   │   ├── get_ljspeech_audio_manifest.py
│   │   │   ├── get_speaker_embedding.py
│   │   │   ├── get_vctk_audio_manifest.py
│   │   │   ├── speaker_embedder/
│   │   │   │   └── __init__.py
│   │   │   └── vad/
│   │   │       └── __init__.py
│   │   └── utils.py
│   ├── speech_text_joint_to_text/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── configs/
│   │   │   └── mustc_noise.list
│   │   ├── criterions/
│   │   │   ├── __init__.py
│   │   │   ├── multi_modality_compound.py
│   │   │   ├── multi_modality_cross_entropy.py
│   │   │   └── text_guide_cross_entropy_acc.py
│   │   ├── data/
│   │   │   └── pair_denoising_dataset.py
│   │   ├── docs/
│   │   │   ├── ende-mustc.md
│   │   │   ├── iwslt2021.md
│   │   │   └── pre-training.md
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── joint_speech_text_pretrain_transformer.py
│   │   │   ├── s2t_dualinputtransformer.py
│   │   │   ├── s2t_dualinputwavtransformer.py
│   │   │   └── s2t_dualinputxmtransformer.py
│   │   ├── scripts/
│   │   │   ├── convert_model.py
│   │   │   └── g2p_encode.py
│   │   └── tasks/
│   │       ├── __init__.py
│   │       ├── pair_denoising.py
│   │       ├── speech_text_denoise_pretrain.py
│   │       └── speech_text_joint.py
│   ├── speech_to_speech/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── asr_bleu/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── asr_model_cfgs.json
│   │   │   ├── compute_asr_bleu.py
│   │   │   ├── requirements.txt
│   │   │   └── utils.py
│   │   ├── benchmarking/
│   │   │   ├── README.md
│   │   │   ├── configs/
│   │   │   │   ├── 2StageS2ST.yaml
│   │   │   │   ├── 3StageS2ST.yaml
│   │   │   │   ├── DirectS2U.yaml
│   │   │   │   └── S2T.yaml
│   │   │   ├── core.py
│   │   │   ├── data_utils.py
│   │   │   └── get_metrics.py
│   │   ├── docs/
│   │   │   ├── data_augmentation.md
│   │   │   ├── direct_s2st_discrete_units.md
│   │   │   ├── enhanced_direct_s2st_discrete_units.md
│   │   │   └── textless_s2st_real_data.md
│   │   ├── generate_waveform_from_code.py
│   │   ├── preprocessing/
│   │   │   ├── __init__.py
│   │   │   ├── data_utils.py
│   │   │   ├── prep_s2spect_data.py
│   │   │   ├── prep_s2ut_data.py
│   │   │   ├── prep_sn_data.py
│   │   │   └── prep_sn_output_data.py
│   │   └── unity/
│   │       ├── __init__.py
│   │       ├── sequence_generator.py
│   │       └── sequence_generator_multi_decoder.py
│   ├── speech_to_text/
│   │   ├── README.md
│   │   ├── data_utils.py
│   │   ├── docs/
│   │   │   ├── covost_example.md
│   │   │   ├── librispeech_example.md
│   │   │   ├── mtedx_example.md
│   │   │   ├── mustc_example.md
│   │   │   └── simulst_mustc_example.md
│   │   ├── prep_covost_data.py
│   │   ├── prep_librispeech_data.py
│   │   ├── prep_mtedx_data.py
│   │   ├── prep_mustc_data.py
│   │   ├── seg_mustc_data.py
│   │   └── simultaneous_translation/
│   │       └── agents/
│   │           └── fairseq_simul_st_agent.py
│   ├── stories/
│   │   └── README.md
│   ├── textless_nlp/
│   │   ├── dgslm/
│   │   │   ├── README.md
│   │   │   ├── create_code_file.py
│   │   │   ├── dgslm_utils.py
│   │   │   ├── hubert_fisher/
│   │   │   │   └── README.md
│   │   │   ├── sample_speech_dlm.py
│   │   │   └── vocoder_hifigan/
│   │   │       ├── README.md
│   │   │       └── generate_stereo_waveform.py
│   │   ├── gslm/
│   │   │   ├── README.md
│   │   │   ├── metrics/
│   │   │   │   ├── README.md
│   │   │   │   ├── abx_metrics/
│   │   │   │   │   ├── README.md
│   │   │   │   │   └── dump_abx_feats.py
│   │   │   │   └── asr_metrics/
│   │   │   │       ├── README.md
│   │   │   │       ├── continuation_eval.py
│   │   │   │       ├── misc/
│   │   │   │       │   ├── bleu_utils.py
│   │   │   │       │   ├── cut_as.py
│   │   │   │       │   └── dict.ltr.txt
│   │   │   │       ├── ppx.py
│   │   │   │       └── self_auto_bleu.py
│   │   │   ├── speech2unit/
│   │   │   │   ├── README.md
│   │   │   │   ├── __init__.py
│   │   │   │   ├── clustering/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── cluster_kmeans.py
│   │   │   │   │   ├── dump_feats.py
│   │   │   │   │   ├── quantize_with_kmeans.py
│   │   │   │   │   └── utils.py
│   │   │   │   └── pretrained/
│   │   │   │       ├── cpc_feature_reader.py
│   │   │   │       ├── hubert_feature_reader.py
│   │   │   │       ├── logmel_feature_reader.py
│   │   │   │       ├── utils.py
│   │   │   │       └── w2v2_feature_reader.py
│   │   │   ├── tools/
│   │   │   │   ├── README.md
│   │   │   │   └── resynthesize_speech.py
│   │   │   ├── ulm/
│   │   │   │   ├── README.md
│   │   │   │   └── sample.py
│   │   │   └── unit2speech/
│   │   │       ├── README.md
│   │   │       ├── convert_to_16k.py
│   │   │       ├── glow.py
│   │   │       ├── multiproc.py
│   │   │       ├── synthesize_audio_from_units.py
│   │   │       ├── tacotron2/
│   │   │       │   ├── __init__.py
│   │   │       │   ├── audio_processing.py
│   │   │       │   ├── cleaners.py
│   │   │       │   ├── cmudict.py
│   │   │       │   ├── layers.py
│   │   │       │   ├── model.py
│   │   │       │   ├── numbers.py
│   │   │       │   ├── stft.py
│   │   │       │   ├── symbols.py
│   │   │       │   ├── text.py
│   │   │       │   ├── utils.py
│   │   │       │   └── waveglow_denoiser.py
│   │   │       ├── tts_data.py
│   │   │       └── utils.py
│   │   ├── pgslm/
│   │   │   ├── README.md
│   │   │   ├── data_utils.py
│   │   │   ├── eval/
│   │   │   │   ├── __init__.py
│   │   │   │   └── cont_metrics.py
│   │   │   ├── generate_waveform.py
│   │   │   ├── inference_dataset.py
│   │   │   ├── naive_decoder.py
│   │   │   ├── prepare_dataset.py
│   │   │   ├── preprocess_f0.py
│   │   │   ├── quantize_f0.py
│   │   │   ├── sample/
│   │   │   │   ├── __init__.py
│   │   │   │   └── sample.py
│   │   │   ├── scripts/
│   │   │   │   ├── join_units_manifest.py
│   │   │   │   ├── prepare_data.sh
│   │   │   │   └── prepare_f0_quantization.sh
│   │   │   └── truncated_laplace.py
│   │   └── speech-resynth/
│   │       └── README.md
│   ├── translation/
│   │   ├── README.md
│   │   ├── prepare-iwslt14.sh
│   │   ├── prepare-iwslt17-multilingual.sh
│   │   ├── prepare-wmt14en2de.sh
│   │   └── prepare-wmt14en2fr.sh
│   ├── translation_moe/
│   │   ├── README.md
│   │   ├── score.py
│   │   └── translation_moe_src/
│   │       ├── __init__.py
│   │       ├── logsumexp_moe.py
│   │       ├── mean_pool_gating_network.py
│   │       └── translation_moe.py
│   ├── truncated_bptt/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── transformer_xl_model.py
│   │   └── truncated_bptt_lm_task.py
│   ├── unsupervised_quality_estimation/
│   │   ├── README.md
│   │   ├── aggregate_scores.py
│   │   ├── meteor.py
│   │   └── repeat_lines.py
│   ├── wav2vec/
│   │   ├── README.md
│   │   ├── __init__.py
│   │   ├── config/
│   │   │   ├── finetuning/
│   │   │   │   ├── base_100h.yaml
│   │   │   │   ├── base_10h.yaml
│   │   │   │   ├── base_10m.yaml
│   │   │   │   ├── base_1h.yaml
│   │   │   │   ├── base_960h.yaml
│   │   │   │   ├── run_config/
│   │   │   │   │   ├── slurm_1.yaml
│   │   │   │   │   ├── slurm_16.yaml
│   │   │   │   │   ├── slurm_1_aws.yaml
│   │   │   │   │   ├── slurm_1_old.yaml
│   │   │   │   │   ├── slurm_2.yaml
│   │   │   │   │   ├── slurm_2_aws.yaml
│   │   │   │   │   ├── slurm_2g.yaml
│   │   │   │   │   ├── slurm_3.yaml
│   │   │   │   │   ├── slurm_4g.yaml
│   │   │   │   │   ├── slurm_4g_aws.yaml
│   │   │   │   │   └── slurm_8.yaml
│   │   │   │   ├── vox_100h.yaml
│   │   │   │   ├── vox_100h_2.yaml
│   │   │   │   ├── vox_100h_2_aws.yaml
│   │   │   │   ├── vox_100h_3.yaml
│   │   │   │   ├── vox_10h.yaml
│   │   │   │   ├── vox_10h_2.yaml
│   │   │   │   ├── vox_10h_2_aws.yaml
│   │   │   │   ├── vox_10h_aws.yaml
│   │   │   │   ├── vox_10h_aws_v100.yaml
│   │   │   │   ├── vox_10m.yaml
│   │   │   │   ├── vox_10m_2.yaml
│   │   │   │   ├── vox_10m_2_aws.yaml
│   │   │   │   ├── vox_10m_3.yaml
│   │   │   │   ├── vox_1h.yaml
│   │   │   │   ├── vox_1h_2.yaml
│   │   │   │   ├── vox_1h_2_aws.yaml
│   │   │   │   ├── vox_1h_3.yaml
│   │   │   │   ├── vox_1h_4.yaml
│   │   │   │   ├── vox_1h_aws.yaml
│   │   │   │   ├── vox_960h.yaml
│   │   │   │   ├── vox_960h_2.yaml
│   │   │   │   ├── vox_960h_2_aws.yaml
│   │   │   │   └── vox_960h_3.yaml
│   │   │   └── pretraining/
│   │   │       ├── wav2vec2_base_librispeech.yaml
│   │   │       ├── wav2vec2_conformer_base_librispeech.yaml
│   │   │       ├── wav2vec2_conformer_large_librivox.yaml
│   │   │       ├── wav2vec2_large_librivox.yaml
│   │   │       ├── wav2vec2_large_librivox_tpu-pod.yaml
│   │   │       └── wav2vec2_large_librivox_tpu.yaml
│   │   ├── libri_labels.py
│   │   ├── scripts/
│   │   │   └── binarize_manifest.sh
│   │   ├── unsupervised/
│   │   │   ├── README.md
│   │   │   ├── __init__.py
│   │   │   ├── config/
│   │   │   │   ├── finetuning/
│   │   │   │   │   └── w2v_finetune.yaml
│   │   │   │   ├── gan/
│   │   │   │   │   ├── w2vu.yaml
│   │   │   │   │   └── w2vu2.yaml
│   │   │   │   ├── generate/
│   │   │   │   │   └── viterbi.yaml
│   │   │   │   ├── timit_matched/
│   │   │   │   │   ├── test.uid
│   │   │   │   │   ├── train.uid
│   │   │   │   │   ├── train_text.uid
│   │   │   │   │   └── valid.uid
│   │   │   │   └── timit_unmatched/
│   │   │   │       ├── test.uid
│   │   │   │       ├── train.uid
│   │   │   │       ├── train_text.uid
│   │   │   │       └── valid.uid
│   │   │   ├── data/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── extracted_features_dataset.py
│   │   │   │   └── random_input_dataset.py
│   │   │   ├── kaldi_self_train/
│   │   │   │   ├── README.md
│   │   │   │   └── st/
│   │   │   │       ├── cmd.sh
│   │   │   │       ├── decode_phone.sh
│   │   │   │       ├── decode_word_step1.sh
│   │   │   │       ├── decode_word_step2.sh
│   │   │   │       ├── local/
│   │   │   │       │   ├── copy_aligned_text.py
│   │   │   │       │   ├── decode.sh
│   │   │   │       │   ├── prepare_data_from_w2v.py
│   │   │   │       │   ├── prepare_lang.sh
│   │   │   │       │   ├── prepare_lang_word.sh
│   │   │   │       │   ├── prepare_lm.sh
│   │   │   │       │   ├── score.sh
│   │   │   │       │   ├── show_wer.sh
│   │   │   │       │   ├── train_subset_lgbeam.sh
│   │   │   │       │   ├── unsup_select.py
│   │   │   │       │   ├── unsup_select_decode.sh
│   │   │   │       │   └── unsup_select_decode_word.sh
│   │   │   │       ├── path.sh
│   │   │   │       ├── steps_gan/
│   │   │   │       │   ├── train_deltas.sh
│   │   │   │       │   ├── train_lda_mllt.sh
│   │   │   │       │   └── train_sat.sh
│   │   │   │       └── train.sh
│   │   │   ├── models/
│   │   │   │   ├── __init__.py
│   │   │   │   └── wav2vec_u.py
│   │   │   ├── scripts/
│   │   │   │   ├── apply_pca.py
│   │   │   │   ├── copy_labels.py
│   │   │   │   ├── filter_lexicon.py
│   │   │   │   ├── filter_tsv.py
│   │   │   │   ├── g2p_wrd_to_phn.py
│   │   │   │   ├── ltr_to_wrd.py
│   │   │   │   ├── mean_pool.py
│   │   │   │   ├── merge_clusters.py
│   │   │   │   ├── normalize_and_filter_text.py
│   │   │   │   ├── normalize_text.py
│   │   │   │   ├── pca.py
│   │   │   │   ├── phonemize_with_sil.py
│   │   │   │   ├── prepare_audio.sh
│   │   │   │   ├── prepare_audio_v2.sh
│   │   │   │   ├── prepare_text.sh
│   │   │   │   ├── prepare_timit.sh
│   │   │   │   ├── remove_silence.py
│   │   │   │   ├── vads.py
│   │   │   │   ├── wav2vec_apply_cluster_faiss.py
│   │   │   │   ├── wav2vec_cluster_faiss.py
│   │   │   │   ├── wav2vec_extract_features.py
│   │   │   │   ├── wer.py
│   │   │   │   └── wrd_to_ltr.py
│   │   │   ├── tasks/
│   │   │   │   ├── __init__.py
│   │   │   │   └── unpaired_audio_text.py
│   │   │   └── w2vu_generate.py
│   │   ├── vq-wav2vec_featurize.py
│   │   ├── wav2vec_featurize.py
│   │   ├── wav2vec_manifest.py
│   │   └── xlsr/
│   │       ├── README.md
│   │       ├── config/
│   │       │   └── finetune.yaml
│   │       └── scripts/
│   │           ├── eval_speaker_clf_task.py
│   │           └── gen_audio_embedding.py
│   ├── wmt19/
│   │   └── README.md
│   ├── wmt20/
│   │   └── README.md
│   ├── wmt21/
│   │   ├── README.md
│   │   ├── eval.sh
│   │   └── scripts/
│   │       ├── normalize-punctuation.perl
│   │       └── replace-unicode-punctuation.perl
│   ├── womens_bios/
│   │   ├── README.md
│   │   └── query_occupations_from_wikidata.py
│   ├── xformers/
│   │   └── README.md
│   ├── xglm/
│   │   ├── README.md
│   │   ├── XStoryCloze.md
│   │   └── model_card.md
│   ├── xlmr/
│   │   └── README.md
│   └── xmod/
│       ├── README.md
│       └── preprocess_nli.py
├── fairseq/
│   ├── __init__.py
│   ├── benchmark/
│   │   ├── __init__.py
│   │   ├── benchmark_multihead_attention.py
│   │   ├── dummy_dataset.py
│   │   ├── dummy_lm.py
│   │   ├── dummy_masked_lm.py
│   │   ├── dummy_model.py
│   │   └── dummy_mt.py
│   ├── binarizer.py
│   ├── checkpoint_utils.py
│   ├── clib/
│   │   ├── cuda/
│   │   │   ├── ngram_repeat_block_cuda.cpp
│   │   │   └── ngram_repeat_block_cuda_kernel.cu
│   │   ├── libbase/
│   │   │   └── balanced_assignment.cpp
│   │   ├── libbleu/
│   │   │   ├── libbleu.cpp
│   │   │   └── module.cpp
│   │   ├── libnat/
│   │   │   └── edit_dist.cpp
│   │   └── libnat_cuda/
│   │       ├── binding.cpp
│   │       ├── edit_dist.cu
│   │       └── edit_dist.h
│   ├── config/
│   │   ├── __init__.py
│   │   ├── config.yaml
│   │   ├── fb_run_config/
│   │   │   └── slurm.yaml
│   │   └── model/
│   │       ├── transformer_lm/
│   │       │   ├── transformer_lm_baevski_gbw.yaml
│   │       │   ├── transformer_lm_baevski_wiki103.yaml
│   │       │   ├── transformer_lm_big.yaml
│   │       │   ├── transformer_lm_gbw.yaml
│   │       │   ├── transformer_lm_gpt.yaml
│   │       │   ├── transformer_lm_gpt2_big.yaml
│   │       │   ├── transformer_lm_gpt2_medium.yaml
│   │       │   ├── transformer_lm_gpt2_small.yaml
│   │       │   └── transformer_lm_wiki103.yaml
│   │       ├── wav2vec/
│   │       │   └── vq_wav2vec_gumbel.yaml
│   │       └── wav2vec2/
│   │           ├── wav2vec2_base.yaml
│   │           └── wav2vec2_large.yaml
│   ├── criterions/
│   │   ├── __init__.py
│   │   ├── adaptive_loss.py
│   │   ├── composite_loss.py
│   │   ├── cross_entropy.py
│   │   ├── ctc.py
│   │   ├── fairseq_criterion.py
│   │   ├── fastspeech2_loss.py
│   │   ├── hubert_criterion.py
│   │   ├── label_smoothed_cross_entropy.py
│   │   ├── label_smoothed_cross_entropy_latency_augmented.py
│   │   ├── label_smoothed_cross_entropy_with_alignment.py
│   │   ├── label_smoothed_cross_entropy_with_ctc.py
│   │   ├── label_smoothed_cross_entropy_with_rdrop.py
│   │   ├── legacy_masked_lm.py
│   │   ├── masked_lm.py
│   │   ├── model_criterion.py
│   │   ├── nat_loss.py
│   │   ├── sentence_prediction.py
│   │   ├── sentence_prediction_adapters.py
│   │   ├── sentence_ranking.py
│   │   ├── speech_dlm_criterion.py
│   │   ├── speech_to_speech_criterion.py
│   │   ├── speech_ulm_criterion.py
│   │   ├── tacotron2_loss.py
│   │   └── wav2vec_criterion.py
│   ├── data/
│   │   ├── __init__.py
│   │   ├── add_class_target_dataset.py
│   │   ├── add_target_dataset.py
│   │   ├── append_token_dataset.py
│   │   ├── audio/
│   │   │   ├── __init__.py
│   │   │   ├── audio_utils.py
│   │   │   ├── data_cfg.py
│   │   │   ├── dataset_transforms/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── concataugment.py
│   │   │   │   └── noisyoverlapaugment.py
│   │   │   ├── feature_transforms/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── delta_deltas.py
│   │   │   │   ├── global_cmvn.py
│   │   │   │   ├── specaugment.py
│   │   │   │   └── utterance_cmvn.py
│   │   │   ├── frm_text_to_speech_dataset.py
│   │   │   ├── hubert_dataset.py
│   │   │   ├── multi_modality_dataset.py
│   │   │   ├── raw_audio_dataset.py
│   │   │   ├── speech_to_speech_dataset.py
│   │   │   ├── speech_to_text_dataset.py
│   │   │   ├── speech_to_text_joint_dataset.py
│   │   │   ├── text_to_speech_dataset.py
│   │   │   └── waveform_transforms/
│   │   │       ├── __init__.py
│   │   │       └── noiseaugment.py
│   │   ├── backtranslation_dataset.py
│   │   ├── base_wrapper_dataset.py
│   │   ├── bucket_pad_length_dataset.py
│   │   ├── codedataset.py
│   │   ├── colorize_dataset.py
│   │   ├── concat_dataset.py
│   │   ├── concat_sentences_dataset.py
│   │   ├── data_utils.py
│   │   ├── data_utils_fast.pyx
│   │   ├── denoising_dataset.py
│   │   ├── dictionary.py
│   │   ├── encoders/
│   │   │   ├── __init__.py
│   │   │   ├── byte_bpe.py
│   │   │   ├── byte_utils.py
│   │   │   ├── bytes.py
│   │   │   ├── characters.py
│   │   │   ├── fastbpe.py
│   │   │   ├── gpt2_bpe.py
│   │   │   ├── gpt2_bpe_utils.py
│   │   │   ├── hf_bert_bpe.py
│   │   │   ├── hf_byte_bpe.py
│   │   │   ├── moses_tokenizer.py
│   │   │   ├── nltk_tokenizer.py
│   │   │   ├── sentencepiece_bpe.py
│   │   │   ├── space_tokenizer.py
│   │   │   ├── subword_nmt_bpe.py
│   │   │   └── utils.py
│   │   ├── fairseq_dataset.py
│   │   ├── fasta_dataset.py
│   │   ├── huffman/
│   │   │   ├── __init__.py
│   │   │   ├── huffman_coder.py
│   │   │   └── huffman_mmap_indexed_dataset.py
│   │   ├── id_dataset.py
│   │   ├── indexed_dataset.py
│   │   ├── iterators.py
│   │   ├── language_pair_dataset.py
│   │   ├── legacy/
│   │   │   ├── __init__.py
│   │   │   ├── block_pair_dataset.py
│   │   │   ├── masked_lm_dataset.py
│   │   │   └── masked_lm_dictionary.py
│   │   ├── list_dataset.py
│   │   ├── lm_context_window_dataset.py
│   │   ├── lru_cache_dataset.py
│   │   ├── mask_tokens_dataset.py
│   │   ├── monolingual_dataset.py
│   │   ├── multi_corpus_dataset.py
│   │   ├── multi_corpus_sampled_dataset.py
│   │   ├── multilingual/
│   │   │   ├── __init__.py
│   │   │   ├── multilingual_data_manager.py
│   │   │   ├── multilingual_utils.py
│   │   │   ├── sampled_multi_dataset.py
│   │   │   ├── sampled_multi_epoch_dataset.py
│   │   │   └── sampling_method.py
│   │   ├── nested_dictionary_dataset.py
│   │   ├── noising.py
│   │   ├── num_samples_dataset.py
│   │   ├── numel_dataset.py
│   │   ├── offset_tokens_dataset.py
│   │   ├── pad_dataset.py
│   │   ├── padding_mask_dataset.py
│   │   ├── plasma_utils.py
│   │   ├── prepend_dataset.py
│   │   ├── prepend_token_dataset.py
│   │   ├── raw_label_dataset.py
│   │   ├── replace_dataset.py
│   │   ├── resampling_dataset.py
│   │   ├── roll_dataset.py
│   │   ├── round_robin_zip_datasets.py
│   │   ├── shorten_dataset.py
│   │   ├── sort_dataset.py
│   │   ├── span_mask_tokens_dataset.py
│   │   ├── speech_dlm_dataset.py
│   │   ├── strip_token_dataset.py
│   │   ├── subsample_dataset.py
│   │   ├── text_compressor.py
│   │   ├── token_block_dataset.py
│   │   ├── token_block_utils_fast.pyx
│   │   ├── transform_eos_concat_langpair_dataset.py
│   │   ├── transform_eos_dataset.py
│   │   └── transform_eos_lang_pair_dataset.py
│   ├── dataclass/
│   │   ├── __init__.py
│   │   ├── configs.py
│   │   ├── constants.py
│   │   ├── initialize.py
│   │   └── utils.py
│   ├── distributed/
│   │   ├── __init__.py
│   │   ├── distributed_timeout_wrapper.py
│   │   ├── fully_sharded_data_parallel.py
│   │   ├── legacy_distributed_data_parallel.py
│   │   ├── module_proxy_wrapper.py
│   │   ├── tpu_distributed_data_parallel.py
│   │   └── utils.py
│   ├── file_chunker_utils.py
│   ├── file_io.py
│   ├── file_utils.py
│   ├── hub_utils.py
│   ├── incremental_decoding_utils.py
│   ├── iterative_refinement_generator.py
│   ├── logging/
│   │   ├── __init__.py
│   │   ├── meters.py
│   │   ├── metrics.py
│   │   └── progress_bar.py
│   ├── model_parallel/
│   │   ├── __init__.py
│   │   ├── criterions/
│   │   │   ├── __init__.py
│   │   │   └── vocab_parallel_cross_entropy.py
│   │   ├── megatron_trainer.py
│   │   ├── models/
│   │   │   ├── __init__.py
│   │   │   ├── pipeline_parallel_transformer/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── layers.py
│   │   │   │   └── model.py
│   │   │   ├── roberta/
│   │   │   │   ├── __init__.py
│   │   │   │   └── model.py
│   │   │   ├── transformer.py
│   │   │   └── transformer_lm.py
│   │   └── modules/
│   │       ├── __init__.py
│   │       ├── multihead_attention.py
│   │       └── transformer_layer.py
│   ├── models/
│   │   ├── __init__.py
│   │   ├── bart/
│   │   │   ├── __init__.py
│   │   │   ├── hub_interface.py
│   │   │   └── model.py
│   │   ├── composite_encoder.py
│   │   ├── distributed_fairseq_model.py
│   │   ├── ema/
│   │   │   ├── __init__.py
│   │   │   └── ema.py
│   │   ├── fairseq_decoder.py
│   │   ├── fairseq_encoder.py
│   │   ├── fairseq_incremental_decoder.py
│   │   ├── fairseq_model.py
│   │   ├── fconv.py
│   │   ├── fconv_lm.py
│   │   ├── fconv_self_att.py
│   │   ├── hubert/
│   │   │   ├── __init__.py
│   │   │   ├── hubert.py
│   │   │   └── hubert_asr.py
│   │   ├── huggingface/
│   │   │   ├── __init__.py
│   │   │   └── hf_gpt2.py
│   │   ├── lightconv.py
│   │   ├── lightconv_lm.py
│   │   ├── lstm.py
│   │   ├── lstm_lm.py
│   │   ├── masked_lm.py
│   │   ├── model_utils.py
│   │   ├── multilingual_transformer.py
│   │   ├── multires_hubert/
│   │   │   ├── __init__.py
│   │   │   ├── multires_hubert.py
│   │   │   └── multires_hubert_asr.py
│   │   ├── nat/
│   │   │   ├── __init__.py
│   │   │   ├── cmlm_transformer.py
│   │   │   ├── fairseq_nat_model.py
│   │   │   ├── insertion_transformer.py
│   │   │   ├── iterative_nonautoregressive_transformer.py
│   │   │   ├── levenshtein_transformer.py
│   │   │   ├── levenshtein_utils.py
│   │   │   ├── nat_crf_transformer.py
│   │   │   ├── nonautoregressive_ensembles.py
│   │   │   └── nonautoregressive_transformer.py
│   │   ├── roberta/
│   │   │   ├── __init__.py
│   │   │   ├── alignment_utils.py
│   │   │   ├── enc_dec.py
│   │   │   ├── hub_interface.py
│   │   │   ├── model.py
│   │   │   ├── model_camembert.py
│   │   │   ├── model_gottbert.py
│   │   │   └── model_xlmr.py
│   │   ├── speech_dlm/
│   │   │   ├── __init__.py
│   │   │   ├── hub_interface.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── speech_dlm_decoder.py
│   │   │   │   └── speech_dlm_decoder_layer.py
│   │   │   ├── sequence_generator/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── multichannel_search.py
│   │   │   │   └── multichannel_sequence_generator.py
│   │   │   └── speech_dlm.py
│   │   ├── speech_to_speech/
│   │   │   ├── __init__.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── ctc_decoder.py
│   │   │   │   ├── stacked_embedding.py
│   │   │   │   ├── transformer_decoder_aug.py
│   │   │   │   └── transformer_encoder.py
│   │   │   ├── s2s_conformer.py
│   │   │   ├── s2s_conformer_translatotron2.py
│   │   │   ├── s2s_conformer_unity.py
│   │   │   └── s2s_transformer.py
│   │   ├── speech_to_text/
│   │   │   ├── __init__.py
│   │   │   ├── berard.py
│   │   │   ├── convtransformer.py
│   │   │   ├── hub_interface.py
│   │   │   ├── modules/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── augmented_memory_attention.py
│   │   │   │   ├── convolution.py
│   │   │   │   └── emformer.py
│   │   │   ├── multi_modality_model.py
│   │   │   ├── s2t_conformer.py
│   │   │   ├── s2t_transformer.py
│   │   │   ├── s2t_wav_transformer.py
│   │   │   ├── utils.py
│   │   │   ├── xm_transformer.py
│   │   │   └── xm_transformer_unity.py
│   │   ├── text_to_speech/
│   │   │   ├── __init__.py
│   │   │   ├── codehifigan.py
│   │   │   ├── fastspeech2.py
│   │   │   ├── hifigan.py
│   │   │   ├── hub_interface.py
│   │   │   ├── tacotron2.py
│   │   │   ├── tts_transformer.py
│   │   │   └── vocoder.py
│   │   ├── transformer/
│   │   │   ├── __init__.py
│   │   │   ├── transformer_base.py
│   │   │   ├── transformer_config.py
│   │   │   ├── transformer_decoder.py
│   │   │   ├── transformer_decoder_aug.py
│   │   │   ├── transformer_encoder.py
│   │   │   └── transformer_legacy.py
│   │   ├── transformer_align.py
│   │   ├── transformer_from_pretrained_xlm.py
│   │   ├── transformer_lm.py
│   │   ├── transformer_ulm.py
│   │   ├── wav2vec/
│   │   │   ├── __init__.py
│   │   │   ├── utils.py
│   │   │   ├── wav2vec.py
│   │   │   ├── wav2vec2.py
│   │   │   ├── wav2vec2_asr.py
│   │   │   ├── wav2vec2_classification.py
│   │   │   └── wav2vec2_laser.py
│   │   └── xmod/
│   │       ├── __init__.py
│   │       ├── hub_interface.py
│   │       ├── model.py
│   │       └── transformer_layer_xmod.py
│   ├── modules/
│   │   ├── __init__.py
│   │   ├── adaptive_input.py
│   │   ├── adaptive_softmax.py
│   │   ├── base_layer.py
│   │   ├── beamable_mm.py
│   │   ├── character_token_embedder.py
│   │   ├── checkpoint_activations.py
│   │   ├── conformer_layer.py
│   │   ├── conv_tbc.py
│   │   ├── cross_entropy.py
│   │   ├── cuda_utils.cu
│   │   ├── downsampled_multihead_attention.py
│   │   ├── dynamic_convolution.py
│   │   ├── dynamic_crf_layer.py
│   │   ├── dynamicconv_layer/
│   │   │   ├── __init__.py
│   │   │   ├── cuda_function_gen.py
│   │   │   ├── dynamicconv_cuda.cpp
│   │   │   ├── dynamicconv_cuda.cuh
│   │   │   ├── dynamicconv_cuda_kernel.cu
│   │   │   ├── dynamicconv_layer.py
│   │   │   ├── dynamiconv_cpu.cpp
│   │   │   └── setup.py
│   │   ├── ema_module.py
│   │   ├── espnet_multihead_attention.py
│   │   ├── fairseq_dropout.py
│   │   ├── fp32_batch_norm.py
│   │   ├── fp32_group_norm.py
│   │   ├── fp32_instance_norm.py
│   │   ├── gelu.py
│   │   ├── grad_multiply.py
│   │   ├── gumbel_vector_quantizer.py
│   │   ├── kmeans_attention.py
│   │   ├── kmeans_vector_quantizer.py
│   │   ├── layer_drop.py
│   │   ├── layer_norm.py
│   │   ├── learned_positional_embedding.py
│   │   ├── lightconv_layer/
│   │   │   ├── __init__.py
│   │   │   ├── cuda_function_gen.py
│   │   │   ├── lightconv_cuda.cpp
│   │   │   ├── lightconv_cuda.cuh
│   │   │   ├── lightconv_cuda_kernel.cu
│   │   │   ├── lightconv_layer.py
│   │   │   └── setup.py
│   │   ├── lightweight_convolution.py
│   │   ├── linearized_convolution.py
│   │   ├── location_attention.py
│   │   ├── lstm_cell_with_zoneout.py
│   │   ├── multihead_attention.py
│   │   ├── positional_embedding.py
│   │   ├── positional_encoding.py
│   │   ├── quant_noise.py
│   │   ├── quantization/
│   │   │   ├── __init__.py
│   │   │   ├── pq/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── em.py
│   │   │   │   ├── modules/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── qconv.py
│   │   │   │   │   ├── qemb.py
│   │   │   │   │   └── qlinear.py
│   │   │   │   ├── pq.py
│   │   │   │   └── utils.py
│   │   │   ├── quantization_options.py
│   │   │   └── scalar/
│   │   │       ├── __init__.py
│   │   │       ├── modules/
│   │   │       │   ├── __init__.py
│   │   │       │   ├── qact.py
│   │   │       │   ├── qconv.py
│   │   │       │   ├── qemb.py
│   │   │       │   └── qlinear.py
│   │   │       ├── ops.py
│   │   │       └── utils.py
│   │   ├── rotary_positional_embedding.py
│   │   ├── same_pad.py
│   │   ├── scalar_bias.py
│   │   ├── sinusoidal_positional_embedding.py
│   │   ├── sparse_multihead_attention.py
│   │   ├── sparse_transformer_sentence_encoder.py
│   │   ├── sparse_transformer_sentence_encoder_layer.py
│   │   ├── transformer_layer.py
│   │   ├── transformer_layer_aug.py
│   │   ├── transformer_sentence_encoder.py
│   │   ├── transformer_sentence_encoder_layer.py
│   │   ├── transpose_last.py
│   │   ├── unfold.py
│   │   └── vggblock.py
│   ├── nan_detector.py
│   ├── ngram_repeat_block.py
│   ├── optim/
│   │   ├── __init__.py
│   │   ├── adadelta.py
│   │   ├── adafactor.py
│   │   ├── adagrad.py
│   │   ├── adam.py
│   │   ├── adamax.py
│   │   ├── amp_optimizer.py
│   │   ├── bmuf.py
│   │   ├── composite.py
│   │   ├── cpu_adam.py
│   │   ├── dynamic_loss_scaler.py
│   │   ├── fairseq_optimizer.py
│   │   ├── fp16_optimizer.py
│   │   ├── fused_adam.py
│   │   ├── fused_lamb.py
│   │   ├── lr_scheduler/
│   │   │   ├── __init__.py
│   │   │   ├── cosine_lr_scheduler.py
│   │   │   ├── fairseq_lr_scheduler.py
│   │   │   ├── fixed_schedule.py
│   │   │   ├── inverse_square_root_schedule.py
│   │   │   ├── manual_lr_scheduler.py
│   │   │   ├── pass_through.py
│   │   │   ├── polynomial_decay_schedule.py
│   │   │   ├── reduce_lr_on_plateau.py
│   │   │   ├── step_lr_scheduler.py
│   │   │   ├── tri_stage_lr_scheduler.py
│   │   │   └── triangular_lr_scheduler.py
│   │   ├── nag.py
│   │   ├── sgd.py
│   │   └── shard.py
│   ├── options.py
│   ├── pdb.py
│   ├── quantization_utils.py
│   ├── registry.py
│   ├── scoring/
│   │   ├── __init__.py
│   │   ├── bertscore.py
│   │   ├── bleu.py
│   │   ├── chrf.py
│   │   ├── meteor.py
│   │   ├── tokenizer.py
│   │   └── wer.py
│   ├── search.py
│   ├── sequence_generator.py
│   ├── sequence_scorer.py
│   ├── speech_generator.py
│   ├── tasks/
│   │   ├── __init__.py
│   │   ├── audio_classification.py
│   │   ├── audio_finetuning.py
│   │   ├── audio_pretraining.py
│   │   ├── cross_lingual_lm.py
│   │   ├── denoising.py
│   │   ├── fairseq_task.py
│   │   ├── frm_text_to_speech.py
│   │   ├── hubert_pretraining.py
│   │   ├── language_modeling.py
│   │   ├── legacy_masked_lm.py
│   │   ├── masked_lm.py
│   │   ├── multilingual_denoising.py
│   │   ├── multilingual_language_modeling.py
│   │   ├── multilingual_masked_lm.py
│   │   ├── multilingual_translation.py
│   │   ├── multires_hubert_pretraining.py
│   │   ├── nlu_finetuning.py
│   │   ├── online_backtranslation.py
│   │   ├── semisupervised_translation.py
│   │   ├── sentence_prediction.py
│   │   ├── sentence_prediction_adapters.py
│   │   ├── sentence_ranking.py
│   │   ├── simultaneous_translation.py
│   │   ├── span_masked_lm.py
│   │   ├── speech_dlm_task.py
│   │   ├── speech_to_speech.py
│   │   ├── speech_to_text.py
│   │   ├── speech_ulm_task.py
│   │   ├── text_to_speech.py
│   │   ├── translation.py
│   │   ├── translation_from_pretrained_bart.py
│   │   ├── translation_from_pretrained_xlm.py
│   │   ├── translation_lev.py
│   │   └── translation_multi_simple_epoch.py
│   ├── token_generation_constraints.py
│   ├── tokenizer.py
│   ├── trainer.py
│   ├── utils.py
│   └── version.txt
├── fairseq_cli/
│   ├── __init__.py
│   ├── eval_lm.py
│   ├── generate.py
│   ├── hydra_train.py
│   ├── hydra_validate.py
│   ├── interactive.py
│   ├── preprocess.py
│   ├── score.py
│   ├── train.py
│   └── validate.py
├── hubconf.py
├── hydra_plugins/
│   └── dependency_submitit_launcher/
│       ├── hydra_plugins/
│       │   └── dependency_submitit_launcher/
│       │       ├── __init__.py
│       │       ├── config.py
│       │       └── launcher.py
│       └── setup.py
├── pyproject.toml
├── release_utils.py
├── scripts/
│   ├── __init__.py
│   ├── average_checkpoints.py
│   ├── build_sym_alignment.py
│   ├── check_installation.py
│   ├── compare_namespaces.py
│   ├── compound_split_bleu.sh
│   ├── constraints/
│   │   ├── extract.py
│   │   └── validate.py
│   ├── convert_dictionary.lua
│   ├── convert_model.lua
│   ├── count_docs.py
│   ├── read_binarized.py
│   ├── rm_pt.py
│   ├── sacrebleu.sh
│   ├── shard_docs.py
│   ├── split_train_valid_docs.py
│   ├── spm_decode.py
│   ├── spm_encode.py
│   ├── spm_train.py
│   └── test_fsdp.sh
├── setup.cfg
├── setup.py
├── tests/
│   ├── __init__.py
│   ├── distributed/
│   │   ├── __init__.py
│   │   ├── test_bmuf.py
│   │   ├── test_distributed_timeout_wrapper.py
│   │   ├── test_module_proxy_wrapper.py
│   │   ├── test_utils.py
│   │   └── utils.py
│   ├── gpu/
│   │   ├── __init__.py
│   │   ├── test_binaries_gpu.py
│   │   ├── test_ema_gpu.py
│   │   └── transformer_quantization_config.yaml
│   ├── speech/
│   │   ├── __init__.py
│   │   ├── test_convtransformer_simul_trans.py
│   │   ├── test_dual_input_wav_transformer.py
│   │   ├── test_dualinput_s2t_transformer.py
│   │   ├── test_fastspeech2.py
│   │   ├── test_s2s_transformer.py
│   │   ├── test_s2t_conformer.py
│   │   ├── test_s2t_transformer.py
│   │   ├── test_tts_transformer.py
│   │   ├── test_wav2vec2.py
│   │   └── test_xm_transformer.py
│   ├── speech_recognition/
│   │   ├── __init__.py
│   │   ├── asr_test_base.py
│   │   ├── test_collaters.py
│   │   ├── test_cross_entropy.py
│   │   ├── test_data_utils.py
│   │   └── test_vggtransformer.py
│   ├── tasks/
│   │   ├── test_denoising.py
│   │   ├── test_masked_lm.py
│   │   ├── test_multilingual_denoising.py
│   │   └── test_span_masked_lm.py
│   ├── test_activation_checkpointing.py
│   ├── test_amp_optimizer.py
│   ├── test_average_checkpoints.py
│   ├── test_backtranslation_dataset.py
│   ├── test_binaries.py
│   ├── test_binarizer.py
│   ├── test_character_token_embedder.py
│   ├── test_checkpoint_utils.py
│   ├── test_checkpoint_utils_for_task_level_attributes.py
│   ├── test_concat_dataset.py
│   ├── test_constraints.py
│   ├── test_convtbc.py
│   ├── test_data_utils.py
│   ├── test_dataclass_utils.py
│   ├── test_dataset.py
│   ├── test_dictionary.py
│   ├── test_ema.py
│   ├── test_espnet_multihead_attention.py
│   ├── test_export.py
│   ├── test_file_chunker_utils.py
│   ├── test_file_io.py
│   ├── test_fp16_optimizer.py
│   ├── test_hf_hub.py
│   ├── test_huffman.py
│   ├── test_inference_dropout.py
│   ├── test_iopath.py
│   ├── test_iterators.py
│   ├── test_label_smoothing.py
│   ├── test_lm_context_window.py
│   ├── test_lstm_jitable.py
│   ├── test_memory_efficient_fp16.py
│   ├── test_metrics.py
│   ├── test_multi_corpus_dataset.py
│   ├── test_multi_corpus_sampled_dataset.py
│   ├── test_multihead_attention.py
│   ├── test_noising.py
│   ├── test_online_backtranslation.py
│   ├── test_plasma_utils.py
│   ├── test_positional_encoding.py
│   ├── test_reproducibility.py
│   ├── test_resampling_dataset.py
│   ├── test_roberta.py
│   ├── test_rotary_positional_embedding.py
│   ├── test_sequence_generator.py
│   ├── test_sequence_scorer.py
│   ├── test_sparse_multihead_attention.py
│   ├── test_token_block_dataset.py
│   ├── test_train.py
│   ├── test_transformer.py
│   ├── test_utils.py
│   ├── test_valid_subset_checks.py
│   └── utils.py
└── train.py

Download .txt

Showing preview only (738K chars total). Download the full file or copy to clipboard to get everything.

SYMBOL INDEX (9202 symbols across 872 files)

FILE: examples/MMPT/locallaunch.py
  class JobLauncher (line 14) | class JobLauncher(object):
    method __init__ (line 19) | def __init__(self, yaml_file):
    method __call__ (line 31) | def __call__(self, job_type=None, dryrun=False):
  class Pipeline (line 40) | class Pipeline(object):
    method __init__ (line 43) | def __init__(self, fn):
    method __getitem__ (line 78) | def __getitem__(self, idx):
    method __len__ (line 84) | def __len__(self):
    method _save_configs (line 87) | def _save_configs(self, configs_to_save: dict):
    method _overwrite_task (line 95) | def _overwrite_task(self, job_config):
  function main (line 125) | def main(args):

FILE: examples/MMPT/mmpt/datasets/fairseqmmdataset.py
  class FairseqMMDataset (line 16) | class FairseqMMDataset(FairseqDataset):
    method __init__ (line 21) | def __init__(self, mmdataset):
    method set_epoch (line 26) | def set_epoch(self, epoch, **unused):
    method __getitem__ (line 30) | def __getitem__(self, idx):
    method __len__ (line 34) | def __len__(self):
    method collater (line 37) | def collater(self, samples):
    method size (line 51) | def size(self, index):
    method num_tokens (line 55) | def num_tokens(self, index):

FILE: examples/MMPT/mmpt/datasets/mmdataset.py
  class MMDataset (line 16) | class MMDataset(Dataset):
    method __init__ (line 30) | def __init__(
    method __len__ (line 43) | def __len__(self):
    method __getitem__ (line 46) | def __getitem__(self, idx):
    method collater (line 57) | def collater(self, samples):
    method print_example (line 79) | def print_example(self, output):

FILE: examples/MMPT/mmpt/evaluators/evaluator.py
  class Evaluator (line 13) | class Evaluator(object):
    method __init__ (line 20) | def __init__(self, config, eval_dataloader=None):
    method __call__ (line 31) | def __call__(self):
    method evaluate (line 48) | def evaluate(self, model, eval_dataloader=None, output_file="merged"):

FILE: examples/MMPT/mmpt/evaluators/metric.py
  class Metric (line 10) | class Metric(object):
    method __init__ (line 11) | def __init__(self, config, metric_names):
    method best_metric (line 14) | def best_metric(self, metric):
    method save_metrics (line 17) | def save_metrics(self, fn, metrics):
    method print_computed_metrics (line 21) | def print_computed_metrics(self, metrics):
  class RetrievalMetric (line 25) | class RetrievalMetric(Metric):
    method __init__ (line 33) | def __init__(self, config, metric_names=["R1", "R5", "R10", "MR"]):
    method compute_metrics (line 37) | def compute_metrics(self, outputs, texts, **kwargs):
    method print_computed_metrics (line 60) | def print_computed_metrics(self, metrics):
  class DiDeMoMetric (line 74) | class DiDeMoMetric(Metric):
    method __init__ (line 82) | def __init__(self, config, metric_names=["rank1", "rank5", "miou"]):
    method compute_metrics (line 85) | def compute_metrics(self, outputs, targets, **kwargs):
    method print_computed_metrics (line 95) | def print_computed_metrics(self, metrics):
    method _iou (line 109) | def _iou(self, pred, gt):
    method _rank (line 114) | def _rank(self, pred, gt):
    method _eval_predictions (line 117) | def _eval_predictions(self, segments, data):
  class NLGMetric (line 145) | class NLGMetric(Metric):
    method __init__ (line 146) | def __init__(
    method compute_metrics (line 159) | def compute_metrics(self, outputs, targets, **kwargs):
    method print_computed_metrics (line 163) | def print_computed_metrics(self, metrics):
  class QAMetric (line 179) | class QAMetric(Metric):
    method __init__ (line 180) | def __init__(
    method compute_metrics (line 187) | def compute_metrics(self, outputs, targets, **kwargs):
    method print_computed_metrics (line 191) | def print_computed_metrics(self, metrics):
  class COINActionSegmentationMetric (line 195) | class COINActionSegmentationMetric(Metric):
    method __init__ (line 205) | def __init__(self, config, metric_name=["frame_acc"]):
    method compute_metrics (line 208) | def compute_metrics(self, outputs, targets):
    method print_computed_metrics (line 215) | def print_computed_metrics(self, metrics):
  class CrossTaskMetric (line 220) | class CrossTaskMetric(Metric):
    method __init__ (line 221) | def __init__(self, config, metric_names=["recall"]):
    method compute_metrics (line 224) | def compute_metrics(self, outputs, targets, **kwargs):
    method print_computed_metrics (line 237) | def print_computed_metrics(self, metrics):
    method _get_recalls (line 244) | def _get_recalls(self, Y_true, Y_pred):
  class ActionRecognitionMetric (line 262) | class ActionRecognitionMetric(Metric):
    method __init__ (line 263) | def __init__(
    method compute_metrics (line 270) | def compute_metrics(self, outputs, targets, splits, **kwargs):
    method print_computed_metrics (line 305) | def print_computed_metrics(self, metrics):

FILE: examples/MMPT/mmpt/evaluators/predictor.py
  class Predictor (line 16) | class Predictor(object):
    method __init__ (line 21) | def __init__(self, config):
    method __call__ (line 27) | def __call__(self, outputs):
    method predict_loop (line 31) | def predict_loop(self, model, eval_dataloader, output_file=None):
    method finalize (line 44) | def finalize(self, output_file):
    method to_ctx (line 47) | def to_ctx(self, data, ctx=0, dtype=None):
  class NLGPredictor (line 59) | class NLGPredictor(Predictor):
    method __init__ (line 62) | def __init__(self, config):
    method predict_loop (line 72) | def predict_loop(self, model, eval_dataloader, output_file=None):
    method __call__ (line 84) | def __call__(self, data, model, outputs):
    method finalize (line 104) | def finalize(self, outputs, output_file=None):
  class RetrievalPredictor (line 112) | class RetrievalPredictor(Predictor):
    method __init__ (line 114) | def __init__(self, config):
    method predict_loop (line 120) | def predict_loop(
    method __call__ (line 153) | def __call__(self, sample, full_scores):
    method finalize (line 157) | def finalize(self, full_scores, texts, output_file=None):
    method _get_pooled_outputs (line 163) | def _get_pooled_outputs(self, outputs):
    method _append_scores (line 169) | def _append_scores(self, scores, full_scores):
    method _aggregate_scores (line 177) | def _aggregate_scores(self, scores):
  class QAPredictor (line 186) | class QAPredictor(Predictor):
    method __init__ (line 188) | def __init__(self, config):
    method predict_loop (line 192) | def predict_loop(self, model, eval_dataloader, output_file="qa.npy"):
    method __call__ (line 215) | def __call__(self, sample):
    method finalize (line 223) | def finalize(self, output_file=None):
    method _append_scores (line 229) | def _append_scores(self, scores, answers, full_scores):
    method _aggregate_scores (line 236) | def _aggregate_scores(self, scores):
  class CrossTaskPredictor (line 245) | class CrossTaskPredictor(Predictor):
    method __init__ (line 250) | def __init__(self, config):
    method predict_loop (line 258) | def predict_loop(self, model, eval_dataloader, output_file="result.pkl"):
    method __call__ (line 273) | def __call__(self, sample, model, Y_pred, Y_true):
    method finalize (line 316) | def finalize(self, Y_pred, Y_true, output_file=None):
    method _read_assignment (line 326) | def _read_assignment(self, T, K, path):
  class COINPredictor (line 350) | class COINPredictor(Predictor):
    method __init__ (line 354) | def __init__(self, config):
    method predict_loop (line 360) | def predict_loop(self, model, eval_dataloader, output_file="result.pkl"):
    method __call__ (line 375) | def __call__(self, sample, model, Y_pred, Y_true):
    method _merge_windows (line 383) | def _merge_windows(self, sample, output):
    method finalize (line 410) | def finalize(self, Y_pred, Y_true, output_file=None):
  class COINZSPredictor (line 429) | class COINZSPredictor(COINPredictor):
    method __init__ (line 434) | def __init__(self, config):
    method predict_loop (line 438) | def predict_loop(self, model, eval_dataloader, output_file="result.pkl"):
    method reshape_subsample (line 465) | def reshape_subsample(self, sample):
    method flat_subsample (line 471) | def flat_subsample(self, tensor):
    method __call__ (line 476) | def __call__(self, sample, label_hidden_states, model, lbd, Y_pred, Y_...
    method finalize (line 495) | def finalize(self, Y_pred, Y_true, output_file=None):
  class DiDeMoPredictor (line 514) | class DiDeMoPredictor(Predictor):
    method __init__ (line 518) | def __init__(self, config):
    method predict_loop (line 524) | def predict_loop(self, model, eval_dataloader, output_file="didemo.npy"):
    method __call__ (line 555) | def __call__(self, sample):
    method finalize (line 587) | def finalize(self, output_file=None):
    method _aggregate_scores (line 593) | def _aggregate_scores(self, scores):

FILE: examples/MMPT/mmpt/losses/fairseqmmloss.py
  class MMCriterion (line 15) | class MMCriterion(FairseqCriterion):
    method __init__ (line 16) | def __init__(self, task):
    method forward (line 21) | def forward(self, model, sample):
    method reduce_metrics (line 48) | def reduce_metrics(logging_outputs) -> None:
    method logging_outputs_can_be_summed (line 57) | def logging_outputs_can_be_summed() -> bool:

FILE: examples/MMPT/mmpt/losses/loss.py
  class Loss (line 8) | class Loss(object):
    method __call__ (line 9) | def __call__(self, *args, **kwargs):
  class DummyLoss (line 14) | class DummyLoss(Loss):
    method __init__ (line 15) | def __init__(self):
    method __call__ (line 18) | def __call__(self, logits, targets, **kwargs):
  class DummyK400Loss (line 22) | class DummyK400Loss(Loss):
    method __init__ (line 24) | def __init__(self):
    method __call__ (line 27) | def __call__(self, logits, targets, **kwargs):
  class CrossEntropy (line 32) | class CrossEntropy(Loss):
    method __init__ (line 33) | def __init__(self):
    method __call__ (line 36) | def __call__(self, logits, targets, **kwargs):
  class ArgmaxCrossEntropy (line 40) | class ArgmaxCrossEntropy(Loss):
    method __init__ (line 41) | def __init__(self):
    method __call__ (line 44) | def __call__(self, logits, targets, **kwargs):
  class BCE (line 48) | class BCE(Loss):
    method __init__ (line 49) | def __init__(self):
    method __call__ (line 52) | def __call__(self, logits, targets, **kwargs):
  class NLGLoss (line 57) | class NLGLoss(Loss):
    method __init__ (line 58) | def __init__(self):
    method __call__ (line 61) | def __call__(self, logits, text_label, **kwargs):
  class MSE (line 66) | class MSE(Loss):
    method __init__ (line 67) | def __init__(self):
    method __call__ (line 70) | def __call__(self, logits, targets, **kwargs):
  class L1 (line 74) | class L1(Loss):
    method __init__ (line 75) | def __init__(self):
    method __call__ (line 78) | def __call__(self, logits, targets, **kwargs):
  class SmoothL1 (line 82) | class SmoothL1(Loss):
    method __init__ (line 83) | def __init__(self):
    method __call__ (line 86) | def __call__(self, logits, targets, **kwargs):

FILE: examples/MMPT/mmpt/losses/nce.py
  class NCE (line 17) | class NCE(Loss):
    method __init__ (line 18) | def __init__(self):
    method __call__ (line 22) | def __call__(self, align_scores, **kargs):
  class T2VContraLoss (line 42) | class T2VContraLoss(Loss):
    method __init__ (line 45) | def __init__(self):
    method __call__ (line 49) | def __call__(self, pooled_video, pooled_text, **kargs):
  class V2TContraLoss (line 59) | class V2TContraLoss(Loss):
    method __init__ (line 62) | def __init__(self):
    method __call__ (line 66) | def __call__(self, pooled_video, pooled_text, **kargs):
  class MMContraLoss (line 76) | class MMContraLoss(Loss):
    method __init__ (line 77) | def __init__(self):
    method __call__ (line 80) | def __call__(self, pooled_video, pooled_text, **kwargs):
  class MTM (line 93) | class MTM(Loss):
    method __init__ (line 96) | def __init__(self):
    method __call__ (line 99) | def __call__(
  class MFMMLM (line 129) | class MFMMLM(Loss):
    method __init__ (line 132) | def __init__(self):
    method __call__ (line 135) | def __call__(

FILE: examples/MMPT/mmpt/models/fairseqmmmodel.py
  class FairseqMMModel (line 14) | class FairseqMMModel(BaseFairseqModel):
    method build_model (line 18) | def build_model(cls, args, task):
    method __init__ (line 21) | def __init__(self, mmmodel):
    method forward (line 25) | def forward(self, *args, **kwargs):
    method upgrade_state_dict_named (line 28) | def upgrade_state_dict_named(self, state_dict, name):
  function mmarch (line 50) | def mmarch(args):

FILE: examples/MMPT/mmpt/models/mmfusion.py
  class MMPTModel (line 31) | class MMPTModel(nn.Module):
    method from_pretrained (line 35) | def from_pretrained(cls, config, checkpoint="checkpoint_best.pt"):
    method __init__ (line 60) | def __init__(self, config, model, video_encoder, **kwargs):
    method forward (line 66) | def forward(self, video_frames, caps, cmasks, return_score=False):
  class MMFusion (line 92) | class MMFusion(nn.Module):
    method __init__ (line 96) | def __init__(self, config, **kwargs):
    method forward (line 137) | def forward(
    method _mm_on_the_fly (line 149) | def _mm_on_the_fly(
    method _mm_attention_mask (line 180) | def _mm_attention_mask(self, cmasks, vmasks):
    method _make_iso_mask (line 212) | def _make_iso_mask(self, batch_size, cmasks, vmasks):
    method _pooling_vt_layer (line 258) | def _pooling_vt_layer(
  class MMFusionMFMMLM (line 297) | class MMFusionMFMMLM(MMFusion):
    method forward (line 299) | def forward(
  class MMFusionMTM (line 351) | class MMFusionMTM(MMFusionMFMMLM):
    method __init__ (line 352) | def __init__(self, config, **kwargs):
  class MMFusionShare (line 366) | class MMFusionShare(MMFusion):
    method forward (line 370) | def forward(
    method forward_video (line 398) | def forward_video(
    method forward_text (line 455) | def forward_text(
  class MMFusionSeparate (line 514) | class MMFusionSeparate(MMFusionShare):
    method forward_video (line 515) | def forward_video(
    method forward_text (line 572) | def forward_text(
  class MMFusionJoint (line 624) | class MMFusionJoint(MMFusion):
    method forward (line 627) | def forward(
  class MMFusionActionSegmentation (line 663) | class MMFusionActionSegmentation(MMFusion):
    method forward (line 667) | def forward(
  class MMFusionActionLocalization (line 705) | class MMFusionActionLocalization(MMFusion):
    method __init__ (line 708) | def __init__(self, config, **kwargs):
    method forward (line 716) | def forward(
  class MMFusionSeparateActionSegmentation (line 791) | class MMFusionSeparateActionSegmentation(MMFusionSeparate):
    method forward (line 793) | def forward(
  class MMFusionSeparateActionLocalization (line 817) | class MMFusionSeparateActionLocalization(MMFusionSeparate):
    method __init__ (line 818) | def __init__(self, config, **kwargs):
    method forward (line 826) | def forward(
  class MMFusionShareActionLocalization (line 873) | class MMFusionShareActionLocalization(MMFusionShare):
    method __init__ (line 874) | def __init__(self, config, **kwargs):
    method forward (line 882) | def forward(

FILE: examples/MMPT/mmpt/models/mmfusionnlg.py
  class MMFusionNLG (line 43) | class MMFusionNLG(MMFusion):
    method __init__ (line 44) | def __init__(self, config, **kwargs):
    method forward (line 57) | def forward(
    method generate (line 82) | def generate(
  class MMBertForNLG (line 113) | class MMBertForNLG(BertPreTrainedModel):
    method __init__ (line 114) | def __init__(self, config):
    method get_output_embeddings (line 124) | def get_output_embeddings(self):
    method forward (line 127) | def forward(
    method prepare_inputs_for_generation (line 192) | def prepare_inputs_for_generation(
    method generate (line 220) | def generate(
    method _generate_beam_search (line 639) | def _generate_beam_search(
    method _generate_no_beam_search (line 899) | def _generate_no_beam_search(

FILE: examples/MMPT/mmpt/models/transformermodel.py
  class MMBertForJoint (line 36) | class MMBertForJoint(BertPreTrainedModel):
    method __init__ (line 39) | def __init__(self, config):
    method forward (line 45) | def forward(
  class MMBertForTokenClassification (line 83) | class MMBertForTokenClassification(BertPreTrainedModel):
    method __init__ (line 87) | def __init__(self, config):
    method forward (line 96) | def forward(
  class MMBertForEncoder (line 136) | class MMBertForEncoder(BertPreTrainedModel):
    method __init__ (line 138) | def __init__(self, config):
    method forward (line 144) | def forward(
  class MMBertForMFMMLM (line 181) | class MMBertForMFMMLM(BertPreTrainedModel):
    method __init__ (line 183) | def __init__(self, config):
    method get_output_embeddings (line 191) | def get_output_embeddings(self):
    method forward (line 194) | def forward(
  class BertMFMMLMPredictionHead (line 282) | class BertMFMMLMPredictionHead(nn.Module):
    method __init__ (line 283) | def __init__(self, config):
    method forward (line 297) | def forward(
  class MFMMLMHead (line 325) | class MFMMLMHead(nn.Module):
    method __init__ (line 326) | def __init__(self, config):
    method forward (line 330) | def forward(
  class MMBertForMTM (line 346) | class MMBertForMTM(MMBertForMFMMLM):
    method __init__ (line 347) | def __init__(self, config):
  class BertMTMPredictionHead (line 356) | class BertMTMPredictionHead(nn.Module):
    method __init__ (line 357) | def __init__(self, config):
    method forward (line 363) | def forward(
  class MTMHead (line 406) | class MTMHead(nn.Module):
    method __init__ (line 407) | def __init__(self, config):
    method forward (line 411) | def forward(
  class MMBertModel (line 427) | class MMBertModel(BertModel):
    method __init__ (line 430) | def __init__(self, config, add_pooling_layer=True):
    method forward (line 437) | def forward(
    method get_extended_attention_mask (line 631) | def get_extended_attention_mask(self, attention_mask, input_shape, dev...
  class MultiLayerAttentionMaskBertEncoder (line 670) | class MultiLayerAttentionMaskBertEncoder(BertEncoder):
    method forward (line 674) | def forward(

FILE: examples/MMPT/mmpt/modules/mm.py
  class VideoTokenMLP (line 32) | class VideoTokenMLP(nn.Module):
    method __init__ (line 33) | def __init__(self, config):
    method forward (line 41) | def forward(self, hidden_states):
  class MMBertEmbeddings (line 49) | class MMBertEmbeddings(BertEmbeddings):
    method __init__ (line 50) | def __init__(self, config):
    method forward (line 60) | def forward(
  class AlignHead (line 136) | class AlignHead(nn.Module):
    method __init__ (line 139) | def __init__(self, config):
    method forward (line 143) | def forward(self, dropout_pooled_output):

FILE: examples/MMPT/mmpt/modules/retri.py
  class VectorRetriever (line 20) | class VectorRetriever(object):
    method __init__ (line 27) | def __init__(self, hidden_size, cent, db_type, examples_per_cent_to_tr...
    method make_direct_maps (line 45) | def make_direct_maps(self):
    method __len__ (line 48) | def __len__(self):
    method save (line 51) | def save(self, out_dir):
    method load (line 65) | def load(self, out_dir):
    method add (line 72) | def add(self, hidden_states, video_ids, last=False):
    method finalize_training (line 95) | def finalize_training(self):
    method search (line 107) | def search(
    method search_by_video_ids (line 139) | def search_by_video_ids(
  class VectorRetrieverDM (line 187) | class VectorRetrieverDM(VectorRetriever):
    method __init__ (line 195) | def __init__(
    method make_direct_maps (line 206) | def make_direct_maps(self):
    method search (line 210) | def search(
    method search_by_video_ids (line 244) | def search_by_video_ids(
  class MMVectorRetriever (line 291) | class MMVectorRetriever(VectorRetrieverDM):
    method __init__ (line 297) | def __init__(self, hidden_size, cent, db_type, examples_per_cent_to_tr...
    method __len__ (line 307) | def __len__(self):
    method make_direct_maps (line 311) | def make_direct_maps(self):
    method save (line 315) | def save(self, out_dir):
    method load (line 334) | def load(self, out_dir):
    method add (line 345) | def add(self, hidden_states, video_ids):
    method get_clips_by_video_id (line 376) | def get_clips_by_video_id(self, video_id):
    method search (line 383) | def search(

FILE: examples/MMPT/mmpt/modules/vectorpool.py
  class VectorPool (line 12) | class VectorPool(object):
    method __init__ (line 17) | def __init__(self, config):
    method __call__ (line 23) | def __call__(self, sample, **kwargs):
    method build_retriver (line 26) | def build_retriver(
    method __repr__ (line 40) | def __repr__(self):
  class VideoVectorPool (line 49) | class VideoVectorPool(VectorPool):
    method __init__ (line 53) | def __init__(self, config):
    method __call__ (line 57) | def __call__(self, sample, subsampling, **kwargs):
  class DistributedVectorPool (line 78) | class DistributedVectorPool(VectorPool):
    method __init__ (line 82) | def __init__(self, config):
    method build_retriver (line 91) | def build_retriver(
    method load (line 122) | def load(self, local_rank):
    method save (line 137) | def save(self):
  class DistributedVideoVectorPool (line 164) | class DistributedVideoVectorPool(DistributedVectorPool):
    method __call__ (line 168) | def __call__(self, sample, subsampling, **kwargs):
  class TextClipVectorPool (line 189) | class TextClipVectorPool(VectorPool):
    method __init__ (line 190) | def __init__(self, config):
    method __call__ (line 197) | def __call__(self, sample, **kwargs):
  class MMClipVectorPool (line 212) | class MMClipVectorPool(VectorPool):
    method __init__ (line 216) | def __init__(self, out_dir):
    method __call__ (line 221) | def __call__(self, sample, **kwargs):

FILE: examples/MMPT/mmpt/processors/dedupprocessor.py
  class CaptionDedupProcessor (line 14) | class CaptionDedupProcessor(object):
    method __init__ (line 38) | def __init__(self, pkl_file):
    method __call__ (line 49) | def __call__(self):
    method single (line 58) | def single(self, video_id):
    method finalize (line 74) | def finalize(self, tgt_fn):
    method save_stat (line 78) | def save_stat(self, video_id, caption):
    method print_stat (line 106) | def print_stat(self):
    method _dedup (line 119) | def _dedup(self, caption):
  function convert_to_pickle (line 221) | def convert_to_pickle(src_fn, tgt_fn):

FILE: examples/MMPT/mmpt/processors/dsprocessor.py
  class DSAligner (line 31) | class DSAligner(Aligner):
    method __call__ (line 36) | def __call__(self, video_id, video_feature, text_feature, wps=0.7):
  class NLGTextProcessor (line 66) | class NLGTextProcessor(TextProcessor):
    method __call__ (line 70) | def __call__(self, text_id):
  class DSNLGAligner (line 74) | class DSNLGAligner(DSAligner):
    method __init__ (line 76) | def __init__(self, config):
    method __call__ (line 89) | def __call__(self, video_id, video_feature, text_feature):
  class MSRVTTMetaProcessor (line 120) | class MSRVTTMetaProcessor(MetaProcessor):
    method __init__ (line 125) | def __init__(self, config):
    method __len__ (line 146) | def __len__(self):
    method __getitem__ (line 149) | def __getitem__(self, idx):
  class MSRVTTTextProcessor (line 160) | class MSRVTTTextProcessor(TextProcessor):
    method __init__ (line 166) | def __init__(self, config):
    method __call__ (line 176) | def __call__(self, text_id):
  class MSRVTTNLGTextProcessor (line 186) | class MSRVTTNLGTextProcessor(MSRVTTTextProcessor):
    method __call__ (line 188) | def __call__(self, text_id):
  class MSRVTTQAMetaProcessor (line 198) | class MSRVTTQAMetaProcessor(MetaProcessor):
    method __init__ (line 204) | def __init__(self, config):
    method __len__ (line 221) | def __len__(self):
    method __getitem__ (line 224) | def __getitem__(self, idx):
  class MSRVTTQATextProcessor (line 228) | class MSRVTTQATextProcessor(TextProcessor):
    method __call__ (line 233) | def __call__(self, text_ans):
  class MSRVTTQAAligner (line 240) | class MSRVTTQAAligner(DSAligner):
    method __call__ (line 246) | def __call__(self, video_id, video_feature, text_feature, wps=0.7):
  class YoucookMetaProcessor (line 266) | class YoucookMetaProcessor(MetaProcessor):
    method __init__ (line 279) | def __init__(self, config):
    method __getitem__ (line 312) | def __getitem__(self, idx):
  class YoucookVideoProcessor (line 330) | class YoucookVideoProcessor(VideoProcessor):
    method __call__ (line 333) | def __call__(self, video_fn):
  class YoucookNLGMetaProcessor (line 339) | class YoucookNLGMetaProcessor(MetaProcessor):
    method __init__ (line 344) | def __init__(self, config):
    method __getitem__ (line 372) | def __getitem__(self, idx):
  class CrossTaskMetaProcessor (line 378) | class CrossTaskMetaProcessor(MetaProcessor):
    method __init__ (line 379) | def __init__(self, config):
    method __getitem__ (line 436) | def __getitem__(self, idx):
    method __len__ (line 443) | def __len__(self):
    method _random_split (line 446) | def _random_split(self, task_vids, test_tasks, n_train):
    method _get_vids (line 459) | def _get_vids(self, path, vfeat_dir, annotation_path):
    method _read_task_info (line 484) | def _read_task_info(self, path):
    method _get_A (line 506) | def _get_A(self, task_steps, share="words"):
  class CrossTaskVideoProcessor (line 546) | class CrossTaskVideoProcessor(VideoProcessor):
    method __call__ (line 547) | def __call__(self, video_fn):
  class CrossTaskTextProcessor (line 554) | class CrossTaskTextProcessor(TextProcessor):
    method __call__ (line 555) | def __call__(self, text_id):
  class CrossTaskAligner (line 565) | class CrossTaskAligner(Aligner):
    method __init__ (line 569) | def __init__(self, config):
    method __call__ (line 575) | def __call__(self, video_id, video_feature, text_feature):
    method _read_assignment (line 635) | def _read_assignment(self, T, K, path):
  class MetaTextBinarizer (line 661) | class MetaTextBinarizer(Aligner):
    method __call__ (line 662) | def __call__(self, text_feature):
  class COINActionSegmentationMetaProcessor (line 676) | class COINActionSegmentationMetaProcessor(MetaProcessor):
    method __init__ (line 683) | def __init__(self, config):
    method meta_text_labels (line 718) | def meta_text_labels(self, config):
    method __getitem__ (line 736) | def __getitem__(self, idx):
  class COINActionSegmentationTextProcessor (line 740) | class COINActionSegmentationTextProcessor(TextProcessor):
    method __call__ (line 741) | def __call__(self, text_label):
  class COINActionSegmentationAligner (line 745) | class COINActionSegmentationAligner(Aligner):
    method __init__ (line 746) | def __init__(self, config):
    method __call__ (line 751) | def __call__(self, video_id, video_feature, text_feature):
  class DiDeMoMetaProcessor (line 808) | class DiDeMoMetaProcessor(MetaProcessor):
    method __init__ (line 812) | def __init__(self, config):
    method __len__ (line 825) | def __len__(self):
    method __getitem__ (line 828) | def __getitem__(self, idx):
  class DiDeMoTextProcessor (line 832) | class DiDeMoTextProcessor(TextProcessor):
    method __call__ (line 837) | def __call__(self, text):
  class DiDeMoAligner (line 841) | class DiDeMoAligner(DSAligner):
    method __call__ (line 846) | def __call__(self, video_id, video_feature, text_feature):

FILE: examples/MMPT/mmpt/processors/how2processor.py
  class How2MetaProcessor (line 39) | class How2MetaProcessor(MetaProcessor):
    method __init__ (line 40) | def __init__(self, config):
    method __getitem__ (line 46) | def __getitem__(self, idx):
  class ShardedHow2MetaProcessor (line 51) | class ShardedHow2MetaProcessor(How2MetaProcessor):
    method __init__ (line 52) | def __init__(self, config):
    method _init_shard (line 58) | def _init_shard(self):
    method __getitem__ (line 80) | def __getitem__(self, idx):
  class ShardedVideoProcessor (line 87) | class ShardedVideoProcessor(Processor):
    method __init__ (line 92) | def __init__(self, config):
    method __call__ (line 96) | def __call__(self, video_id):
  class ShardedTextProcessor (line 119) | class ShardedTextProcessor(Processor):
    method __init__ (line 120) | def __init__(self, config):
    method __call__ (line 124) | def __call__(self, video_id):
  class FixedLenAligner (line 147) | class FixedLenAligner(Aligner):
    method __init__ (line 165) | def __init__(self, config):
    method _get_text_maxlen (line 182) | def _get_text_maxlen(self):
    method __call__ (line 186) | def __call__(self, video_id, video_feature, text_feature):
    method sampling (line 216) | def sampling(
  class VariedLenAligner (line 269) | class VariedLenAligner(FixedLenAligner):
    method __init__ (line 270) | def __init__(self, config):
    method _get_text_maxlen (line 275) | def _get_text_maxlen(self):
  class StartClipAligner (line 279) | class StartClipAligner(VariedLenAligner):
    method sampling (line 280) | def sampling(
  class OverlappedAligner (line 292) | class OverlappedAligner(VariedLenAligner):
    method __init__ (line 295) | def __init__(self, config):
    method _get_video_maxlen (line 302) | def _get_video_maxlen(self):
    method sampling (line 306) | def sampling(
  class MFMMLMAligner (line 361) | class MFMMLMAligner(FixedLenAligner):
    method __init__ (line 366) | def __init__(self, config):
    method __call__ (line 385) | def __call__(self, video_id, video_feature, text_feature):
    method sampling (line 413) | def sampling(
  class FrameMaskingProcessor (line 452) | class FrameMaskingProcessor(Processor):
    method __init__ (line 453) | def __init__(self, config):
    method __call__ (line 458) | def __call__(self, vmasks, modality_masking=None, vfeats=None):
  class TextGenerationProcessor (line 487) | class TextGenerationProcessor(Processor):
    method __init__ (line 488) | def __init__(self, tokenizer):
    method __call__ (line 492) | def __call__(self, inputs):
  class TextMaskingProcessor (line 507) | class TextMaskingProcessor(Processor):
    method __init__ (line 508) | def __init__(self, config):
    method __call__ (line 522) | def __call__(
    method mask_input (line 602) | def mask_input(self, inputs, special_tokens_mask=None):
    method get_special_tokens_mask (line 636) | def get_special_tokens_mask(
  class TextClipSamplingProcessor (line 664) | class TextClipSamplingProcessor(Processor):
    method __init__ (line 665) | def __init__(self, max_text_len, keep_prob=1.0):
    method __call__ (line 670) | def __call__(
  class VideoClipSamplingProcessor (line 738) | class VideoClipSamplingProcessor(Processor):
    method __call__ (line 739) | def __call__(self, video_len, max_video_len, center):
  class How2MILNCEAligner (line 762) | class How2MILNCEAligner(FixedLenAligner):
    method __init__ (line 765) | def __init__(self, config):
    method sampling (line 773) | def sampling(
    method _get_video (line 807) | def _get_video(self, video_feature, start, end):
    method _get_text (line 812) | def _get_text(self, cap):
    method _find_nearest_candidates (line 831) | def _find_nearest_candidates(self, caption, ind):
  class PKLJSONStrTextProcessor (line 853) | class PKLJSONStrTextProcessor(TextProcessor):
    method __init__ (line 859) | def __init__(self, config, max_clip_text_len=96):
    method __call__ (line 870) | def __call__(self, video_id):

FILE: examples/MMPT/mmpt/processors/how2retriprocessor.py
  class ShardedHow2VideoRetriMetaProcessor (line 15) | class ShardedHow2VideoRetriMetaProcessor(ShardedHow2MetaProcessor):
    method __init__ (line 16) | def __init__(self, config):
    method __len__ (line 24) | def __len__(self):
    method set_candidates (line 27) | def set_candidates(self, cands):
    method __getitem__ (line 33) | def __getitem__(self, idx):
  class ShardedVideoRetriVideoProcessor (line 43) | class ShardedVideoRetriVideoProcessor(ShardedVideoProcessor):
    method __call__ (line 47) | def __call__(self, sharded_video_idxs):
  class ShardedVideoRetriTextProcessor (line 56) | class ShardedVideoRetriTextProcessor(ShardedTextProcessor):
    method __call__ (line 60) | def __call__(self, sharded_video_idxs):
  class VideoRetriAligner (line 69) | class VideoRetriAligner(VariedLenAligner):
    method __call__ (line 71) | def __call__(self, sharded_video_idxs, video_features, text_features):
  class VideoRetriOverlappedAligner (line 86) | class VideoRetriOverlappedAligner(OverlappedAligner):
    method __call__ (line 88) | def __call__(self, sharded_video_idxs, video_features, text_features):

FILE: examples/MMPT/mmpt/processors/models/s3dg.py
  class InceptionBlock (line 30) | class InceptionBlock(nn.Module):
    method __init__ (line 31) | def __init__(
    method forward (line 64) | def forward(self, input):
  class SelfGating (line 82) | class SelfGating(nn.Module):
    method __init__ (line 83) | def __init__(self, input_dim):
    method forward (line 87) | def forward(self, input_tensor):
  class STConv3D (line 96) | class STConv3D(nn.Module):
    method __init__ (line 97) | def __init__(
    method forward (line 149) | def forward(self, input):
  class MaxPool3dTFPadding (line 156) | class MaxPool3dTFPadding(th.nn.Module):
    method __init__ (line 157) | def __init__(self, kernel_size, stride=None, padding="SAME"):
    method _get_padding_shape (line 165) | def _get_padding_shape(self, filter_shape, stride):
    method forward (line 183) | def forward(self, inp):
  class Sentence_Embedding (line 189) | class Sentence_Embedding(nn.Module):
    method __init__ (line 190) | def __init__(
    method _zero_pad_tensor_token (line 209) | def _zero_pad_tensor_token(self, tensor, size):
    method _split_text (line 216) | def _split_text(self, sentence):
    method _words_to_token (line 220) | def _words_to_token(self, words):
    method _words_to_ids (line 230) | def _words_to_ids(self, x):
    method forward (line 234) | def forward(self, x):
  class S3D (line 243) | class S3D(nn.Module):
    method __init__ (line 244) | def __init__(self, dict_path, num_classes=512, gating=True, space_to_d...
    method _space_to_depth (line 301) | def _space_to_depth(self, input):
    method forward (line 310) | def forward(self, inputs):

FILE: examples/MMPT/mmpt/processors/processor.py
  class Processor (line 8) | class Processor(object):
    method __call__ (line 13) | def __call__(self, **kwargs):
  class MetaProcessor (line 17) | class MetaProcessor(Processor):
    method __init__ (line 24) | def __init__(self, config):
    method __len__ (line 27) | def __len__(self):
    method __getitem__ (line 30) | def __getitem__(self, idx):
    method _get_split_path (line 33) | def _get_split_path(self, config):
  class TextProcessor (line 44) | class TextProcessor(Processor):
    method __init__ (line 53) | def __init__(self, config):
    method __call__ (line 61) | def __call__(self, text_id):
  class VideoProcessor (line 66) | class VideoProcessor(Processor):
    method __init__ (line 71) | def __init__(self, config):
    method __call__ (line 74) | def __call__(self, video_fn):
  class Aligner (line 83) | class Aligner(object):
    method __init__ (line 87) | def __init__(self, config):
    method __call__ (line 101) | def __call__(self, video_id, video_feature, text_feature):
    method _build_video_seq (line 104) | def _build_video_seq(self, video_feature, video_clips=None):
    method _build_text_seq (line 138) | def _build_text_seq(self, text_feature, text_clip_indexs=None):
    method batch_post_processing (line 165) | def batch_post_processing(self, batch, video_feature):
  class MMAttentionMask2DProcessor (line 169) | class MMAttentionMask2DProcessor(Processor):
    method __call__ (line 173) | def __call__(self, vmask, cmask, mtype):
    method _build_mm_mask (line 181) | def _build_mm_mask(self, vmask, cmask):
    method _build_videogeneration_mask (line 185) | def _build_videogeneration_mask(self, vmask, cmask):
    method _build_textgeneration_mask (line 233) | def _build_textgeneration_mask(self, vmask, cmask):

FILE: examples/MMPT/mmpt/tasks/fairseqmmtask.py
  class FairseqMMTask (line 20) | class FairseqMMTask(LegacyFairseqTask):
    method add_args (line 22) | def add_args(parser):
    method setup_task (line 32) | def setup_task(cls, args, **kwargs):
    method __init__ (line 35) | def __init__(self, args):
    method load_dataset (line 43) | def load_dataset(self, split, **kwargs):
    method get_batch_iterator (line 54) | def get_batch_iterator(
    method source_dictionary (line 99) | def source_dictionary(self):
    method target_dictionary (line 103) | def target_dictionary(self):

FILE: examples/MMPT/mmpt/tasks/milncetask.py
  class MILNCETask (line 11) | class MILNCETask(Task):
    method reshape_subsample (line 12) | def reshape_subsample(self, sample):

FILE: examples/MMPT/mmpt/tasks/retritask.py
  class RetriTask (line 28) | class RetriTask(Task):
    method reshape_subsample (line 31) | def reshape_subsample(self, sample):
    method flat_subsample (line 37) | def flat_subsample(self, tensor):
    method build_dataloader (line 42) | def build_dataloader(self):
    method retrive_candidates (line 73) | def retrive_candidates(self, epoch, dataloader=None):
  class VideoRetriTask (line 105) | class VideoRetriTask(RetriTask):
    method reshape_subsample (line 108) | def reshape_subsample(self, sample):
    method flat_subsample (line 119) | def flat_subsample(self, tensor):
    method _retri_predict (line 124) | def _retri_predict(self, epoch, dataloader):
    method _retri_sync (line 139) | def _retri_sync(self, epoch, out_dir):
  class VideoPredictor (line 154) | class VideoPredictor(Predictor):
    method __init__ (line 155) | def __init__(self, config):
    method predict_loop (line 159) | def predict_loop(
    method __call__ (line 174) | def __call__(self, sample, model, **kwargs):
    method finalize (line 195) | def finalize(self):
  class VideoRetriPredictor (line 202) | class VideoRetriPredictor(Predictor):
    method __init__ (line 208) | def __init__(self, config):
    method predict_loop (line 215) | def predict_loop(
    method finalize (line 247) | def finalize(self, batched_videos, epoch):

FILE: examples/MMPT/mmpt/tasks/task.py
  class Task (line 14) | class Task(object):
    method config_task (line 20) | def config_task(cls, config):
    method __init__ (line 32) | def __init__(self, config):
    method build_dataset (line 42) | def build_dataset(self):
    method build_model (line 96) | def build_model(self, checkpoint=None):
    method load_checkpoint (line 104) | def load_checkpoint(self, checkpoint):
    method _trim_state_dict (line 115) | def _trim_state_dict(self, state_dict):
    method build_loss (line 133) | def build_loss(self):
    method flat_subsample (line 139) | def flat_subsample(self, tensor):
    method reshape_subsample (line 150) | def reshape_subsample(self, sample):
    method __call__ (line 161) | def __call__(self, model, sample):
    method build_dataloader (line 182) | def build_dataloader(self):

FILE: examples/MMPT/mmpt/tasks/vlmtask.py
  class VLMTask (line 10) | class VLMTask(Task):
    method flat_subsample (line 17) | def flat_subsample(self, tensor):

FILE: examples/MMPT/mmpt/utils/__init__.py
  function set_seed (line 13) | def set_seed(seed=43211):
  function get_world_size (line 23) | def get_world_size():
  function get_local_rank (line 31) | def get_local_rank():
  function print_on_rank0 (line 36) | def print_on_rank0(func):
  class RetriMeter (line 42) | class RetriMeter(object):
    method __init__ (line 46) | def __init__(self, freq=1024):
    method __call__ (line 52) | def __call__(self, data):
    method __repr__ (line 66) | def __repr__(self):

FILE: examples/MMPT/mmpt/utils/load_config.py
  function load_config (line 10) | def load_config(args=None, config_file=None, overwrite_fairseq=False):
  function recursive_config (line 55) | def recursive_config(config_path):
  function suffix_rundir (line 66) | def suffix_rundir(save_dir, run_dir):
  function overwrite_dir (line 76) | def overwrite_dir(config, replace, basedir):

FILE: examples/MMPT/mmpt/utils/shardedtensor.py
  class ShardedTensor (line 10) | class ShardedTensor(object):
    method __init__ (line 11) | def __init__(self, data, starts):
    method from_list (line 20) | def from_list(xs):
    method __getitem__ (line 29) | def __getitem__(self, i):
    method __len__ (line 32) | def __len__(self):
    method lengths (line 35) | def lengths(self):
    method save (line 38) | def save(self, path):
    method load (line 43) | def load(path, mmap_mode=None):

FILE: examples/MMPT/mmpt_cli/localjob.py
  class BaseJob (line 10) | class BaseJob(object):
    method __init__ (line 11) | def __init__(self, yaml_file, dryrun=False):
    method submit (line 16) | def submit(self, **kwargs):
    method _normalize_cmd (line 19) | def _normalize_cmd(self, cmd_list):
  class LocalJob (line 26) | class LocalJob(BaseJob):
    method __init__ (line 49) | def __init__(self, yaml_file, job_type=None, dryrun=False):
    method submit (line 61) | def submit(self):
  class JobStatus (line 91) | class JobStatus(object):
    method __init__ (line 92) | def __init__(self, job_id):
    method __repr__ (line 95) | def __repr__(self):
    method __str__ (line 98) | def __str__(self):
    method done (line 101) | def done(self):
    method running (line 104) | def running(self):
    method result (line 107) | def result(self):
    method stderr (line 113) | def stderr(self):
    method stdout (line 116) | def stdout(self):

FILE: examples/MMPT/mmpt_cli/predict.py
  function get_dataloader (line 22) | def get_dataloader(config):
  function main (line 53) | def main(args):

FILE: examples/MMPT/scripts/text_token_extractor/pretokenization.py
  class TokenizerDataset (line 16) | class TokenizerDataset(Dataset):
    method __init__ (line 17) | def __init__(self, config):
    method __getitem__ (line 21) | def __getitem__(self, idx):
    method __len__ (line 25) | def __len__(self):
  function numpify (line 29) | def numpify(shard_idx, video_ids, captions, target_dir, split, prefix, m...
  function sharding (line 57) | def sharding(config, out_file):
  function tokenize (line 76) | def tokenize(config, out_file):
  function main (line 91) | def main(args):

FILE: examples/MMPT/scripts/video_feature_extractor/model.py
  class GlobalAvgPool (line 8) | class GlobalAvgPool(nn.Module):
    method __init__ (line 9) | def __init__(self):
    method forward (line 12) | def forward(self, x):
  function get_model (line 16) | def get_model(args):

FILE: examples/MMPT/scripts/video_feature_extractor/pathbuilder.py
  class PathBuilder (line 17) | class PathBuilder(object):
    method build (line 19) | def build(cls, video_dirs, feature_dir, ext, shards=0, split=None):

FILE: examples/MMPT/scripts/video_feature_extractor/preprocessing.py
  class Normalize (line 6) | class Normalize(object):
    method __init__ (line 8) | def __init__(self, mean, std):
    method __call__ (line 12) | def __call__(self, tensor):
  class Preprocessing (line 16) | class Preprocessing(object):
    method __init__ (line 18) | def __init__(self, type):
    method _zero_pad (line 27) | def _zero_pad(self, tensor, size):
    method __call__ (line 35) | def __call__(self, tensor):

FILE: examples/MMPT/scripts/video_feature_extractor/random_sequence_shuffler.py
  class RandomSequenceSampler (line 8) | class RandomSequenceSampler(Sampler):
    method __init__ (line 10) | def __init__(self, n_sample, seq_len):
    method _pad_ind (line 14) | def _pad_ind(self, ind):
    method __iter__ (line 19) | def __iter__(self):
    method __len__ (line 28) | def __len__(self):

FILE: examples/MMPT/scripts/video_feature_extractor/shard_feature.py
  class Shard (line 12) | class Shard(object):
    method __init__ (line 13) | def __init__(
    method __call__ (line 31) | def __call__(self, split="train"):

FILE: examples/MMPT/scripts/video_feature_extractor/videoreader.py
  class VideoLoader (line 14) | class VideoLoader(Dataset):
    method __init__ (line 16) | def __init__(
    method __len__ (line 38) | def __len__(self):
    method _get_video_dim (line 41) | def _get_video_dim(self, video_path):
    method _get_video_info (line 49) | def _get_video_info(self, video_path):
    method _get_output_dim (line 55) | def _get_output_dim(self, h, w):
    method __getitem__ (line 63) | def __getitem__(self, idx):
    method _decode (line 68) | def _decode(self, output_file, video_path):
    method _run (line 101) | def _run(self, cmd, output_file):
  class VideoVerifier (line 113) | class VideoVerifier(VideoLoader):
    method __getitem__ (line 114) | def __getitem__(self, idx):
  class VideoCompressor (line 123) | class VideoCompressor(VideoLoader):
    method __init__ (line 124) | def __init__(
    method _run (line 145) | def _run(self, cmd, output_file):
  class VideoDownloader (line 154) | class VideoDownloader(VideoCompressor):
    method __getitem__ (line 156) | def __getitem__(self, idx):
  class AvKeyframeVideoCompressor (line 170) | class AvKeyframeVideoCompressor(VideoLoader):
    method __init__ (line 174) | def __init__(
    method _get_video_dim (line 187) | def _get_video_dim(self, video_fn):
    method _get_output_dim (line 195) | def _get_output_dim(self, height, width):
    method __getitem__ (line 204) | def __getitem__(self, idx):

FILE: examples/adaptive_span/adagrad_with_grad_clip.py
  class FairseqAdagradWithGradClip (line 12) | class FairseqAdagradWithGradClip(LegacyFairseqOptimizer):
    method __init__ (line 13) | def __init__(self, args, params):
    method add_args (line 18) | def add_args(parser):
    method optimizer_config (line 28) | def optimizer_config(self):
    method supports_flat_params (line 42) | def supports_flat_params(self):
  function _clip_grad (line 46) | def _clip_grad(clr, grad, group_grad_clip):
  class AdagradWithGradClip (line 54) | class AdagradWithGradClip(Adagrad):
    method __init__ (line 57) | def __init__(
    method step (line 77) | def step(self, closure=None):

FILE: examples/adaptive_span/adaptive_span_attention.py
  class AdaptiveMask (line 12) | class AdaptiveMask(nn.Module):
    method __init__ (line 24) | def __init__(self, max_size, ramp_size, init_val=0, shape=(1,)):
    method forward (line 32) | def forward(self, x):
    method get_current_max_size (line 42) | def get_current_max_size(self, include_ramp=True):
    method get_current_avg_size (line 49) | def get_current_avg_size(self, include_ramp=True):
    method clamp_param (line 58) | def clamp_param(self):
  class AdaptiveSpan (line 63) | class AdaptiveSpan(nn.Module):
    method __init__ (line 75) | def __init__(
    method forward (line 102) | def forward(self, attn, normalize=True):
    method get_trim_len (line 116) | def get_trim_len(self):
    method trim_memory (line 124) | def trim_memory(self, query, key, value, key_pe):
    method get_cache_size (line 142) | def get_cache_size(self):
    method get_loss (line 149) | def get_loss(self):
    method get_current_max_span (line 153) | def get_current_max_span(self):
    method get_current_avg_span (line 156) | def get_current_avg_span(self):
    method clamp_param (line 159) | def clamp_param(self):

FILE: examples/adaptive_span/adaptive_span_loss.py
  class AdaptiveSpanCriterionConfig (line 19) | class AdaptiveSpanCriterionConfig(FairseqDataclass):
  class AdaptiveSpanCriterion (line 24) | class AdaptiveSpanCriterion(CrossEntropyCriterion):
    method __init__ (line 25) | def __init__(self, task, sentence_avg):
    method forward (line 28) | def forward(self, model, sample, reduce=True):
    method compute_loss (line 58) | def compute_loss(self, model, net_output, sample, reduce=True):
    method reduce_metrics (line 66) | def reduce_metrics(logging_outputs) -> None:
    method logging_outputs_can_be_summed (line 101) | def logging_outputs_can_be_summed() -> bool:

FILE: examples/adaptive_span/adaptive_span_model.py
  function _skew (line 21) | def _skew(X, pad_value):
  function _unskew (line 32) | def _unskew(X):
  class SeqAttention (line 44) | class SeqAttention(nn.Module):
    method __init__ (line 50) | def __init__(self, d_model, n_head, attn_span, dropout, adapt_span_lay...
    method forward (line 62) | def forward(self, query, key, value, key_pe):
    method get_cache_size (line 90) | def get_cache_size(self):
  class MultiHeadSeqAttention (line 94) | class MultiHeadSeqAttention(nn.Module):
    method __init__ (line 95) | def __init__(self, d_model, n_head, **kargs):
    method head_reshape (line 110) | def head_reshape(self, x):
    method forward (line 118) | def forward(self, query, key, value, key_pe):
  class FeedForwardLayer (line 139) | class FeedForwardLayer(nn.Module):
    method __init__ (line 140) | def __init__(self, d_model, d_inner, dropout, **kargs):
    method forward (line 148) | def forward(self, h):
  class TransformerSeqLayer (line 155) | class TransformerSeqLayer(nn.Module):
    method __init__ (line 156) | def __init__(self, d_model, **kargs):
    method forward (line 163) | def forward(self, h, h_cache, key_pe):
    method get_cache_size (line 176) | def get_cache_size(self):
  class TransformerSeq (line 180) | class TransformerSeq(nn.Module):
    method __init__ (line 181) | def __init__(
    method forward (line 218) | def forward(self, x, h_cache, target=None):
    method get_aux_loss (line 245) | def get_aux_loss(self):
    method get_current_max_span (line 251) | def get_current_max_span(self):
    method get_current_avg_span (line 259) | def get_current_avg_span(self):

FILE: examples/adaptive_span/adaptive_span_model_wrapper.py
  class AdaptiveSpanSmallConfig (line 24) | class AdaptiveSpanSmallConfig(FairseqDataclass):
  class AdaptiveSpanTransformer (line 41) | class AdaptiveSpanTransformer(FairseqLanguageModel):
    method build_model (line 43) | def build_model(cls, cfg: AdaptiveSpanSmallConfig, task):
    method get_aux_loss (line 46) | def get_aux_loss(self):
    method get_current_max_span (line 49) | def get_current_max_span(self):
    method get_current_avg_span (line 52) | def get_current_avg_span(self):
  class AdaptiveSpanDecoder (line 56) | class AdaptiveSpanDecoder(FairseqIncrementalDecoder):
    method __init__ (line 57) | def __init__(self, cfg, task):
    method forward (line 81) | def forward(
    method max_positions (line 104) | def max_positions(self):
    method init_hid_cache (line 107) | def init_hid_cache(self, batch_sz):
    method get_aux_loss (line 121) | def get_aux_loss(self):
    method get_current_max_span (line 124) | def get_current_max_span(self):
    method get_current_avg_span (line 127) | def get_current_avg_span(self):
    method reorder_incremental_state (line 130) | def reorder_incremental_state(

FILE: examples/attention_head_selection/src/data/speech_to_text_dataset_with_domain.py
  class SpeechToTextDatasetItemWithDomain (line 29) | class SpeechToTextDatasetItemWithDomain(SpeechToTextDatasetItem):
  class SpeechToTextDatasetWithDomain (line 35) | class SpeechToTextDatasetWithDomain(SpeechToTextDataset):
    method __init__ (line 37) | def __init__(
    method __getitem__ (line 73) | def __getitem__(self, index: int) -> SpeechToTextDatasetItemWithDomain:
    method collater (line 86) | def collater(
  class SpeechToTextDatasetCreatorWithDomain (line 105) | class SpeechToTextDatasetCreatorWithDomain(SpeechToTextDatasetCreator):
    method _from_list (line 112) | def _from_list(
    method _load_samples_from_tsv (line 159) | def _load_samples_from_tsv(
    method _from_tsv (line 183) | def _from_tsv(
    method from_tsv (line 208) | def from_tsv(

FILE: examples/attention_head_selection/src/loss/attention_head_selection.py
  class HeadSelectionLoss (line 12) | class HeadSelectionLoss(_Loss):
    method __init__ (line 14) | def __init__(self, args):
    method forward (line 19) | def forward(self, head_samples, sample_sizes, prior=0.5, eps=1e-7):

FILE: examples/attention_head_selection/src/models/head_selection_s2t_transformer.py
  class HeadSelectionS2TTransformerModel (line 31) | class HeadSelectionS2TTransformerModel(S2TTransformerModel):
    method __init__ (line 35) | def __init__(self, encoder, decoder):
    method add_args (line 39) | def add_args(parser):
    method build_encoder (line 80) | def build_encoder(cls, args):
    method build_decoder (line 99) | def build_decoder(cls, args, task, embed_tokens):
  class HeadSelectionS2TTransformerEncoder (line 106) | class HeadSelectionS2TTransformerEncoder(S2TTransformerEncoder):
    method __init__ (line 108) | def __init__(self, args):
    method set_task_ids (line 122) | def set_task_ids(self, task_ids):
    method _forward (line 125) | def _forward(self, src_tokens, src_lengths, return_all_hiddens=False):
  class HeadSelectionTransformerDecoderScriptable (line 130) | class HeadSelectionTransformerDecoderScriptable(HeadSelectionTransformer...
    method extract_features (line 131) | def extract_features(
  function base_architecture (line 153) | def base_architecture(args):
  function head_selection_s2t_transformer_s (line 164) | def head_selection_s2t_transformer_s(args):

FILE: examples/attention_head_selection/src/models/head_selection_transformer.py
  class HeadSelectionTransformerModel (line 25) | class HeadSelectionTransformerModel(TransformerModel):
    method __init__ (line 26) | def __init__(self, args, encoder, decoder):
    method add_args (line 30) | def add_args(parser):
    method build_encoder (line 71) | def build_encoder(cls, args, src_dict, embed_tokens):
    method build_decoder (line 80) | def build_decoder(cls, args, tgt_dict, embed_tokens):
  class HeadSelectionTransformerEncoder (line 89) | class HeadSelectionTransformerEncoder(TransformerEncoder):
    method __init__ (line 91) | def __init__(self, args, dictionary, embed_tokens):
    method set_task_ids (line 111) | def set_task_ids(self, task_ids):
    method build_encoder_layer (line 114) | def build_encoder_layer(self, args, layer_idx=None):
    method forward (line 121) | def forward(
  class HeadSelectionTransformerDecoder (line 132) | class HeadSelectionTransformerDecoder(TransformerDecoder):
    method __init__ (line 134) | def __init__(
    method set_task_ids (line 177) | def set_task_ids(self, task_ids):
    method build_head_selection_decoder_layer (line 180) | def build_head_selection_decoder_layer(self, args, no_encoder_attn=Fal...
    method forward (line 189) | def forward(

FILE: examples/attention_head_selection/src/modules/attn_head_selector.py
  class AttnHeadSelector (line 9) | class AttnHeadSelector(nn.Module):
    method __init__ (line 13) | def __init__(
    method gumbel_sample (line 36) | def gumbel_sample(self, logits, tau=1.0):
    method subset_select (line 43) | def subset_select(self, y_soft, topk, dim=-1):
    method group_selet (line 48) | def group_selet(self, y_soft, topk, dim=-1):
    method head_select (line 57) | def head_select(self, task_ids=None):
    method forward (line 77) | def forward(self, layer_idx):

FILE: examples/attention_head_selection/src/modules/head_selection_transformer_layer.py
  class HeadSelectionTransformerEncoderLayer (line 11) | class HeadSelectionTransformerEncoderLayer(TransformerEncoderLayer):
    method __init__ (line 13) | def __init__(self, args, layer_idx, attn_head_selector=None):
    method build_self_attention_selection (line 20) | def build_self_attention_selection(self, embed_dim, args, attn_head_se...
  class HeadSelectionTransformerDecoderLayer (line 34) | class HeadSelectionTransformerDecoderLayer(TransformerDecoderLayer):
    method __init__ (line 36) | def __init__(
    method build_self_attention_selection (line 61) | def build_self_attention_selection(
    method build_encoder_attention_selection (line 79) | def build_encoder_attention_selection(self, embed_dim, args, enc_attn_...

FILE: examples/attention_head_selection/src/modules/multihead_attention_selection.py
  class MultiheadAttentionSelection (line 17) | class MultiheadAttentionSelection(MultiheadAttention):
    method __init__ (line 19) | def __init__(
    method forward (line 71) | def forward(

FILE: examples/attention_head_selection/src/modules/multihead_functional.py
  function _scaled_dot_product_attention (line 19) | def _scaled_dot_product_attention(
  function _in_projection (line 59) | def _in_projection(
  function multi_head_attention_forward (line 73) | def multi_head_attention_forward(

FILE: examples/attention_head_selection/src/speech_to_text_head_selection.py
  class SpeechToTextHeadSelectionTask (line 16) | class SpeechToTextHeadSelectionTask(SpeechToTextTask):
    method add_args (line 19) | def add_args(cls, parser):
    method __init__ (line 34) | def __init__(self, args, tgt_dict):
    method map_task_to_id (line 43) | def map_task_to_id(self, train_subset):
    method load_dataset (line 65) | def load_dataset(self, split, epoch=1, combine=False, **kwargs):
    method build_model (line 85) | def build_model(self, args):
    method get_sample_sizes (line 90) | def get_sample_sizes(self, sample, task_ids, num_tasks):
    method train_step (line 102) | def train_step(
    method valid_step (line 149) | def valid_step(self, sample, model, criterion):
    method inference_step (line 164) | def inference_step(

FILE: examples/audio_nlp/nlu/generate_manifests.py
  function get_insl_frame (line 5) | def get_insl_frame(parse):
  function sequencify_utterance (line 22) | def sequencify_utterance(utterance):
  function generate_fairseq_manifests (line 30) | def generate_fairseq_manifests(manifest, output_path, audio_root=None):
  function main (line 64) | def main(args):

FILE: examples/backtranslation/deduplicate_lines.py
  function get_hashes_and_lines (line 14) | def get_hashes_and_lines(raw_line):
  function main (line 19) | def main():

FILE: examples/backtranslation/extract_bt_data.py
  function main (line 13) | def main():

FILE: examples/bart/summarize.py
  function generate (line 15) | def generate(bart, infile, outfile="bart_hypo.txt", bsz=32, n_obs=None, ...
  function main (line 43) | def main():

FILE: examples/byte_level_bpe/get_bitext.py
  function _convert_xml (line 26) | def _convert_xml(in_path: str, out_path: str):
  function _convert_train (line 37) | def _convert_train(in_path: str, out_path: str):
  function _get_bytes (line 46) | def _get_bytes(in_path: str, out_path: str):
  function _get_chars (line 52) | def _get_chars(in_path: str, out_path: str):
  function pretokenize (line 58) | def pretokenize(in_path: str, out_path: str, src: str, tgt: str):
  function _convert_to_bchar (line 80) | def _convert_to_bchar(in_path_prefix: str, src: str, tgt: str, out_path:...
  function _get_bpe (line 88) | def _get_bpe(in_path: str, model_prefix: str, vocab_size: int):
  function _apply_bbpe (line 101) | def _apply_bbpe(model_path: str, in_path: str, out_path: str):
  function _apply_bpe (line 110) | def _apply_bpe(model_path: str, in_path: str, out_path: str):
  function _concat_files (line 119) | def _concat_files(in_paths: List[str], out_path: str):
  function preprocess_iwslt17 (line 127) | def preprocess_iwslt17(
  function main (line 213) | def main():

FILE: examples/byte_level_bpe/gru_transformer.py
  class GRUTransformerModel (line 18) | class GRUTransformerModel(TransformerModel):
    method build_encoder (line 20) | def build_encoder(cls, args, src_dict, embed_tokens):
  class GRUTransformerEncoder (line 24) | class GRUTransformerEncoder(TransformerEncoder):
    method __init__ (line 25) | def __init__(self, args, dictionary, embed_tokens):
    method forward_embedding (line 34) | def forward_embedding(self, src_tokens):
  function gru_transformer_base_architecture (line 53) | def gru_transformer_base_architecture(args):
  function gru_transformer_big (line 98) | def gru_transformer_big(args):

FILE: examples/constrained_decoding/normalize.py
  function main (line 13) | def main(args):

FILE: examples/constrained_decoding/tok.py
  function main (line 13) | def main(args):

FILE: examples/criss/mining/mine.py
  function call (line 22) | def call(cmd):
  function get_batches (line 27) | def get_batches(directory, lang, prefix="all_avg_pool"):
  function load_batch (line 39) | def load_batch(emb_file, dim):
  function knnGPU_sharded (line 47) | def knnGPU_sharded(x_batches_f, y_batches_f, dim, k, direction="x2y"):
  function score (line 93) | def score(sim, fwd_mean, bwd_mean, margin):
  function score_candidates (line 97) | def score_candidates(
  function load_text (line 109) | def load_text(files):

FILE: examples/criss/save_encoder.py
  function get_avg_pool (line 17) | def get_avg_pool(
  function main (line 43) | def main(args):
  function cli_main (line 200) | def cli_main():

FILE: examples/criss/sentence_retrieval/encoder_analysis.py
  function compute_dist (line 15) | def compute_dist(source_embs, target_embs, k=5, return_sim_mat=False):
  function load_embeddings (line 35) | def load_embeddings(directory, LANGS):
  function compute_accuracy (line 58) | def compute_accuracy(directory, LANGS):

FILE: examples/data2vec/data/add_class_target_dataset.py
  class AddClassTargetDataset (line 11) | class AddClassTargetDataset(BaseWrapperDataset):
    method __init__ (line 12) | def __init__(
    method __getitem__ (line 33) | def __getitem__(self, index):
    method collater (line 51) | def collater(self, samples):

FILE: examples/data2vec/data/image_dataset.py
  class ImageDataset (line 24) | class ImageDataset(FairseqDataset, VisionDataset):
    method __init__ (line 25) | def __init__(
    method __getitem__ (line 71) | def __getitem__(self, index):
    method __len__ (line 92) | def __len__(self):
    method collater (line 95) | def collater(self, samples):
    method num_tokens (line 113) | def num_tokens(self, index):
    method size (line 116) | def size(self, index):
    method ordered_indices (line 119) | def ordered_indices(self):

FILE: examples/data2vec/data/mae_finetuning_image_dataset.py
  function build_transform (line 27) | def build_transform(is_train, input_size, color_jitter, aa, reprob, remo...
  class MaeFinetuningImageDataset (line 66) | class MaeFinetuningImageDataset(FairseqDataset):
    method __init__ (line 67) | def __init__(
    method __getitem__ (line 96) | def __getitem__(self, index):
    method __len__ (line 100) | def __len__(self):
    method collater (line 103) | def collater(self, samples):
    method num_tokens (line 121) | def num_tokens(self, index):
    method size (line 124) | def size(self, index):
    method ordered_indices (line 127) | def ordered_indices(self):

FILE: examples/data2vec/data/mae_image_dataset.py
  function load (line 29) | def load(path, loader, cache):
  function caching_loader (line 56) | def caching_loader(cache_root: str, loader):
  class RandomResizedCropAndInterpolationWithTwoPic (line 70) | class RandomResizedCropAndInterpolationWithTwoPic:
    method __init__ (line 85) | def __init__(
    method _pil_interp (line 123) | def _pil_interp(self, method):
    method get_params (line 137) | def get_params(img, scale, ratio):
    method __call__ (line 179) | def __call__(self, img):
  class MaeImageDataset (line 204) | class MaeImageDataset(FairseqDataset):
    method __init__ (line 205) | def __init__(
    method __getitem__ (line 328) | def __getitem__(self, index):
    method __len__ (line 374) | def __len__(self):
    method collater (line 377) | def collater(self, samples):
    method num_tokens (line 400) | def num_tokens(self, index):
    method size (line 403) | def size(self, index):
    method sizes (line 407) | def sizes(self):
    method ordered_indices (line 410) | def ordered_indices(self):

FILE: examples/data2vec/data/modality.py
  class Modality (line 11) | class Modality(Enum):

FILE: examples/data2vec/data/path_dataset.py
  class PathDataset (line 15) | class PathDataset(VisionDataset):
    method __init__ (line 16) | def __init__(
    method __len__ (line 49) | def __len__(self) -> int:
    method __getitem__ (line 52) | def __getitem__(self, idx) -> Tuple[np.ndarray, np.ndarray]:

FILE: examples/data2vec/fb_convert_beit_cp.py
  function get_parser (line 23) | def get_parser():
  function update_checkpoint (line 37) | def update_checkpoint(model_dict, prefix, is_nested):
  function main (line 81) | def main():

FILE: examples/data2vec/models/audio_classification.py
  class AudioClassificationConfig (line 33) | class AudioClassificationConfig(FairseqDataclass):
  class AudioClassificationModel (line 190) | class AudioClassificationModel(BaseFairseqModel):
    method __init__ (line 191) | def __init__(self, cfg: AudioClassificationConfig, num_classes):
    method upgrade_state_dict_named (line 310) | def upgrade_state_dict_named(self, state_dict, name):
    method build_model (line 315) | def build_model(cls, cfg: AudioClassificationConfig, task: FairseqTask):
    method load_model_weights (line 322) | def load_model_weights(self, state, model, cfg):
    method set_num_updates (line 355) | def set_num_updates(self, num_updates):
    method compute_gain (line 360) | def compute_gain(self, sound, fs=16_000, min_db=-80.0, mode="A_weighti...
    method compute_gain_torch (line 405) | def compute_gain_torch(self, sound, fs=16_000, min_db=-80.0, mode="A_w...
    method forward (line 458) | def forward(self, source, padding_mask, label=None, **kwargs):

FILE: examples/data2vec/models/data2vec2.py
  class D2vModalitiesConfig (line 56) | class D2vModalitiesConfig(FairseqDataclass):
  class Data2VecMultiConfig (line 63) | class Data2VecMultiConfig(FairseqDataclass):
  class Data2VecMultiModel (line 151) | class Data2VecMultiModel(BaseFairseqModel):
    method make_modality_encoder (line 152) | def make_modality_encoder(
    method __init__ (line 183) | def __init__(self, cfg: Data2VecMultiConfig, modalities, skip_ema=Fals...
    method _init_weights (line 272) | def _init_weights(self, m):
    method make_ema_teacher (line 292) | def make_ema_teacher(self, ema_decay):
    method make_target_model (line 308) | def make_target_model(self):
    method set_num_updates (line 332) | def set_num_updates(self, num_updates):
    method state_dict (line 358) | def state_dict(self, destination=None, prefix="", keep_vars=False):
    method _load_from_state_dict (line 366) | def _load_from_state_dict(self, state_dict, prefix, *args, **kwargs):
    method build_model (line 378) | def build_model(cls, cfg: Data2VecMultiConfig, task=None):
    method forward (line 394) | def forward(
    method forward_decoder (line 691) | def forward_decoder(
    method d2v_loss (line 703) | def d2v_loss(self, x, y):
    method make_targets (line 721) | def make_targets(self, y, num_layers):
    method compute_var (line 767) | def compute_var(y):
    method extract_features (line 783) | def extract_features(
    method remove_pretraining_modules (line 796) | def remove_pretraining_modules(self, modality=None, keep_decoder=False):

FILE: examples/data2vec/models/data2vec_audio.py
  class Data2VecAudioConfig (line 37) | class Data2VecAudioConfig(Wav2Vec2Config):
  function get_annealed_rate (line 87) | def get_annealed_rate(start, end, curr_step, total_steps):
  class Data2VecAudioModel (line 94) | class Data2VecAudioModel(BaseFairseqModel):
    method __init__ (line 95) | def __init__(self, cfg: Data2VecAudioConfig):
    method make_ema_teacher (line 149) | def make_ema_teacher(self):
    method set_num_updates (line 166) | def set_num_updates(self, num_updates):
    method state_dict (line 189) | def state_dict(self, destination=None, prefix="", keep_vars=False):
    method _load_from_state_dict (line 197) | def _load_from_state_dict(self, state_dict, prefix, *args, **kwargs):
    method build_model (line 206) | def build_model(cls, cfg: Data2VecAudioConfig, task=None):
    method apply_mask (line 211) | def apply_mask(
    method _get_feat_extract_output_lengths (line 281) | def _get_feat_extract_output_lengths(self, input_lengths: torch.LongTe...
    method forward (line 298) | def forward(
    method compute_var (line 503) | def compute_var(y):
    method extract_features (line 519) | def extract_features(
    method remove_pretraining_modules (line 531) | def remove_pretraining_modules(self, last_layer=None):

FILE: examples/data2vec/models/data2vec_image_classification.py
  class Data2VecImageClassificationConfig (line 30) | class Data2VecImageClassificationConfig(FairseqDataclass):
  class Data2VecImageClassificationModel (line 45) | class Data2VecImageClassificationModel(BaseFairseqModel):
    method __init__ (line 46) | def __init__(self, cfg: Data2VecImageClassificationConfig):
    method load_model_weights (line 95) | def load_model_weights(self, state, model, cfg):
    method build_model (line 101) | def build_model(cls, cfg: Data2VecImageClassificationConfig, task=None):
    method forward (line 106) | def forward(

FILE: examples/data2vec/models/data2vec_text.py
  class Data2VecTextConfig (line 32) | class Data2VecTextConfig(FairseqDataclass):
  function get_annealed_rate (line 77) | def get_annealed_rate(start, end, curr_step, total_steps):
  class Data2VecTextModel (line 84) | class Data2VecTextModel(FairseqEncoderModel):
    method __init__ (line 85) | def __init__(self, cfg: Data2VecTextConfig, encoder):
    method build_model (line 95) | def build_model(cls, cfg, task):
    method forward (line 102) | def forward(
    method get_normalized_probs (line 127) | def get_normalized_probs(self, net_output, log_probs, sample=None):
    method register_classification_head (line 135) | def register_classification_head(
    method supported_targets (line 158) | def supported_targets(self):
    method upgrade_state_dict_named (line 161) | def upgrade_state_dict_named(self, state_dict, name):
    method remove_pretraining_modules (line 265) | def remove_pretraining_modules(self, last_layer=None):
  class Data2VecTextEncoder (line 280) | class Data2VecTextEncoder(FairseqEncoder):
    method __init__ (line 281) | def __init__(self, cfg: Data2VecTextConfig, dictionary, task_data):
    method build_embedding (line 314) | def build_embedding(self, vocab_size, embedding_dim, padding_idx):
    method build_encoder (line 317) | def build_encoder(self, cfg, dictionary, embed_tokens):
    method build_lm_head (line 322) | def build_lm_head(self, embed_dim, output_dim, activation_fn, weight):
    method make_ema_teacher (line 325) | def make_ema_teacher(self):
    method set_num_updates (line 352) | def set_num_updates(self, num_updates):
    method state_dict (line 373) | def state_dict(self, destination=None, prefix="", keep_vars=False):
    method _load_from_state_dict (line 379) | def _load_from_state_dict(self, state_dict, prefix, *args, **kwargs):
    method forward (line 387) | def forward(
    method extract_features (line 498) | def extract_features(self, src_tokens, return_all_hiddens=False, **kwa...
    method output_layer (line 512) | def output_layer(self, features, masked_tokens=None, **unused):
    method max_positions (line 515) | def max_positions(self):

FILE: examples/data2vec/models/data2vec_text_classification.py
  class Data2VecTextClassificationConfig (line 33) | class Data2VecTextClassificationConfig(FairseqDataclass):
  class Data2VecTextClassificationModel (line 49) | class Data2VecTextClassificationModel(BaseFairseqModel):
    method __init__ (line 50) | def __init__(self, cfg: Data2VecTextClassificationConfig):
    method load_model_weights (line 79) | def load_model_weights(self, state, model, cfg):
    method build_model (line 91) | def build_model(cls, cfg: Data2VecTextClassificationConfig, task=None):
    method register_classification_head (line 96) | def register_classification_head(
    method forward (line 122) | def forward(

FILE: examples/data2vec/models/data2vec_vision.py
  class Data2VecVisionConfig (line 33) | class Data2VecVisionConfig(FairseqDataclass):
  function get_annealed_rate (line 91) | def get_annealed_rate(start, end, curr_step, total_steps):
  class Data2VecVisionModel (line 98) | class Data2VecVisionModel(BaseFairseqModel):
    method __init__ (line 99) | def __init__(self, cfg: Data2VecVisionConfig):
    method make_ema_teacher (line 137) | def make_ema_teacher(self):
    method set_num_updates (line 147) | def set_num_updates(self, num_updates):
    method state_dict (line 170) | def state_dict(self, destination=None, prefix="", keep_vars=False):
    method _load_from_state_dict (line 178) | def _load_from_state_dict(self, state_dict, prefix, *args, **kwargs):
    method build_model (line 187) | def build_model(cls, cfg: Data2VecVisionConfig, task=None):
    method make_mask (line 192) | def make_mask(self, bsz, num_masks, min_masks, max_masks):
    method forward (line 240) | def forward(
    method compute_var (line 349) | def compute_var(y):
    method remove_pretraining_modules (line 365) | def remove_pretraining_modules(self, last_layer=None):
  class PatchEmbed (line 376) | class PatchEmbed(nn.Module):
    method __init__ (line 379) | def __init__(self, img_size=224, patch_size=16, in_chans=3, embed_dim=...
    method forward (line 395) | def forward(self, x):
  class Attention (line 401) | class Attention(nn.Module):
    method __init__ (line 402) | def __init__(
    method forward (line 471) | def forward(self, x, rel_pos_bias=None):
  class RelativePositionBias (line 520) | class RelativePositionBias(nn.Module):
    method __init__ (line 521) | def __init__(self, window_size, num_heads):
    method forward (line 555) | def forward(self):
  class DropPath (line 571) | class DropPath(nn.Module):
    method __init__ (line 574) | def __init__(self, drop_prob=None):
    method forward (line 578) | def forward(self, x):
    method extra_repr (line 590) | def extra_repr(self) -> str:
  class Block (line 594) | class Block(nn.Module):
    method __init__ (line 595) | def __init__(
    method forward (line 638) | def forward(self, x, rel_pos_bias=None):
  class TransformerEncoder (line 653) | class TransformerEncoder(nn.Module):
    method __init__ (line 654) | def __init__(self, cfg: Data2VecVisionConfig, patch_shape):
    method init_weights (line 685) | def init_weights(self, m):
    method fix_init_weight (line 699) | def fix_init_weight(self):
    method extract_features (line 707) | def extract_features(self, x, layer_results):
    method forward (line 721) | def forward(self, x, layer_results=None):

FILE: examples/data2vec/models/mae.py
  class MaeConfig (line 36) | class MaeConfig(FairseqDataclass):
  function modify_relative_position_bias (line 75) | def modify_relative_position_bias(orig_bias, bsz, mask):
  class AltBlock (line 103) | class AltBlock(nn.Module):
    method __init__ (line 104) | def __init__(
    method forward (line 172) | def forward(self, x, rel_pos_bias=None, pos_mask=None):
  class AltAttention (line 203) | class AltAttention(nn.Module):
    method __init__ (line 204) | def __init__(
    method forward (line 274) | def forward(self, x, rel_pos_bias=None, pos_mask=None):
  class RelativePositionBias (line 324) | class RelativePositionBias(nn.Module):
    method __init__ (line 325) | def __init__(self, window_size, num_heads):
    method forward (line 359) | def forward(self):
  function get_2d_sincos_pos_embed (line 370) | def get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False):
  function get_2d_sincos_pos_embed_from_grid (line 388) | def get_2d_sincos_pos_embed_from_grid(embed_dim, grid):
  function get_1d_sincos_pos_embed_from_grid (line 399) | def get_1d_sincos_pos_embed_from_grid(embed_dim, pos):
  function interpolate_pos_embed (line 420) | def interpolate_pos_embed(model, checkpoint_model):
  class MaeModel (line 454) | class MaeModel(BaseFairseqModel):
    method __init__ (line 455) | def __init__(self, cfg: MaeConfig):
    method initialize_weights (line 611) | def initialize_weights(self):
    method _init_weights (line 645) | def _init_weights(self, m):
    method patchify (line 655) | def patchify(self, imgs):
    method unpatchify (line 669) | def unpatchify(self, x):
    method random_masking (line 683) | def random_masking(self, x, mask_ratio):
    method build_model (line 713) | def build_model(cls, cfg: MaeConfig, task=None):
    method forward_encoder (line 718) | def forward_encoder(self, x, mask_ratio):
    method forward_decoder (line 749) | def forward_decoder(self, x, ids_restore):
    method forward_loss (line 786) | def forward_loss(self, imgs, pred, mask):
    method forward (line 804) | def forward(self, imgs, predictions_only=False):
    method remove_pretraining_modules (line 821) | def remove_pretraining_modules(self):

FILE: examples/data2vec/models/mae_image_classification.py
  class PredictionMode (line 33) | class PredictionMode(Enum):
  class MaeImageClassificationConfig (line 40) | class MaeImageClassificationConfig(FairseqDataclass):
  function get_layer_id_for_vit (line 80) | def get_layer_id_for_vit(name, num_layers):
  class MaeImageClassificationModel (line 98) | class MaeImageClassificationModel(BaseFairseqModel):
    method __init__ (line 99) | def __init__(self, cfg: MaeImageClassificationConfig):
    method build_model (line 305) | def build_model(cls, cfg: MaeImageClassificationConfig, task=None):
    method forward (line 310) | def forward(
    method model_forward (line 370) | def model_forward(self, imgs):

FILE: examples/data2vec/models/modalities/audio.py
  class D2vAudioConfig (line 25) | class D2vAudioConfig(D2vModalityConfig):
  class AudioEncoder (line 50) | class AudioEncoder(ModalitySpecificEncoder):
    method __init__ (line 54) | def __init__(
    method convert_padding_mask (line 142) | def convert_padding_mask(self, x, padding_mask):
    method reset_parameters (line 186) | def reset_parameters(self):

FILE: examples/data2vec/models/modalities/base.py
  class D2vModalityConfig (line 27) | class D2vModalityConfig:
  class ModalitySpecificEncoder (line 75) | class ModalitySpecificEncoder(nn.Module):
    method __init__ (line 76) | def __init__(
    method upgrade_state_dict_named (line 148) | def upgrade_state_dict_named(self, state_dict, name):
    method convert_padding_mask (line 155) | def convert_padding_mask(self, x, padding_mask):
    method decoder_input (line 158) | def decoder_input(self, x, mask_info: MaskInfo):
    method local_features (line 190) | def local_features(self, features):
    method contextualized_features (line 205) | def contextualized_features(
    method forward (line 332) | def forward(
    method reset_parameters (line 353) | def reset_parameters(self):
    method compute_mask (line 356) | def compute_mask(
    method make_maskinfo (line 413) | def make_maskinfo(self, x, mask, shape=None):
    method apply_mask (line 443) | def apply_mask(self, x, mask_info):
    method remove_pretraining_modules (line 473) | def remove_pretraining_modules(self, keep_decoder=False):
  function get_annealed_rate (line 478) | def get_annealed_rate(start, end, curr_step, total_steps):
  function random_masking (line 487) | def random_masking(x, mask_ratio, mask_seed: Optional[MaskSeed]):
  function gather_unmasked (line 523) | def gather_unmasked(x: torch.Tensor, mask_info: MaskInfo) -> torch.Tensor:
  function gather_unmasked_mask (line 531) | def gather_unmasked_mask(x: torch.Tensor, mask_info: MaskInfo) -> torch....
  function get_alibi (line 539) | def get_alibi(
  function get_alibi_bias (line 608) | def get_alibi_bias(
  function _learned_alibi_bias (line 646) | def _learned_alibi_bias(
  function masked_alibi (line 667) | def masked_alibi(alibi_bias, mask_info):

FILE: examples/data2vec/models/modalities/images.py
  class D2vImageConfig (line 33) | class D2vImageConfig(D2vModalityConfig):
  class ImageEncoder (line 50) | class ImageEncoder(ModalitySpecificEncoder):
    method __init__ (line 54) | def __init__(
    method reset_parameters (line 159) | def reset_parameters(self):
    method patchify (line 165) | def patchify(self, imgs):
    method unpatchify (line 179) | def unpatchify(self, x):
    method compute_mask (line 193) | def compute_mask(
    method decoder_input (line 234) | def decoder_input(self, x, mask_info):

FILE: examples/data2vec/models/modalities/modules.py
  class D2vDecoderConfig (line 20) | class D2vDecoderConfig:
  class FixedPositionalEncoder (line 35) | class FixedPositionalEncoder(nn.Module):
    method __init__ (line 36) | def __init__(self, pos_embed):
    method forward (line 40) | def forward(self, x, padding_mask):
  class TextFeatPositionalEncoder (line 44) | class TextFeatPositionalEncoder(nn.Module):
    method __init__ (line 50) | def __init__(self, pos_encoder):
    method forward (line 54) | def forward(self, x, padding_mask):
  class BlockEncoder (line 60) | class BlockEncoder(nn.Module):
    method __init__ (line 61) | def __init__(self, blocks, norm_layer, layer_norm_first, layerdrop, dr...
    method forward (line 69) | def forward(self, x, padding_mask, alibi_bias, alibi_scale):
  class DecoderBase (line 97) | class DecoderBase(nn.Module):
    method __init__ (line 100) | def __init__(self, cfg: D2vDecoderConfig):
    method reset_parameters (line 105) | def reset_parameters(self):
    method add_residual (line 110) | def add_residual(self, x, residual, i, mask_info):
  class Decoder1d (line 123) | class Decoder1d(DecoderBase):
    method __init__ (line 124) | def __init__(self, cfg: D2vDecoderConfig, input_dim):
    method forward (line 165) | def forward(self, x, mask_info):
  class Decoder2d (line 181) | class Decoder2d(DecoderBase):
    method __init__ (line 182) | def __init__(self, cfg: D2vDecoderConfig, input_dim, h_size, w_size):
    method forward (line 215) | def forward(self, x, mask_info):
  class TransformerDecoder (line 232) | class TransformerDecoder(nn.Module):
    method __init__ (line 235) | def __init__(self, cfg: D2vDecoderConfig, input_dim, encoder):
    method reset_parameters (line 246) | def reset_parameters(self):
    method forward (line 251) | def forward(self, x, mask_info):
  class AltBlock (line 258) | class AltBlock(nn.Module):
    method __init__ (line 259) | def __init__(
    method forward (line 306) | def forward(self, x, padding_mask=None, alibi_bias=None):
  class AltAttention (line 326) | class AltAttention(nn.Module):
    method __init__ (line 327) | def __init__(
    method forward (line 354) | def forward(self, x, padding_mask=None, alibi_bias=None):
  class EncDecAttention (line 399) | class EncDecAttention(nn.Module):
    method __init__ (line 400) | def __init__(
    method forward (line 429) | def forward(self, q, kv, padding_mask=None, alibi_bias=None):
  class EncDecBlock (line 479) | class EncDecBlock(nn.Module):
    method __init__ (line 480) | def __init__(
    method forward (line 529) | def forward(self, q, kv, padding_mask=None, alibi_bias=None):
  class EncDecTransformerDecoder (line 546) | class EncDecTransformerDecoder(nn.Module):
    method __init__ (line 547) | def __init__(self, cfg: D2vDecoderConfig, input_dim):
    method reset_parameters (line 578) | def reset_parameters(self):
    method forward (line 583) | def forward(self, x, kv):

FILE: examples/data2vec/models/modalities/text.py
  class D2vTextConfig (line 22) | class D2vTextConfig(D2vModalityConfig):
  class TextEncoder (line 33) | class TextEncoder(ModalitySpecificEncoder):
    method __init__ (line 37) | def __init__(
    method reset_parameters (line 95) | def reset_parameters(self):
    method convert_padding_mask (line 98) | def convert_padding_mask(self, x, padding_mask):
  class TextLocalEncoder (line 118) | class TextLocalEncoder(nn.Module):
    method __init__ (line 119) | def __init__(
    method forward (line 153) | def forward(self, src_tokens):

FILE: examples/data2vec/models/utils.py
  function get_alibi (line 4) | def get_alibi(
  function masked_alibi (line 44) | def masked_alibi(alibi_bias, mask_indices, orig_B, orig_T):

FILE: examples/data2vec/scripts/convert_audioset_labels.py
  function get_parser (line 11) | def get_parser():
  function main (line 23) | def main():

FILE: examples/data2vec/scripts/text/glue_lr.py
  function get_best_stat_str (line 21) | def get_best_stat_str(task_vals, show_subdir):
  function get_all_stat_str (line 46) | def get_all_stat_str(task_vals):
  function get_tabular_stat_str (line 54) | def get_tabular_stat_str(task_vals):
  function main (line 78) | def main():

FILE: examples/data2vec/scripts/text/unprocess_data.py
  function load_dictionary (line 7) | def load_dictionary(dict_path):
  function load_dataset (line 10) | def load_dataset(split_path, src_dict):
  function load_bpe (line 20) | def load_bpe(enc_path):
  function detokenize (line 26) | def detokenize(tokens, src_dict, idx2bpe):
  function _main (line 32) | def _main(src_root, src_dict_path, src_bpe_path, src_splits, tgt_root, t...
  function main_pt (line 47) | def main_pt():
  function main_ft (line 71) | def main_ft():

FILE: examples/data2vec/scripts/text/valids.py
  function main (line 34) | def main(args, print_output):

FILE: examples/data2vec/tasks/audio_classification.py
  class AudioClassificationConfig (line 28) | class AudioClassificationConfig(AudioPretrainingConfig):
  class AudioClassificationTask (line 34) | class AudioClassificationTask(AudioPretrainingTask):
    method __init__ (line 39) | def __init__(
    method load_labels (line 47) | def load_labels(self):
    method labels (line 62) | def labels(self):
    method load_dataset (line 65) | def load_dataset(
    method calculate_stats (line 95) | def calculate_stats(self, output, target):
    method valid_step (line 143) | def valid_step(self, sample, model, criterion):
    method reduce_metrics (line 147) | def reduce_metrics(self, logging_outputs, criterion):

FILE: examples/data2vec/tasks/image_classification.py
  class ImageClassificationConfig (line 37) | class ImageClassificationConfig(ImagePretrainingConfig):
  class ImageClassificationTask (line 42) | class ImageClassificationTask(ImagePretrainingTask):
    method setup_task (line 47) | def setup_task(cls, cfg: ImageClassificationConfig, **kwargs):
    method load_dataset (line 50) | def load_dataset(self, split: str, task_cfg: FairseqDataclass = None, ...
    method build_model (line 108) | def build_model(self, model_cfg: FairseqDataclass, from_checkpoint=Fal...
    method reduce_metrics (line 118) | def reduce_metrics(self, logging_outputs, criterion):

FILE: examples/data2vec/tasks/image_pretraining.py
  class ImagePretrainingConfig (line 44) | class ImagePretrainingConfig(FairseqDataclass):
  class ImagePretrainingTask (line 52) | class ImagePretrainingTask(FairseqTask):
    method setup_task (line 58) | def setup_task(cls, cfg: ImagePretrainingConfig, **kwargs):
    method load_dataset (line 67) | def load_dataset(self, split: str, task_cfg: FairseqDataclass = None, ...
    method source_dictionary (line 101) | def source_dictionary(self):
    method target_dictionary (line 105) | def target_dictionary(self):
    method max_positions (line 108) | def max_positions(self):

FILE: examples/data2vec/tasks/mae_image_classification.py
  class MaeImageClassificationConfig (line 30) | class MaeImageClassificationConfig(FairseqDataclass):
  class MaeImageClassificationTask (line 39) | class MaeImageClassificationTask(FairseqTask):
    method setup_task (line 45) | def setup_task(cls, cfg: MaeImageClassificationConfig, **kwargs):
    method load_dataset (line 54) | def load_dataset(self, split: str, task_cfg: FairseqDataclass = None, ...
    method build_model (line 67) | def build_model(self, model_cfg: FairseqDataclass, from_checkpoint=Fal...
    method reduce_metrics (line 77) | def reduce_metrics(self, logging_outputs, criterion):
    method source_dictionary (line 91) | def source_dictionary(self):
    method target_dictionary (line 95) | def target_dictionary(self):
    method max_positions (line 98) | def max_positions(self):

FILE: examples/data2vec/tasks/mae_image_pretraining.py
  class ImageMaskingConfig (line 29) | class ImageMaskingConfig:
  class MaeImagePretrainingConfig (line 42) | class MaeImagePretrainingConfig(FairseqDataclass):
  class MaeImagePretrainingTask (line 63) | class MaeImagePretrainingTask(FairseqTask):
    method setup_task (line 69) | def setup_task(cls, cfg: MaeImagePretrainingConfig, **kwargs):
    method load_dataset (line 78) | def load_dataset(self, split: str, task_cfg: FairseqDataclass = None, ...
    method source_dictionary (line 110) | def source_dictionary(self):
    method target_dictionary (line 114) | def target_dictionary(self):
    method max_positions (line 117) | def max_positions(self):

FILE: examples/data2vec/tasks/multimodal.py
  class MultimodalPretrainingConfig (line 30) | class MultimodalPretrainingConfig(FairseqDataclass):
  class MultimodalPretrainingTask (line 47) | class MultimodalPretrainingTask(FairseqTask):
    method __init__ (line 52) | def __init__(self, cfg: MultimodalPretrainingConfig):
    method setup_task (line 65) | def setup_task(cls, cfg: MultimodalPretrainingConfig, **kwargs):
    method load_dataset (line 74) | def load_dataset(self, split: str, task_cfg: FairseqDataclass = None, ...
    method supported_modalities (line 100) | def supported_modalities(self):
    method get_batch_iterator (line 111) | def get_batch_iterator(
    method source_dictionary (line 156) | def source_dictionary(self):
    method target_dictionary (line 160) | def target_dictionary(self):
    method max_positions (line 163) | def max_positions(self):

FILE: examples/discriminative_reranking_nmt/criterions/discriminative_reranking_criterion.py
  class KLDivergenceRerankingCriterionConfig (line 23) | class KLDivergenceRerankingCriterionConfig(FairseqDataclass):
  class KLDivergenceRerankingCriterion (line 43) | class KLDivergenceRerankingCriterion(FairseqCriterion):
    method __init__ (line 44) | def __init__(
    method forward (line 52) | def forward(self, model, sample, reduce=True):
    method compute_kl_loss (line 106) | def compute_kl_loss(self, logits, target):
    method reduce_metrics (line 121) | def reduce_metrics(logging_outputs) -> None:
    method logging_outputs_can_be_summed (line 133) | def logging_outputs_can_be_summed() -> bool:

FILE: examples/discriminative_reranking_nmt/drnmt_rerank.py
  function init_loaded_scores (line 33) | def init_loaded_scores(mt_scores, model_scores, hyp, ref):
  function parse_fairseq_gen (line 41) | def parse_fairseq_gen(filename, task):
  function read_target (line 70) | def read_target(filename):
  function make_batches (line 76) | def make_batches(args, src, hyp, task, max_positions, encode_fn):
  function decode_rerank_scores (line 113) | def decode_rerank_scores(args):
  function get_score (line 174) | def get_score(mt_s, md_s, w1, lp, tgt_len):
  function get_best_hyps (line 178) | def get_best_hyps(mt_scores, md_scores, hypos, fw_weight, lenpen, beam):
  function eval_metric (line 199) | def eval_metric(args, hypos, ref):
  function score_target_hypo (line 208) | def score_target_hypo(args, fw_weight, lp):
  function print_result (line 224) | def print_result(best_scores, best_hypos, output_file):
  function main (line 229) | def main(args):
  function cli_main (line 309) | def cli_main():

FILE: examples/discriminative_reranking_nmt/models/discriminative_reranking_model.py
  function update_init_roberta_model_state (line 28) | def update_init_roberta_model_state(state):
  class BaseRanker (line 49) | class BaseRanker(nn.Module):
    method __init__ (line 50) | def __init__(self, args, task):
    method forward (line 56) | def forward(self, src_tokens):
    method get_segment_labels (line 59) | def get_segment_labels(self, src_tokens):
    method get_positions (line 69) | def get_positions(self, src_tokens, segment_labels):
  class BertRanker (line 96) | class BertRanker(BaseRanker):
    method __init__ (line 97) | def __init__(self, args, task):
    method forward (line 224) | def forward(self, src_tokens, src_lengths):
    method sentence_forward (line 237) | def sentence_forward(self, encoder_out, src_tokens=None, sentence_rep=...
    method joint_forward (line 263) | def joint_forward(self, x):
    method classification_forward (line 273) | def classification_forward(self, x):
  class DiscriminativeNMTRerankerConfig (line 279) | class DiscriminativeNMTRerankerConfig(FairseqDataclass):
  class DiscriminativeNMTReranker (line 342) | class DiscriminativeNMTReranker(BaseFairseqModel):
    method build_model (line 344) | def build_model(cls, args, task):
    method __init__ (line 348) | def __init__(self, args, model):
    method forward (line 355) | def forward(self, src_tokens, src_lengths, **kwargs):
    method sentence_forward (line 358) | def sentence_forward(self, encoder_out, src_tokens):
    method joint_forward (line 361) | def joint_forward(self, x):
    method classification_forward (line 364) | def classification_forward(self, x):

FILE: examples/discriminative_reranking_nmt/scripts/prep_data.py
  function read_text_file (line 11) | def read_text_file(filename):
  function get_bleu (line 18) | def get_bleu(in_sent, target_sent):
  function get_ter (line 26) | def get_ter(in_sent, target_sent):
  function init (line 32) | def init(sp_model):
  function process (line 38) | def process(source_sent, target_sent, hypo_sent, metric):
  function main (line 50) | def main(args):

FILE: examples/discriminative_reranking_nmt/tasks/discriminative_reranking_task.py
  class DiscriminativeRerankingNMTConfig (line 45) | class DiscriminativeRerankingNMTConfig(FairseqDataclass):
  class RerankerScorer (line 77) | class RerankerScorer(object):
    method __init__ (line 80) | def __init__(self, args, mt_beam):
    method generate (line 84) | def generate(self, models, sample, **kwargs):
  class DiscriminativeRerankingNMTTask (line 114) | class DiscriminativeRerankingNMTTask(FairseqTask):
    method __init__ (line 122) | def __init__(self, cfg: DiscriminativeRerankingNMTConfig, data_diction...
    method load_dictionary (line 130) | def load_dictionary(cls, cfg, filename):
    method setup_task (line 138) | def setup_task(cls, cfg: DiscriminativeRerankingNMTConfig, **kwargs):
    method load_dataset (line 149) | def load_dataset(self, split, epoch=0, combine=False, **kwargs):
    method build_dataset_for_inference (line 291) | def build_dataset_for_inference(self, src_tokens, src_lengths, **kwargs):
    method build_model (line 341) | def build_model(self, cfg: FairseqDataclass, from_checkpoint: bool = F...
    method build_generator (line 344) | def build_generator(self, args):
    method max_positions (line 347) | def max_positions(self):
    method source_dictionary (line 351) | def source_dictionary(self):
    method target_dictionary (line 355) | def target_dictionary(self):
    method create_dummy_batch (line 358) | def create_dummy_batch(self, device):
    method train_step (line 376) | def train_step(
    method valid_step (line 386) | def valid_step(self, sample, model, criterion):
    method reduce_metrics (line 437) | def reduce_metrics(self, logging_outputs, criterion):

FILE: examples/emotion_conversion/emotion_models/duration_predictor.py
  function save_ckpt (line 16) | def save_ckpt(model, path, model_class):
  function load_ckpt (line 25) | def load_ckpt(path):
  class Collator (line 36) | class Collator:
    method __init__ (line 37) | def __init__(self, padding_idx):
    method __call__ (line 40) | def __call__(self, batch):
  class Predictor (line 50) | class Predictor(nn.Module):
    method __init__ (line 51) | def __init__(self, n_tokens, emb_dim):
    method inflate_input (line 60) | def inflate_input(self, batch):
  class CnnPredictor (line 80) | class CnnPredictor(Predictor):
    method __init__ (line 81) | def __init__(self, n_tokens, emb_dim, channels, kernel, output_dim, dr...
    method forward (line 103) | def forward(self, x):
  function l2_log_loss (line 111) | def l2_log_loss(input, target):
  class DurationDataset (line 119) | class DurationDataset(Dataset):
    method __init__ (line 120) | def __init__(self, tsv_path, km_path, substring=""):
    method __len__ (line 135) | def __len__(self):
    method __getitem__ (line 138) | def __getitem__(self, i):
  function train (line 160) | def train(cfg):
  function train_epoch (line 188) | def train_epoch(model, loader, criterion, optimizer, device):
  function valid_epoch (line 209) | def valid_epoch(model, loader, criterion, device):
  function main (line 237) | def main(cfg):

FILE: examples/emotion_conversion/emotion_models/pitch_predictor.py
  function quantize_f0 (line 29) | def quantize_f0(speaker_to_f0, nbins, normalize, log):
  function save_ckpt (line 59) | def save_ckpt(model, path, model_class, f0_min, f0_max, f0_bins, speaker...
  function load_ckpt (line 72) | def load_ckpt(path):
  function freq2bin (line 86) | def freq2bin(f0, f0_min, f0_max, bins):
  function bin2freq (line 94) | def bin2freq(x, f0_min, f0_max, bins, mode):
  function load_wav (line 108) | def load_wav(full_path):
  function l1_loss (line 113) | def l1_loss(input, target):
  function l2_loss (line 117) | def l2_loss(input, target):
  class Collator (line 121) | class Collator:
    method __init__ (line 122) | def __init__(self, padding_idx):
    method __call__ (line 125) | def __call__(self, batch):
  class CnnPredictor (line 147) | class CnnPredictor(nn.Module):
    method __init__ (line 148) | def __init__(
    method forward (line 213) | def forward(self, x, gst=None):
    method setup_f0_stats (line 243) | def setup_f0_stats(self, f0_min, f0_max, f0_bins, speaker_stats):
    method inference (line 250) | def inference(self, x, spk_id=None, gst=None):
  class PitchDataset (line 276) | class PitchDataset(Dataset):
    method __init__ (line 277) | def __init__(
    method __len__ (line 332) | def __len__(self):
    method _load_f0 (line 335) | def _load_f0(self, tsv_line):
    method _preprocess_f0 (line 342) | def _preprocess_f0(self, f0, spk):
    method _compute_f0_minmax (line 364) | def _compute_f0_minmax(self):
    method _compute_f0_stats (line 374) | def _compute_f0_stats(self):
    method __getitem__ (line 386) | def __getitem__(self, i):
  function train (line 412) | def train(cfg):
  function run_epoch (line 489) | def run_epoch(model, loader, optimizer, device, cfg, mode):
  function main (line 543) | def main(cfg):

FILE: examples/emotion_conversion/emotion_models/utils.py
  class Stat (line 4) | class Stat:
    method __init__ (line 5) | def __init__(self, keep_raw=False):
    method update (line 15) | def update(self, new_x):
    method mean (line 29) | def mean(self):
    method std (line 33) | def std(self):
    method mean_log (line 37) | def mean_log(self):
    method std_log (line 41) | def std_log(self):
    method n_frms (line 45) | def n_frms(self):
    method n_utts (line 49) | def n_utts(self):
    method raw_data (line 53) | def raw_data(self):
  class F0Stat (line 58) | class F0Stat(Stat):
    method update (line 59) | def update(self, new_x):
  class Accuracy (line 65) | class Accuracy:
    method __init__ (line 66) | def __init__(self):
    method update (line 69) | def update(self, yhat, y):
    method acc (line 73) | def acc(self, tol):

FILE: examples/emotion_conversion/fairseq_models/__init__.py
  class MultilingualTransformerModelFromMbart (line 25) | class MultilingualTransformerModelFromMbart(MultilingualTransformerModel):
    method build_model (line 27) | def build_model(cls, args, task):
    method load_state_dict (line 150) | def load_state_dict(self, state_dict, strict=True, model_cfg=None):
  function transformer_small (line 202) | def transformer_small(args):
  function multilingual_small (line 217) | def multilingual_small(args):

FILE: examples/emotion_conversion/preprocess/build_hifigan_manifest.py
  function main (line 5) | def main():

FILE: examples/emotion_conversion/preprocess/build_translation_manifests.py
  function get_fname (line 17) | def get_fname(s):
  function get_emotion (line 20) | def get_emotion(s):
  function get_utt_id (line 23) | def get_utt_id(s):
  function dedup (line 26) | def dedup(seq):
  function remove_under_k (line 44) | def remove_under_k(seq, k):
  function call (line 57) | def call(cmd):
  function denoising_preprocess (line 62) | def denoising_preprocess(path, lang, dict):
  function translation_preprocess (line 78) | def translation_preprocess(path, src_lang, trg_lang, dict, only_train=Fa...
  function load_tsv_km (line 98) | def load_tsv_km(tsv_path, km_path):
  function main (line 107) | def main():

FILE: examples/emotion_conversion/preprocess/create_core_manifest.py
  function verify_dict_size (line 17) | def verify_dict_size(km, dict):
  function verify_files_exist (line 26) | def verify_files_exist(l):
  function run_cmd (line 34) | def run_cmd(cmd, print_output=True):
  function main (line 44) | def main():

FILE: examples/emotion_conversion/preprocess/extract_f0.py
  function extract_f0 (line 24) | def extract_f0(tsv_line):
  function main (line 51) | def main():

FILE: examples/emotion_conversion/preprocess/split_emov_km_tsv_by_uttid.py
  function train_val_test_split (line 12) | def train_val_test_split(tsv_lines, km_lines, valid_percent, test_percen...

FILE: examples/emotion_conversion/synthesize.py
  class AttrDict (line 36) | class AttrDict(dict):
    method __init__ (line 37) | def __init__(self, *args, **kwargs):
  function parse_generation_file (line 42) | def parse_generation_file(fname):
  function get_code_to_fname (line 87) | def get_code_to_fname(manifest, tokens):
  function code_to_str (line 117) | def code_to_str(s):
  function get_praat_f0 (line 122) | def get_praat_f0(audio, rate=16000, interp=False):
  function generate_from_code (line 141) | def generate_from_code(generator, h, code, spkr=None, f0=None, gst=None,...
  function synth (line 160) | def synth(argv, interactive=False):

FILE: examples/fast_noisy_channel/noisy_channel_beam_search.py
  class NoisyChannelBeamSearch (line 10) | class NoisyChannelBeamSearch(Search):
    method __init__ (line 12) | def __init__(self, tgt_dict):
    method _init_buffers (line 17) | def _init_buffers(self, t):
    method combine_fw_bw (line 26) | def combine_fw_bw(self, combine_method, fw_cum, bw, step):
    method step (line 35) | def step(self, step, fw_lprobs, scores, bw_lprobs, lm_lprobs, combine_...

FILE: examples/fast_noisy_channel/noisy_channel_sequence_generator.py
  class NoisyChannelSequenceGenerator (line 19) | class NoisyChannelSequenceGenerator(object):
    method __init__ (line 20) | def __init__(
    method generate (line 135) | def generate(
  function get_lm_scores (line 767) | def get_lm_scores(model, input_tokens, incremental_states, cand_tokens, ...
  function make_dict2dict (line 779) | def make_dict2dict(old_dict, new_dict):
  function dict2dict (line 786) | def dict2dict(tokens, dict2dict_map):
  function reorder_tokens (line 797) | def reorder_tokens(tokens, lengths, eos):
  function reorder_all_tokens (line 802) | def reorder_all_tokens(tokens, lengths, eos):
  function normalized_scores_with_batch_vocab (line 808) | def normalized_scores_with_batch_vocab(

FILE: examples/fast_noisy_channel/noisy_channel_translation.py
  class NoisyChannelTranslation (line 15) | class NoisyChannelTranslation(TranslationTask):
    method add_args (line 21) | def add_args(parser):
    method build_generator (line 50) | def build_generator(

FILE: examples/hubert/measure_teacher_quality.py
  function comp_purity (line 13) | def comp_purity(p_xy, axis):
  function comp_entropy (line 21) | def comp_entropy(p):
  function comp_norm_mutual_info (line 25) | def comp_norm_mutual_info(p_xy):
  function pad (line 35) | def pad(labs, n):
  function comp_avg_seg_dur (line 41) | def comp_avg_seg_dur(labs_list):
  function comp_joint_prob (line 54) | def comp_joint_prob(uid2refs, uid2hyps):
  function read_phn (line 87) | def read_phn(tsv_path, rm_stress=True):
  function read_lab (line 99) | def read_lab(tsv_path, lab_path, pad_len=0, upsample=1):
  function main_lab_lab (line 112) | def main_lab_lab(
  function main_phn_lab (line 140) | def main_phn_lab(
  function _main (line 166) | def _main(uid2refs, uid2hyps, verbose):

FILE: examples/hubert/simple_kmeans/dump_hubert_feature.py
  class HubertFeatureReader (line 28) | class HubertFeatureReader(object):
    method __init__ (line 29) | def __init__(self, ckpt_path, layer, max_chunk=1600000):
    method read_audio (line 42) | def read_audio(self, path, ref_len=None):
    method get_feats (line 51) | def get_feats(self, path, ref_len=None):
  function main (line 72) | def main(tsv_dir, split, ckpt_path, layer, nshard, rank, feat_dir, max_c...

FILE: examples/hubert/simple_kmeans/dump_hubert_feature_s2t.py
  class HubertFeatureReaderS2T (line 27) | class HubertFeatureReaderS2T(HubertFeatureReader):
    method read_audio (line 28) | def read_audio(self, path, ref_len=None):
  function get_path_iterator (line 40) | def get_path_iterator(root, tsv, nshard, rank, audio_col_name):
  function main (line 61) | def main(

FILE: examples/hubert/simple_kmeans/dump_km_label.py
  class ApplyKmeans (line 25) | class ApplyKmeans(object):
    method __init__ (line 26) | def __init__(self, km_path):
    method __call__ (line 37) | def __call__(self, x):
  function get_feat_iterator (line 54) | def get_feat_iterator(feat_dir, split, nshard, rank):
  function dump_label (line 70) | def dump_label(feat_dir, split, km_path, nshard, rank, lab_dir):

FILE: examples/hubert/simple_kmeans/dump_mfcc_feature.py
  class MfccFeatureReader (line 26) | class MfccFeatureReader(object):
    method __init__ (line 27) | def __init__(self, sample_rate):
    method read_audio (line 30) | def read_audio(self, path, ref_len=None):
    method get_feats (line 36) | def get_feats(self, path, ref_len=None):
  function main (line 55) | def main(tsv_dir, split, nshard, rank, feat_dir, sample_rate):

FILE: examples/hubert/simple_kmeans/dump_w2v2_feature.py
  class Wav2Vec2FeatureReader (line 27) | class Wav2Vec2FeatureReader(object):
    method __init__ (line 28) | def __init__(self, ckpt_path, layer, max_chunk=1600000):
    method read_audio (line 42) | def read_audio(self, path, ref_len=None):
    method get_feats (line 52) | def get_feats(self, path, ref_len=None):
  function main (line 74) | def main(tsv_dir, split, ckpt_path, layer, nshard, rank, feat_dir, max_c...

FILE: examples/hubert/simple_kmeans/feature_utils.py
  function get_shard_range (line 23) | def get_shard_range(tot, nshard, rank):
  function get_path_iterator (line 35) | def get_path_iterator(tsv, nshard, rank):
  function dump_feature (line 48) | def dump_feature(reader, generator, num, split, nshard, rank, feat_dir):

FILE: examples/hubert/simple_kmeans/learn_kmeans.py
  function get_km_model (line 24) | def get_km_model(
  function load_feature_shard (line 49) | def load_feature_shard(feat_dir, split, nshard, rank, percent):
  function load_feature (line 74) | def load_feature(feat_dir, split, nshard, seed, percent):
  function learn_kmeans (line 87) | def learn_kmeans(

FILE: examples/hubert/update_ckpt.py
  function update_state (line 13) | def update_state(state):

FILE: examples/laser/laser_src/laser_lstm.py
  class LSTMModel (line 22) | class LSTMModel(FairseqEncoderDecoderModel):
    method __init__ (line 23) | def __init__(self, encoder, decoder):
    method forward (line 26) | def forward(
    method add_args (line 44) | def add_args(parser):
    method build_model (line 147) | def build_model(cls, args, task):
  class LSTMEncoder (line 202) | class LSTMEncoder(FairseqEncoder):
    method __init__ (line 205) | def __init__(
    method forward (line 249) | def forward(self, src_tokens, src_lengths, dataset_name):
    method reorder_encoder_out (line 323) | def reorder_encoder_out(self, encoder_out_dict, new_order):
    method max_positions (line 336) | def max_positions(self):
  class LSTMDecoder (line 341) | class LSTMDecoder(FairseqIncrementalDecoder):
    method __init__ (line 344) | def __init__(
    method forward (line 400) | def forward(
    method reorder_incremental_state (line 510) | def reorder_incremental_state(self, incremental_state, new_order):
    method max_positions (line 526) | def max_positions(self):
  function Embedding (line 531) | def Embedding(num_embeddings, embedding_dim, padding_idx):
  function LSTM (line 538) | def LSTM(input_size, hidden_size, **kwargs):
  function LSTMCell (line 546) | def LSTMCell(input_size, hidden_size, **kwargs):
  function Linear (line 554) | def Linear(in_features, out_features, bias=True, dropout=0):
  function base_architecture (line 564) | def base_architecture(args):

FILE: examples/laser/laser_src/laser_task.py
  class LaserTask (line 33) | class LaserTask(LegacyFairseqTask):
    method add_args (line 35) | def add_args(parser):
    method __init__ (line 82) | def __init__(self, args, config, src_dictionary, tgt_dictionary, num_t...
    method setup_task (line 90) | def setup_task(cls, args, **kwargs):
    method build_model (line 115) | def build_model(self, args, from_checkpoint=False):
    method dataset (line 119) | def dataset(self, split):
    method load_dataset (line 124) | def load_dataset(self, split, epoch=1, **kwargs):
    method source_dictionary (line 263) | def source_dictionary(self):
    method target_dictionary (line 267) | def target_dictionary(self):
    method get_batch_iterator (line 270) | def get_batch_iterator(

FILE: examples/laser/laser_src/laser_transformer.py
  class LaserTransformerModel (line 34) | class LaserTransformerModel(FairseqEncoderDecoderModel):
    method __init__ (line 40) | def __init__(self, encoder, decoder):
    method forward (line 43) | def forward(
    method add_args (line 59) | def add_args(parser):
    method build_model (line 70) | def build_model(cls, args, task):
  class LaserTransformerEncoder (line 104) | class LaserTransformerEncoder(TransformerEncoder):
    method __init__ (line 105) | def __init__(self, *args, **kwargs):
    method forward (line 108) | def forward(self, src_tokens, *args, **kwargs):
    method reorder_encoder_out (line 127) | def reorder_encoder_out(self, encoder_out: Dict[str, List[Tensor]], ne...
  class LaserTransformerDecoder (line 141) | class LaserTransformerDecoder(TransformerDecoder):
    method __init__ (line 142) | def __init__(self, args, dictionary, *kargs, **kwargs):
    method build_decoder_layer (line 169) | def build_decoder_layer(self, args, no_encoder_attn=False):
    method extract_features (line 179) | def extract_features(
    method forward (line 307) | def forward(
  function base_laser_transformer_architecture (line 352) | def base_laser_transformer_architecture(args):

FILE: examples/laser/laser_src/multitask_data_utils.py
  class MultiItr (line 13) | class MultiItr(object):
    method __init__ (line 14) | def __init__(self, itr):
    method __len__ (line 18) | def __len__(self):
    method __iter__ (line 21) | def __iter__(self):
    method __next__ (line 24) | def __next__(self):
  class MultidatasetEpochBatchIterator (line 31) | class MultidatasetEpochBatchIterator(iterators.EpochBatchIterating):
    method __init__ (line 34) | def __init__(
    method __len__ (line 65) | def __len__(self):
    method next_epoch_itr (line 68) | def next_epoch_itr(self, shuffle=True, fix_batches_to_gpus=False):
    method end_of_epoch (line 79) | def end_of_epoch(self):
    method next_epoch_idx (line 83) | def next_epoch_idx(self):
    method iterations_in_epoch (line 93) | def iterations_in_epoch(self):
    method state_dict (line 96) | def state_dict(self):
    method load_state_dict (line 102) | def load_state_dict(self, state_dict):
  class MultitaskDatasetWrapper (line 108) | class MultitaskDatasetWrapper(BaseWrapperDataset):
    method __init__ (line 111) | def __init__(self, dataset, target_language_id, sample=1.0, name=""):
    method collater (line 117) | def collater(self, *args, **kwargs):
    method num_tokens (line 124) | def num_tokens(self, *args, **kwargs):
    method ordered_indices (line 127) | def ordered_indices(self, *args, **kwargs):
    method size (line 134) | def size(self, index: int):
    method supports_prefetch (line 138) | def supports_prefetch(self):
    method prefetch (line 142) | def prefetch(self, indices):

FILE: examples/latent_depth/latent_depth_src/loss/latent_depth.py
  class LatentLayersKLLoss (line 12) | class LatentLayersKLLoss(_Loss):
    method __init__ (line 13) | def __init__(self, args):
    method forward (line 17) | def forward(self, layer_samples, lang_idx, update_num, sample_size):
  class LatentLayersSparsityLoss (line 48) | class LatentLayersSparsityLoss(_Loss):
    method __init__ (line 49) | def __init__(self, args):
    method is_valid (line 53) | def is_valid(self, update_num):
    method forward (line 58) | def forward(self, layer_samples_list, update_num, sample_size):

FILE: examples/latent_depth/latent_depth_src/models/latent_multilingual_transformer.py
  class LatentMultilingualTransformerModel (line 19) | class LatentMultilingualTransformerModel(MultilingualTransformerModel):
    method add_args (line 26) | def add_args(parser):
    method _get_module_class (line 42) | def _get_module_class(cls, is_encoder, args, lang_dict, embed_tokens, ...
  function latent_multilingual_architecture (line 62) | def latent_multilingual_architecture(args):

FILE: examples/latent_depth/latent_depth_src/models/latent_transformer.py
  class LatentTransformerEncoder (line 17) | class LatentTransformerEncoder(TransformerEncoder):
    method __init__ (line 22) | def __init__(self, args, dictionary, embed_tokens, num_logits=1):
    method set_lang_idx (line 37) | def set_lang_idx(self, lang_idx):
    method _build_encoder_layer (line 40) | def _build_encoder_layer(self, args, idx=None):
    method forward (line 43) | def forward(self, src_tokens, src_lengths, return_all_hiddens: bool = ...
  class LatentTransformerEncoderLayer (line 48) | class LatentTransformerEncoderLayer(TransformerEncoderLayer):
    method __init__ (line 60) | def __init__(self, args, idx, layer_select=None):
    method residual_connection (line 65) | def residual_connection(self, x, residual):
  class LatentTransformerDecoder (line 69) | class LatentTransformerDecoder(TransformerDecoder):
    method __init__ (line 74) | def __init__(
    method set_lang_idx (line 96) | def set_lang_idx(self, lang_idx):
    method _build_decoder_layer (line 99) | def _build_decoder_layer(self, args, no_encoder_attn=False, idx=None):
    method forward (line 104) | def forward(
  class LatentTransformerDecoderLayer (line 127) | class LatentTransformerDecoderLayer(TransformerDecoderLayer):
    method __init__ (line 142) | def __init__(
    method residual_connection (line 155) | def residual_connection(self, x, residual):

FILE: examples/latent_depth/latent_depth_src/modules/latent_layers.py
  class LayerSelect (line 10) | class LayerSelect(nn.Module):
    method __init__ (line 15) | def __init__(self, num_layers, num_logits, soft_select=False, sampling...
    method sample (line 26) | def sample(self, logit_idx):
    method forward (line 45) | def forward(self, i):
    method _gumbel_sigmoid (line 49) | def _gumbel_sigmoid(

FILE: examples/latent_depth/latent_depth_src/multilingual_translation_latent_depth.py
  class MultilingualTranslationTaskLatentDepth (line 14) | class MultilingualTranslationTaskLatentDepth(MultilingualTranslationTask):
    method add_args (line 22) | def add_args(parser):
    method __init__ (line 42) | def __init__(self, args, dicts, training):
    method _per_lang_pair_train_loss (line 61) | def _per_lang_pair_train_loss(
    method train_step (line 116) | def train_step(
    method _per_lang_pair_valid_loss (line 145) | def _per_lang_pair_valid_loss(self, lang_pair, model, criterion, sample):
    method inference_step (line 158) | def inference_step(
    method encoder_latent_layer (line 176) | def encoder_latent_layer(self):
    method decoder_latent_layer (line 183) | def decoder_latent_layer(self):
    method src_lang_idx_dict (line 190) | def src_lang_idx_dict(self):
    method tgt_lang_idx_dict (line 194) | def tgt_lang_idx_dict(self):

FILE: examples/linformer/linformer_src/models/linformer_roberta.py
  class LinformerModel (line 30) | class LinformerModel(RobertaModel):
    method add_args (line 32) | def add_args(parser):
    method build_model (line 56) | def build_model(cls, args, task):
  class LinformerEncoder (line 69) | class LinformerEncoder(RobertaEncoder):
    method __init__ (line 72) | def __init__(self, args, dictionary):
    method build_encoder (line 76) | def build_encoder(self, args, dictionary, embed_tokens):
    method upgrade_state_dict_named (line 81) | def upgrade_state_dict_named(self, state_dict, name):
  function base_architecture (line 104) | def base_architecture(args):
  function linformer_roberta_base_architecture (line 113) | def linformer_roberta_base_architecture(args):
  function linformer_roberta_large_architecture (line 118) | def linformer_roberta_large_architecture(args):

FILE: examples/linformer/linformer_src/modules/linformer_sentence_encoder.py
  class LinformerTransformerEncoder (line 14) | class LinformerTransformerEncoder(TransformerEncoder):
    method __init__ (line 38) | def __init__(self, args, dictionary, embed_tokens):
    method build_encoder_layer (line 42) | def build_encoder_layer(self, args):

FILE: examples/linformer/linformer_src/modules/linformer_sentence_encoder_layer.py
  class LinformerTransformerEncoderLayer (line 13) | class LinformerTransformerEncoderLayer(TransformerEncoderLayer):
    method __init__ (line 19) | def __init__(self, args, shared_compress_layer):
    method build_self_attention (line 27) | def build_self_attention(self, embed_dim, args):
    method upgrade_state_dict_named (line 42) | def upgrade_state_dict_named(self, state_dict, name):

FILE: examples/linformer/linformer_src/modules/multihead_linear_attention.py
  class MultiheadLinearAttention (line 19) | class MultiheadLinearAttention(nn.Module):
    method __init__ (line 27) | def __init__(
    method prepare_for_onnx_export_ (line 115) | def prepare_for_onnx_export_(self):
    method reset_parameters (line 118) | def reset_parameters(self):
    method forward (line 152) | def forward(
    method _append_prev_key_padding_mask (line 375) | def _append_prev_key_padding_mask(
    method reorder_incremental_state (line 413) | def reorder_incremental_state(
    method _get_input_buffer (line 432) | def _get_input_buffer(
    method _set_input_buffer (line 442) | def _set_input_buffer(
    method apply_sparse_mask (line 449) | def apply_sparse_mask(attn_weights, tgt_len: int, src_len: int, bsz: i...
    method upgrade_state_dict_named (line 452) | def upgrade_state_dict_named(self, state_dict, name):

FILE: examples/m2m_100/process_data/clean_histogram.py
  function read_hist (line 17) | def read_hist(f):

FILE: examples/m2m_100/process_data/dedup_data.py
  function main (line 10) | def main(args):
  function existing_data (line 27) | def existing_data():
  function dedup (line 34) | def dedup(language_pair, data, verbose=True, output=True):

FILE: examples/m2m_100/process_data/remove_too_much_punc.py
  function len_no_punc (line 5) | def len_no_punc(s, punc):
  function filter_overpunc (line 8) | def filter_overpunc(len_npunc, len_sen):
  function main (line 11) | def main(args):

FILE: examples/megatron_11b/detok.py
  function main (line 13) | def main():

FILE: examples/mms/asr/infer/mms_infer.py
  function parser (line 16) | def parser():
  function reorder_decode (line 25) | def reorder_decode(hypos):
  function process (line 34) | def process(args):

FILE: examples/mms/data_prep/align_and_segment.py
  function generate_emissions (line 23) | def generate_emissions(model, audio_file):
  function get_alignments (line 67) | def get_alignments(
  function main (line 102) | def main(args):

FILE: examples/mms/data_prep/align_utils.py
  function normalize_uroman (line 13) | def normalize_uroman(text):
  function get_uroman_tokens (line 20) | def get_uroman_tokens(norm_transcripts, uroman_root_dir, iso = None):
  class Segment (line 48) | class Segment:
    method __repr__ (line 53) | def __repr__(self):
    method length (line 57) | def length(self):
  function merge_repeats (line 61) | def merge_repeats(path, idx_to_token_map):
  function time_to_frame (line 72) | def time_to_frame(time):
  function load_model_dict (line 79) | def load_model_dict():
  function get_spans (line 137) | def get_spans(tokens, segments):

FILE: examples/mms/data_prep/text_normalization.py
  function text_normalize (line 8) | def text_normalize(text, iso_code, lower_case=True, remove_numbers=True,...

FILE: examples/mms/lid/infer.py
  function subset_manifest (line 14) | def subset_manifest(infer_manifest, veri_pair):
  function wrap_target_dataset (line 36) | def wrap_target_dataset(infer_manifest, dataset, task):
  function resample_data (line 56) | def resample_data(source, padding_mask, n_sample, max_sample_len):
  function resample_sample (line 83) | def resample_sample(sample, n_sample, max_sample_len):
  function dict_to_nparr (line 99) | def dict_to_nparr(dd):

FILE: examples/mms/lid_rerank/mms-zs/uromanize.py
  function norm_uroman (line 16) | def norm_uroman(text):
  function uromanize (line 23) | def uromanize(words):
  function convert_sent (line 41) | def convert_sent(txt, char_lang=False):

FILE: examples/mms/lid_rerank/mms/run_single_lang.py
  function reorder_decode (line 12) | def reorder_decode(hypos):

FILE: examples/mms/lid_rerank/nllb/infer.py
  function fix_code (line 18) | def fix_code(x):

FILE: examples/mms/lid_rerank/rerank/rerank.py
  function select (line 19) | def select(w, feats, ref_lid, nbest_lid, ref_asr, nbest_asr, n=10, exclu...

FILE: examples/mms/lid_rerank/rerank/tune_coefficients.py
  function compute (line 11) | def compute(w, feats, ref_lid, nbest_lid, ref_asr, nbest_asr, n=10, excl...

FILE: examples/mms/tts/infer.py
  class TextMapper (line 25) | class TextMapper(object):
    method __init__ (line 26) | def __init__(self, vocab_file):
    method text_to_sequence (line 32) | def text_to_sequence(self, text, cleaner_names):
    method uromanize (line 47) | def uromanize(self, text, uroman_pl):
    method get_text (line 65) | def get_text(self, text, hps):
    method filter_oov (line 72) | def filter_oov(self, text, lang=None):
    method preprocess_char (line 79) | def preprocess_char(self, text, lang=None):
  function generate (line 88) | def generate():

FILE: examples/multilingual/data_scripts/binarize.py
  function call_output (line 10) | def call_output(cmd):
  function call (line 16) | def call(cmd):
  function get_data_size (line 48) | def get_data_size(raw):
  function encode_spm (line 53) | def encode_spm(model, direction, prefix='', splits=['train', 'test', 'va...
  function binarize_ (line 68) | def binarize_(
  function binarize (line 117) | def binarize(
  function load_langs (line 161) | def load_langs(path):

FILE: examples/multilingual/data_scripts/check_iswlt_test_data.py
  function run_eval_bleu (line 20) | def run_eval_bleu(cmd):
  function check_data_test_bleu (line 32) | def check_data_test_bleu(raw_folder, data_lang_pairs):

FILE: examples/multilingual/data_scripts/check_self_overlaps.py
  function get_directions (line 19) | def get_directions(folder):
  function diff_list (line 24) | def diff_list(lhs, rhs):
  function check_diff (line 27) | def check_diff(
  function main (line 64) | def main():

FILE: examples/multilingual/data_scripts/check_valid_test_overlaps.py
  function load_langs (line 19) | def load_langs(path):
  function load_sentences (line 26) | def load_sentences(raw_data, split, direction):
  function swap_direction (line 35) | def swap_direction(d):
  function get_all_test_data (line 39) | def get_all_test_data(raw_data, directions, split='test'):
  function check_train_sentences (line 57) | def check_train_sentences(src_path, tgt_path, direction, all_test_data, ...
  function check_train_all (line 83) | def check_train_all(raw_data, directions, all_test_data):
  function main (line 102) | def main():

FILE: examples/multilingual/data_scripts/dedup_all.py
  function main (line 21) | def main():

FILE: examples/multilingual/data_scripts/download_ted_and_extract.py
  function call (line 35) | def call(cmd):
  class MultiLingualAlignedCorpusReader (line 39) | class MultiLingualAlignedCorpusReader(object):
    method __init__ (line 43) | def __init__(self, corpus_path, delimiter='\t',
    method read_data (line 71) | def read_data(self, file_loc_):
    method filter_text (line 82) | def filter_text(self, dict_):
    method read_file (line 107) | def read_file(self, split_type, data_type):
    method save_file (line 110) | def save_file(self, path_, split_type, data_type, lang):
    method add_target_token (line 118) | def add_target_token(self, list_, lang_id):
    method read_from_single_file (line 125) | def read_from_single_file(self, path_, s_lang, t_lang):
    method read_aligned_corpus (line 139) | def read_aligned_corpus(self, split_type='train'):
  function read_langs (line 169) | def read_langs(corpus_path):
  function extra_english (line 177) | def extra_english(corpus_path, split):
  function tok_file_name (line 191) | def tok_file_name(filename, lang):
  function de_tok (line 197) | def de_tok(tok_file, lang):
  function extra_bitex (line 207) | def extra_bitex(
  function bar_custom (line 255) | def bar_custom(current, total, width=80):
  function download_and_extract (line 259) | def download_and_extract(download_to, extract_to):

FILE: examples/multilingual/data_scripts/download_wmt19_and_before.py
  class DLDataset (line 38) | class DLDataset(NamedTuple):
  function bar_custom (line 49) | def bar_custom(current, total, width=80):
  function get_downloaded_file (line 52) | def get_downloaded_file(dl_folder, url):
  function download_parts_and_combine (line 61) | def download_parts_and_combine(dl_folder, urls, filename):
  function download_a_url (line 79) | def download_a_url(dl_folder, url):
  function download_files (line 93) | def download_files(dl_folder, urls, completed_urls={}):
  function check_need_manual_downalod (line 100) | def check_need_manual_downalod(dl_folder, to_manually_download_urls):
  function download_dataset (line 114) | def download_dataset(to_folder, dl_dataset, completed_urls={}):
  function call (line 121) | def call(cmd, debug=False):
  function get_extract_name (line 127) | def get_extract_name(file_path):
  function extract_file (line 131) | def extract_file(downloaded_file, extract_folder, get_extract_name=get_e...
  function extract_all_files (line 160) | def extract_all_files(
  function my_glob (line 175) | def my_glob(folder):
  function sgm2raw (line 181) | def sgm2raw(sgm, debug):
  function tmx2raw (line 190) | def tmx2raw(tmx, debug):
  function cut_wikitles (line 206) | def cut_wikitles(wiki_file, debug):
  function cut_tsv (line 230) | def cut_tsv(file, debug):
  function convert_file_if_needed (line 250) | def convert_file_if_needed(file, debug):
  function convert_files_if_needed (line 267) | def convert_files_if_needed(extracted_foldrs, my_glob=my_glob, debug=Fal...
  function match_patt (line 273) | def match_patt(file_path, file_pattern, src, tgt, lang):
  function match_patts (line 276) | def match_patts(file_path, file_patterns, src, tgt, lang):
  function extracted_glob (line 290) | def extracted_glob(extracted_folder, file_patterns, src, tgt, lang):
  function all_extracted_files (line 316) | def all_extracted_files(split, src, tgt, extracted_folders, split_urls):
  function concat_files (line 327) | def concat_files(split, src, tgt, extracted_folders, split_urls, path_pa...
  function lid_filter (line 358) | def lid_filter(split, src, tgt, from_folder, to_folder, debug=False):
  function concat_into_splits (line 372) | def concat_into_splits(dl_dataset, src, tgt, extracted_folders, to_folde...
  function download_multi (line 394) | def download_multi(dl_folder, extract_folder, urls, num_processes=8, deb...
  function run_eval_bleu (line 402) | def run_eval_bleu(cmd):
  function check_wmt_test_bleu (line 414) | def check_wmt_test_bleu(raw_folder, wmt_lang_pairs):
  function download_and_extract (line 437) | def download_and_extract(
  function download_czang16 (line 479) | def download_czang16(download_to, username=None):
  function download_czeng17_script (line 496) | def download_czeng17_script(download_to, extract_folder, debug=False):
  function convert2czeng17 (line 508) | def convert2czeng17(file, debug):
  function extract_czeng17 (line 521) | def extract_czeng17(extract_folder, debug=False):
  function work_on_wmt (line 839) | def work_on_wmt(directions, wmt_data):

FILE: examples/multilingual/data_scripts/remove_valid_test_in_train.py
  function load_langs (line 12) | def load_langs(path):
  function load_sentences (line 19) | def load_sentences(raw_data, split, direction):
  function swap_direction (line 28) | def swap_direction(d):
  function get_all_test_data (line 32) | def get_all_test_data(raw_data, directions, split='test'):
  function check_train_sentences (line 49) | def check_train_sentences(raw_data, direction, all_test_data, mess_up_tr...
  function check_train_all (line 72) | def check_train_all(raw_data, directions, all_test_data):
  function count_train_in_other_set (line 80) | def count_train_in_other_set(mess_up_train):
  function train_size_if_remove_in_otherset (line 87) | def train_size_if_remove_in_otherset(data_sizes, mess_up_train):
  function remove_messed_up_sentences (line 95) | def remove_messed_up_sentences(raw_data, direction, mess_up_train, mess_...
  function merge_valid_test_messup (line 132) | def merge_valid_test_messup(mess_up_train_valid, mess_up_train_test):
  function check_train_pairs (line 145) | def check_train_pairs(raw_data, direction, all_test_data, mess_up_train=...
  function load_pairs (line 164) | def load_pairs(raw_data, split, direction):
  function get_messed_up_test_pairs (line 178) | def get_messed_up_test_pairs(split, directions):

FILE: examples/multilingual/data_scripts/utils/dedup.py
  function deup (line 9) | def deup(src_file, tgt_file, src_file_out, tgt_file_out):
  function main (line 26) | def main():

FILE: examples/multilingual/data_scripts/utils/fasttext_multi_filter.py
  function init (line 18) | def init(model_path):
  function pred (line 22) | def pred(lines):
  function main (line 25) | def main():

FILE: examples/noisychannel/rerank.py
  function score_target_hypo (line 23) | def score_target_hypo(
  function match_target_hypo (line 168) | def match_target_hypo(args, target_outfile, hypo_outfile):
  function load_score_files (line 230) | def load_score_files(args):
  function rerank (line 362) | def rerank(args):
  function cli_main (line 421) | def cli_main():

FILE: examples/noisychannel/rerank_generate.py
  function gen_and_reprocess_nbest (line 21) | def gen_and_reprocess_nbest(args):
  function cli_main (line 390) | def cli_main():

FILE: examples/noisychannel/rerank_options.py
  function get_reranking_parser (line 9) | def get_reranking_parser(default_task="translation"):
  function get_tuning_parser (line 15) | def get_tuning_parser(default_task="translation"):
  function add_reranking_args (line 22) | def add_reranking_args(parser):
  function add_tuning_args (line 110) | def add_tuning_args(parser):

FILE: examples/noisychannel/rerank_score_bw.py
  function score_bw (line 15) | def score_bw(args):
  function cli_main (line 136) | def cli_main():

FILE: examples/noisychannel/rerank_score_lm.py
  function score_lm (line 13) | def score_lm(args):
  function cli_main (line 74) | def cli_main():

FILE: examples/noisychannel/rerank_tune.py
  function random_search (line 15) | def random_search(args):
  function cli_main (line 94) | def cli_main():

FILE: examples/noisychannel/rerank_utils.py
  function reprocess (line 16) | def reprocess(fle):
  function reprocess_nbest (line 75) | def reprocess_nbest(fle):
  function write_reprocessed (line 124) | def write_reprocessed(
  function calc_length_from_frac (line 196) | def calc_length_from_frac(bpe_sentence, prefix_frac, bpe_symbol):
  function get_prefix (line 207) | def get_prefix(sentence, prefix_len):
  function get_prefix_no_bpe (line 216) | def get_prefix_no_bpe(sentence, bpe_symbol, prefix_len):
  function get_prefix_from_len (line 223) | def get_prefix_from_len(sentence, bpe_symbol, prefix_len):
  function get_num_bpe_tokens_from_len (line 234) | def get_num_bpe_tokens_from_len(sentence, bpe_symbol, prefix_len):
  function make_right_to_left (line 241) | def make_right_to_left(line):
  function remove_bpe (line 248) | def remove_bpe(line, bpe_symbol):
  function remove_bpe_dict (line 254) | def remove_bpe_dict(pred_dict, bpe_symbol):
  function parse_bleu_scoring (line 265) | def parse_bleu_scoring(line):
  function get_full_from_prefix (line 272) | def get_full_from_prefix(hypo_prefix, hypos):
  function get_score (line 283) | def get_score(
  class BitextOutput (line 325) | class BitextOutput(object):
    method __init__ (line 326) | def __init__(
  class BitextOutputFromGen (line 411) | class BitextOutputFromGen(object):
    method __init__ (line 412) | def __init__(
  function get_score_from_pos (line 482) | def get_score_from_pos(
  class LMOutput (line 510) | class LMOutput(object):
    method __init__ (line 511) | def __init__(
  function parse_lm (line 540) | def parse_lm(input_file, prefix_len=None, bpe_symbol=None, target_prefix...
  function get_directories (line 585) | def get_directories(
  function lm_scoring (line 652) | def lm_scoring(
  function rescore_file_name (line 829) | def rescore_file_name(

FILE: examples/operators/alignment_train_cpu.cpp
  function exclusiveCumprod (line 15) | void exclusiveCumprod(
  function clamp (line 55) | void clamp(
  function alignmentTrainCPUImpl (line 80) | void alignmentTrainCPUImpl(
  function alignmentTrainCPU (line 135) | void alignmentTrainCPU(
  function PYBIND11_MODULE (line 159) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: examples/operators/alignment_train_cuda.cpp
  function alignmentTrainCUDA (line 14) | void alignmentTrainCUDA(
  function PYBIND11_MODULE (line 24) | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {

FILE: examples/paraphraser/paraphrase.py
  function main (line 15) | def main():

FILE: examples/pointer_generator/pointer_generator_src/transformer_pg.py
  class TransformerPointerGeneratorModel (line 28) | class TransformerPointerGeneratorModel(TransformerModel):
    method add_args (line 48) | def add_args(parser):
    method build_model (line 70) | def build_model(cls, args, task):
    method build_encoder (line 141) | def build_encoder(cls, args, src_dict, embed_tokens):
    method build_decoder (line 145) | def build_decoder(cls, args, tgt_dict, embed_tokens):
  class TransformerPointerGeneratorEncoder (line 149) | class TransformerPointerGeneratorEncoder(TransformerEncoder):
    method forward (line 157) | def forward(
  class TransformerPointerGeneratorDecoder (line 215) | class TransformerPointerGeneratorDecoder(TransformerDecoder):
    method __init__ (line 227) | def __init__(self, args, dictionary, embed_tokens):
    method forward (line 253) | def forward(
    method output_layer (line 312) | def output_layer(
    method get_normalized_probs (line 364) | def get_normalized_probs(
  class Embedding (line 380) | class Embedding(nn.Embedding):
    method __init__ (line 418) | def __init__(
    method forward (line 431) | def forward(self, input):
  function transformer_pointer_generator (line 444) | def transformer_pointer_generator(args):
  function transformer_pointer_generator_iwslt_de_en (line 455) | def transformer_pointer_generator_iwslt_de_en(args):
  function transformer_pointer_generator_wmt_en_de (line 470) | def transformer_pointer_generator_wmt_en_de(args):
  function transformer_pointer_generator_vaswani_wmt_en_de_big (line 480) | def transformer_pointer_generator_vaswani_wmt_en_de_big(args):
  function transformer_pointer_generator_vaswani_wmt_en_fr_big (line 496) | def transformer_pointer_generator_vaswani_wmt_en_fr_big(args):
  function transformer_pointer_generator_wmt_en_de_big (line 504) | def transformer_pointer_generator_wmt_en_de_big(args):
  function transformer_pointer_generator_wmt_en_de_big_t2t (line 513) | def transformer_pointer_generator_wmt_en_de_big_t2t(args):

FILE: examples/pointer_generator/postprocess.py
  class OOVIndexError (line 12) | class OOVIndexError(IndexError):
    method __init__ (line 13) | def __init__(self, pos, source_seq, target_seq):
  function replace_oovs (line 26) | def replace_oovs(source_in, target_in, target_out):
  function main (line 50) | def main():

FILE: examples/pointer_generator/preprocess.py
  function replace_oovs (line 11) | def replace_oovs(source_in, target_in, vocabulary, source_out, target_out):
  function main (line 53) | def main():

FILE: examples/roberta/commonsense_qa/commonsense_qa_task.py
  class CommonsenseQATask (line 28) | class CommonsenseQATask(LegacyFairseqTask):
    method add_args (line 32) | def add_args(parser):
    method __init__ (line 45) | def __init__(self, args, vocab):
    method load_dictionary (line 53) | def load_dictionary(cls, filename):
    method setup_task (line 64) | def setup_task(cls, args, **kwargs):
    method load_dataset (line 75) | def load_dataset(
    method build_model (line 172) | def build_model(self, args, from_checkpoint=False):
    method source_dictionary (line 185) | def source_dictionary(self):
    method target_dictionary (line 189) | def target_dictionary(self):

FILE: examples/roberta/multiprocessing_bpe_encoder.py
  function main (line 17) | def main():
  class MultiprocessingEncoder (line 91) | class MultiprocessingEncoder(object):
    method __init__ (line 92) | def __init__(self, args):
    method initializer (line 95) | def initializer(self):
    method encode (line 99) | def encode(self, line):
    method decode (line 104) | def decode(self, tokens):
    method encode_lines (line 108) | def encode_lines(self, lines):
    method decode_lines (line 121) | def decode_lines(self, lines):

FILE: examples/roberta/preprocess_RACE.py
  class InputExample (line 14) | class InputExample:
    method __init__ (line 15) | def __init__(self, paragraph, qa_list, label):
  function get_examples (line 21) | def get_examples(data_dir, set_type):
  function main (line 60) | def main():

FILE: examples/roberta/wsc/wsc_criterion.py
  class WSCCriterion (line 16) | class WSCCriterion(LegacyFairseqCriterion):
    method __init__ (line 17) | def __init__(self, args, task):
    method __del__ (line 26) | def __del__(self):
    method add_args (line 31) | def add_args(parser):
    method get_masked_input (line 44) | def get_masked_input(self, tokens, mask):
    method get_lprobs (line 49) | def get_lprobs(self, model, tokens, mask):
    method get_loss (line 57) | def get_loss(self, query_lprobs, cand_lprobs):
    method forward (line 70) | def forward(self, model, sample, reduce=True):
    method aggregate_logging_outputs (line 118) | def aggregate_logging_outputs(logging_outputs):
  class WinograndeCriterion (line 141) | class WinograndeCriterion(WSCCriterion):
    method forward (line 142) | def forward(self, model, sample, reduce=True):

FILE: examples/roberta/wsc/wsc_task.py
  class WSCTask (line 32) | class WSCTask(LegacyFairseqTask):
    method add_args (line 36) | def add_args(parser):
    method __init__ (line 48) | def __init__(self, args, vocab):
    method load_dictionary (line 65) | def load_dictionary(cls, filename):
    method setup_task (line 76) | def setup_task(cls, args, **kwargs):
    method binarize (line 85) | def binarize(self, s: str, append_eos: bool = False):
    method binarize_with_mask (line 99) | def binarize_with_mask(self, txt, prefix, suffix, leading_space, trail...
    method load_dataset (line 110) | def load_dataset(
    method build_dataset_for_inference (line 225) | def build_dataset_for_inference(self, sample_json):
    method disambiguate_pronoun (line 235) | def disambiguate_pronoun(self, model, sentence, use_cuda=False):
    method source_dictionary (line 273) | def source_dictionary(self):
    method target_dictionary (line 277) | def target_dictionary(self):
  class WinograndeTask (line 282) | class WinograndeTask(WSCTask):
    method setup_task (line 289) | def setup_task(cls, args, **kwargs):
    method load_dataset (line 298) | def load_dataset(

FILE: examples/roberta/wsc/wsc_utils.py
  function convert_sentence_to_json (line 10) | def convert_sentence_to_json(sentence):
  function extended_noun_chunks (line 36) | def extended_noun_chunks(sentence):
  function find_token (line 52) | def find_token(sentence, start_pos):
  function find_span (line 61) | def find_span(sentence, search_text, start=0):
  function get_detokenizer (line 77) | def get_detokenizer():
  function get_spacy_nlp (line 85) | def get_spacy_nlp():
  function jsonl_iterator (line 92) | def jsonl_iterator(input_fname, positive_only=False, ngram_order=3, eval...
  function winogrande_jsonl_iterator (line 195) | def winogrande_jsonl_iterator(input_fname, eval=False):
  function filter_noun_chunks (line 215) | def filter_noun_chunks(

FILE: examples/rxf/rxf_src/label_smoothed_cross_entropy_r3f.py
  class LabelSmoothedCrossEntropyR3FCriterion (line 17) | class LabelSmoothedCrossEntropyR3FCriterion(FairseqCriterion):
    method __init__ (line 18) | def __init__(
    method add_args (line 39) | def add_args(parser):
    method _get_symm_kl (line 53) | def _get_symm_kl(self, noised_logits, input_logits):
    method forward (line 71) | def forward(self, model, sample, reduce=True):
    method compute_loss (line 118) | def compute_loss(self, model, net_output, sample, reduce=True):
    method reduce_metrics (line 132) | def reduce_metrics(logging_outputs) -> None:
    method logging_outputs_can_be_summed (line 152) | def logging_outputs_can_be_summed() -> bool:

FILE: examples/rxf/rxf_src/sentence_prediction_r3f.py
  class SentencePredictionR3F (line 15) | class SentencePredictionR3F(FairseqCriterion):
    method __init__ (line 16) | def __init__(
    method add_args (line 43) | def add_args(parser):
    method _get_symm_kl (line 58) | def _get_symm_kl(self, noised_logits, input_logits):
    method forward (line 76) | def forward(self, model, sample, reduce=True):
    method aggregate_logging_outputs (line 149) | def aggregate_logging_outputs(logging_outputs):

FILE: examples/simultaneous_translation/eval/agents/simul_t2t_enja.py
  class SimulTransTextAgentJA (line 22) | class SimulTransTextAgentJA(TextAgent):
    method __init__ (line 27) | def __init__(self, args):
    method initialize_states (line 43) | def initialize_states(self, states):
    method to_device (line 47) | def to_device(self, tensor):
    method load_model_vocab (line 53) | def load_model_vocab(self, args):
    method add_args (line 84) | def add_args(parser):
    method build_word_splitter (line 103) | def build_word_splitter(self, args):
    method segment_to_units (line 112) | def segment_to_units(self, segment, states):
    method update_model_encoder (line 116) | def update_model_encoder(self, states):
    method update_states_read (line 140) | def update_states_read(self, states):
    method units_to_segment (line 144) | def units_to_segment(self, units, states):
    method policy (line 167) | def policy(self, states):
    method predict (line 211) | def predict(self, states):

FILE: examples/simultaneous_translation/models/convtransformer_simul_trans.py
  class SimulConvTransformerModel (line 29) | class SimulConvTransformerModel(ConvTransformerModel):
    method add_args (line 40) | def add_args(parser):
    method build_decoder (line 50) | def build_decoder(cls, args, task, embed_tokens):
  function convtransformer_simul_trans_espnet (line 69) | def convtransformer_simul_trans_espnet(args):
  class AugmentedMemoryConvTransformerModel (line 75) | class AugmentedMemoryConvTransformerModel(SimulConvTransformerModel):
    method build_encoder (line 77) | def build_encoder(cls, args):
  function augmented_memory_convtransformer_espnet (line 91) | def augmented_memory_convtransformer_espnet(args):
  class ConvTransformerEmformerEncoder (line 102) | class ConvTransformerEmformerEncoder(ConvTransformerEncoder):
    method __init__ (line 103) | def __init__(self, args):
    method forward (line 131) | def forward(self, src_tokens, src_lengths):
    method conv_layer_stride (line 149) | def conv_layer_stride(args):
  class ConvtransformerEmformer (line 155) | class ConvtransformerEmformer(SimulConvTransformerModel):
    method add_args (line 157) | def add_args(parser):
    method build_encoder (line 190) | def build_encoder(cls, args):
  function convtransformer_emformer_base (line 203) | def convtransformer_emformer_base(args):

FILE: examples/simultaneous_translation/models/transformer_monotonic_attention.py
  class TransformerUnidirectionalModel (line 47) | class TransformerUnidirectionalModel(TransformerModel):
    method build_encoder (line 49) | def build_encoder(cls, args, src_dict, embed_tokens):
  class TransformerModelSimulTrans (line 54) | class TransformerModelSimulTrans(TransformerModel):
    method build_encoder (line 56) | def build_encoder(cls, args, src_dict, embed_tokens):
    method build_decoder (line 60) | def build_decoder(cls, args, tgt_dict, embed_tokens):
  class TransformerMonotonicEncoder (line 64) | class TransformerMonotonicEncoder(TransformerEncoder):
    method __init__ (line 65) | def __init__(self, args, dictionary, embed_tokens):
  class TransformerMonotonicDecoder (line 78) | class TransformerMonotonicDecoder(TransformerDecoder):
    method __init__ (line 91) | def __init__(self, args, dictionary, embed_tokens, no_encoder_attn=Fal...
    method set_num_updates (line 105) | def set_num_updates(self, num_updates):
    method pre_attention (line 108) | def pre_attention(
    method post_attention (line 155) | def post_attention(self, x):
    method clean_cache (line 167) | def clean_cache(
    method extract_features (line 185) | def extract_features(
  function base_monotonic_architecture (line 264) | def base_monotonic_architecture(args):
  function transformer_monotonic_iwslt_de_en (line 272) | def transformer_monotonic_iwslt_de_en(args):
  function transformer_monotonic_vaswani_wmt_en_de_big (line 281) | def transformer_monotonic_vaswani_wmt_en_de_big(args):
  function transformer_monotonic_vaswani_wmt_en_fr_big (line 288) | def transformer_monotonic_vaswani_wmt_en_fr_big(args):
  function transformer_unidirectional_iwslt_de_en (line 295) | def transformer_unidirectional_iwslt_de_en(args):
  function monotonic_tiny_architecture (line 300) | def monotonic_tiny_architecture(args):

FILE: examples/simultaneous_translation/modules/fixed_pre_decision.py
  function fixed_pooling_monotonic_attention (line 17) | def fixed_pooling_monotonic_attention(monotonic_attention):
  class WaitKAttentionFixedStride (line 177) | class WaitKAttentionFixedStride:
  class MonotonicAttentionFixedStride (line 183) | class MonotonicAttentionFixedStride:
  class MonotonicInfiniteLookbackAttentionFixedStride (line 189) | class MonotonicInfiniteLookbackAttentionFixedStride:

FILE: examples/simultaneous_translation/modules/monotonic_multihead_attention.py
  class MonotonicAttention (line 29) | class MonotonicAttention(MultiheadAttention):
    method __init__ (line 36) | def __init__(self, args):
    method add_args (line 67) | def add_args(parser):
    method energy_from_qk (line 90) | def energy_from_qk(
    method p_choose_from_qk (line 134) | def p_choose_from_qk(self, query, key, key_padding_mask, incremental_s...
    method p_choose (line 151) | def p_choose(self, query, key, key_padding_mask, incremental_states=No...
    method monotonic_attention_process_infer (line 154) | def monotonic_attention_process_infer(
    method monotonic_attention_process_train (line 272) | def monotonic_attention_process_train(
    method forward (line 325) | def forward(
    method _get_monotonic_buffer (line 402) | def _get_monotonic_buffer(self, incremental_state: Optional[Dict[str, ...
    method _set_monotonic_buffer (line 413) | def _set_monotonic_buffer(self, incremental_state: Optional[Dict[str, ...
  class MonotonicInfiniteLookbackAttention (line 422) | class MonotonicInfiniteLookbackAttention(
    method __init__ (line 425) | def __init__(self, args):
    method init_soft_attention (line 430) | def init_soft_attention(self):
  class WaitKAttention (line 451) | class WaitKAttention(
    method __init__ (line 459) | def __init__(self, args):
    method add_args (line 470) | def add_args(parser):
    method p_choose_from_qk (line 480) | def p_choose_from_qk(
  class ChunkwiseAttention (line 503) | class ChunkwiseAttention(
    method __init__ (line 506) | def __init__(self, args):
    method add_args (line 512) | def add_args(parser):

FILE: examples/simultaneous_translation/modules/monotonic_transformer_layer.py
  class TransformerMonotonicEncoderLayer (line 16) | class TransformerMonotonicEncoderLayer(TransformerEncoderLayer):
    method forward (line 17) | def forward(self, x, encoder_padding_mask):
  class TransformerMonotonicDecoderLayer (line 24) | class TransformerMonotonicDecoderLayer(TransformerDecoderLayer):
    method __init__ (line 25) | def __init__(self, args):
    method prune_incremental_state (line 31) | def prune_incremental_state(
    method forward (line 48) | def forward(

FILE: examples/simultaneous_translation/tests/test_alignment_train.py
  class AlignmentTrainTest (line 15) | class AlignmentTrainTest(TestCase):
    method _test_custom_alignment_train_ref (line 16) | def _test_custom_alignment_train_ref(self, p_choose, eps):
    method _test_custom_alignment_train_impl (line 48) | def _test_custom_alignment_train_impl(self, p_choose, alpha, eps):
    method test_alignment_train (line 63) | def test_alignment_train(self, bsz, tgt_len, src_len, device):

FILE: examples/simultaneous_translation/tests/test_text_models.py
  function generate_config (line 28) | def generate_config(overrides_kv):
  function make_sample_with_padding (line 35) | def make_sample_with_padding(longer_src=False) -> Dict[str, Any]:
  function build_transformer_monotonic_attention (line 71) | def build_transformer_monotonic_attention(**extra_args: Any):
  function expected_alignment_formula (line 98) | def expected_alignment_formula(
  function mass_perservation_formula (line 156) | def mass_perservation_formula(alpha, left_padding=False, padding_mask=No...
  function expected_soft_attention_formula (line 186) | def expected_soft_attention_formula(
  class MonotonicAttentionTestAbstractClass (line 227) | class MonotonicAttentionTestAbstractClass(object):
    method test_forward (line 228) | def test_forward(self):
    method test_p_choose (line 234) | def test_p_choose(self):
    method test_expected_alignment (line 242) | def test_expected_alignment(self):
  class HardMonotonicAttentionTestCase (line 265) | class HardMonotonicAttentionTestCase(
    method setUp (line 269) | def setUp(self):
  class InfiniteLookbackTestCase (line 275) | class InfiniteLookbackTestCase(
    method setUp (line 279) | def setUp(self):
    method test_fp16_for_long_input (line 289) | def test_fp16_for_long_input(self):
    method test_expected_attention (line 304) | def test_expected_attention(self):
  class ChunkwiswTestCase (line 345) | class ChunkwiswTestCase(
    method setUp (line 348) | def setUp(self):
  class WaitkTestCase (line 359) | class WaitkTestCase(InfiniteLookbackTestCase):
    method setUp (line 360) | def setUp(self):
    method check_waitk (line 370) | def check_waitk(self, p_choose, lagging, padding_mask):
    method test_waitk_p_choose (line 381) | def test_waitk_p_choose(self):

FILE: examples/simultaneous_translation/utils/functions.py
  function prob_check (line 9) | def prob_check(tensor, eps=1e-10):
  function exclusive_cumprod (line 20) | def exclusive_cumprod(tensor, dim: int, eps: float = 1e-10):
  function safe_cumprod (line 48) | def safe_cumprod(tensor, dim: int, eps: float = 1e-10):
  function moving_sum (line 69) | def moving_sum(x, start_idx: int, end_idx: int):

FILE: examples/simultaneous_translation/utils/monotonic_attention.py
  function expected_alignment_from_p_choose (line 12) | def expected_alignment_from_p_choose(
  function expected_soft_attention (line 62) | def expected_soft_attention(
  function mass_preservation (line 137) | def mass_preservation(

FILE: examples/simultaneous_translation/utils/p_choose_strategy.py
  function waitk_p_choose (line 6) | def waitk_p_choose(
  function learnable_p_choose (line 102) | def learnable_p_choose(

FILE: examples/speech_recognition/criterions/ASG_loss.py
  class ASGCriterion (line 15) | class ASGCriterion(FairseqCriterion):
    method add_args (line 17) | def add_args(parser):
    method __init__ (line 40) | def __init__(
    method build_criterion (line 74) | def build_criterion(cls, args, task):
    method linseg_step (line 84) | def linseg_step(self):
    method replace_eos_with_silence (line 98) | def replace_eos_with_silence(self, tgt):
    method forward (line 106) | def forward(self, model, sample, reduce=True):
    method aggregate_logging_outputs (line 158) | def aggregate_logging_outputs(logging_outputs):

FILE: examples/speech_recognition/criterions/cross_entropy_acc.py
  class CrossEntropyWithAccCriterion (line 18) | class CrossEntropyWithAccCriterion(FairseqCriterion):
    method __init__ (line 19) | def __init__(self, task, sentence_avg):
    method compute_loss (line 23) | def compute_loss(self, model, net_output, target, reduction, log_probs):
    method get_logging_output (line 46) | def get_logging_output(self, sample, target, lprobs, loss):
    method forward (line 69) | def forward(self, model, sample, reduction="sum", log_probs=True):
    method aggregate_logging_outputs (line 103) | def aggregate_logging_outputs(logging_outputs):

FILE: examples/speech_recognition/data/asr_dataset.py
  class AsrDataset (line 15) | class AsrDataset(FairseqDataset):
    method __init__ (line 33) | def __init__(
    method __getitem__ (line 74) | def __getitem__(self, index):
    method __len__ (line 94) | def __len__(self):
    method collater (line 97) | def collater(self, samples):
    method num_tokens (line 108) | def num_tokens(self, index):
    method size (line 111) | def size(self, index):
    method ordered_indices (line 119) | def ordered_indices(self):

FILE: examples/speech_recognition/data/collaters.py
  class Seq2SeqCollater (line 21) | class Seq2SeqCollater(object):
    method __init__ (line 29) | def __init__(
    method _collate_frames (line 43) | def _collate_frames(self, frames):
    method collate (line 60) | def collate(self, samples):

FILE: examples/speech_recognition/data/data_utils.py
  function calc_mean_invstddev (line 9) | def calc_mean_invstddev(feature):
  function apply_mv_norm (line 21) | def apply_mv_norm(features):
  function lengths_to_encoder_padding_mask (line 31) | def lengths_to_encoder_padding_mask(lengths, batch_first=False):
  function encoder_padding_mask_to_lengths (line 67) | def encoder_padding_mask_to_lengths(

FILE: examples/speech_recognition/data/replabels.py
  function replabel_symbol (line 13) | def replabel_symbol(i):
  function pack_replabels (line 21) | def pack_replabels(tokens, dictionary, max_reps):
  function unpack_replabels (line 49) | def unpack_replabels(tokens, dictionary, max_reps):

FILE: examples/speech_recognition/datasets/asr_prep_json.py
  function process_sample (line 24) | def process_sample(aud_path, lable, utt_id, sp, tgt_dict):
  function main (line 43) | def main():

FILE: examples/speech_recognition/infer.py
  function add_asr_eval_argument (line 31) | def add_asr_eval_argument(parser):
  function check_args (line 89) | def check_args(args):
  function get_dataset_itr (line 100) | def get_dataset_itr(args, task, models):
  function process_predictions (line 115) | def process_predictions(
  function prepare_result_files (line 158) | def prepare_result_files(args):
  function optimize_models (line 181) | def optimize_models(args, use_cuda, models):
  function apply_half (line 194) | def apply_half(t):
  class ExistingEmissionsDecoder (line 200) | class ExistingEmissionsDecoder(object):
    method __init__ (line 201) | def __init__(self, decoder, emissions):
    method generate (line 205) | def generate(self, models, sample, **unused):
  function main (line 216) | def main(args, task=None, model_state=None):
  function make_parser (line 423) | def make_parser():
  function cli_main (line 429) | def cli_main():

FILE: examples/speech_recognition/kaldi/add-self-loop-simple.cc
  function int32 (line 22) | int32 AddSelfLoopsSimple(fst::StdVectorFst* fst) {
  function print_usage (line 67) | void print_usage() {
  function main (line 73) | int main(int argc, char** argv) {

FILE: examples/speech_recognition/kaldi/kaldi_decoder.py
  class KaldiDecoderConfig (line 26) | class KaldiDecoderConfig(FairseqDataclass):
  class KaldiDecoder (line 50) | class KaldiDecoder(object):
    method __init__ (line 51) | def __init__(
    method generate (line 127) | def generate(self, models, sample, **unused):
    method get_emissions (line 137) | def get_emissions(self, models, encoder_input):
    method decode_one (line 177) | def decode_one(self, logits, padding):
    method decode (line 233) | def decode(self, emissions, padding):

FILE: examples/speech_recognition/kaldi/kaldi_initializer.py
  class KaldiInitializerConfig (line 30) | class KaldiInitializerConfig(FairseqDataclass):
  function create_units (line 42) | def create_units(fst_dir: Path, in_labels: str, vocab: Dictionary) -> Path:
  function create_lexicon (line 58) | def create_lexicon(
  function create_G (line 132) | def create_G(
  function create_L (line 154) | def create_L(
  function create_LG (line 230) | def create_LG(
  function create_H (line 290) | def create_H(
  function create_HLGa (line 430) | def create_HLGa(
  function create_HLa (line 493) | def create_HLa(
  function create_HLG (line 556) | def create_HLG(
  function initalize_kaldi (line 612) | def initalize_kaldi(cfg: KaldiInitializerConfig) -> Path:
  function cli_main (line 673) | def cli_main(cfg: KaldiInitializerConfig) -> None:

FILE: examples/speech_recognition/models/vggtransformer.py
  class VGGTransformerModel (line 31) | class VGGTransformerModel(FairseqEncoderDecoderModel):
    method __init__ (line 37) | def __init__(self, encoder, decoder):
    method add_args (line 41) | def add_args(parser):
    method build_encoder (line 124) | def build_encoder(cls, args, task):
    method build_decoder (line 134) | def build_decoder(cls, args, task):
    method build_model (line 144) | def build_model(cls, args, task):
    method get_normalized_probs (line 154) | def get_normalized_probs(self, net_output, log_probs, sample=None):
  function prepare_transformer_encoder_params (line 176) | def prepare_transformer_encoder_params(
  function prepare_transformer_decoder_params (line 196) | def prepare_transformer_decoder_params(
  class VGGTransformerEncoder (line 217) | class VGGTransformerEncoder(FairseqEncoder):
    method __init__ (line 220) | def __init__(
    method forward (line 328) | def forward(self, src_tokens, src_lengths, **kwargs):
    method infer_conv_output_dim (line 391) | def infer_conv_output_dim(self, in_channels, input_dim):
    method validate_transformer_config (line 401) | def validate_transformer_config(self, transformer_config):
    method parse_transformer_context (line 412) | def parse_transformer_context(self, transformer_context):
    method parse_transformer_sampling (line 446) | def parse_transformer_sampling(self, transformer_sampling, num_layers):
    method slice (line 484) | def slice(self, embedding, padding_mask, attn_mask, sampling_factor):
    method lengths_to_attn_mask (line 498) | def lengths_to_attn_mask(self, input_lengths, subsampling_factor=1):
    method reorder_encoder_out (line 550) | def reorder_encoder_out(self, encoder_out, new_order):
  class TransformerDecoder (line 561) | class TransformerDecoder(FairseqIncrementalDecoder):
    method __init__ (line 575) | def __init__(
    method forward (line 629) | def forward(self, prev_output_tokens, encoder_out=None, incremental_st...
    method buffered_future_mask (line 700) | def buffered_future_mask(self, tensor):
    method _transpose_if_training (line 716) | def _transpose_if_training(self, x, incremental_state):
    method _transpose_if_inference (line 721) | def _transpose_if_inference(self, x, incremental_state):
  class VGGTransformerEncoderModel (line 728) | class VGGTransformerEncoderModel(FairseqEncoderModel):
    method __init__ (line 729) | def __init__(self, encoder):
    method add_args (line 733) | def add_args(parser):
    method build_model (line 794) | def build_model(cls, args, task):
    method get_normalized_probs (line 809) | def get_normalized_probs(self, net_output, log_probs, sample=None):
  class VGGTransformerEncoderOnly (line 819) | class VGGTransformerEncoderOnly(VGGTransformerEncoder):
    method __init__ (line 820) | def __init__(
    method forward (line 842) | def forward(self, src_tokens, src_lengths, **kwargs):
    method max_positions (line 858) | def max_positions(self):
  function Embedding (line 863) | def Embedding(num_embeddings, embedding_dim, padding_idx):
  function Linear (line 870) | def Linear(in_features, out_features, bias=True, dropout=0):
  function LinearizedConv1d (line 879) | def LinearizedConv1d(in_channels, out_channels, kernel_size, dropout=0, ...
  function LayerNorm (line 888) | def LayerNorm(embedding_dim):
  function base_architecture (line 894) | def base_architecture(args):
  function vggtransformer_1 (line 913) | def vggtransformer_1(args):
  function vggtransformer_2 (line 934) | def vggtransformer_2(args):
  function vggtransformer_base (line 955) | def vggtransformer_base(args):
  function base_architecture_enconly (line 992) | def base_architecture_enconly(args):
  function vggtransformer_enc_1 (line 1007) | def vggtransformer_enc_1(args):

FILE: examples/speech_recognition/models/w2l_conv_glu_enc.py
  class W2lConvGluEncoderModel (line 44) | class W2lConvGluEncoderModel(FairseqEncoderModel):
    method __init__ (line 45) | def __init__(self, encoder):
    method add_args (line 49) | def add_args(parser):
    method build_model (line 74) | def build_model(cls, args, task):
    method get_normalized_probs (line 85) | def get_normalized_probs(self, net_output, log_probs, sample=None):
  class W2lConvGluEncoder (line 91) | class W2lConvGluEncoder(FairseqEncoder):
    method __init__ (line 92) | def __init__(
    method forward (line 123) | def forward(self, src_tokens, src_lengths, **kwargs):
    method reorder_encoder_out (line 159) | def reorder_encoder_out(self, encoder_out, new_order):
    method max_positions (line 168) | def max_positions(self):
  function w2l_conv_glu_enc (line 174) | def w2l_conv_glu_enc(args):

FILE: examples/speech_recognition/new/decoders/base_decoder.py
  class BaseDecoder (line 14) | class BaseDecoder:
    method __init__ (line 15) | def __init__(self, tgt_dict: Dictionary) -> None:
    method generate (line 31) | def generate(
    method get_emissions (line 40) | def get_emissions(
    method get_tokens (line 53) | def get_tokens(self, idxs: torch.IntTensor) -> torch.LongTensor:
    method decode (line 58) | def decode(

FILE: examples/speech_recognition/new/decoders/decoder.py
  function Decoder (line 16) | def Decoder(

FILE: examples/speech_recognition/new/decoders/decoder_config.py
  class DecoderConfig (line 19) | class DecoderConfig(FairseqDataclass):
  class FlashlightDecoderConfig (line 27) | class FlashlightDecoderConfig(FairseqDataclass):

FILE: examples/speech_recognition/new/decoders/flashlight_decoder.py
  class KenLMDecoder (line 54) | class KenLMDecoder(BaseDecoder):
    method __init__ (line 55) | def __init__(self, cfg: FlashlightDecoderConfig, tgt_dict: Dictionary)...
    method get_timesteps (line 123) | def get_timesteps(self, token_idxs: List[int]) -> List[int]:
    method decode (line 144) | def decode(
  class FairseqLM (line 181) | class FairseqLM(LM):
    method __init__ (line 182) | def __init__(self, dictionary: Dictionary, model: FairseqModel) -> None:
    method start (line 200) | def start(self, start_with_nothing: bool) -> LMState:
    method score (line 217) | def score(
    method finish (line 297) | def finish(self, state: LMState) -> Tuple[LMState, int]:
    method empty_cache (line 306) | def empty_cache(self) -> None:
  class FairseqLMDecoder (line 312) | class FairseqLMDecoder(BaseDecoder):
    method __init__ (line 313) | def __init__(self, cfg: FlashlightDecoderConfig, tgt_dict: Dictionary)...
    method decode (line 405) | def decode(

FILE: examples/speech_recognition/new/decoders/viterbi_decoder.py
  class ViterbiDecoder (line 15) | class ViterbiDecoder(BaseDecoder):
    method decode (line 16) | def decode(

FILE: examples/speech_recognition/new/infer.py
  class DecodingConfig (line 52) | class DecodingConfig(DecoderConfig, FlashlightDecoderConfig):
  class InferConfig (line 66) | class InferConfig(FairseqDataclass):
  function reset_logging (line 82) | def reset_logging():
  class InferenceProcessor (line 97) | class InferenceProcessor:
    method __init__ (line 100) | def __init__(self, cfg: InferConfig) -> None:
    method __enter__ (line 151) | def __enter__(self) -> "InferenceProcessor":
    method __exit__ (line 160) | def __exit__(self, *exc) -> bool:
    method __iter__ (line 169) | def __iter__(self) -> Any:
    method log (line 179) | def log(self, *args, **kwargs):
    method print (line 182) | def print(self, *args, **kwargs):
    method get_res_file (line 185) | def get_res_file(self, fname: str) -> None:
    method merge_shards (line 191) | def merge_shards(self) -> None:
    method optimize_model (line 223) | def optimize_model(self, model: FairseqModel) -> None:
    method load_model_ensemble (line 230) | def load_model_ensemble(self) -> Tuple[List[FairseqModel], FairseqData...
    method get_dataset_itr (line 244) | def get_dataset_itr(self, disable_iterator_cache: bool = False) -> None:
    method build_progress_bar (line 260) | def build_progress_bar(
    method data_parallel_world_size (line 277) | def data_parallel_world_size(self):
    method data_parallel_rank (line 283) | def data_parallel_rank(self):
    method process_sentence (line 288) | def process_sentence(
    method process_sample (line 330) | def process_sample(self, sample: Dict[str, Any]) -> None:
    method log_generation_time (line 357) | def log_generation_time(self) -> None:
  function parse_wer (line 369) | def parse_wer(wer_file: Path) -> float:
  function get_wer_file (line 374) | def get_wer_file(cfg: InferConfig) -> Path:
  function main (line 388) | def main(cfg: InferConfig) -> float:
  function hydra_main (line 444) | def hydra_main(cfg: InferConfig) -> Union[float, Tuple[float, Optional[f...
  function cli_main (line 479) | def cli_main() -> None:

FILE: examples/speech_recognition/tasks/speech_recognition.py
  function get_asr_dataset_from_json (line 18) | def get_asr_dataset_from_json(data_json_path, tgt_dict):
  class SpeechRecognitionTask (line 69) | class SpeechRecognitionTask(LegacyFairseqTask):
    method add_args (line 75) | def add_args(parser):
    method __init__ (line 96) | def __init__(self, args, tgt_dict):
    method setup_task (line 101) | def setup_task(cls, args, **kwargs):
    method load_dataset (line 117) | def load_dataset(self, split, combine=False, **kwargs):
    method build_generator (line 126) | def build_generator(self, models, args, **unused):
    method target_dictionary (line 144) | def target_dictionary(self):
    method source_dictionary (line 150) | def source_dictionary(self):
    method max_positions (line 155) | def max_positions(self):

FILE: examples/speech_recognition/utils/wer_utils.py
  class Code (line 24) | class Code(Enum):
  class Token (line 31) | class Token(object):
    method __init__ (line 32) | def __init__(self, lbl="", st=np.nan, en=np.nan):
  class AlignmentResult (line 39) | class AlignmentResult(object):
    method __init__ (line 40) | def __init__(self, refs, hyps, codes, score):
  function coordinate_to_offset (line 47) | def coordinate_to_offset(row, col, ncols):
  function offset_to_row (line 51) | def offset_to_row(offset, ncols):
  function offset_to_col (line 55) | def offset_to_col(offset, ncols):
  function trimWhitespace (line 59) | def trimWhitespace(str):
  function str2toks (line 63) | def str2toks(str):
  class EditDistance (line 71) | class EditDistance(object):
    method __init__ (line 72) | def __init__(self, time_mediated):
    method cost (line 80) | def cost(self, ref, hyp, code):
    method get_result (line 98) | def get_result(self, refs, hyps):
    method align (line 141) | def align(self, refs, hyps):
  class WERTransformer (line 205) | class WERTransformer(object):
    method __init__ (line 206) | def __init__(self, hyp_str, ref_str, verbose=True):
    method process (line 221) | def process(self, input):  # std::vector<std::string>&& input
    method report_result (line 294) | def report_result(self):
    method wer (line 320) | def wer(self):
    method stats (line 331) | def stats(self):
  function calc_wer (line 354) | def calc_wer(hyp_str, ref_str):
  function calc_wer_stats (line 359) | def calc_wer_stats(hyp_str, ref_str):
  function get_wer_alignment_codes (line 364) | def get_wer_alignment_codes(hyp_str, ref_str):
  function merge_counts (line 373) | def merge_counts(x, y):

FILE: examples/speech_recognition/w2l_decoder.py
  class W2lDecoder (line 49) | class W2lDecoder(object):
    method __init__ (line 50) | def __init__(self, args, tgt_dict):
    method generate (line 70) | def generate(self, models, sample, **unused):
    method get_emissions (line 80) | def get_emissions(self, models, encoder_input):
    method get_tokens (line 90) | def get_tokens(self, idxs):
  class W2lViterbiDecoder (line 97) | class W2lViterbiDecoder(W2lDecoder):
    method __init__ (line 98) | def __init__(self, args, tgt_dict):
    method decode (line 101) | def decode(self, emissions):
  class W2lKenLMDecoder (line 125) | class W2lKenLMDecoder(W2lDecoder):
    method __init__ (line 126) | def __init__(self, args, tgt_dict):
    method get_timesteps (line 198) | def get_timesteps(self, token_idxs: List[int]) -> List[int]:
    method decode (line 219) | def decode(self, emissions):
  class FairseqLM (line 246) | class FairseqLM(LM):
    method __init__ (line 247) | def __init__(self, dictionary, model):
    method start (line 263) | def start(self, start_with_nothing):
    method score (line 280) | def score(self, state: LMState, token_index: int, no_cache: bool = Fal...
    method finish (line 356) | def finish(self, state: LMState):
    method empty_cache (line 366) | def empty_cache(self):
  class W2lFairseqLMDecoder (line 372) | class W2lFairseqLMDecoder(W2lDecoder):
    method __init__ (line 373) | def __init__(self, args, tgt_dict):
    method decode (line 462) | def decode(self, emissions):

FILE: examples/speech_synthesis/data_utils.py
  function trim_or_pad_to_target_length (line 26) | def trim_or_pad_to_target_length(
  function extract_logmel_spectrogram (line 46) | def extract_logmel_spectrogram(
  function extract_pitch (line 79) | def extract_pitch(
  function extract_energy (line 137) | def extract_energy(
  function get_global_cmvn (line 190) | def get_global_cmvn(feature_root: Path, output_path: Optional[Path] = No...
  function ipa_phonemize (line 223) | def ipa_phonemize(text, lang="en-us", use_g2p=False):
  class ForceAlignmentInfo (line 249) | class ForceAlignmentInfo(object):
  function get_mfa_alignment_by_sample_id (line 256) | def get_mfa_alignment_by_sample_id(
  function get_mfa_alignment (line 299) | def get_mfa_alignment(
  function get_unit_alignment (line 310) | def get_unit_alignment(
  function get_feature_value_min_max (line 333) | def get_feature_value_min_max(feature_paths: List[str]):

FILE: examples/speech_synthesis/evaluation/eval_asr.py
  function preprocess_text (line 17) | def preprocess_text(text):
  function prepare_w2v_data (line 23) | def prepare_w2v_data(
  function run_asr (line 44) | def run_asr(asr_dir, split, w2v_ckpt, w2v_label, res_dir):
  function compute_error_rate (line 61) | def compute_error_rate(hyp_wrd_path, ref_wrd_path, unit="word"):
  function main (line 82) | def main(args):

FILE: examples/speech_synthesis/evaluation/eval_f0.py
  function difference_function (line 22) | def difference_function(x, n, tau_max):
  function cumulative_mean_normalized_difference_function (line 49) | def cumulative_mean_normalized_difference_function(df, n):
  function get_pitch (line 64) | def get_pitch(cmdf, tau_min, tau_max, harmo_th=0.1):
  function compute_yin (line 87) | def compute_yin(sig, sr, w_len=512, w_step=256, f0_min=100, f0_max=500,
  function extract_f0 (line 145) | def extract_f0(samples):
  function eval_f0_error (line 171) | def eval_f0_error(samples, distortion_fn):
  function eval_gross_pitch_error (line 194) | def eval_gross_pitch_error(samples):
  function eval_voicing_decision_error (line 198) | def eval_voicing_decision_error(samples):
  function eval_f0_frame_error (line 202) | def eval_f0_frame_error(samples):
  function print_results (line 206) | def print_results(results, show_bin):
  function main (line 236) | def main(eval_f0, gpe, vde, ffe, show_bin):

FILE: examples/speech_synthesis/evaluation/eval_sp.py
  function load_eval_spec (line 24) | def load_eval_spec(path):
  function eval_distortion (line 31) | def eval_distortion(samples, distortion_fn, device="cuda"):
  function eval_mel_cepstral_distortion (line 61) | def eval_mel_cepstral_distortion(samples, device="cuda"):
  function eval_mel_spectral_distortion (line 65) | def eval_mel_spectral_distortion(samples, device="cuda"):
  function print_results (line 69) | def print_results(results, show_bin):
  function main (line 109) | def main(eval_spec, mcd, msd, show_bin):

FILE: examples/speech_synthesis/evaluation/get_eval_manifest.py
  function main (line 11) | def main(args):

FILE: examples/speech_synthesis/generate_waveform.py
  function make_parser (line 28) | def make_parser():
  function postprocess_results (line 44) | def postprocess_results(
  function dump_result (line 67) | def dump_result(
  function main (line 126) | def main(args):
  function cli_main (line 185) | def cli_main():

FILE: examples/speech_synthesis/preprocessing/denoise_and_vad_audio.py
  function generate_tmp_filename (line 37) | def generate_tmp_filename(extension="txt"):
  function convert_sr (line 42) | def convert_sr(inpath, sr, output_path=None):
  function apply_vad (line 50) | def apply_vad(vad, inpath):
  function write (line 73) | def write(wav, filename, sr=16_000):
  function process (line 80) | def process(args):
  function main (line 174) | def main():

FILE: examples/speech_synthesis/preprocessing/denoiser/demucs.py
  class BLSTM (line 19) | class BLSTM(nn.Module):
    method __init__ (line 20) | def __init__(self, dim, layers=2, bi=True):
    method forward (line 30) | def forward(self, x, hidden=None):
  function rescale_conv (line 37) | def rescale_conv(conv, reference):
  function rescale_module (line 45) | def rescale_module(module, reference):
  class Demucs (line 51) | class Demucs(nn.Module):
    method __init__ (line 75) | def __init__(self,
    method valid_length (line 136) | def valid_length(self, length):
    method total_stride (line 155) | def total_stride(self):
    method forward (line 158) | def forward(self, mix):
  function fast_conv (line 197) | def fast_conv(conv, x):
  class DemucsStreamer (line 218) | class DemucsStreamer:
    method __init__ (line 235) | def __init__(self, demucs,
    method reset_time_per_frame (line 268) | def reset_time_per_frame(self):
    method time_per_frame (line 273) | def time_per_frame(self):
    method flush (line 276) | def flush(self):
    method feed (line 288) | def feed(self, wav):
    method _separate_frame (line 355) | def _separate_frame(self, frame):
  function test (line 426) | def test():

FILE: examples/speech_synthesis/preprocessing/denoiser/pretrained.py
  function _demucs (line 22) | def _demucs(pretrained, url, **kwargs):
  function dns48 (line 30) | def dns48(pretrained=True):
  function dns64 (line 34) | def dns64(pretrained=True):
  function master64 (line 38) | def master64(pretrained=True):
  function add_model_flags (line 42) | def add_model_flags(parser):
  function get_model (line 61) | def get_model(args):

FILE: examples/speech_synthesis/preprocessing/denoiser/resample.py
  function sinc (line 14) | def sinc(t):
  function kernel_upsample2 (line 23) | def kernel_upsample2(zeros=56):
  function upsample2 (line 35) | def upsample2(x, zeros=56):
  function kernel_downsample2 (line 51) | def kernel_downsample2(zeros=56):
  function downsample2 (line 63) | def downsample2(x, zeros=56):

FILE: examples/speech_synthesis/preprocessing/denoiser/utils.py
  function capture_init (line 19) | def capture_init(init):
  function deserialize_model (line 33) | def deserialize_model(package, strict=False):
  function copy_state (line 52) | def copy_state(state):
  function serialize_model (line 56) | def serialize_model(model):
  function swap_state (line 63) | def swap_state(model, state):
  function pull_metric (line 80) | def pull_metric(history, name):
  class LogProgress (line 88) | class LogProgress:
    method __init__ (line 101) | def __init__(self,
    method update (line 115) | def update(self, **infos):
    method __iter__ (line 118) | def __iter__(self):
    method __next__ (line 125) | def __next__(self):
    method _log (line 139) | def _log(self):
  function colorize (line 154) | def colorize(text, color):
  function bold (line 163) | def bold(text):
  function cal_snr (line 170) | def cal_snr(lbl, est):

FILE: examples/speech_synthesis/preprocessing/get_common_voice_audio_manifest.py
  function get_top_n (line 25) | def get_top_n(
  function get_splits (line 45) | def get_splits(
  function convert_to_wav (line 77) | def convert_to_wav(root: Path, filenames: List[str], target_sr=16_000):
  function process (line 92) | def process(args):
  function main (line 128) | def main():

FILE: examples/speech_synthesis/preprocessing/get_feature_manifest.py
  function process (line 36) | def process(args):
  function main (line 233) | def main():

FILE: examples/speech_synthesis/preprocessing/get_ljspeech_audio_manifest.py
  function process (line 23) | def process(args):
  function main (line 60) | def main():

FILE: examples/speech_synthesis/preprocessing/get_speaker_embedding.py
  function extract_embedding (line 22) | def extract_embedding(audio_path, embedder):
  function process (line 35) | def process(args):
  function main (line 74) | def main():

FILE: examples/speech_synthesis/preprocessing/get_vctk_audio_manifest.py
  function normalize_text (line 25) | def normalize_text(text):
  function process (line 29) | def process(args):
  function main (line 66) | def main():

FILE: examples/speech_synthesis/preprocessing/speaker_embedder/__init__.py
  function set_requires_grad (line 26) | def set_requires_grad(nets, requires_grad=False):
  class LinearNorm (line 41) | class LinearNorm(nn.Module):
    method __init__ (line 42) | def __init__(self, hp):
    method forward (line 46) | def forward(self, x):
  class SpeechEmbedder (line 50) | class SpeechEmbedder(nn.Module):
    method __init__ (line 51) | def __init__(self, hp):
    method forward (line 60) | def forward(self, mel):
  class SpkrEmbedder (line 75) | class SpkrEmbedder(nn.Module):
    method __init__ (line 78) | def __init__(
    method get_mel (line 110) | def get_mel(self, y):
    method forward (line 123) | def forward(self, inputs):

FILE: examples/speech_synthesis/preprocessing/vad/__init__.py
  function read_wave (line 26) | def read_wave(path):
  function write_wave (line 41) | def write_wave(path, audio, sample_rate):
  class Frame (line 52) | class Frame(object):
    method __init__ (line 54) | def __init__(self, bytes, timestamp, duration):
  function frame_generator (line 60) | def frame_generator(frame_duration_ms, audio, sample_rate):
  function vad_collector (line 76) | def vad_collector(sample_rate, frame_duration_ms,
  function main (line 145) | def main(args):

FILE: examples/speech_synthesis/utils.py
  function batch_mel_spectral_distortion (line 16) | def batch_mel_spectral_distortion(
  function _same_t_in_true_and_est (line 39) | def _same_t_in_true_and_est(func):
  function gross_pitch_error (line 55) | def gross_pitch_error(true_t, true_f, est_t, est_f):
  function _gross_pitch_error_frames (line 69) | def _gross_pitch_error_frames(true_t, true_f, est_t, est_f, eps=1e-8):
  function _true_voiced_frames (line 76) | def _true_voiced_frames(true_t, true_f, est_t, est_f):
  function _voicing_decision_error_frames (line 80) | def _voicing_decision_error_frames(true_t, true_f, est_t, est_f):
  function f0_frame_error (line 85) | def f0_frame_error(true_t, true_f, est_t, est_f):
  function voicing_decision_error (line 97) | def voicing_decision_error(true_t, true_f, est_t, est_f):

FILE: examples/speech_text_joint_to_text/criterions/multi_modality_compound.py
  class SpeechTextPreTrainCompoundCriterionConfig (line 23) | class SpeechTextPreTrainCompoundCriterionConfig(
  class SpeechTextPreTrainCompoundCriterion (line 43) | class SpeechTextPreTrainCompoundCriterion(FairseqCriterion):
    method __init__ (line 44) | def __init__(
    method forward (line 65) | def forward(self, model, sample, reduce=True):
    method logging_outputs_can_be_summed (line 84) | def logging_outputs_can_be_summed() -> bool:
    method mode2value (line 93) | def mode2value(mode):  # make the logging_outputs_can_be_summed = True
    method value2mode (line 101) | def value2mode(value):
    method reduce_metrics (line 109) | def reduce_metrics(logging_outputs) -> None:

FILE: examples/speech_text_joint_to_text/criterions/multi_modality_cross_entropy.py
  class SpeechTextPreTrainCrossEntCriterion (line 20) | class SpeechTextPreTrainCrossEntCriterion(LabelSmoothedCrossEntropyCrite...
    method __init__ (line 21) | def __init__(self, task, sentence_avg, label_smoothing, report_accurac...
    method forward (line 26) | def forward(self, model, sample, reduce=True):
    method get_lprobs_and_target (line 44) | def get_lprobs_and_target(self, model, net_output, sample):
    method compute_loss (line 57) | def compute_loss(self, model, net_output, sample, reduce=True):

FILE: examples/speech_text_joint_to_text/criterions/text_guide_cross_entropy_acc.py
  class GuidedCrossEntAccCriterion (line 16) | class GuidedCrossEntAccCriterion(FairseqCriterion):
    method __init__ (line 17) | def __init__(
    method add_args (line 44) | def add_args(parser):
    method forward (line 60) | def forward(self, model, sample, reduce=True):
    method compute_loss_and_acc (line 103) | def compute_loss_and_acc(self, model, lprobs, target, reduction='sum'):
    method guide_loss_and_acc (line 117) | def guide_loss_and_acc(self, model, lprobs, lprobs_teacher, target, re...
    method get_logging_output (line 140) | def get_logging_output(
    method aggregate_logging_outputs (line 180) | def aggregate_logging_outputs(logging_outputs):
    method reduce_metrics (line 218) | def reduce_metrics(cls, logging_outputs):

FILE: examples/speech_text_joint_to_text/data/pair_denoising_dataset.py
  class LanguagePairDenoisingDataset (line 18) | class LanguagePairDenoisingDataset(LanguagePairDataset):
    method __init__ (line 19) | def __init__(
    method can_reuse_epoch_itr_across_epochs (line 123) | def can_reuse_epoch_itr_across_epochs(self):
    method set_epoch (line 126) | def set_epoch(self, epoch, **unused):
    method __getitem__ (line 129) | def __getitem__(self, index):
    method word_starts (line 177) | def word_starts(self, source):
    method add_whole_word_mask (line 186) | def add_whole_word_mask(self, source, p):
    method add_insertion_noise (line 297) | def add_insertion_noise(self, tokens, p):

FILE: examples/speech_text_joint_to_text/models/joint_speech_text_pretrain_transformer.py
  class SpeechTextPreTrainEncoder (line 35) | class SpeechTextPreTrainEncoder(MultiModalityEncoder):
    method __init__ (line 36) | def __init__(
    method update_transformer_encoder_cfg (line 51) | def update_transformer_encoder_cfg(cls, args, update_dict):
    method build_text_encoder (line 60) | def build_text_encoder(cls, args, src_dictionary):
    method build_speech_encoder (line 71) | def build_speech_encoder(cls, args):
    method share_layers (line 83) | def share_layers(cls, src_layers, tgt_layers):  # share layer but not ...
    method build_unsup_speech_encoder (line 101) | def build_unsup_speech_encoder(cls, args, sup_speech_encoder):
    method build_encoder (line 130) | def build_encoder(cls, args, dictionary):
    method share_speech_text_encoder (line 185) | def share_speech_text_encoder(
    method select_encoder (line 202) | def select_encoder(self, mode, **kwargs):
    method forward (line 217) | def forward(self, src_tokens, src_lengths=None, mode="", alignment=Non...
  class SpeechDummyDecoder (line 222) | class SpeechDummyDecoder(FairseqDecoder):
    method __init__ (line 223) | def __init__(
    method extend_alignment (line 238) | def extend_alignment(self, alignment, src_lengths, prev_output_tokens):
    method forward (line 261) | def forward(
    method get_normalized_probs (line 353) | def get_normalized_probs(
  class SpeechTextPreTrainDecoder (line 364) | class SpeechTextPreTrainDecoder(MultiInputDecoder):
    method __init__ (line 365) | def __init__(self, dictionary, speech_decoder, text_decoder):
    method select_decoder (line 370) | def select_decoder(self, mode, **kwargs):
    method get_normalized_probs (line 387) | def get_normalized_probs(
    method build_text_decoder (line 401) | def build_text_decoder(cls, args, tgt_dictionary, dec_emb_share=None):
    method build_dummy_speech_decoder (line 413) | def build_dummy_speech_decoder(cls, args, dictionary, dec_emb_share=No...
    method build_decoder (line 428) | def build_decoder(
  class SpeechTextPreTrainModel (line 444) | class SpeechTextPreTrainModel(FairseqEncoderDecoderModel):
    method __init__ (line 445) | def __init__(self, encoder, decoder):
    method forward (line 449) | def forward(
    method max_positions (line 463) | def max_positions(self):
    method get_targets (line 466) | def get_targets(self, sample, net_output):
    method get_normalized_probs (line 474) | def get_normalized_probs(
    method add_args (line 486) | def add_args(parser):
    method build_model (line 550) | def build_model(cls, args, task):
    method upgrade_state_dict (line 558) | def upgrade_state_dict(self, state_dict):
  function speech_text_pretrain_bart_base (line 568) | def speech_text_pretrain_bart_base(args):
  function speech_text_pretrain_bart_base_stack (line 657) | def speech_text_pretrain_bart_base_stack(args):
  function speech_text_pretrain_bart_large (line 671) | def speech_text_pretrain_bart_large(args):
  function speech_text_pretrain_bart_large_stack (line 687) | def speech_text_pretrain_bart_large_stack(args):

FILE: examples/speech_text_joint_to_text/models/s2t_dualinputtransformer.py
  class SpeechEoSEncoder (line 35) | class SpeechEoSEncoder(FairseqEncoder):
    method __init__ (line 36) | def __init__(self, encoder, eos_num, feat_dim, adapter_type="None", ad...
    method add_adapter (line 47) | def add_adapter(self, adapter_type, adapter_dim):
    method add_eos (line 77) | def add_eos(self, src_tokens, src_lengths):
    method apply_adapter (line 94) | def apply_adapter(self, enc_out):
    method forward (line 111) | def forward(self, src_tokens, src_lengths=None, return_all_hiddens=Fal...
    method reorder_encoder_out (line 121) | def reorder_encoder_out(self, encoder_out, new_order):
  class DualInputEncoder (line 125) | class DualInputEncoder(FairseqEncoder):
    method __init__ (line 126) | def __init__(
    method set_shared_layer (line 148) | def set_shared_layer(cls, share_level, src_layer, tgt_layer):
    method build_spch_encoder (line 185) | def build_spch_encoder(cls, args):
    method build_text_encoder (line 221) | def build_text_encoder(cls, args, src_dictionary, spch_encoder):
    method mult_rst_grad (line 278) | def mult_rst_grad(self, rst, ratio):
    method process_attentive_loss_states (line 284) | def process_attentive_loss_states(self, rst, interstates):
    method forward (line 289) | def forward(
    method reorder_encoder_out (line 375) | def reorder_encoder_out(self, encoder_out, new_order):
  class TransformerMultiInputDecoder (line 381) | class TransformerMultiInputDecoder(FairseqDecoder):
    method __init__ (line 382) | def __init__(
    method share_spchdecoder (line 400) | def share_spchdecoder(cls, task_args, text_decoder, spch_decoder):
    method cross_attentive_loss (line 439) | def cross_attentive_loss(
    method forward (line 484) | def forward(
  class DualInputS2TTransformerModel (line 557) | class DualInputS2TTransformerModel(FairseqEncoderDecoderModel):
    method __init__ (line 558) | def __init__(self, encoder, decoder):
    method max_positions (line 562) | def max_positions(self):
    method add_args (line 566) | def add_args(parser):
    method build_encoder (line 783) | def build_encoder(cls, args, task):
    method build_decoder (line 830) | def build_decoder(cls, args, task):
    method build_model (line 907) | def build_model(cls, args, task):
    method get_normalized_probs (line 917) | def get_normalized_probs(self, net_output, log_probs, sample=None):
    method set_num_updates (line 923) | def set_num_updates(self, num_updates):
    method forward (line 928) | def forward(
  function dualinputs2ttransformer_base (line 990) | def dualinputs2ttransformer_base(args):
  function dualinputs2ttransformer_s (line 1045) | def dualinputs2ttransformer_s(args):
  function dualinputs2ttransformer_m (line 1058) | def dualinputs2ttransformer_m(args):
  function dualinputs2ttransformer_b (line 1071) | def dualinputs2ttransformer_b(args):
  function dualinputs2ttransformer_l (line 1084) | def dualinputs2ttransformer_l(args):

FILE: examples/speech_text_joint_to_text/models/s2t_dualinputwavtransformer.py
  class DualInputWavTransformerModel (line 32) | class DualInputWavTransformerModel(DualInputS2TTransformerModel):
    method __init__ (line 33) | def __init__(self, encoder, decoder):
    method add_args (line 37) | def add_args(parser):
    method update_transformer_encoder_cfg (line 223) | def update_transformer_encoder_cfg(cls, args, update_dict):
    method build_text_encoder (line 232) | def build_text_encoder(cls, args, src_dictionary):
    method build_speech_encoder (line 247) | def build_speech_encoder(cls, args):
    method check_args (line 255) | def check_args(cls, condition, is_strict, msg):
    method build_encoder (line 263) | def build_encoder(cls, args, task):
    method build_text_decoder (line 330) | def build_text_decoder(cls, args, tgt_dictionary, dec_emb_share=None):
    method build_decoder (line 342) | def build_decoder(cls, args, task):
    method load_pretrained_speech_text_components (line 381) | def load_pretrained_speech_text_components(cls, checkpoint, component_...
    method share_speech_text_encoder (line 398) | def share_speech_text_encoder(
  function dualinputs2twavtransformer_base (line 419) | def dualinputs2twavtransformer_base(args):
  function dualinputs2twavtransformer_base_stack (line 502) | def dualinputs2twavtransformer_base_stack(args):
  function dualinputs2twavtransformer_large (line 517) | def dualinputs2twavtransformer_large(args):

FILE: examples/speech_text_joint_to_text/models/s2t_dualinputxmtransformer.py
  class TransformerSentenceEncoderLayerStd (line 34) | class TransformerSentenceEncoderLayerStd(TransformerSentenceEncoderLayer):
    method __init__ (line 35) | def __init__(self, sent_enc_layer):
    method forward (line 59) | def forward(
  class SharedEncoder (line 74) | class SharedEncoder(FairseqEncoder):
    method __init__ (line 75) | def __init__(self, wav2vec_enc, mbart_enc, adaptor, shared_layers):
    method forward (line 96) | def forward(self, src_tokens, src_lengths=None, **kwargs):
  class StackedWav2VecEncoderWithAdaptor (line 127) | class StackedWav2VecEncoderWithAdaptor(FairseqEncoder):
    method __init__ (line 128) | def __init__(
    method forward (line 146) | def forward(self, src_tokens, src_lengths=None, return_all_hiddens=Fal...
    method reorder_encoder_out (line 177) | def reorder_encoder_out(self, encoder_out, new_order):
  class DualInputXMTransformerModel (line 221) | class DualInputXMTransformerModel(DualInputS2TTransformerModel):
    method __init__ (line 222) | def __init__(self, encoder, decoder):
    method add_args (line 226) | def add_args(parser):
    method build_encoder (line 401) | def build_encoder(cls, args, task):
    method build_decoder (line 473) | def build_decoder(cls, args, task):
    method build_model (line 522) | def build_model(cls, args, task):
  function dualinputxmtransformer_base (line 534) | def dualinputxmtransformer_base(args):

FILE: examples/speech_text_joint_to_text/scripts/convert_model.py
  function is_update (line 16) | def is_update(param_name, module_name):
  function load_checkpoint (line 22) | def load_checkpoint(src_cpt):
  function save_checkpoint (line 35) | def save_checkpoint(tgt_cpt, states):
  function main (line 45) | def main():

FILE: examples/speech_text_joint_to_text/scripts/g2p_encode.py
  function parse (line 19) | def parse():
  function process_sent (line 42) | def process_sent(sent, g2p, res_wrds, args):
  function remove_punc (line 59) | def remove_punc(sent):
  function do_g2p (line 70) | def do_g2p(g2p, sent, res_wrds, is_first_sent):
  function pre_process_sent (line 80) | def pre_process_sent(sent, do_filter, lower_case, res_wrds):
  function dup_pho (line 95) | def dup_pho(sent, dup_v_num, dup_c_num):
  function add_word_start (line 114) | def add_word_start(sent):
  function load_reserve_word (line 129) | def load_reserve_word(reserve_word):
  function process_sents (line 139) | def process_sents(sents, args):
  function main (line 154) | def main():

FILE: examples/speech_text_joint_to_text/tasks/pair_denoising.py
  function gen_whole_word_mask (line 35) | def gen_whole_word_mask(args, dictionary):
  class PairedDenoisingTask (line 59) | class PairedDenoisingTask(TranslationTask):
    method add_args (line 64) | def add_args(parser):
    method setup_task (line 138) | def setup_task(cls, args, **kwargs):
    method __init__ (line 173) | def __init__(self, args, src_dict, tgt_dict):
    method language_pair_denoising_dataset (line 183) | def language_pair_denoising_dataset(
    method _get_sample_prob (line 302) | def _get_sample_prob(self, dataset_lens):
    method resample_datasets (line 312) | def resample_datasets(self, lang_datasets, lang_pairs_all, epoch):
    method load_dataset_only (line 352) | def load_dataset_only(
    method load_dataset (line 444) | def load_dataset(self, split, epoch=1, combine=False, **kwargs):

FILE: examples/speech_text_joint_to_text/tasks/speech_text_denoise_pretrain.py
  class SpeechTextJointDenoisingPreTask (line 30) | class SpeechTextJointDenoisingPreTask(PairedDenoisingTask):
    method add_args (line 38) | def add_args(cls, parser):
    method setup_task (line 192) | def setup_task(cls, args, **kwargs):
    method __init__ (line 231) | def __init__(self, args, src_dict, tgt_dict):
    method build_model (line 277) | def build_model(self, args):
    method build_tokenizer (line 282) | def build_tokenizer(self, data_cfg, msg=""):
    method build_bpe (line 286) | def build_bpe(self, data_cfg, msg=""):
    method resolve_data_type (line 291) | def resolve_data_type(cls, split, use_sup_speech_ctc):
    method create_modalitydatasetitem (line 311) | def create_modalitydatasetitem(self, dtype, dataset):
    method load_dataset (line 337) | def load_dataset(self, split, epoch=1, combine=False, **kwargs):
    method get_sample_ratio (line 544) | def get_sample_ratio(self, epoch):
    method get_batch_iterator (line 570) | def get_batch_iterator(

FILE: examples/speech_text_joint_to_text/tasks/speech_text_joint.py
  class SpeechTextJointToTextTask (line 41) | class SpeechTextJointToTextTask(SpeechToTextTask):
    method add_args (line 47) | def add_args(cls, parser):
    method __init__ (line 122) | def __init__(self, args, src_dict, tgt_dict, infer_tgt_lang_id=None):
    method setup_task (line 132) | def setup_task(cls, args, **kwargs):
    method load_langpair_dataset (line 164) | def load_langpair_dataset(
    method inference_step (line 212) | def inference_step(
    method build_src_tokenizer (line 224) | def build_src_tokenizer(self, args):
    method build_src_bpe (line 228) | def build_src_bpe(self, args):
    method load_dataset (line 232) | def load_dataset(self, split, epoch=1, combine=False, **kwargs):
    method target_dictionary (line 302) | def target_dictionary(self):
    method source_dictionary (line 308) | def source_dictionary(self):
    method get_batch_iterator (line 313) | def get_batch_iterator(

FILE: examples/speech_to_speech/asr_bleu/compute_asr_bleu.py
  function merge_tailo_init_final (line 12) | def merge_tailo_init_final(text):
  function remove_tone (line 31) | def remove_tone(text):
  function extract_audio_for_eval (line 38) | def extract_audio_for_eval(audio_dirpath: str, audio_format: str):
  function extract_text_for_eval (line 70) | def extract_text_for_eval(
  function compose_eval_data (line 86) | def compose_eval_data(
  function load_eval_data_from_tsv (line 117) | def load_eval_data_from_tsv(eval_data_filepath: str):
  function run_asr_bleu (line 126) | def run_asr_bleu(args):
  function main (line 164) | def main():

FILE: examples/speech_to_speech/asr_bleu/utils.py
  class DownloadProgressBar (line 18) | class DownloadProgressBar(tqdm):
    method update_to (line 21) | def update_to(self, b=1, bsize=1, tsize=None) -> None:
  function retrieve_asr_config (line 30) | def retrieve_asr_config(lang_key: str, asr_version: str, json_path: str)...
  class ASRGenerator (line 47) | class ASRGenerator(object):
    method __init__ (line 50) | def __init__(
    method prepare_hf_model (line 110) | def prepare_hf_model(self, model_cfg: dict) -> None:
    method prepare_fairseq_model (line 155) | def prepare_fairseq_model(self, model_cfg: dict) -> None:
    method load_audiofile (line 220) | def load_audiofile(self, audio_path: str) -> torch.Tensor:
    method compute_emissions (line 247) | def compute_emissions(self, audio_input: torch.Tensor) -> torch.Tensor:
    method decode_emissions (line 270) | def decode_emissions(self, emissions: torch.Tensor) -> str:
    method transcribe_audiofile (line 290) | def transcribe_audiofile(self, audio_path: str, lower=True) -> str:

FILE: examples/speech_to_speech/benchmarking/core.py
  class BenchmarkingBase (line 2

Copy disabled (too large) Download .json

Condensed preview — 1626 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (10,093K chars).

[
  {
    "path": ".github/CODEOWNERS",
    "chars": 932,
    "preview": "# Setting up CODEOWNERS for UST related codebase\n# Documentation for open sourced models relevant to UST\nexamples/speech"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "chars": 1059,
    "preview": "---\nname: 🐛 Bug Report\nabout: Submit a bug report to help us improve\nlabels: 'bug, needs triage'\n---\n\n## 🐛 Bug\n\n<!-- A c"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/documentation.md",
    "chars": 262,
    "preview": "---\nname: 📚 Documentation/Typos\nabout: Report an issue related to documentation or a typo\nlabels: 'documentation, needs "
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 756,
    "preview": "---\nname: 🚀 Feature Request\nabout: Submit a proposal/request for a new feature\nlabels: 'enhancement, help wanted, needs "
  },
  {
    "path": ".github/ISSUE_TEMPLATE/how-to-question.md",
    "chars": 744,
    "preview": "---\nname: ❓ Questions/Help\nabout: If you have questions, please first search existing issues and docs\nlabels: 'question,"
  },
  {
    "path": ".github/ISSUE_TEMPLATE.md",
    "chars": 244,
    "preview": "## 👉 [Please follow one of these issue templates](https://github.com/pytorch/fairseq/issues/new/choose) 👈\n\nNote: to keep"
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE.md",
    "chars": 590,
    "preview": "# Before submitting\n\n- [ ] Was this discussed/approved via a Github issue? (no need for typos, doc improvements)\n- [ ] D"
  },
  {
    "path": ".github/stale.yml",
    "chars": 1734,
    "preview": "# Configuration for probot-stale - https://github.com/probot/stale\n# Mostly copied from github.com/facebook/react/blob/m"
  },
  {
    "path": ".github/workflows/build.yml",
    "chars": 2687,
    "preview": "name: build\n\non:\n  # Trigger the workflow on push to main or any pull request\n  push:\n    branches:\n      - main\n  pull_"
  },
  {
    "path": ".github/workflows/depreview.yml",
    "chars": 290,
    "preview": "name: 'Dependency Review'\non: [pull_request]\n\npermissions:\n  contents: read\n\njobs:\n  dependency-review:\n    runs-on: ubu"
  },
  {
    "path": ".github/workflows/release.yml",
    "chars": 5020,
    "preview": "name: Fairseq Release\n\non:\n  workflow_dispatch:\n    inputs:\n      name:\n        description: 'Release Type'\n        defa"
  },
  {
    "path": ".gitignore",
    "chars": 1742,
    "preview": "# JetBrains PyCharm IDE\n.idea/\n\n# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extension"
  },
  {
    "path": ".gitmodules",
    "chars": 162,
    "preview": "[submodule \"fairseq/model_parallel/megatron\"]\n    path = fairseq/model_parallel/megatron\n    url = https://github.com/ng"
  },
  {
    "path": ".pre-commit-config.yaml",
    "chars": 935,
    "preview": "exclude: 'build|stubs'\n\ndefault_language_version:\n    python: python3\n\nrepos:\n-   repo: https://github.com/pre-commit/pr"
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 3350,
    "preview": "# Code of Conduct\n\n## Our Pledge\n\nIn the interest of fostering an open and welcoming environment, we as\ncontributors and"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 3329,
    "preview": "# Contributing to Facebook AI Research Sequence-to-Sequence Toolkit (fairseq)\nWe want to make contributing to this proje"
  },
  {
    "path": "LICENSE",
    "chars": 1086,
    "preview": "MIT License\n\nCopyright (c) Facebook, Inc. and its affiliates.\n\nPermission is hereby granted, free of charge, to any pers"
  },
  {
    "path": "MANIFEST.in",
    "chars": 28,
    "preview": "include fairseq/version.txt\n"
  },
  {
    "path": "README.md",
    "chars": 17398,
    "preview": "<p align=\"center\">\n  <img src=\"docs/fairseq_logo.png\" width=\"150\">\n  <br />\n  <br />\n  <a href=\"https://opensource.fb.co"
  },
  {
    "path": "RELEASE.md",
    "chars": 654,
    "preview": "# Creating a New Release\n\nIn order to create a new release:\n\n1. Navigate to the [Fairseq Workflows](https://github.com/f"
  },
  {
    "path": "docs/Makefile",
    "chars": 607,
    "preview": "# Minimal makefile for Sphinx documentation\n#\n\n# You can set these variables from the command line.\nSPHINXOPTS    =\nSPHI"
  },
  {
    "path": "docs/command_line_tools.rst",
    "chars": 1893,
    "preview": ".. _Command-line Tools:\n\nCommand-line Tools\n==================\n\nFairseq provides several command-line tools for training"
  },
  {
    "path": "docs/conf.py",
    "chars": 3133,
    "preview": "#!/usr/bin/env python3\n# -*- coding: utf-8 -*-\n#\n# fairseq documentation build configuration file, created by\n# sphinx-q"
  },
  {
    "path": "docs/criterions.rst",
    "chars": 758,
    "preview": ".. role:: hidden\n    :class: hidden-section\n\n.. _Criterions:\n\nCriterions\n==========\n\nCriterions compute the loss functio"
  },
  {
    "path": "docs/data.rst",
    "chars": 1202,
    "preview": ".. role:: hidden\n    :class: hidden-section\n\n.. module:: fairseq.data\n\nData Loading and Utilities\n======================"
  },
  {
    "path": "docs/docutils.conf",
    "chars": 25,
    "preview": "[writers]\noption-limit=0\n"
  },
  {
    "path": "docs/getting_started.rst",
    "chars": 9236,
    "preview": "Evaluating Pre-trained Models\n=============================\n\nFirst, download a pre-trained model along with its vocabula"
  },
  {
    "path": "docs/hydra_integration.md",
    "chars": 10440,
    "preview": "## Hydra\n\n[Hydra](https://github.com/facebookresearch/hydra) is an open-source Python\nframework that simplifies the deve"
  },
  {
    "path": "docs/index.rst",
    "chars": 1002,
    "preview": ".. fairseq documentation master file, created by\n   sphinx-quickstart on Fri Aug 17 21:45:30 2018.\n   You can adapt this"
  },
  {
    "path": "docs/lr_scheduler.rst",
    "chars": 1055,
    "preview": ".. role:: hidden\n    :class: hidden-section\n\n.. _Learning Rate Schedulers:\n\nLearning Rate Schedulers\n==================="
  },
  {
    "path": "docs/make.bat",
    "chars": 805,
    "preview": "@ECHO OFF\r\n\r\npushd %~dp0\r\n\r\nREM Command file for Sphinx documentation\r\n\r\nif \"%SPHINXBUILD%\" == \"\" (\r\n\tset SPHINXBUILD=py"
  },
  {
    "path": "docs/models.rst",
    "chars": 2830,
    "preview": ".. role:: hidden\n    :class: hidden-section\n\n.. module:: fairseq.models\n\n.. _Models:\n\nModels\n======\n\nA Model defines the"
  },
  {
    "path": "docs/modules.rst",
    "chars": 241,
    "preview": "Modules\n=======\n\nFairseq provides several stand-alone :class:`torch.nn.Module` classes that may\nbe helpful when implemen"
  },
  {
    "path": "docs/optim.rst",
    "chars": 846,
    "preview": ".. role:: hidden\n    :class: hidden-section\n\n.. _optimizers:\n\nOptimizers\n==========\n\nOptimizers update the Model paramet"
  },
  {
    "path": "docs/overview.rst",
    "chars": 2702,
    "preview": "Overview\n========\n\nFairseq can be extended through user-supplied `plug-ins\n<https://en.wikipedia.org/wiki/Plug-in_(compu"
  },
  {
    "path": "docs/tasks.rst",
    "chars": 1391,
    "preview": ".. role:: hidden\n    :class: hidden-section\n\n.. module:: fairseq.tasks\n\n.. _Tasks:\n\nTasks\n=====\n\nTasks store dictionarie"
  },
  {
    "path": "docs/tutorial_classifying_names.rst",
    "chars": 16996,
    "preview": "Tutorial: Classifying Names with a Character-Level RNN\n======================================================\n\nIn this t"
  },
  {
    "path": "docs/tutorial_simple_lstm.rst",
    "chars": 21220,
    "preview": "Tutorial: Simple LSTM\n=====================\n\nIn this tutorial we will extend fairseq by adding a new\n:class:`~fairseq.mo"
  },
  {
    "path": "examples/.gitignore",
    "chars": 16,
    "preview": "!*/*.sh\n!*/*.md\n"
  },
  {
    "path": "examples/MMPT/.gitignore",
    "chars": 1920,
    "preview": "# Byte-compiled / optimized / DLL files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packagi"
  },
  {
    "path": "examples/MMPT/CONFIG.md",
    "chars": 1746,
    "preview": "### Config Files Explained\n\nTaking `projects/mfmmlm.yaml` for example, which run pretraining using masked frame model (M"
  },
  {
    "path": "examples/MMPT/DATASET.md",
    "chars": 2629,
    "preview": "# Dataset\n\nWe understand video data are challenging to download and process. For videos, we provide our preprocessing sc"
  },
  {
    "path": "examples/MMPT/README.md",
    "chars": 9563,
    "preview": "# VideoCLIP and VLM\n\nYou just find this toolkit for multimodal video understanding! It contains implementation of two re"
  },
  {
    "path": "examples/MMPT/endtask.md",
    "chars": 2452,
    "preview": "# Zero-shot Transfer and Finetuning\n\n(If you are new to the ideas of `mmpt.processors`, see [README](README.md) first.)\n"
  },
  {
    "path": "examples/MMPT/locallaunch.py",
    "chars": 5336,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/__init__.py",
    "chars": 394,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/datasets/__init__.py",
    "chars": 273,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/datasets/fairseqmmdataset.py",
    "chars": 1785,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/datasets/mmdataset.py",
    "chars": 3873,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/evaluators/__init__.py",
    "chars": 305,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/evaluators/evaluator.py",
    "chars": 2026,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/evaluators/metric.py",
    "chars": 10898,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/evaluators/predictor.py",
    "chars": 23125,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/losses/__init__.py",
    "chars": 345,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/losses/fairseqmmloss.py",
    "chars": 2241,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/losses/loss.py",
    "chars": 2095,
    "preview": "# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport torch\n\nfrom torch import nn\n\n\nclass Loss(object):\n    def __c"
  },
  {
    "path": "examples/MMPT/mmpt/losses/nce.py",
    "chars": 4586,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/models/__init__.py",
    "chars": 395,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/models/fairseqmmmodel.py",
    "chars": 1417,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/models/mmfusion.py",
    "chars": 30634,
    "preview": "# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.\n# Copyright (c) 2018,"
  },
  {
    "path": "examples/MMPT/mmpt/models/mmfusionnlg.py",
    "chars": 48394,
    "preview": "# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors, Facebook AI Research authors and The HuggingFace In"
  },
  {
    "path": "examples/MMPT/mmpt/models/transformermodel.py",
    "chars": 26064,
    "preview": "# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.\n# Copyright (c) 2018,"
  },
  {
    "path": "examples/MMPT/mmpt/modules/__init__.py",
    "chars": 255,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/modules/mm.py",
    "chars": 5537,
    "preview": "# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.\n# Copyright (c) 2018,"
  },
  {
    "path": "examples/MMPT/mmpt/modules/retri.py",
    "chars": 15471,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/modules/vectorpool.py",
    "chars": 8278,
    "preview": "# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport torch\nimport os\nimport numpy as np\nimport pickle\n\nfrom . impo"
  },
  {
    "path": "examples/MMPT/mmpt/processors/__init__.py",
    "chars": 652,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/processors/dedupprocessor.py",
    "chars": 8834,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/processors/dsprocessor.py",
    "chars": 29891,
    "preview": "# Copyright (c) Facebook, Inc. All Rights Reserved\n\n\"\"\"\nProcessors for all downstream (ds) tasks.\n\"\"\"\n\nimport json\nimpor"
  },
  {
    "path": "examples/MMPT/mmpt/processors/how2processor.py",
    "chars": 32302,
    "preview": "# coding=utf-8\n# Copyright 2018 The Google AI Language Team Authors and The HuggingFace Inc. team.\n# Copyright (c) 2018,"
  },
  {
    "path": "examples/MMPT/mmpt/processors/how2retriprocessor.py",
    "chars": 3742,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/processors/models/s3dg.py",
    "chars": 12416,
    "preview": "# This source code is licensed under the MIT license found in the\n# LICENSE file in the root directory of this source tr"
  },
  {
    "path": "examples/MMPT/mmpt/processors/processor.py",
    "chars": 9358,
    "preview": "# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport numpy as np\nimport os\nimport torch\n\n\nclass Processor(object):"
  },
  {
    "path": "examples/MMPT/mmpt/tasks/__init__.py",
    "chars": 445,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/tasks/fairseqmmtask.py",
    "chars": 3045,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/tasks/milncetask.py",
    "chars": 954,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/tasks/retritask.py",
    "chars": 8413,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/tasks/task.py",
    "chars": 6780,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/tasks/vlmtask.py",
    "chars": 856,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/utils/__init__.py",
    "chars": 1886,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/utils/load_config.py",
    "chars": 3155,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt/utils/shardedtensor.py",
    "chars": 1410,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt_cli/localjob.py",
    "chars": 3794,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/mmpt_cli/predict.py",
    "chars": 3937,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/pretraining.md",
    "chars": 1884,
    "preview": "# Pretraining\n\n(If you are new to the ideas of `mmpt.processors`, see [README](README.md) first.)\nWe mostly use [howto10"
  },
  {
    "path": "examples/MMPT/projects/mfmmlm.yaml",
    "chars": 1321,
    "preview": "project_dir: mfmmlm\nrun_task:\n  - how2.yaml\n  - [vtt.yaml, vttcap.yaml, vttqa.yaml, youcook.yaml, youcookcap.yaml, cross"
  },
  {
    "path": "examples/MMPT/projects/mtm/mmfusionmtm.yaml",
    "chars": 430,
    "preview": "includes: projects/mfmmlm.yaml\nproject_dir: mtm/mmfusionmtm\ntask_group:\n  pretrain:\n    task: VLMTask  # reproducible\n  "
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/coin.yaml",
    "chars": 1155,
    "preview": "dataset:\n  video_processor: VideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: COINActionSegmentationMetaPr"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/crosstask.yaml",
    "chars": 1502,
    "preview": "dataset:\n  video_processor: CrossTaskVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: CrossTaskMetaProces"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/how2.yaml",
    "chars": 1302,
    "preview": "dataset:\n  video_processor: ShardedVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: ShardedHow2MetaProces"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_coin.yaml",
    "chars": 811,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: COINActio"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_crosstask.yaml",
    "chars": 1200,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: CrossTaskVideoProcessor\n  aligner: "
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_crosstask_zs.yaml",
    "chars": 1193,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: CrossTaskVideoProcessor\n  aligner: "
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_vtt.yaml",
    "chars": 693,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: DSAligner"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_vttqa.yaml",
    "chars": 687,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: MSRVTTQAA"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_youcook.yaml",
    "chars": 799,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: YoucookVideoProcessor\n  aligner: DS"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/test_youcookcap.yaml",
    "chars": 797,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: YoucookVideoProcessor\n  aligner: DS"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/vtt.yaml",
    "chars": 1217,
    "preview": "dataset:\n  video_processor: VideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: MSRVTTMetaProcessor\n  train_"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/vttqa.yaml",
    "chars": 1113,
    "preview": "dataset:\n  video_processor: VideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: MSRVTTMetaProcessor\n  train_"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/youcook.yaml",
    "chars": 1164,
    "preview": "dataset:\n  video_processor: YoucookVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: YoucookMetaProcessor\n"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm/youcookcap.yaml",
    "chars": 1112,
    "preview": "dataset:\n  video_processor: YoucookVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: YoucookNLGMetaProcess"
  },
  {
    "path": "examples/MMPT/projects/mtm/vlm.yaml",
    "chars": 153,
    "preview": "includes: projects/mtm/mmfusionmtm.yaml\nproject_dir: mtm/vlm\ntask_group:\n  pretrain:\n    dataset:\n      sampled_min_len:"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/coin_videoclip.yaml",
    "chars": 1244,
    "preview": "dataset:\n  video_processor: VideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: COINActionSegmentationMetaPr"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/crosstask_videoclip.yaml",
    "chars": 1596,
    "preview": "dataset:\n  video_processor: CrossTaskVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: CrossTaskMetaProces"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/how2.yaml",
    "chars": 1640,
    "preview": "dataset:\n  video_processor: ShardedVideoRetriVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: ShardedHow2"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_coin_videoclip.yaml",
    "chars": 900,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: COINActio"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_coin_zs.yaml",
    "chars": 870,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: COINActio"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_crosstask_videoclip.yaml",
    "chars": 1291,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: CrossTaskVideoProcessor\n  aligner: "
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_crosstask_zs_videoclip.yaml",
    "chars": 1284,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: CrossTaskVideoProcessor\n  aligner: "
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_didemo_zs.yaml",
    "chars": 772,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: DiDeMoAli"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_vtt_videoclip.yaml",
    "chars": 779,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: DSAligner"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_vtt_zs.yaml",
    "chars": 778,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: DSAligner"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_vttqa_videoclip.yaml",
    "chars": 773,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: MSRVTTQAA"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_vttqa_zs.yaml",
    "chars": 770,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: VideoProcessor\n  aligner: MSRVTTQAA"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_youcook_videoclip.yaml",
    "chars": 885,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: YoucookVideoProcessor\n  aligner: DS"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/test_youcook_zs.yaml",
    "chars": 880,
    "preview": "slurm_config: big\ntask_type: local_predict\ndataset:\n  split: test\n  video_processor: YoucookVideoProcessor\n  aligner: DS"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/vtt_videoclip.yaml",
    "chars": 1303,
    "preview": "dataset:\n  video_processor: VideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: MSRVTTMetaProcessor\n  train_"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/vttqa_videoclip.yaml",
    "chars": 1199,
    "preview": "dataset:\n  video_processor: VideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: MSRVTTMetaProcessor\n  train_"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip/youcook_videoclip.yaml",
    "chars": 1250,
    "preview": "dataset:\n  video_processor: YoucookVideoProcessor\n  bert_name: bert-base-uncased\n  meta_processor: YoucookMetaProcessor\n"
  },
  {
    "path": "examples/MMPT/projects/retri/videoclip.yaml",
    "chars": 271,
    "preview": "includes: projects/retri/videoretri.yaml\nproject_dir: retri/videoclip\ntask_group:\n  pretrain:\n    model:\n      model_cls"
  },
  {
    "path": "examples/MMPT/projects/retri/videoretri.yaml",
    "chars": 1514,
    "preview": "includes: projects/mfmmlm.yaml\nproject_dir: retri/videoretri\nrun_task:\n  - how2.yaml\ntask_group:\n  pretrain:\n    task: V"
  },
  {
    "path": "examples/MMPT/projects/task/coin.yaml",
    "chars": 653,
    "preview": "includes: projects/task/ft.yaml\ntask_type: sweep_big\ndataset:\n  meta_processor: COINActionSegmentationMetaProcessor\n  tr"
  },
  {
    "path": "examples/MMPT/projects/task/coin_videoclip.yaml",
    "chars": 237,
    "preview": "includes: projects/task/coin.yaml\nmodel:\n  model_cls: MMFusionSeparateActionSegmentation\n  mm_encoder_cls: \n  video_enco"
  },
  {
    "path": "examples/MMPT/projects/task/crosstask.yaml",
    "chars": 1057,
    "preview": "includes: projects/task/ft.yaml\ndataset:\n  meta_processor: CrossTaskMetaProcessor\n  train_path: data/crosstask/crosstask"
  },
  {
    "path": "examples/MMPT/projects/task/crosstask_videoclip.yaml",
    "chars": 333,
    "preview": "includes: projects/task/crosstask.yaml\nmodel:\n  model_cls: MMFusionSeparateActionLocalization\n  mm_encoder_cls: \n  video"
  },
  {
    "path": "examples/MMPT/projects/task/default.yaml",
    "chars": 573,
    "preview": "# this yaml cannot be run alone. you must use `how2.yaml`, `vtt.yaml` etc for training.\ndataset:\n  video_processor: Vide"
  },
  {
    "path": "examples/MMPT/projects/task/ft.yaml",
    "chars": 473,
    "preview": "includes: projects/task/default.yaml\n# all derived config will be run by fairseq-train.\ntask_type: sweep_small\nfairseq:\n"
  },
  {
    "path": "examples/MMPT/projects/task/how2.yaml",
    "chars": 653,
    "preview": "includes: projects/task/default.yaml\ntask_type: sweep_big\nslurm_config: big\ndataset:\n  meta_processor: ShardedHow2MetaPr"
  },
  {
    "path": "examples/MMPT/projects/task/test.yaml",
    "chars": 300,
    "preview": "# this yaml cannot be run alone: implement a test_${dataset}.yaml\nslurm_config: big\ntask_type: local_predict\ndataset:\n  "
  },
  {
    "path": "examples/MMPT/projects/task/test_coin.yaml",
    "chars": 669,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  split: test\n  test_path: data/coin/COIN.json\n  meta_processor: COINActionSe"
  },
  {
    "path": "examples/MMPT/projects/task/test_coin_videoclip.yaml",
    "chars": 242,
    "preview": "includes: projects/task/test_coin.yaml\nmodel:\n  model_cls: MMFusionSeparateActionSegmentation\n  mm_encoder_cls: \n  video"
  },
  {
    "path": "examples/MMPT/projects/task/test_coin_zs.yaml",
    "chars": 324,
    "preview": "includes: projects/task/test_coin.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMBe"
  },
  {
    "path": "examples/MMPT/projects/task/test_crosstask.yaml",
    "chars": 1122,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  split: test\n  meta_processor: CrossTaskMetaProcessor\n  test_path: data/cros"
  },
  {
    "path": "examples/MMPT/projects/task/test_crosstask_videoclip.yaml",
    "chars": 235,
    "preview": "includes: projects/task/test_crosstask.yaml\nmodel:\n  model_cls: MMFusionSeparateActionLocalization\n  mm_encoder_cls: \n  "
  },
  {
    "path": "examples/MMPT/projects/task/test_crosstask_zs.yaml",
    "chars": 1188,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  split: test\n  meta_processor: CrossTaskMetaProcessor\n  test_path: data/cros"
  },
  {
    "path": "examples/MMPT/projects/task/test_crosstask_zs_videoclip.yaml",
    "chars": 238,
    "preview": "includes: projects/task/test_crosstask_zs.yaml\nmodel:\n  model_cls: MMFusionSeparateActionLocalization\n  mm_encoder_cls: "
  },
  {
    "path": "examples/MMPT/projects/task/test_didemo_zs.yaml",
    "chars": 636,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  meta_processor: DiDeMoMetaProcessor\n  test_path: data/didemo/test_data.json"
  },
  {
    "path": "examples/MMPT/projects/task/test_vtt.yaml",
    "chars": 536,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  meta_processor: MSRVTTMetaProcessor\n  test_path: data/msrvtt/MSRVTT_JSFUSIO"
  },
  {
    "path": "examples/MMPT/projects/task/test_vtt_videoclip.yaml",
    "chars": 192,
    "preview": "includes: projects/task/test_vtt.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMBer"
  },
  {
    "path": "examples/MMPT/projects/task/test_vtt_zs.yaml",
    "chars": 346,
    "preview": "includes: projects/task/test_vtt.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMBer"
  },
  {
    "path": "examples/MMPT/projects/task/test_vttqa.yaml",
    "chars": 551,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  meta_processor: MSRVTTQAMetaProcessor\n  test_path: data/msrvtt-qa/MSR_MC_te"
  },
  {
    "path": "examples/MMPT/projects/task/test_vttqa_videoclip.yaml",
    "chars": 194,
    "preview": "includes: projects/task/test_vttqa.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMB"
  },
  {
    "path": "examples/MMPT/projects/task/test_vttqa_zs.yaml",
    "chars": 350,
    "preview": "includes: projects/task/test_vttqa.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMB"
  },
  {
    "path": "examples/MMPT/projects/task/test_youcook.yaml",
    "chars": 750,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  meta_processor: YoucookMetaProcessor\n  test_path: data/youcook/youcook_val."
  },
  {
    "path": "examples/MMPT/projects/task/test_youcook_videoclip.yaml",
    "chars": 196,
    "preview": "includes: projects/task/test_youcook.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: M"
  },
  {
    "path": "examples/MMPT/projects/task/test_youcook_zs.yaml",
    "chars": 354,
    "preview": "includes: projects/task/test_youcook.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: M"
  },
  {
    "path": "examples/MMPT/projects/task/test_youcookcap.yaml",
    "chars": 661,
    "preview": "includes: projects/task/test.yaml\ndataset:\n  meta_processor: YoucookNLGMetaProcessor\n  test_path: data/youcook/val_list."
  },
  {
    "path": "examples/MMPT/projects/task/vtt.yaml",
    "chars": 658,
    "preview": "includes: projects/task/ft.yaml\ndataset:\n  meta_processor: MSRVTTMetaProcessor\n  train_path: data/msrvtt/MSRVTT_train.cs"
  },
  {
    "path": "examples/MMPT/projects/task/vtt_videoclip.yaml",
    "chars": 292,
    "preview": "includes: projects/task/vtt.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMBertForE"
  },
  {
    "path": "examples/MMPT/projects/task/vttqa.yaml",
    "chars": 554,
    "preview": "includes: projects/task/ft.yaml\ndataset:\n  meta_processor: MSRVTTMetaProcessor\n  train_path: data/msrvtt/MSRVTT_train.cs"
  },
  {
    "path": "examples/MMPT/projects/task/vttqa_videoclip.yaml",
    "chars": 255,
    "preview": "includes: projects/task/vttqa.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMBertFo"
  },
  {
    "path": "examples/MMPT/projects/task/youcook.yaml",
    "chars": 728,
    "preview": "includes: projects/task/ft.yaml\ndataset:\n  meta_processor: YoucookMetaProcessor\n  train_path: data/youcook/youcook_train"
  },
  {
    "path": "examples/MMPT/projects/task/youcook_videoclip.yaml",
    "chars": 256,
    "preview": "includes: projects/task/youcook.yaml\nmodel:\n  model_cls: MMFusionSeparate\n  mm_encoder_cls: \n  video_encoder_cls: MMBert"
  },
  {
    "path": "examples/MMPT/projects/task/youcookcap.yaml",
    "chars": 624,
    "preview": "# finetuning for youcook captioning.\nincludes: projects/task/ft.yaml\ndataset:\n  meta_processor: YoucookNLGMetaProcessor\n"
  },
  {
    "path": "examples/MMPT/scripts/text_token_extractor/configs/bert-base-uncased.yaml",
    "chars": 159,
    "preview": "dataset:\n  bert_name: bert-base-uncased\n  caption_pkl_path: data/how2/raw_caption_dedup.pkl\n  use_fast: true\n  target_di"
  },
  {
    "path": "examples/MMPT/scripts/text_token_extractor/pretokenization.py",
    "chars": 3408,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/extract.py",
    "chars": 5529,
    "preview": "# Copyright Howto100M authors.\n# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport torch as th\nimport torch.nn.fu"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/how2/s3d.sh",
    "chars": 219,
    "preview": "#!/bin/bash\n\n\npython scripts/video_feature_extractor/extract.py \\\n    --vdir <path_to_video_folder> \\\n    --fdir data/fe"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/model.py",
    "chars": 1921,
    "preview": "# Copyright (c) Howto100M authors and Facebook, Inc. All Rights Reserved\n\nimport torch as th\n\nfrom torch import nn\n\n\ncla"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/pathbuilder.py",
    "chars": 3410,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/preprocessing.py",
    "chars": 2071,
    "preview": "# Copyright Howto100m authors.\n# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport torch as th\n\nclass Normalize(o"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/random_sequence_shuffler.py",
    "chars": 829,
    "preview": "# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport numpy as np\n\nfrom torch.utils.data.sampler import Sampler\n\n\nc"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/shard_feature.py",
    "chars": 2166,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/MMPT/scripts/video_feature_extractor/videoreader.py",
    "chars": 8322,
    "preview": "# Copyright Howto100M authors.\n# Copyright (c) Facebook, Inc. All Rights Reserved\n\nimport torch as th\nimport pandas as p"
  },
  {
    "path": "examples/MMPT/setup.py",
    "chars": 668,
    "preview": "import setuptools\n\nwith open(\"README.md\", \"r\") as fh:\n    long_description = fh.read()\n\nsetuptools.setup(\n    name=\"mmpt"
  },
  {
    "path": "examples/__init__.py",
    "chars": 264,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/adaptive_span/README.md",
    "chars": 4375,
    "preview": "# Adaptive Span\n\nAdaptive Span is a novel self-attention mechanism that can learn its optimal\nattention span. This allow"
  },
  {
    "path": "examples/adaptive_span/__init__.py",
    "chars": 669,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/adaptive_span/adagrad_with_grad_clip.py",
    "chars": 4374,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/adaptive_span/adaptive_span_attention.py",
    "chars": 5881,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/adaptive_span/adaptive_span_loss.py",
    "chars": 4260,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/adaptive_span/adaptive_span_model.py",
    "chars": 8540,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n# All rights reserved.\n#\n# This source code is licensed under the lic"
  },
  {
    "path": "examples/adaptive_span/adaptive_span_model_wrapper.py",
    "chars": 4692,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/README.md",
    "chars": 6814,
    "preview": "# Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling (Gong et al., 202"
  },
  {
    "path": "examples/attention_head_selection/src/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/attention_head_selection/src/data/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/attention_head_selection/src/data/speech_to_text_dataset_with_domain.py",
    "chars": 8439,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/loss/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/attention_head_selection/src/loss/attention_head_selection.py",
    "chars": 872,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/models/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/attention_head_selection/src/models/head_selection_s2t_transformer.py",
    "chars": 6831,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/models/head_selection_transformer.py",
    "chars": 7637,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/modules/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "examples/attention_head_selection/src/modules/attn_head_selector.py",
    "chars": 3074,
    "preview": "# This source code is licensed under the MIT license found in the\n# LICENSE file in the root directory of this source tr"
  },
  {
    "path": "examples/attention_head_selection/src/modules/head_selection_transformer_layer.py",
    "chars": 3509,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/modules/multihead_attention_selection.py",
    "chars": 14046,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/modules/multihead_functional.py",
    "chars": 11418,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/attention_head_selection/src/speech_to_text_head_selection.py",
    "chars": 7727,
    "preview": "# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT license found in the\n#"
  },
  {
    "path": "examples/audio_nlp/nlu/README.md",
    "chars": 3778,
    "preview": "# End-to-end NLU\n\nEnd-to-end spoken language understanding (SLU) predicts intent directly from audio using a single mode"
  },
  {
    "path": "examples/audio_nlp/nlu/configs/nlu_finetuning.yaml",
    "chars": 1025,
    "preview": "# @package _group_\n\ncommon:\n  fp16: true\n  log_format: json\n  log_interval: 10\n  tensorboard_logdir: tb\n\ncheckpoint:\n  n"
  },
  {
    "path": "examples/audio_nlp/nlu/create_dict_stop.sh",
    "chars": 907,
    "preview": "#!/bin/bash\n\n### Script handling creation of data binaries\n### for model training within fairseq\n\n\nfairseq_root=\".\"\n\ndat"
  },
  {
    "path": "examples/audio_nlp/nlu/generate_manifests.py",
    "chars": 2574,
    "preview": "import argparse\nfrom pathlib import Path\nimport soundfile\n\ndef get_insl_frame(parse):\n    out = []\n    def is_ont_token("
  },
  {
    "path": "examples/backtranslation/README.md",
    "chars": 10775,
    "preview": "# Understanding Back-Translation at Scale (Edunov et al., 2018)\n\nThis page includes pre-trained models from the paper [U"
  },
  {
    "path": "examples/backtranslation/deduplicate_lines.py",
    "chars": 1221,
    "preview": "#!/usr/bin/python3\n# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT lic"
  },
  {
    "path": "examples/backtranslation/extract_bt_data.py",
    "chars": 2509,
    "preview": "#!/usr/bin/env python\n# Copyright (c) Facebook, Inc. and its affiliates.\n#\n# This source code is licensed under the MIT "
  },
  {
    "path": "examples/backtranslation/prepare-de-monolingual.sh",
    "chars": 3240,
    "preview": "#!/bin/bash\n\nSCRIPTS=mosesdecoder/scripts\nTOKENIZER=$SCRIPTS/tokenizer/tokenizer.perl\nNORM_PUNC=$SCRIPTS/tokenizer/norma"
  },
  {
    "path": "examples/backtranslation/prepare-wmt18en2de.sh",
    "chars": 3697,
    "preview": "#!/bin/bash\n# Adapted from https://github.com/facebookresearch/MIXER/blob/master/prepareData.sh\n\necho 'Cloning Moses git"
  },
  {
    "path": "examples/backtranslation/sacrebleu.sh",
    "chars": 961,
    "preview": "#!/bin/bash\n\nif [ $# -ne 5 ]; then\n    echo \"usage: $0 [dataset=wmt14/full] [langpair=en-de] [databin] [bpecode] [model]"
  },
  {
    "path": "examples/backtranslation/tokenized_bleu.sh",
    "chars": 1137,
    "preview": "#!/bin/bash\n\nif [ $# -ne 5 ]; then\n    echo \"usage: $0 [dataset=wmt14/full] [langpair=en-de] [databin] [bpecode] [model]"
  },
  {
    "path": "examples/bart/README.glue.md",
    "chars": 4060,
    "preview": "# Fine-tuning BART on GLUE tasks\n\n### 1) Download the data from GLUE website (https://gluebenchmark.com/tasks) using fol"
  }
]

// ... and 1426 more files (download for full content)

About this extraction

This page contains the full source code of the facebookresearch/fairseq GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 1626 files (9.2 MB), approximately 2.5M tokens, and a symbol index with 9202 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo