Full Code of cvqluu/TDNN for AI

master c9d3df7b342c cached
2 files
3.7 KB
1.0k tokens
3 symbols
1 requests
Download .txt
Repository: cvqluu/TDNN
Branch: master
Commit: c9d3df7b342c
Files: 2
Total size: 3.7 KB

Directory structure:
gitextract_r05coniw/

├── README.md
└── tdnn.py

================================================
FILE CONTENTS
================================================

================================================
FILE: README.md
================================================
# TDNN
Simple Time Delay Neural Network (TDNN) implementation in Pytorch. Uses the unfold method to slide over an input sequence.

![Alt text](misc/diagram.png?raw=true "Diagram") [1] https://www.danielpovey.com/files/2015_interspeech_multisplice.pdf

# Factorized TDNN (TDNN-F)

I've also implemented the Factorized TDNN from Kaldi (TDNN-F) in PyTorch here: https://github.com/cvqluu/Factorized-TDNN

## Usage

To recreate the TDNN part of the x-vector network in [2]:

```python

from tdnn import TDNN

# Assuming 24 dim MFCCs per frame

frame1 = TDNN(input_dim=24, output_dim=512, context_size=5, dilation=1)
frame2 = TDNN(input_dim=512, output_dim=512, context_size=3, dilation=2)
frame3 = TDNN(input_dim=512, output_dim=512, context_size=3, dilation=3)
frame4 = TDNN(input_dim=512, output_dim=512, context_size=1, dilation=1)
frame5 = TDNN(input_dim=512, output_dim=1500, context_size=1, dilation=1)

# Input to frame1 is of shape (batch_size, T, 24)
# Output of frame5 will be (batch_size, T-14, 1500)

```

![Alt text](misc/xvec_config.png?raw=true "Diagram") [2] https://www.danielpovey.com/files/2018_icassp_xvectors.pdf


================================================
FILE: tdnn.py
================================================
import torch
import torch.nn as nn
import torch.nn.functional as F

class TDNN(nn.Module):
    
    def __init__(
                    self, 
                    input_dim=23, 
                    output_dim=512,
                    context_size=5,
                    stride=1,
                    dilation=1,
                    batch_norm=True,
                    dropout_p=0.0
                ):
        '''
        TDNN as defined by https://www.danielpovey.com/files/2015_interspeech_multisplice.pdf

        Affine transformation not applied globally to all frames but smaller windows with local context

        batch_norm: True to include batch normalisation after the non linearity
        
        Context size and dilation determine the frames selected
        (although context size is not really defined in the traditional sense)
        For example:
            context size 5 and dilation 1 is equivalent to [-2,-1,0,1,2]
            context size 3 and dilation 2 is equivalent to [-2, 0, 2]
            context size 1 and dilation 1 is equivalent to [0]
        '''
        super(TDNN, self).__init__()
        self.context_size = context_size
        self.stride = stride
        self.input_dim = input_dim
        self.output_dim = output_dim
        self.dilation = dilation
        self.dropout_p = dropout_p
        self.batch_norm = batch_norm
      
        self.kernel = nn.Linear(input_dim*context_size, output_dim)
        self.nonlinearity = nn.ReLU()
        if self.batch_norm:
            self.bn = nn.BatchNorm1d(output_dim)
        if self.dropout_p:
            self.drop = nn.Dropout(p=self.dropout_p)
        
    def forward(self, x):
        '''
        input: size (batch, seq_len, input_features)
        outpu: size (batch, new_seq_len, output_features)
        '''

        _, _, d = x.shape
        assert (d == self.input_dim), 'Input dimension was wrong. Expected ({}), got ({})'.format(self.input_dim, d)
        x = x.unsqueeze(1)

        # Unfold input into smaller temporal contexts
        x = F.unfold(
                        x, 
                        (self.context_size, self.input_dim), 
                        stride=(1,self.input_dim), 
                        dilation=(self.dilation,1)
                    )

        # N, output_dim*context_size, new_t = x.shape
        x = x.transpose(1,2)
        x = self.kernel(x)
        x = self.nonlinearity(x)
        
        if self.dropout_p:
            x = self.drop(x)

        if self.batch_norm:
            x = x.transpose(1,2)
            x = self.bn(x)
            x = x.transpose(1,2)

        return x
Download .txt
gitextract_r05coniw/

├── README.md
└── tdnn.py
Download .txt
SYMBOL INDEX (3 symbols across 1 files)

FILE: tdnn.py
  class TDNN (line 5) | class TDNN(nn.Module):
    method __init__ (line 7) | def __init__(
    method forward (line 47) | def forward(self, x):
Condensed preview — 2 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (4K chars).
[
  {
    "path": "README.md",
    "chars": 1130,
    "preview": "# TDNN\nSimple Time Delay Neural Network (TDNN) implementation in Pytorch. Uses the unfold method to slide over an input "
  },
  {
    "path": "tdnn.py",
    "chars": 2619,
    "preview": "import torch\nimport torch.nn as nn\nimport torch.nn.functional as F\n\nclass TDNN(nn.Module):\n    \n    def __init__(\n      "
  }
]

About this extraction

This page contains the full source code of the cvqluu/TDNN GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 2 files (3.7 KB), approximately 1.0k tokens, and a symbol index with 3 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!