causal_conv1d#
- causal_conv1d(
- x,
- weight,
- bias=None,
- activation='identity',
- channel_last=False,
Depthwise causal 1D convolution with optional activation.
Each channel is convolved with its own kernel. Causal means the output at time \(t\) depends only on inputs at times \(\le t\).
\[y_{b,c,t} = \mathrm{activation}\left( \sum_{k=0}^{K-1} x_{b,c,t-k} \cdot w_{c,k} + b_c \right)\]- Parameters:
x (torch.Tensor) – Input tensor of shape
(batch_size, dim, seq_len)ifchannel_lastis False, otherwise(batch_size, seq_len, dim).weight (torch.Tensor) – Weight tensor of shape
(dim, kernel_size)ifchannel_lastis False, otherwise(kernel_size, dim).bias (torch.Tensor | None) – Optional bias tensor of shape
(dim,).activation (str) – Activation function to apply. Supported:
"silu","identity".channel_last (bool) – Whether the channels dimension is the last dimension (NWH or NHW).
- Returns:
Output tensor of shape
(batch_size, dim, seq_len)ifchannel_lastis False, otherwise(batch_size, seq_len, dim).- Return type:
Example
batch_size, dim, seq_len, kernel_size = 2, 4, 10, 3 channel_last = False x = torch.randn(batch_size, dim, seq_len, device="cuda") weight = torch.randn(dim, kernel_size, device="cuda") bias = torch.randn(dim) y = causal_conv1d(x, weight, bias, activation="silu", channel_last=channel_last) print(y.shape) # torch.Size([2, 4, 10]) channel_last = True x = torch.randn(batch_size, seq_len, dim, device="cuda") weight = torch.randn(kernel_size, dim, device="cuda") bias = torch.randn(dim) y = causal_conv1d(x, weight, bias, activation="silu", channel_last=channel_last) print(y.shape) # torch.Size([2, 10, 4])