Feed Forward Networks (FFN)
About
The d9d.module.block.ffn package implements standard dense Feed-Forward networks used in Transformer blocks.
Features
SwiGLU
SwiGLU is a SwiGLU layer.
Uses efficient SiLU-Mul kernel.
Kernel Benchmarks (BF16, H100)

d9d.module.block.ffn
SwiGLU
Bases: Module, ModuleLateInit
Implements the SwiGLU Feed-Forward Network (FFN).
This module applies the gated activation function: down(SiLU(gate(x)) * up(x)).
It corresponds to the standard MLP block used in architectures like LLaMA.
__init__(hidden_size, intermediate_size, bias=False)
forward(x)
reset_parameters()
Resets module parameters.