Positional Embeddings
About
The d9d.module.block.positional package manages positional encoding logic.
Features
Rotary Positional Encoding
Rotary Positional Encoding from RoFormer.
See RotaryEmbeddingProvider and RotaryEmbeddingApplicator classes.
First one is typically bound to a model class and is used for providing (cos, sin) embedding tensors for specified position IDs.
Second one is typically bound to attention module implementation and is used for modifying query and key states in runtime.
Embedding Layout Styles
The package supports multiple internal memory layouts for RoPE operations via the RotaryEmbeddingStyle enumeration. It is critical that both the provider and applicator share the identical style configuration:
d9d.module.block.positional
Provides modules for positional embeddings, such as Rotary Positional Embeddings.
RotaryEmbeddingApplicator
Bases: Module
Applies Rotary Positional Embeddings (RoPE) to Q and K projections.
__init__(style)
Constructs RotaryEmbeddingApplicator object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
style
|
RotaryEmbeddingStyle
|
Rotary embedding layout style alignment. |
required |
forward(query_states, key_states, position_embedding_cos, position_embedding_sin)
Rotates query and key states using provided cosine and sine embeddings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_states
|
Tensor
|
Query tensor. Shape: |
required |
key_states
|
Tensor
|
Key tensor. Shape: |
required |
position_embedding_cos
|
Tensor
|
Cosine values for positions.
Shape: |
required |
position_embedding_sin
|
Tensor
|
Sine values for positions.
Shape: |
required |
Returns:
| Type | Description |
|---|---|
tuple[Tensor, Tensor]
|
A tuple containing the rotated query and key tensors. |
RotaryEmbeddingProvider
Bases: Module, ModuleLateInit
Module that manages and provides Rotary Positional Embeddings.
__init__(rope_base, head_dim, max_position_ids, style)
Constructs the RotaryEmbeddingProvider.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rope_base
|
int
|
Base geometrical progression period for RoPE. |
required |
head_dim
|
int
|
Dimensionality of the attention head. |
required |
max_position_ids
|
int
|
Maximum supported sequence length for caching. |
required |
style
|
RotaryEmbeddingStyle
|
Embedding layout alignment. |
required |
forward(position_ids)
reset_parameters()
Resets module buffer populated values.
RotaryEmbeddingStyle
Bases: StrEnum
Supported Rotary Positional Embedding (RoPE) layout styles.
Attributes:
| Name | Type | Description |
|---|---|---|
HALF |
Applies transformations by splitting the feature dimension into two halves. |
|
INTERLEAVED |
Applies transformations by treating adjacent feature elements as pairs. |