Embeddings

About

The d9d.module.block.embedding package provides enhanced embedding layers.

Features

Currently, this package provides only SplitTokenEmbeddings module. You can use this module:

Regular Token Embedding Layer: Specify a single split with global vocab size.
For Prompt Tuning: Add additional tokens to your Tokenizer and specify two splits - first one will be original token embeddings, second one will be newly added learnable prompt tokens. Unfreeze only nn.Embedding module that is related to the second split.

`d9d.module.block.embedding`

Package providing various embedding layer implementations

`SplitTokenEmbeddings`

Bases: Module, ModuleLateInit

A token embedding layer composed of multiple named, independent embedding tables.

This class maintains a dictionary of embedding layers, mapping contiguous ranges of global vocabulary indices to specific named splits (e.g., 'orig', 'special', 'prompt_prefix'). This is useful for model adaptation strategies where different sets of tokens require different initialization training behaviors.

`init(split_vocab_size, split_order, hidden_size)`

Constructs the SplitTokenEmbeddings object.

Parameters:

Name	Type	Description	Default
`split_vocab_size`	`dict[str, int]`	A dictionary mapping split names to their vocabulary sizes.	required
`split_order`	`Sequence[str]`	A sequence defining the order in which splits are concatenated to form the global vocabulary. Keys provided here must exist in split_vocab_size.	required
`hidden_size`	`int`	The dimensionality of the embedding vectors.	required

`forward(input_ids)`

Retrieves embeddings for the input indices by routing them to appropriate internal layers.

Parameters:

Name	Type	Description	Default
`input_ids`	`Tensor`	Tensor of arbitrary shape containing global vocabulary indices.	required

Returns:

Type	Description
`Tensor`	Tensor of same shape as input_ids plus a last dimension of hidden_size.

`reset_parameters()`

Resets parameters for all registered embedding splits.

Embeddings

About

Features

d9d.module.block.embedding

SplitTokenEmbeddings

__init__(split_vocab_size, split_order, hidden_size)

forward(input_ids)

reset_parameters()

`d9d.module.block.embedding`

`SplitTokenEmbeddings`

`init(split_vocab_size, split_order, hidden_size)`

`forward(input_ids)`

`reset_parameters()`