Visual breakdown of single-head attention mechanism showing query, key, value vectors and matrix operations for attention output.

This image demonstrates the complete process of a single-head attention mechanism in a transformer model. It includes the linear transformations of input tokens into query, key, and value vectors, the scaled dot-product attention computation, application of softmax, and final multiplication to obtain the attention-weighted output. This mechanism helps models focus on relevant parts of the input sequence.

×

Table Of Content