Part 2: What are the different types Of Attention Mechanisms? | Glue Labs

Post

editor-img
Glue Labs
Oct 28, 2022

Part 2: What are the different types Of Attention Mechanisms?

In our last post, we explored generalised and self-attention. So now let's move forward and learn about more types of attention mechanisms.

(In case you missed part 1)

3. Multi-Head Attention

Multi-head attention is a transformer model of attention mechanism. When the attention module repeats its computations over several iterations, each computation forms parallel layers known as attention heads.

Each separate head independently passes the input sequence and corresponding output sequence element through a separate head.

A final attention score is produced by combining attention scores at each head so that every nuance of the input sequence is taken into consideration.

4. Additive Attention

This type of attention is also known as the Bahdanau attention mechanism. It makes use of attention alignment scores based on a number of factors.

These alignment scores are calculated at different points in a neural network.

Source or input sequence words are correlated with target or output sequence words but not to an exact degree.

This correlation takes into account all hidden states, and the final alignment score is the summation of the matrix of alignment scores.

5. Global Attention

This type of attention mechanism is also referred to as the Luong mechanism. This is a multiplicative attention model, which is an improvement over the Bahdanau model.

In situations where neural machine translations are required, the Luong model can either attend to all source words or predict the target sentence, thereby attending to a smaller subset of words.

While both the global and local attention models are equally viable, the context vectors used in each method tend to differ.