Beam search decoding is an advanced technique used in text generation models, like transformers, to improve the quality of generated text. It addresses some of the limitations of simpler methods like greedy search by considering multiple potential sequences at each step in the generation process.
Beam search maintains a fixed number of candidate sequences (known as the “beam width” or “beam size”) at each step in the generation process. Here’s how it works:
Let’s say the model is generating text with a beam width of 2. Starting with the sentence “The cat”, it will:
In summary, beam search provides a balance between greedy search (which considers only one sequence) and exhaustive search (which considers all possible sequences), aiming to efficiently find a high-quality sequence in a large search space.