RE: Why do I need to add positional embeddings to my transformer inputs?
Why do I need to add positional embeddings to my transformer inputs? I thought the transformer was so smart, how does it not know about input positions a priori?
The Transformer model, by design, does not take into consideration the position or order of inputs. This is due to its self-attention mechanism, which considers all inputs as a 'set', ignoring their sequence.
The positional embeddings are added to give the model a sense of order or sequence of the inputs. They provide additional information regarding the position of each word in a sequence. Without positional embeddings, a Transformer model cannot differentiate the order of words, which is crucial in many tasks like natural language understanding.
So in essence, positional embeddings are not a limitation of the Transformer model, but rather a deliberate design choice for handling sequential inputs in tasks where order is important.