Question d’entretien chez AMD

How does the self attention layer work in transformers?