Self-attention is the primary building block of large language models (LLMs) and transformers in general