Large language models (LLMs) have significantly advanced artificial intelligence (AI) and natural language processing (NLP) by excelling in tasks like text generation, machine translation, question answering and sentiment analysis, often rivaling human performance. This paper reviews LLMs’ foundations, advancements and applications, beginning with the transformative transformer architecture, which improved on earlier models like recurrent neural networks and convolutional neural networks through self-attention mechanisms that capture long-range dependencies and contextual relationships. Key innovations such as masked language modeling and causal language modeling underpin leading models like Bidirectional encoder representations from transformers (BERT) and the Generative Pre-trained Transformer (GPT) series. The paper highlights scaling laws, model size increases and advanced training techniques that have driven LLMs’ growth. It also explores methodologies to enhance their precision and adaptability, including parameter-efficient fine-tuning and prompt engineering. Challenges like high computational demands, biases and hallucinations are addressed, with solutions such as retrieval-augmented generation to improve factual accuracy. By discussing LLMs’ strengths, limitations and transformative potential, this paper provides researchers, practitioners and students with a comprehensive understanding. It underscores the importance of ongoing research to improve efficiency, manage ethical concerns and shape the future of AI and language technologies.