Large language model in the context of "Fine-tuning (deep learning)"

Play Trivia Questions online!

or

Skip to study material about Large language model in the context of "Fine-tuning (deep learning)"

Ad spacer

⭐ Core Definition: Large language model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) and provide the core capabilities of modern chatbots. LLMs can be fine-tuned for specific tasks or guided by prompt engineering. These models acquire predictive power regarding syntax, semantics, and ontologies inherent in human language corpora, but they also inherit inaccuracies and biases present in the data they are trained on.

They consist of billions to trillions of parameters and operate as general-purpose sequence models, generating, summarizing, translating, and reasoning over text. LLMs represent a significant new technology in their ability to generalize across tasks with minimal task-specific supervision, enabling capabilities like conversational agents, code generation, knowledge retrieval, and automated reasoning that previously required bespoke systems.

↓ Menu

>>>PUT SHARE BUTTONS HERE<<<
In this Dossier

Large language model in the context of Generative literature

Generative literature is poetry or fiction that is automatically generated, often using computers. It is a genre of electronic literature, and also related to generative art.

John Clark's Latin Verse Machine (1830–1843) is probably the first example of mechanised generative literature, while Christopher Strachey's love letter generator (1952) is the first digital example. With the large language models (LLMs) of the 2020s, generative literature is becoming increasingly common.

↑ Return to Menu

Large language model in the context of Generative artificial intelligence

Generative artificial intelligence (Generative AI, or GenAI) is a subfield of artificial intelligence that uses generative models to produce text, images, videos, audio, software code or other forms of data. These models learn the underlying patterns and structures of their training data and use them to produce new data based on the input, which often comes in the form of natural language prompts.

Generative AI tools have become more common since the AI boom in the 2020s. This boom was made possible by improvements in transformer-based deep neural networks, particularly large language models (LLMs). Major tools include chatbots such as ChatGPT, Copilot, Gemini, Claude, Grok, and DeepSeek; text-to-image models such as Stable Diffusion, Midjourney, and DALL-E; and text-to-video models such as Veo and Sora. Technology companies developing generative AI include OpenAI, xAI, Anthropic, Meta AI, Microsoft, Google, Mistral AI, DeepSeek, Baidu and Yandex.

↑ Return to Menu

Large language model in the context of Language model

A language model is a model of the human brain's ability to produce natural language. Language models are useful for a variety of tasks, including speech recognition, machine translation, natural language generation (generating more human-like text), optical character recognition, route optimization, handwriting recognition, grammar induction, and information retrieval.

Large language models (LLMs), currently their most advanced form as of 2019, are predominantly based on transformers trained on larger datasets (frequently using texts scraped from the public internet). They have superseded recurrent neural network-based models, which had previously superseded the purely statistical models, such as the word n-gram language model.

↑ Return to Menu

Large language model in the context of Chatbot

A chatbot (originally chatterbot) is a software application or web interface designed to have textual or spoken conversations. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.

Chatbots have increased in popularity as part of the AI boom of the 2020s, and the popularity of ChatGPT, followed by competitors such as Gemini, Claude and later Grok. AI chatbots typically use a foundational large language model, such as GPT-4 or the Gemini language model, which is fine-tuned for specific uses.

↑ Return to Menu

Large language model in the context of Prompt (natural language)

Prompt engineering is the process of structuring or crafting an instruction in order to produce better outputs from a generative artificial intelligence (AI) model. It typically involves designing clear queries, adding relevant context, and refining wording to guide the model toward more accurate, useful, and consistent responses.

A prompt is natural language text describing the task that an AI should perform. A prompt for a text-to-text language model can be a query, a command, or a longer statement including context, instructions, and conversation history. Prompt engineering may involve phrasing a query, specifying a style, choice of words and grammar, providing relevant context, or describing a character for the AI to mimic.

↑ Return to Menu

Large language model in the context of AI boom

An AI boom is a period of rapid growth in the field of artificial intelligence (AI). The current boom is an ongoing period that originally started from 2010 to 2016, but saw increased acceleration in the 2020s. Examples of this include generative AI technologies, such as large language models and AI image generators developed by companies like OpenAI, as well as scientific advances, such as protein folding prediction led by Google DeepMind. This period is sometimes referred to as an AI spring, a term used to differentiate it from previous AI winters. As of 2025, ChatGPT has emerged as the 4th most visited website globally, surpassed only by Google, YouTube, and Facebook.

↑ Return to Menu

Large language model in the context of Transformer (machine learning model)

In deep learning, the transformer is an artificial neural network architecture based on the multi-head attention mechanism, in which text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table. At each layer, each token is then contextualized within the scope of the context window with other (unmasked) tokens via a parallel multi-head attention mechanism, allowing the signal for key tokens to be amplified and less important tokens to be diminished.

Transformers have the advantage of having no recurrent units, therefore requiring less training time than earlier recurrent neural architectures (RNNs) such as long short-term memory (LSTM). Later variations have been widely adopted for training large language models (LLMs) on large (language) datasets.

↑ Return to Menu

Large language model in the context of Microsoft Copilot

Microsoft Copilot is a generative artificial intelligence chatbot developed by Microsoft AI, a division of Microsoft. Based on OpenAI's GPT-4 and GPT-5 series of large language models, it was launched in 2023 as Microsoft's main replacement for the discontinued Cortana.

The service was introduced in February 2023 under the name Bing Chat, as a built-in feature for Microsoft Bing and Microsoft Edge. Over the course of 2023, Microsoft began to unify the Copilot branding across its various chatbot products, cementing the "copilot" analogy. At its Build 2023 conference, Microsoft announced its plans to integrate Copilot into Windows 11, allowing users to access it directly through the taskbar. In January 2024, a dedicated Copilot key was announced for Windows keyboards.

↑ Return to Menu