Fine-tuning (deep learning) in the context of Backpropagation


Fine-tuning (deep learning) in the context of Backpropagation

Fine-tuning (deep learning) Study page number 1 of 1

Play TriviaQuestions Online!

or

Skip to study material about Fine-tuning (deep learning) in the context of "Backpropagation"


⭐ Core Definition: Fine-tuning (deep learning)

Fine-tuning (in deep learning) is the process of adapting a model trained for one task (the upstream task) to perform a different, usually more specific, task (the downstream task). It is considered a form of transfer learning, as it reuses knowledge learned from the original training objective.

Fine-tuning involves applying additional training (e.g., on new data) to the parameters of a neural network that have been pre-trained. Many variants exist. The additional training can be applied to the entire neural network, or to only a subset of its layers, in which case the layers that are not being fine-tuned are "frozen" (i.e., not changed during backpropagation). A model may also be augmented with "adapters"—lightweight modules inserted into the model's architecture that nudge the embedding space for domain adaptation. These contain far fewer parameters than the original model and can be fine-tuned in a parameter-efficient way by tuning only their weights and leaving the rest of the model's weights frozen.

↓ Menu
HINT:

In this Dossier

Fine-tuning (deep learning) in the context of Large language model

A large language model (LLM) is a language model trained with self-supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation. The largest and most capable LLMs are generative pre-trained transformers (GPTs) and provide the core capabilities of modern chatbots. LLMs can be fine-tuned for specific tasks or guided by prompt engineering. These models acquire predictive power regarding syntax, semantics, and ontologies inherent in human language corpora, but they also inherit inaccuracies and biases present in the data they are trained on.

They consist of billions to trillions of parameters and operate as general-purpose sequence models, generating, summarizing, translating, and reasoning over text. LLMs represent a significant new technology in their ability to generalize across tasks with minimal task-specific supervision, enabling capabilities like conversational agents, code generation, knowledge retrieval, and automated reasoning that previously required bespoke systems.

View the full Wikipedia page for Large language model
↑ Return to Menu