Regular expression in the context of "AWK"

Play Trivia Questions online!

or

Skip to study material about Regular expression in the context of "AWK"




⭐ Core Definition: Regular expression

A regular expression (shortened as regex or regexp), sometimes referred to as a rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

The concept of regular expressions began in the 1950s, when the American mathematician Stephen Cole Kleene formalized the concept of a regular language. They came into common use with Unix text-processing utilities. Different syntaxes for writing regular expressions have existed since the 1980s, one being the POSIX standard and another, widely used, being the Perl syntax.

↓ Menu

👉 Regular expression in the context of AWK

AWK (/ɔːk/) is a scripting language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and it is a standard feature of most Unix-like operating systems. The shell command that runs the AWK processor is named awk.

The AWK language is a data-driven scripting language consisting of a set of actions to be taken against streams of textual data – either run directly on files or used as part of a pipeline – for purposes of extracting or transforming text, such as producing formatted reports. The language extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions. While AWK has a limited intended application domain and was especially designed to support one-liner programs, the language is Turing-complete, and even the early Bell Labs users of AWK often wrote well-structured large AWK programs.

↓ Explore More Topics
In this Dossier

Regular expression in the context of Regular language

In theoretical computer science and formal language theory, a regular language (also called a rational language) is a formal language that can be defined by a regular expression, in the strict sense in theoretical computer science (as opposed to many modern regular expression engines, which are augmented with features that allow the recognition of non-regular languages).

Alternatively, a regular language can be defined as a language recognised by a finite automaton. The equivalence of regular expressions and finite automata is known as Kleene's theorem (after American mathematician Stephen Cole Kleene). In the Chomsky hierarchy, regular languages are the languages generated by Type-3 grammars.

↑ Return to Menu

Regular expression in the context of OpenBSD

OpenBSD is a security-focused, free software, Unix-like operating system based on the Berkeley Software Distribution (BSD). Theo de Raadt created OpenBSD in 1995 by forking NetBSD 1.0. The OpenBSD project emphasizes portability, standardization, correctness, proactive security, and integrated cryptography.

The OpenBSD project maintains portable versions of many subsystems as packages for other operating systems. Because of the project's preferred BSD license, which allows binary redistributions without the source code, many components are reused in proprietary and corporate-sponsored software projects. The firewall code in Apple's macOS is based on OpenBSD's PF firewall code, Android's Bionic C standard library is based on OpenBSD code, LLVM uses OpenBSD's regular expression library, and Windows 10 uses OpenSSH (OpenBSD Secure Shell) with LibreSSL.

↑ Return to Menu

Regular expression in the context of Nondeterministic finite automaton

In automata theory, a finite-state machine is called a deterministic finite automaton (DFA), if

  • each of its transitions is uniquely determined by its source state and input symbol, and
  • reading an input symbol is required for each state transition.

A nondeterministic finite automaton (NFA), or nondeterministic finite-state machine, does not need to obey these restrictions. In particular, every DFA is also an NFA. Sometimes the term NFA is used in a narrower sense, referring to an NFA that is not a DFA, but not in this article.

↑ Return to Menu

Regular expression in the context of Kate (text editor)

The KDE Advanced Text Editor, or Kate, is a source code editor developed by the KDE free software community. It has been a part of KDE Software Compilation since version 2.2, which was first released in 2001. Intended for software developers, it features syntax highlighting, code folding, customizable layouts, multiple cursors and selections, regular expression support, and extensibility via plugins. The text editor's mascot is Kate the Cyber Woodpecker.

↑ Return to Menu

Regular expression in the context of Wordfilter

A wordfilter (sometimes referred to as just "filter" or "censor") is a script typically used on Internet forums or chat rooms that automatically scans users' posts or comments as they are submitted and automatically changes or censors particular words or phrases.

The most basic wordfilters search only for specific strings of letters, and remove or overwrite them regardless of their context. More advanced wordfilters make some exceptions for context (such as filtering "butt" but not "butter"), and more advanced wordfilters may use regular expressions.

↑ Return to Menu

Regular expression in the context of SNOBOL

SNOBOL (StriNg Oriented and symBOlic Language) is a series of programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph Griswold and Ivan P. Polonsky, culminating in SNOBOL4. It was one of a number of text-string-oriented languages developed during the 1950s and 1960s; others included COMIT and TRAC. Despite the similar name, it is entirely unlike COBOL.

SNOBOL4 stands apart from most programming languages of its era by having patterns as a first-class data type, a data type whose values can be manipulated in all ways permitted to any other data type in the programming language, and by providing operators for pattern concatenation and alternation. SNOBOL4 patterns are a type of object and admit various manipulations, much like later object-oriented languages such as JavaScript whose patterns are known as regular expressions. In addition SNOBOL4 strings generated during execution can be treated as programs and either interpreted or compiled and executed (as in the eval function of other languages).

↑ Return to Menu