Statistical dispersion in the context of "Summary statistic"

Play Trivia Questions online!

or

Skip to study material about Statistical dispersion in the context of "Summary statistic"

Ad spacer

⭐ Core Definition: Statistical dispersion

In statistics, dispersion (also called variability, scatter, or spread) is the extent to which a distribution is stretched or squeezed. Common examples of measures of statistical dispersion are the variance, standard deviation, and interquartile range. For instance, when the variance of data in a set is large, the data is widely scattered. On the other hand, when the variance is small, the data in the set is clustered.

Dispersion is contrasted with location or central tendency, and together they are the most used properties of distributions.

↓ Menu

>>>PUT SHARE BUTTONS HERE<<<
In this Dossier

Statistical dispersion in the context of Climate

Climate is the long-term weather pattern in a region, typically averaged over 30 years. More rigorously, it is the mean and variability of meteorological variables over a time spanning from months to millions of years. Some of the meteorological variables that are commonly measured are temperature, humidity, atmospheric pressure, wind, and precipitation. In a broader sense, climate is the state of the components of the climate system, including the atmosphere, hydrosphere, cryosphere, lithosphere and biosphere and the interactions between them. The climate of a location is affected by its latitude, longitude, terrain, altitude, land use and nearby water bodies and their currents.

Climates can be classified according to the average and typical variables, most commonly temperature and precipitation. The most widely used classification scheme is the Köppen climate classification. The Thornthwaite system, in use since 1948, incorporates evapotranspiration along with temperature and precipitation information and is used in studying biological diversity and how climate change affects it. The major classifications in Thornthwaite's climate classification are microthermal, mesothermal, and megathermal. Finally, the Bergeron and Spatial Synoptic Classification systems focus on the origin of air masses that define the climate of a region.

↑ Return to Menu

Statistical dispersion in the context of Variance

In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers are spread out from their average value. It is the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , , , or .

An advantage of variance as a measure of dispersion is that it is more amenable to algebraic manipulation than other measures of dispersion such as the expected absolute deviation; for example, the variance of a sum of uncorrelated random variables is equal to the sum of their variances. A disadvantage of the variance for practical applications is that, unlike the standard deviation, its units differ from the random variable, which is why the standard deviation is more commonly reported as a measure of dispersion once the calculation is finished. Another disadvantage is that the variance is not finite for many distributions.

↑ Return to Menu

Statistical dispersion in the context of Descriptive statistics

A descriptive statistic (in the count noun sense) is a summary statistic that quantitatively describes or summarizes features from a collection of information, while descriptive statistics (in the mass noun sense) is the process of using and analysing those statistics. Descriptive statistics is distinguished from inferential statistics (or inductive statistics) by its aim to summarize a sample, rather than use the data to learn about the population that the sample of data is thought to represent. This generally means that descriptive statistics, unlike inferential statistics, is not developed on the basis of probability theory, and are frequently nonparametric statistics. Even when a data analysis draws its main conclusions using inferential statistics, descriptive statistics are generally also presented. For example, in papers reporting on human subjects, typically a table is included giving the overall sample size, sample sizes in important subgroups (e.g., for each treatment or exposure group), and demographic or clinical characteristics such as the average age, the proportion of subjects of each sex, the proportion of subjects with related co-morbidities, etc.

Some measures that are commonly used to describe a data set are measures of central tendency and measures of variability or dispersion. Measures of central tendency include the mean, median and mode, while measures of variability include the standard deviation (or variance), the minimum and maximum values of the variables, kurtosis and skewness.

↑ Return to Menu

Statistical dispersion in the context of Interquartile range

In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference between the 75th and 25th percentiles of the data. To calculate the IQR, the data set is divided into quartiles, or four rank-ordered even parts via linear interpolation. These quartiles are denoted by Q1 (also called the lower quartile), Q2 (the median), and Q3 (also called the upper quartile). The lower quartile corresponds with the 25th percentile and the upper quartile corresponds with the 75th percentile, so IQR = Q3 −  Q1.

The IQR is an example of a trimmed estimator, defined as the 25% trimmed range, which enhances the accuracy of dataset statistics by dropping lower contribution, outlying points. It is also used as a robust measure of scale It can be clearly visualized by the box on a box plot.

↑ Return to Menu

Statistical dispersion in the context of Precision (statistics)

In statistics, the precision matrix or concentration matrix is the matrix inverse of the covariance matrix or dispersion matrix, .For univariate distributions, the precision matrix degenerates into a scalar precision, defined as the reciprocal of the variance, .

Other summary statistics of statistical dispersion also called precision (or imprecision)include the reciprocal of the standard deviation, ; the standard deviation itself and the relative standard deviation;as well as the standard error and the confidence interval (or its half-width, the margin of error).

↑ Return to Menu

Statistical dispersion in the context of Quartile

In statistics, quartiles are a type of quantiles which divide the number of data points into four parts, or quarters, of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a form of order statistic. The three quartiles, resulting in four data divisions, are as follows:

Along with the minimum and maximum of the data (which are also quartiles), the three quartiles described above provide a five-number summary of the data. This summary is important in statistics because it provides information about both the center and the spread of the data. Knowing the lower and upper quartile provides information on how big the spread is and if the dataset is skewed toward one side. Since quartiles divide the number of data points evenly, the range is generally not the same between adjacent quartiles (i.e. usually (Q3 - Q2) ≠ (Q2 - Q1)). Interquartile range (IQR) is defined as the difference between the 75th and 25th percentiles or Q3 - Q1. While the maximum and minimum also show the spread of the data, the upper and lower quartiles can provide more detailed information on the location of specific data points, the presence of outliers in the data, and the difference in spread between the middle 50% of the data and the outer data points.

↑ Return to Menu