Robust statistics in the context of Median (statistics)


Robust statistics in the context of Median (statistics)

Robust statistics Study page number 1 of 1

Play TriviaQuestions Online!

or

Skip to study material about Robust statistics in the context of "Median (statistics)"


⭐ Core Definition: Robust statistics

Robust statistics are statistics that maintain their properties even if the underlying distributional assumptions are incorrect. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard deviations; under this model, non-robust methods like a t-test work poorly.

↓ Menu
HINT:

In this Dossier

Robust statistics in the context of Median

The median of a set of numbers is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as the “middle" value. The basic feature of the median in describing data compared to the mean (often simply described as the "average") is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of the center. Median income, for example, may be a better way to describe the center of the income distribution because increases in the largest incomes alone have no effect on the median. For this reason, the median is of central importance in robust statistics.

Median is a 2-quantile; it is the value that partitions a set into two equal parts.

View the full Wikipedia page for Median
↑ Return to Menu

Robust statistics in the context of Outlier

In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are sometimes excluded from the data set. An outlier can be an indication of exciting possibility, but can also cause serious problems in statistical analyses.

Outliers can occur by chance in any distribution, but they can indicate novel behaviour or structures in the data-set, measurement error, or that the population has a heavy-tailed distribution. In the case of measurement error, one wishes to discard them or use statistics that are robust to outliers, while in the case of heavy-tailed distributions, they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution. A frequent cause of outliers is a mixture of two distributions, which may be two distinct sub-populations, or may indicate 'correct trial' versus 'measurement error'; this is modeled by a mixture model.

View the full Wikipedia page for Outlier
↑ Return to Menu

Robust statistics in the context of Robust measures of scale

In statistics, robust measures of scale are methods which quantify the statistical dispersion in a sample of numerical data while resisting outliers. These are contrasted with conventional or non-robust measures of scale, such as sample standard deviation, which are greatly influenced by outliers.

The most common such robust statistics are the interquartile range (IQR) and the median absolute deviation (MAD). Alternatives robust estimators have also been developed, such as those based on pairwise differences and biweight midvariance.

View the full Wikipedia page for Robust measures of scale
↑ Return to Menu