Standard Deviation
Variation always exists in a data set, regardless of which characteristics are measured, because not
every data
point will have the same exact value for every variable. It is important to be able to measure variation in a
way that best captures
The standard deviation is a measurement for the amount of variability (or spread) among the numbers in a
data set. It is the most common measure of variation for numerical data.
In very rough terms, standard deviation
is the average distance from the mean. The more concentrated data is, the smaller the
standard
Where,
n = the number of values in the data set
x = the numbers in the data set
x̄ = the mean of the data set
The standard deviation of the entire population is denoted by
lowercase Greek letter
"σ",
pronounced as
Properties of a standard deviation
- The standard deviation is always a positive number, due to the way it is calculated and the fact that it measures a distance (distances are never negative numbers).
- The smallest possible value for the standard deviation is zero, and that happens only in contrived situations where every single number in the data set is exactly the same.
- The standard deviation calculated from the mean and thus it is affected by outliers. When outliers are
removed from a data set, remaining values are more concentrated around the
mean, and standard deviation
decreases.[1,p.83] - The standard deviation has the same units as the original data.
A
small standard deviation means that the values in the data set are close ot the mean of the data set, on
average, and large standard deviation means that the values are spread out further away from the mean.
A small standard deviation can be a goal in certain situations, for example, in product manufacturing and
quality
The standard deviation is also used to describe where most of the data should fall compared to the
mean. For
example, if the data has the normal
distribution curve, about 95% of the data lies within two standard
deviations of the mean. This is called the empirical rule or 68-95-99.7%
Sample variance is another way to measure variation in a data set. Sample variance is equal
to standard
deviation squared, or "s2", it is an intermediate value to calculating standard deviation. The
downside
of sample variance is that it is in square units. If the data is in dollars, for example, the variance would be
in square dollars, which makes no