The Standard Deviation (s or σ) and Coefficient of Variation (CV) are two commonly used measures of dispersion for data sets. Each one has different applications for their relative strengths. Based on the nature of data, the comparison or inference to be obtained, or the control standards to be set, either one should be applied.
Standard Deviation
The standard deviation is one of the most commonly used statistics; its key advantages are that is measured in the same units as the data, it is important in the maintenance of limits and testing, and that it is scale neutral. The standard deviation is the clearest indicator of how widely two data sets with the same mean (or measuring the same output or variable) are comparatively dispersed. For example: Consider two production lines making 250 mm long screws, perhaps with acceptable deviations of +/- 5 mm in length for them to be ready for use. The two processes may yield data as follows:
In these 2 data sets, the Standard deviation shows us the wide range of dispersion in the outputs of Production Line A vs Line B. The Coefficient of Variation (CV) is unable to add any useful information in this case – the similarity in the variables is sufficient for their dispersion to be described by standard deviation for our requirements.
Coefficient of Variation
The Coefficient of Variation (CV) is the ratio of the Standard Deviation to the Mean. It’s strengths are that it is dimensionless and a percentage (therefore neutral to magnitudes). Therefore it can be useful in comparing the relative dispersion of distributions when standard deviation is affected by magnitude or scale variations. For example: consider the data from samples testing particulate pollution (SPM) in Parts Per Million from two sites, at intervals during the day:
As can be seen, Site A and Site B appear to be quite different – perhaps Site A is a garden or residential area while Site B is commercial or roadside area. Just comparing the standard deviation might indicate that Site B has much higher variation across the day. However, the higher overall (Mean) levels of pollutants at Site B is distorting this analysis. Using the CV metric we can there is actually higher intra-day variation at Site A, although the Mean, Maxima and Minima levels are all lower than they are at Site B.