The coefficient of variation (CoV) is defined as the ratio of the standard deviation to the mean. It measures the extent of variability in relation to the mean and has been recommended to be used as a metric in many applications. However, in some instances, it may be a misleading measure. We discuss that here and make recommendations.
The bottom line is that the standard deviation is often driven by different variables than those that affect the mean. As such, using their ratio as a response can lead to misleading results unless they both change similarly as variables change.
We will give two references for this and include recommendations of how to proceed when the coefficient of variation, its inverse, or other derivatives of the ratio of standard deviation to the mean are inappropriate.
Finally, we include a semiconductor example when the inverse of the CoV was not appropriate.
Problems of the Coefficient of Variation in Organizational Demography
In a paper titled, “The Use and Misuse of the Coefficient of Variation in Organizational Demography Research” Jesper B. Sørensen, currently The Robert A. and Elizabeth R. Jeffe Professor and Professor of Organizational Behavior at Stanford University School of Business, discusses when using the Coefficient of Variation as a response is inappropriate. While the paper focuses on organizational demography, the reasons apply generally. We include the two relevant quotes.
“The coefficient of variation confounds two characteristics of demographic distributions (the standard deviation and the mean) that may have independent effects on organizational outcomes.” Page 10
“A second problem arising from the use of the coefficient of variation relates to model specification. The coefficient of variation can be thought of as an interaction effect between the standard deviation and the inverse of the mean. As noted earlier, the theory may suggest that such an interaction is appropriate if the effect of the standard deviation is thought to be dampened in proportion to the mean.” Page 11
You can access Professor Sorensen’s article here.
Taguchi Signal to Noise Ratio
In a publication titled “Robust Designs” from the National Institute of Standards and Technology (NIST), the use of Taguchi’s Signal to Noise Ratio as a response is discussed. Many people very knowledgeable with Six Sigma are familiar with the Signal To Noise Ratio for three types of responses where the goal is the maximization of the Signal to Noise ratio:
· Smaller is Better
· Target is Best
· Larger is Better
However, as mentioned on page 8.10 of the article, “The signal to noise ratio confounds the mean and the variance together and assumes that the variance is proportional to the mean.”
And on page 8.42 a final recommendation:
“Analyze the mean and variance separately and make decisions concerning the tradeoffs between reducing variance and achieving the desired mean value.”
You can access the NIST publication at this link.
Semiconductor Example: Uniformity
Some years ago, when consulting for a semiconductor manufacturer, a young engineer brought in a problem with a confidential response. Based on the response, a response surface experiment was designed, and optimal settings were subsequently recommended for the process when the experiment was completed and analyzed.
When the resulting settings were found, the engineer returned to the factory and gave them to her boss. Later she returned and mentioned the boss said the settings bore no relation to the process under study.
She was then told that it was time to tell exactly what the response in question was. It turned out to be a measure of uniformity: the mean of multiple wafer thicknesses divided by their standard deviation. (This is the inverse of the coefficient of variation).
As previously mentioned under the discussion of coefficient of variation, different variables may affect the standard deviation than those that affect the mean. That was true in the case.
Fortunately, the engineer had the separate standard deviation and mean data that had gone into the uniformity measure as a function of the experimental settings.
The solution was to build a model of the standard deviation response separately and also separately for the mean. At this point, one could have found “optimal” settings by doing a multiple response optimization of the two responses.
However, management wanted to find the optimal process settings expressly for the uniformity measure.