QA Inc.
QUALITYAMERICA.COM we're worth your visit!
QP Inc.

 
Statistical Inference, Part 2

Contents | Quality Encyclopedia | Discussion Blogs

The following is an excerpt from Chapter 3 of The Quality Engineering Handbook by Thomas Pyzdek, © Quality Publishing. It may be ordered from the Quality Publishing Order Form.

Tolerance intervals

We have found that confidence limits may be determined so that the interval between these limits will cover a population parameter with a certain confidence, that is, a certain proportion of the time. Sometimes it is desirable to obtain an interval which will cover a fixed portion of the population distribution with a specified confidence. These intervals are called tolerance intervals, and the end points of such intervals are called tolerance limits. For example, a manufacturer may wish to estimate what proportion of product will have dimensions that meet the engineering requirement. In quality engineering, tolerance intervals are typically of the form , X-bar ± Ks, where K is determined, so that the interval will cover a proportion P of the population with confidence g. Confidence limits for m are also of the form X-bar ± Ks. However, we determine k so that the confidence interval would cover the population mean m a certain proportion of the time. It is obvious that the interval must be longer to cover a large portion of the distribution than to cover just the single value m. Table 11 in the Appendix gives K for P = 0.75, 0.90, 0.95, 0.99, 0.999 and g = 0.75, 0.90, 0.95, 0.99 and for many different sample sizes n.

Example of calculating a tolerance interval

Assume that a sample of n=20 from a stable process produced the following results: . We can estimate that the interval = 20 ± 3.615(1.5) = 20 ± 5.4225, or the interval from 14.5775 to 25.4225 will contain 99% of the population with confidence 95%. The K values in the table assume normally distributed populations.

Hypothesis Testing

Statistical inference generally involves four steps:

1. Formulating a hypothesis about the population or "state of nature,"

2. Collecting a sample of observations from the population,

3. Calculating statistics based on the sample,

4. Either accepting or rejecting the hypothesis based on a pre-determined acceptance criterion.

There are two types of error associated with statistical inference

Type I error (α error)–The probability that a hypothesis that is actually true will be rejected. The value of α (alpha) is known as the significance level of the test.

Type II error (ß error)–The probability that a hypothesis that is actually false will be accepted.

Type II errors are often plotted in what is known as an operating characteristics curve. Operating characteristics curves will be used extensively in subsequent chapters of this book in evaluating the properties of various statistical quality control techniques.

Confidence intervals are usually constructed as part of a statistical test of hypotheses. The hypothesis test is designed to help us make an inference about the true population value at a desired level of confidence. We will look at a few examples of how hypothesis testing can be used in quality control applications.

Example: hypothesis test of sample mean

Experiment: The nominal specification for filling a bottle with a test chemical is 30 cc’s. The plan is to draw a sample of n=25 units from a stable process and, using the sample mean and standard deviation, construct a two-sided confidence interval (an interval that extends on either side of the sample average) that has a 95% probability of including the true population mean. If the inter-val includes 30, conclude that the lot mean is 30, otherwise conclude that the lot mean is not 30.

Result: A sample of 25 bottles was measured and the following statistics computed

The appropriate test statistic is t, given by the formula

Table 6 in the Appendix gives values for the t statistic at various degrees of freedom. There are n-1 degrees of freedom. For our example we need the t.975 column and the row for 24 df. This gives a t value of 2.064. Since the absolute value of this t value is greater than our test statistic, we fail to reject the hypothesis that the lot mean is 30 cc’s. Using statistical notation this is shown as:

H0: m = 30 cc’s (the null hypothesis)

H1: m is not equal to 30 cc’s (the alternate hypothesis)

a = .05 (type I error or level of significance)

Critical region: -2.064 ² t0 ² +2.064

Test statistic: t = -1.67.

Since t lies inside the critical region, fail to reject H0, and accept the hypothesis that the lot mean is 30cc for the data at hand.

Example: hypothesis test of two sample variances

The variance of machine X’s output, based on a sample of n = 25 taken from a stable process, is 100. Machine Y’s variance, based on a sample of 10, is 50. The manufacturing representative from the supplier of machine X contends that the result is a mere "statistical fluke." Assuming that a "statistical fluke" is something that has less than 1 chance in 100, test the hypothesis that both variances are actually equal.

The test statistic used to test for equality of two sample variances is the F statistic, which, for this example, is given by the equation

Using Table 8 in the Appendix for F.99 we find that for 24 df in the numerator and 9 df in the denominator F = 4.73. Based on this we conclude that the manufacturer of machine X could be right, the result could be a statistical fluke. This example demonstrates the volatile nature of the sampling error of sample variances and standard deviations.

Example: hypothesis test of a standard deviation compared to a standard value

A machine is supposed to produce parts in the range of 0.500 inches plus or minus 0.006 inches. Based on this, your statistician computes that the absolute worst standard deviation tolerable is 0.002 inches. In looking over your capability charts you find that the best machine in the shop has a standard deviation of 0.0022, based on a sample of 25 units. In discussing the situation with the statistician and management, it is agreed that the machine will be used if a one-sided 95% confidence interval on sigma includes 0.002.

The correct statistic for comparing a sample standard deviation with a standard value is the chi-square statistic. For our data we have s=0.0022, n=25, and σ0=0.002. The Χ2 statistic has n-1 = 24 degrees of freedom. Thus,

Table 7 gives, in the 0.95 column (since we are constructing a one-sided confidence interval) and the df = 24 row, the critical value c2 = 36.42. Since our computed value of c2 is less than 36.42, we use the machine. The reader should recognize that all of these exercises involved a number of assumptions. E.g., that we "know" that the best machine has a standard deviation of 0.0022. In reality, this knowledge must be confirmed by a stable control chart.

Resampling (Bootstrapping)

A number of criticisms have been raised regarding the methods used for estimation and hypothesis testing:

• They are not intuitive.

• They are based on strong assumptions (e.g., normality) that are often not met in practice.

• They are difficult to learn and to apply.

• They are error-prone.

In recent years a new method of performing these analyses has been developed. It is known as resampling or bootstrapping. The new methods are conceptually quite simple: using the data from a sample, calculate the statistic of interest repeatedly and examine the distribution of the statistic. For example, say you obtained a sample of n=25 measurements from a lot and you wished to determine a confidence interval on the statistic Cpk. Using resampling, you would tell the computer to select a sample of n=25 from the sample results, compute Cpk, and repeat the process many times, say 10,000 times. You would then determine whatever percentage point value you wished by simply looking at the results. The samples would be taken "with replacement," i.e., a particular value from the original sample might appear several times (or not at all) in a resample.

Resampling has many advantages, especially in the era of easily available, low-cost computer power. Spreadsheets can be programmed to resample and calculate the statistics of interest. Compared with traditional statistical methods, resampling is easier for most people to understand. It works without strong assumptions, and it is simple. Resampling doesn’t impose as much baggage between the engineering problem and the statistical result as conventional methods. It can also be used for more advanced problems, such as modeling, design of experiments, etc.

For a discussion of the theory behind resampling, see Efron (1982). For a presentation of numerous examples using a resampling computer program see Simon (1992).

 

 


Search | Site Map | Privacy | About Us

Copyright © 1995-2008 Quality America Inc. All Rights Reserved