|
Contents
| Quality Encyclopedia
| Discussion Blogs
Applying
Experimental Design
by
C.J. Keller
Lets use
an example were all familiar with: a fluid being pumped through
a pipe. We're interested in the effect of a the system on two responses:
- Pumping
Pressure
- Pumping
energy.
As factors for
an designed experiment we'll select:
- A hardware
option, the pipe size.
- An operating
option, the fluid flow rate.
- A design
option, an additive that is added to the base fluid.
We also suspect
that the ambient temperature may have an effect on the responses. We dont
control temperature in our environment but we can measure it. If we were
able to control temperature during the experiment, we could assign Temperature
as a Subsidiary (Noise, Outer) factor. Because we cant control the
temperature we will record it as a Casual Factor.

The parameters
we want to define are: three main design factors (pipe size, fluid
flow rate and additive), a casual factor (temperature),
two responses (pumping pressure and energy) and a Run number
as a comment column. We could also add other information columns to record
operator notes, time, etc..
We suspect that
the responses will be influenced by more than linear factor effects, so
we want more than two levels for each factor. The factor-levels are the
values of each factor to be used in the runs. The design will consist
of using combinations of a few levels for each factor, where each combination
will be for a run which will produce a response (in this case two responses).
If we have only two levels for a factor and we plot a response vs. the
two levels, only a straight line can be drawn between them to estimate
the effect of other non-tested levels. If there are at least three levels,
a curve may be fitted to the three points.
We also suspect
that these factors may interact with one-another. If an interaction is
present then the response depends not only on adding together the effect
from each factor, but also depends on an added effect from the product
of two or more factors or on the product of a factor with itself.
There are a number
of other factors which may influence the results. We can readily identify
some of them: pipe length, pipe interior surface roughness, etc. There
may be, and probably are, others which we do not suspect as having an
influence. To the extent we can, we will fix or make constant the recognized
non-experiment factors. The others we hope will have a random effect,
which we will try to enhance by randomizing the order in which the runs
are executed.
Well consider
three pipe diameters: 0.5, 0.75 and 1.00; three flow rates: 1, 3 and 5;
and three additive levels: 25, 50 and 75. The choice of at least three
levels allows nonlinear effects to be analyzed. Using the same number
of levels for all the design factors allows more choices of experimental
designs. Using 3 or 5 levels for each factor allows special response surface
designs to be available. The program itself can handle any mix of factor-levels.
The casual factor,
Temperature, will take on whatever value occurs during each experimental
run.
The information
for each parameter is defined using a dialog initiated from the parameter
information summary screen.


Once the parameters
have been defined using the parameter information dialog, the requirements
may be defined by specifying an Interaction set. The requirements specify
the minimum estimating capability of the proposed design. Interactions
not listed may not be capable of being estimated or may be confounded
with other factors or interactions.

If a limited
set of requirements, limited to two-factor quadratic interactions, is
selected then a 27-run Complete Factorial design (such as a Taguchi L27)
would be available. If a still more modest choice of requirements, a Response
Surface set, is chosen a 20-run Central Composite (CC20) design is available.

A default Options
in the title bar has selected Experimental Units as our preferred display
units. If the User prefers to see designs in a standardized format as
is commonly shown in classical references or in the Taguchi notation,
those options are also available.

We could if necessary
trim some of the 20 center point runs to reduce the design size by selecting:
Action + Change Design Array. Both trimming and augmenting of a design
can be done by the program so that the User can tailor the number of runs
and the performance of the design to his requirements.
So far, weve
defined our parameters, selected a set of requirements by specifying an
interaction set and have chosen a design array.
We can now randomize
the run order, produce a Final Design Array for use during the experiment
and build a Base Analysis array. The final design array takes the selected
and possibly modified design array, combines it with a subsidiary (noise,
outer) design if present, provides repeated runs or replicates as selected
by options and randomizes the run order. The final design array is then
available for printing, copying to the clipboard, exporting to another
application and transfer to the Base Analysis array.
The Base Analysis
array is where we merge the design with the experimental data. When we
get the experimental data, we can enter into the Base Analysis array the
Temperature, Pressure and Energy data either by pasting it in or by keyboard
entry in the window.

Because we constructed
this design using an response surface interaction requirement, we can
analyze our data with a response surface analysis as well as with plots
and regressions.
An excellent
starting plot is for a response vs. the run order. Well show Pressure
vs., Run order.

The plot (for
this limited data) doesnt show any obvious order dependency but
run 17 may be an outlier or may signal a shift in the fluid flow model.
In any event, it should be further evaluated.
A Half-Normal
plot of the effects for Energy shows that the ambient Temperature does
indeed have a strong effect on Energy consumption, stronger than Flow.
Does this indicate that Flow is not important? No it does not. The HN
plot only indicates the change in effect over the range of the data. A
thought experiment reducing the flow to zero confirms that the flow is
important to the process.

Next well
do a regression analysis using Energy as the response and selecting that
the fitted Response be calculated. All the parameters are significant
at a probability level of 0.75, except for the quadratic term of Flow.
The following image is a partial view of the ANOVA/Regression report window.

The residual
error, the difference between the data response and the fitted response
may be calculated by generating a Working Analysis Array from the Base
array and then selecting Action + Calculated Response. In that dialog,
a calculated response may be generated from one or more of the data columns.
If this had been a design suitable for data grouping several Taguchi ratios
may also be generated.
A plot of the
Energy Residual vs. Run order shows that Run 17, which had previously
been identified as a potential outlier, is fitted by the polynomial model
about as well as the other points and that further consideration as an
outlier is probably not warranted. There is justification for examining
the effect of that data run on the model.

Up to this point
we have not considered any interactions between the ambient Temperature
and the other factors. By selecting Action + Populate Interaction Array
+ Response Surface, a new interaction list may be generated. It may be
tested for feasibility by selecting Action + Audit, which confirms that
the data set can support the increased requirements. ( All data sets are
automatically checked for adequacy to estimate the requirements list before
any numerical operation is attempted.)
A new regression
may now be generated based on the expanded interaction list. The expanded
model fits the data nicely, but there are some parameters whose level
of significance are much less than our previous criterion of 0.75. in
particular, the significance of Temperature has been reduced, apparently
as a result of including interactions between Temperature and the other
factors. The presence of high-significance interactions that include a
low-significance linear factor may be an indication of a improper model.
That subject is discussed in another note.

We'll keep the
current model for the time being and select Action + Response Surface
Analysis. The resulting screen shows that there is no Energy maxima or
minima within the data range, although there is a saddle point very far
from the data.

Lets go
to a Contour plot to understand the data a bit more. We have 4 factors,
of which only two at a time may be used in a Contour representation of
a response; there are 6 such plots, of which only one has fitted values
within the data range. The plots themselves may be altered by the selection
of the value used for non-plot factors; we'll use median data values for
the non-plot factors.

We'll look at
one more set of Interaction charts, of which we'll show one plot which
is representative for this data. It clearly shows that for the design
selection range for Additive and the casual range of Temperature, there
is less variation and a lower level of Energy for the largest pipe diameter.

In the real world
we would look at more charts, we'd eliminate non-significant factors and
interactions from our analysis model and most likely recognize that we
don't have enough data for a robust conclusion. We certainly do have enough
data to make some conclusions.
One is that the
Pressure and Energy responses tend to indicate different selections from
among the factor levels for best operation. Another is that larger pipe
diameters tend to less variation and lower Energy; low to moderate levels
of Additive require less Energy. Some anomalies in the data indicate that
we should repeat the experiment for the anomalous data points and replicate
a better selected design to get a better measure of variation as well
as to look to improving the model.
|