The analysis of variance is a procedure that tests to determine whether differences exist between two or more population means. **one-way analysis of variance** is the procedure to apply when the samples are independently drawn.

\(H_{0}\): \(\mu_{1} = \mu_{2} = \cdots = \mu_{k}\)

\(H_{1}\): at least two means differ

The statistic that measures the proximity of the sample means to each other is called the **between-treatments variation**; it is denoted **SST**, which stands for **sum of squares for treatments**.

\[SST = \sum_{j=1}^k n_{j}(\bar x_{j} - \bar{\bar x})^2\]

\[\bar{\bar x} =\frac{\sum_{j=1}^k \sum_{i=1}^{n_{j}} x_{ij}}{n}\]

\[n = n_{1} + n_{2} + \cdots + n_{k}\]

\[\bar x_{j} = \frac{\sum_{i=1}^{n_{j}}x_{ij}}{n_{j}}\]

how much variation exists in the percentage of assets, which is measured by the **within-treatments variation**, which is denoted by **SSE** (**sum of squares for error**). The within-treatments variation provides a measure of the amount of variation in the response variable that *is not caused by the treatments*.

\[SSE = \sum_{j=1}^k \sum_{i=1}^{n_{j}}(x_{ij} - \bar x_{j})^2\]

\[SSE = (n_{1}-1)s_{1}^2 + (n_{2}-1)s_{2}^2 + \cdots + (n_{k}-1)s_{k}^2\]

The mean square for treatments is computed by dividing SST by the number of treatments minus 1.

\[MST = \frac{SST}{k-1}\]

The mean square for error is determined by dividing SSE by the total sample size (labeled n) minus the number of treatments.

\[MSE = \frac{SSE}{n-k}\]

Finally, the test statistic is defined as the ratio of the two mean squares.

\[F = \frac{MST}{MSE}\]

The test statistic is F-distributed with k âˆ’ 1 and n âˆ’ k degrees of freedom, provided that the response variable is normally distributed. we reject the null hypothesis only if

\[F > F_{\alpha, k-1, n-k}\]

**total variation** of all the data is denoted **SS(Total)**

\[SS(Total) = SST + SSE = \sum_{j=1}^k \sum_{i=1}^{n_{j}}(x_{ij} - \bar{\bar x})^2\]

ANOVA Table for the One-Way Analysis of Variance:

SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC |
---|---|---|---|---|

Treatments | k âˆ’ 1 | SST | MST = SST/ (k âˆ’ 1) | F = MST/MSE |

Error | n âˆ’ k | SSE | MSE = SSE/ (n âˆ’ k) | |

Total | n âˆ’ 1 | SS(Total) |

Example: a financial analyst randomly sampled 366 American households and asked each to report the age category of the head of the household and the proportion of its financial assets that are invested in the stock market. The age categories are Young (less than 35), Early middle age (35 to 49), Late middle age (50 to 65), Senior (older than 65). The analyst was particularly interested in determining whether the ownership of stocks varied by age.

SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|

Treatments | 3 | 3741.4 | 1247.12 | 2.79 | 0.0405 |

Error | 362 | 161871.0 | 447.16 | ||

Total | 365 | 165612.4 |

Interpret: The value of the test statistic is F = 2.79, and its p-value is .0405, which means there is evidence to infer that the percentage of total assets invested in stocks are different in at least two of the age categories.

There are two reasons why we donâ€™t use multiple t-tests instead of one F-test. First, we would have to perform many more calculations. Second, and more important, conducting multiple tests increases the probability of making Type I errors.

If we want to determine whether \(\mu_{1}\) is greater than \(\mu_{2}\) (or vice versa), we cannot use the analysis of variance because this technique allows us to test for a difference only. Thus, if we want to test to determine whether one population mean exceeds the other, we must use the t-test of \(\mu_{1} âˆ’ \mu_{2}\) (with \(\sigma_{1}^2=\sigma_{2}^2\)). Moreover, the analysis of variance requires that the population variances are equal. If they are not, we must use the unequal variances test statistic.

Bonferroni adjustment:

\[\alpha = \frac{\alpha_{E}}{n}\]

\(\alpha_{E}\), denotes the true probability of making at least one Type I error, is called the **experimentwise Type I error rate**. n is the number of pairwise comparisons.

A **single-factor** analysis of variance addresses the problem of comparing two or more populations defined on the basis of only one factor. A **multifactor** experiment is one in which two or more factors define the treatments.

The example in `14.1`

is a single-factor design because we had one treatment: age of the head of the household. Suppose that we can also look at the gender of the household head in another study. We would then develop a two-factor analysis of variance in which the first factor, age, has four levels, and the second factor, gender, has two levels.

When the problem objective is to compare more than two populations, the experimental design that is the counterpart of the matched pairs experiment is called the **randomized block design**. The term *block* refers to a matched group of observations from each population. The randomized block experiment is also called the **two-way analysis of variance**.

We can determine whether sleeping pills are effective by giving three brands of pills to the same group of people to measure the effects. Such experiments are called **repeated measures** designs.

The data are analyzed in the same way for both designs.

If our analysis includes all possible levels of a factor, the technique is called a **fixed effects analysis of variance**. If the levels included in the study represent a random sample of all the levels that exist, the technique is called a **random-effects analysis of variance**.

The purpose of designing a randomized block experiment is to reduce the within-treatments variation to more easily detect differences between the treatment means. In the one-way analysis of variance, we partitioned the total variation into the between-treatments and the within-treatments variation; that is,

\[SS(Total) = SST + SSE\]

In the randomized block design of the analysis of variance, we partition the total variation into three sources of variation:

\[SS(Total) = SST + SSB + SSE\]

where SSB, the **sum of squares for blocks**, measures the variation between the blocks.

BLOCK | 1 | 2 | â€¦ | k | Block Mean |
---|---|---|---|---|---|

1 | \(x_{11}\) | \(x_{12}\) | â€¦ | \(x_{1k}\) | \(\bar x[B]_{1}\) |

2 | \(x_{21}\) | \(x_{22}\) | â€¦ | \(x_{2k}\) | \(\bar x[B]_{2}\) |

\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | |

b | \(x_{b1}\) | \(x_{b2}\) | â€¦ | \(x_{bk}\) | \(\bar x[B]_{b}\) |

Treatment Mean | \(\bar x[T]_{1}\) | \(\bar x[T]_{2}\) | â€¦ | \(\bar x[T]_{k}\) |

Sums of Squares in the Randomized Block Experiment:

\[SS(Total) = \sum_{j=1}^k \sum_{i=1}^b (x_{ij} - \bar{\bar x})^2\] \[SST = \sum_{j=1}^k b(\bar x[T]_{j} - \bar{\bar x})^2\] \[SSB = \sum_{i=1}^b k(\bar x[B]_{i} - \bar{\bar x})^2\] \[SSE = \sum_{j=1}^k \sum_{i=1}^b (x_{ij} - \bar x[T]_{j} - \bar x[B]_{i} + \bar{\bar x})^2\]

Mean Squares for the Randomized Block Experiment:

\[MST = \frac{SST}{k-1}\] \[MSB = \frac{SSB}{b-1}\] \[MSE = \frac{SSE}{n-k-b-1}\]

Test Statistic for the Randomized Block Experiment

\[F = \frac{MST}{MSE}\]

which is F-distributed with Î½1 = k âˆ’ 1 and Î½2 = n âˆ’ k âˆ’ b + 1 degrees of freedom.

ANOVA Table for the Randomized Block Analysis of Variance

SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC |
---|---|---|---|---|

Treatments | k âˆ’ 1 | SST | MST = SST / (k âˆ’ 1) | F = MST/MSE |

Blocks | b - 1 | SSB | MSB = SSB / (b - 1) | F = MSB/MSE |

Error | n âˆ’ k - b + 1 | SSE | MSE = SSE / (n âˆ’ k - b + 1) | |

Total | n âˆ’ 1 | SS(Total) |

Example: A company selected 25 groups of four men, each of whom had cholesterol levels in excess of 280. In each group, the men were matched according to age and weight. Four drugs were administered over a 2-month period, and the reduction in cholesterol was recorded. Do these results allow the company to conclude that differences exist between the four drugs?

SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|

Drug | 3 | 196.0 | 65.3 | 4.12 | 0.009 |

Group | 24 | 3848.7 | 160.4 | 10.11 | 0.000 |

Error | 72 | 1142.6 | 15.9 | ||

Total | 99 | 5187.2 |

Interpret: we conclude that there is sufficient evidence to infer that at least two of the drugs differ.

The general term for the experiment features two factors is **factorial experiment**. In factorial experiments, we can examine the effect on the response variable of two or more factors. We will present the technique for fixed effects only. That means we will address problems where all the levels of the factors are included in the experiment.

Example: As part of a study on job tenure, a survey was conducted in which Americans aged between 37 and 45 were asked how many jobs they have held in their lifetimes. Also recorded were gender and educational attainment. The categories are E1, E2, E3 and E4. Can we infer that differences exist between genders and educational levels?

\(H_{0}\): \(\mu_{1} = \mu_{2} = \mu_{3} = \mu_{4} = \mu_{5} = \mu_{6} = \mu_{7} = \mu_{8}\)

\(H_{1}\): At least two means differ

Summary:

Groups | Count | Sum | Average | Variance |
---|---|---|---|---|

Male E1 | 10 | 126 | 12.60 | 8.27 |

Male E2 | 10 | 110 | 11.00 | 8.67 |

Male E3 | 10 | 106 | 10.60 | 11.60 |

Male E4 | 10 | 90 | 9.00 | 5.33 |

Female E1 | 10 | 115 | 11.50 | 8.28 |

Female E2 | 10 | 112 | 11.20 | 9.73 |

Female E3 | 10 | 94 | 9.40 | 16.49 |

Female E4 | 10 | 81 | 8.10 | 12.32 |

one-way Anova:

SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|

Between Groups | 7 | 153.35 | 21.91 | 2.17 | 0.0467 |

Within Groups | 72 | 726.20 | 10.09 | ||

Total | 79 | 879.55 |

Interpret: The value of the test statistic is F = 2.17 with a p-value of .0467. We conclude that there are differences in the number of jobs between the eight treatments.

This statistical result raises more questionsâ€”namely, can we conclude that the differences in the mean number of jobs are caused by differences between males and females? Or are they caused by differences between educational levels? Or, perhaps, are there combinations, called interactions, of gender and education that result in especially high or low numbers?

A **complete factorial experiment** is an experiment in which the data for all possible combinations of the levels of the factors are gathered. That means that in the above example we measured the number of jobs for all eight combinations. This experiment is called a complete 2 Ã— 4 factorial experiment. In general, we will refer to one of the factors as factor A (arbitrarily chosen). The number of levels of this factor will be denoted by a. The other factor is called factor B, and its number of levels is denoted by b. The number of observations for each combination is called a replicate. The number of replicates is denoted by r. We address only problems in which the number of replicates is the same for each treatment. Such a design is called **balanced**.