statistics for management and economics study notes 4
14. Analysis of Variance
14.1 One-way Analysis of Variance
The analysis of variance is a procedure that tests to determine whether differences exist between two or more population means. one-way analysis of variance is the procedure to apply when the samples are independently drawn.
\(H_{0}\): \(\mu_{1} = \mu_{2} = \cdots = \mu_{k}\)
\(H_{1}\): at least two means differ
The statistic that measures the proximity of the sample means to each other is called the between-treatments variation; it is denoted SST, which stands for sum of squares for treatments.
\[SST = \sum_{j=1}^k n_{j}(\bar x_{j} - \bar{\bar x})^2\]
\[\bar{\bar x} =\frac{\sum_{j=1}^k \sum_{i=1}^{n_{j}} x_{ij}}{n}\]
\[n = n_{1} + n_{2} + \cdots + n_{k}\]
\[\bar x_{j} = \frac{\sum_{i=1}^{n_{j}}x_{ij}}{n_{j}}\]
how much variation exists in the percentage of assets, which is measured by the within-treatments variation, which is denoted by SSE (sum of squares for error). The within-treatments variation provides a measure of the amount of variation in the response variable that is not caused by the treatments.
\[SSE = \sum_{j=1}^k \sum_{i=1}^{n_{j}}(x_{ij} - \bar x_{j})^2\]
\[SSE = (n_{1}-1)s_{1}^2 + (n_{2}-1)s_{2}^2 + \cdots + (n_{k}-1)s_{k}^2\]
The mean square for treatments is computed by dividing SST by the number of treatments minus 1.
\[MST = \frac{SST}{k-1}\]
The mean square for error is determined by dividing SSE by the total sample size (labeled n) minus the number of treatments.
\[MSE = \frac{SSE}{n-k}\]
Finally, the test statistic is defined as the ratio of the two mean squares.
\[F = \frac{MST}{MSE}\]
The test statistic is F-distributed with k − 1 and n − k degrees of freedom, provided that the response variable is normally distributed. we reject the null hypothesis only if
\[F > F_{\alpha, k-1, n-k}\]
total variation of all the data is denoted SS(Total)
\[SS(Total) = SST + SSE = \sum_{j=1}^k \sum_{i=1}^{n_{j}}(x_{ij} - \bar{\bar x})^2\]
ANOVA Table for the One-Way Analysis of Variance:
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC |
---|---|---|---|---|
Treatments | k − 1 | SST | MST = SST/ (k − 1) | F = MST/MSE |
Error | n − k | SSE | MSE = SSE/ (n − k) | |
Total | n − 1 | SS(Total) |
Example: a financial analyst randomly sampled 366 American households and asked each to report the age category of the head of the household and the proportion of its financial assets that are invested in the stock market. The age categories are Young (less than 35), Early middle age (35 to 49), Late middle age (50 to 65), Senior (older than 65). The analyst was particularly interested in determining whether the ownership of stocks varied by age.
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|
Treatments | 3 | 3741.4 | 1247.12 | 2.79 | 0.0405 |
Error | 362 | 161871.0 | 447.16 | ||
Total | 365 | 165612.4 |
Interpret: The value of the test statistic is F = 2.79, and its p-value is .0405, which means there is evidence to infer that the percentage of total assets invested in stocks are different in at least two of the age categories.
14.1.1 Can We Use the t-Test of the Difference between Two Means Instead of the Analysis of Variance?
There are two reasons why we don’t use multiple t-tests instead of one F-test. First, we would have to perform many more calculations. Second, and more important, conducting multiple tests increases the probability of making Type I errors.
14.1.2 Can We Use the Analysis of Variance Instead of the t-Test of \(\mu_{1} − \mu_{2}\)?
If we want to determine whether \(\mu_{1}\) is greater than \(\mu_{2}\) (or vice versa), we cannot use the analysis of variance because this technique allows us to test for a difference only. Thus, if we want to test to determine whether one population mean exceeds the other, we must use the t-test of \(\mu_{1} − \mu_{2}\) (with \(\sigma_{1}^2=\sigma_{2}^2\)). Moreover, the analysis of variance requires that the population variances are equal. If they are not, we must use the unequal variances test statistic.
14.2 Multiple Comparisions
Bonferroni adjustment:
\[\alpha = \frac{\alpha_{E}}{n}\]
\(\alpha_{E}\), denotes the true probability of making at least one Type I error, is called the experimentwise Type I error rate. n is the number of pairwise comparisons.
14.3 Analysis of Variance Experimental Designs
14.3.1 Single-Factor and Multifactor Experimental Designs
A single-factor analysis of variance addresses the problem of comparing two or more populations defined on the basis of only one factor. A multifactor experiment is one in which two or more factors define the treatments.
The example in 14.1
is a single-factor design because we had one treatment: age of the head of the household. Suppose that we can also look at the gender of the household head in another study. We would then develop a two-factor analysis of variance in which the first factor, age, has four levels, and the second factor, gender, has two levels.
14.3.2 Independent Samples and Blocks
When the problem objective is to compare more than two populations, the experimental design that is the counterpart of the matched pairs experiment is called the randomized block design. The term block refers to a matched group of observations from each population. The randomized block experiment is also called the two-way analysis of variance.
We can determine whether sleeping pills are effective by giving three brands of pills to the same group of people to measure the effects. Such experiments are called repeated measures designs.
The data are analyzed in the same way for both designs.
14.3.3 Fixed and Random Effects
If our analysis includes all possible levels of a factor, the technique is called a fixed effects analysis of variance. If the levels included in the study represent a random sample of all the levels that exist, the technique is called a random-effects analysis of variance.
14.4 Randomized Block (Two-Way) Analysis of Variance
The purpose of designing a randomized block experiment is to reduce the within-treatments variation to more easily detect differences between the treatment means. In the one-way analysis of variance, we partitioned the total variation into the between-treatments and the within-treatments variation; that is,
\[SS(Total) = SST + SSE\]
In the randomized block design of the analysis of variance, we partition the total variation into three sources of variation:
\[SS(Total) = SST + SSB + SSE\]
where SSB, the sum of squares for blocks, measures the variation between the blocks.
BLOCK | 1 | 2 | … | k | Block Mean |
---|---|---|---|---|---|
1 | \(x_{11}\) | \(x_{12}\) | … | \(x_{1k}\) | \(\bar x[B]_{1}\) |
2 | \(x_{21}\) | \(x_{22}\) | … | \(x_{2k}\) | \(\bar x[B]_{2}\) |
\(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | \(\vdots\) | |
b | \(x_{b1}\) | \(x_{b2}\) | … | \(x_{bk}\) | \(\bar x[B]_{b}\) |
Treatment Mean | \(\bar x[T]_{1}\) | \(\bar x[T]_{2}\) | … | \(\bar x[T]_{k}\) |
Sums of Squares in the Randomized Block Experiment:
\[SS(Total) = \sum_{j=1}^k \sum_{i=1}^b (x_{ij} - \bar{\bar x})^2\] \[SST = \sum_{j=1}^k b(\bar x[T]_{j} - \bar{\bar x})^2\] \[SSB = \sum_{i=1}^b k(\bar x[B]_{i} - \bar{\bar x})^2\] \[SSE = \sum_{j=1}^k \sum_{i=1}^b (x_{ij} - \bar x[T]_{j} - \bar x[B]_{i} + \bar{\bar x})^2\]
Mean Squares for the Randomized Block Experiment:
\[MST = \frac{SST}{k-1}\] \[MSB = \frac{SSB}{b-1}\] \[MSE = \frac{SSE}{n-k-b-1}\]
Test Statistic for the Randomized Block Experiment
\[F = \frac{MST}{MSE}\]
which is F-distributed with ν1 = k − 1 and ν2 = n − k − b + 1 degrees of freedom.
ANOVA Table for the Randomized Block Analysis of Variance
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC |
---|---|---|---|---|
Treatments | k − 1 | SST | MST = SST / (k − 1) | F = MST/MSE |
Blocks | b - 1 | SSB | MSB = SSB / (b - 1) | F = MSB/MSE |
Error | n − k - b + 1 | SSE | MSE = SSE / (n − k - b + 1) | |
Total | n − 1 | SS(Total) |
Example: A company selected 25 groups of four men, each of whom had cholesterol levels in excess of 280. In each group, the men were matched according to age and weight. Four drugs were administered over a 2-month period, and the reduction in cholesterol was recorded. Do these results allow the company to conclude that differences exist between the four drugs?
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|
Drug | 3 | 196.0 | 65.3 | 4.12 | 0.009 |
Group | 24 | 3848.7 | 160.4 | 10.11 | 0.000 |
Error | 72 | 1142.6 | 15.9 | ||
Total | 99 | 5187.2 |
Interpret: we conclude that there is sufficient evidence to infer that at least two of the drugs differ.
14.5 Two-Factor Analysis of Variance
The general term for the experiment features two factors is factorial experiment. In factorial experiments, we can examine the effect on the response variable of two or more factors. We will present the technique for fixed effects only. That means we will address problems where all the levels of the factors are included in the experiment.
Example: As part of a study on job tenure, a survey was conducted in which Americans aged between 37 and 45 were asked how many jobs they have held in their lifetimes. Also recorded were gender and educational attainment. The categories are E1, E2, E3 and E4. Can we infer that differences exist between genders and educational levels?
\(H_{0}\): \(\mu_{1} = \mu_{2} = \mu_{3} = \mu_{4} = \mu_{5} = \mu_{6} = \mu_{7} = \mu_{8}\)
\(H_{1}\): At least two means differ
Summary:
Groups | Count | Sum | Average | Variance |
---|---|---|---|---|
Male E1 | 10 | 126 | 12.60 | 8.27 |
Male E2 | 10 | 110 | 11.00 | 8.67 |
Male E3 | 10 | 106 | 10.60 | 11.60 |
Male E4 | 10 | 90 | 9.00 | 5.33 |
Female E1 | 10 | 115 | 11.50 | 8.28 |
Female E2 | 10 | 112 | 11.20 | 9.73 |
Female E3 | 10 | 94 | 9.40 | 16.49 |
Female E4 | 10 | 81 | 8.10 | 12.32 |
one-way Anova:
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|
Between Groups | 7 | 153.35 | 21.91 | 2.17 | 0.0467 |
Within Groups | 72 | 726.20 | 10.09 | ||
Total | 79 | 879.55 |
Interpret: The value of the test statistic is F = 2.17 with a p-value of .0467. We conclude that there are differences in the number of jobs between the eight treatments.
This statistical result raises more questions—namely, can we conclude that the differences in the mean number of jobs are caused by differences between males and females? Or are they caused by differences between educational levels? Or, perhaps, are there combinations, called interactions, of gender and education that result in especially high or low numbers?
A complete factorial experiment is an experiment in which the data for all possible combinations of the levels of the factors are gathered. That means that in the above example we measured the number of jobs for all eight combinations. This experiment is called a complete 2 × 4 factorial experiment. In general, we will refer to one of the factors as factor A (arbitrarily chosen). The number of levels of this factor will be denoted by a. The other factor is called factor B, and its number of levels is denoted by b. The number of observations for each combination is called a replicate. The number of replicates is denoted by r. We address only problems in which the number of replicates is the same for each treatment. Such a design is called balanced.
\(x_{ijk}\) = \(k\)th observation in the \(ij\)th treatment
\(\bar x[AB]_{ij}=\) mean of the treatment when the factor A level is i and the factor B level is j
\(\bar x[A]_{i}=\) Mean of the observations when the factor A level is i
\(\bar x[B]_{j}=\) Mean of the observations when the factor B level is j
\(\bar{\bar x}=\) Mean of all the observations
a = Number of factor A levels
b = Number of factor B levels
r = Number of replicates
\[SS(Total) = \sum_{i=1}^a \sum_{j=1}^b \sum_{k=1}^r (x_{ijk} - \bar{\bar x})^2\] \[SS(A) = rb \sum_{i=1}^a (\bar x[A]_{i} - \bar{\bar x})^2\] \[SS(B) = ra \sum_{j=1}^b (\bar x[B]_{j} - \bar{\bar x})^2\] \[SS(AB) = r \sum_{i=1}^a \sum_{j=1}^b (\bar x[AB]_{ij} - \bar x[A]_{i} - \bar x[B]_{j} + \bar{\bar x})^2\] \[SSE = \sum_{i=1}^a \sum_{j=1}^b \sum_{k=1}^r (x_{ijk} - \bar x[AB]_{ij})^2\]
\(\nu_{SS(A)} = a -1\)
\(\nu_{SS(B)} = b -1\)
\(\nu_{SS(AB)} = (a -1)(b-1)\)
\(\nu_{SSE} = n - ab\)
F-Tests Conducted in Two-Factor Analysis of Variance
Test for Differences between the Levels of Factor A
\(H_{0}\): The means of the a levels of factor A are equal
\(H_{1}\): At least two means differ
Test for Differences between the Levels of Factor B
\(H_{0}\): The means of the a levels of factor B are equal
\(H_{1}\): At least two means differ
Test for Interaction between Factors A and B
\(H_{0}\): Factors A and B do not interact to affect the mean responses
\(H_{1}\): Factors A and B do interact to affect the mean responses
Required Conditions
* The distribution of the response is normally distributed.
* The variance for each treatment is identical.
* The samples are independent.
ANOVA Table for the Two-Factor Experiment:
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC |
---|---|---|---|---|
Factor A | a-1 | SS(A) | MS(A) | MS(A)/MSE |
Factor B | b-1 | SS(B) | MS(B) | MS(B)/MSE |
Interaction | (a-1)(b-1) | SS(AB) | MS(AB) | MS(AB)/MSE |
Error | n - ab | SSE | MSE | |
Total | n -1 | SS(Total) |
Two-way ANOVA: Jobs versus Gender, Education
SOURCE OF VARIATION | DEGREES OF FREEDOM | SUMS OF SQUARES | MEAN SQUARES | F-STATISTIC | P |
---|---|---|---|---|---|
Gender | 1 | 11.25 | 11.25 | 1.12 | 0.294 |
Education | 3 | 135.85 | 45.28 | 4.49 | 0.006 |
Interaction | 3 | 6.25 | 2.08 | 0.21 | 0.892 |
Error | 72 | 726.20 | 10.09 | ||
Total | 79 | 879.55 |
Interpret: There is no evidence at the 5% significance level to infer that differences in the number of jobs exist between men and women. There is sufficient evidence at the 5% significance level to infer that differences in the number of jobs exist between educational levels. There is not enough evidence to conclude that there is an interaction between gender and education.
Order of Testing in the Two-Factor Analysis of Variance: Test for interaction first. If there is enough evidence to infer that there is interaction, do not conduct the other tests. If there is not enough evidence to conclude that there is interaction, proceed to conduct the F-tests for factors A and B.