statistics for management and economics study notes 3

| category Statistics  | tag Statistics 
statistics for management and economics study notes 3

9. Sampling Distributions

9.1 Sampling Distribution of the Mean

Central Limit Theorem: The sampling distribution of the mean of a random sample drawn from any population is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of X will resemble a normal distribution.

μˉx=μ

σ2ˉx=σ2n

If X is normal, then ˉX is normal. If X is nonnormal, then ˉX is approximately normal for sufficiently large sample sizes. The definition of “sufficiently large” depends on the extent of nonnormality of X.

Standardizing the sample mean:

Z=ˉXμσ/n

9.2 Sampling Distribution of a Sample Proportion

ˆP is approximately normally distributed provided that np and n(1 − p) are greater than or equal to 5.

E(ˆP)=p

V(ˆP)=σ2ˆp=p(1p)n

Standardizing the sample proportion:

Z=ˆPpp(1p)/n

9.3 Sampling Distribution of the Difference between Two Means

E(ˉX1ˉX2)=μˉx1ˉx2=μ1μ2

V(ˉX1ˉX2)=σ2ˉx1ˉx2=σ21n1+σ22n2

Standardizing the difference between two sample means:

Z=(ˉX1ˉX2)(μ1μ2)σ21n1+σ22n2

10. Introduction to Estimation

  • An unbiased estimator of a population parameter is an estimator whose expected value is equal to that parameter.
  • An unbiased estimator is said to be consistent if the difference between the estimator and the parameter grows smaller as the sample size grows larger.
  • If there are two unbiased estimators of a parameter, the one whose variance is smaller is said to have relative efficiency.

10.1 Estimating the Population Mean When the Population Standard Deviation is Known

ˉx±zα/2σn

10.2 Determining the Sample Size to Estimate μ

n=(zα/2σB)2

B=Zα/2σn

B stands for the bound on the error of estimation.

11. Introduction to Hypothesis Testing

11.1 Concepts of Hypothesis Testing

  • null hypothesis usually refers to a general statement or default position that there is no relationship between two measured phenomena, or no association among groups. H0
  • alternative hypothesis (or maintained hypothesis or research hypothesis) refers the hypothesis to be accepted if the null hypothesis is rejected. H1
  • A Type I error occurs when we reject a true null hypothesis. α
  • A Type II error is defined as not rejecting a false null hypothesis. β
  • The p-value of a test is the probability of observing a test statistic at least as extreme as the one computed given that the null hypothesis is true.
  • If we reject the null hypothesis, we conclude that there is enough statistical evidence to infer that the alternative hypothesis is true.
  • If we do not reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true.

11.2 Testing the Population Mean When the Population Standard Deviation is Known

  • A two-tail test is conducted whenever the alternative hypothesis specifies that the mean is not equal to the value stated in the null hypothesis.
  • a one-tail test that focuses on the right tail of the sampling distribution whenever we want to know whether there is enough evidence to infer that the mean is greater than the quantity specified by the null hypothesis.
  • a one-tail test that focuses on the left tail of the sampling distribution whenever we want to know whether there is enough evidence to infer that the mean is less than the quantity specified by the null hypothesis.

11.2.1 Standardized Test Statistic

z=ˉxμσ/n

The rejection region:

z>zα/2

or

z<zα/2

11.2.2 Testing Hypotheses and Confidence Interval Estimators

ˉx±zα/2σn

we compute the interval estimate and determine whether the hypothesized value of the mean falls into the interval.

11.3 Calculating the Probability of a Type II Error

Example: A random sample of 400 monthly accounts is drawn, for which the sample mean is $178. The accounts are approximately normally distributed with a standard deviation of $65. Whether the mean is greater than $170 with α = 5%?

H0: μ170

H1: μ>170

ˉxL17065/400=1.645

ˉxL=175.34

Therefore, the rejection region is:

ˉx>175.34

The sample mean was computed to be 178. Because the test statistic (sample mean) is in the rejection region (it is greater than 175.34), we reject the null hypothesis. Thus, there is sufficient evidence to infer that the mean monthly account is greater than $170.

β=P(ˉX<175.34, given that the null hypothesis is false )

Suppose that when the mean account is at least $180.

β=P(ˉX<175.34, given that μ=180)

β=P(ˉXμσ/n<175.3418065/400)=P(Z<1.43)=0.0764

This plot illustrates the inverse relationship between the probabilities of Type I and Type II errors. Unfortunately, there is no simple formula to determine what the significance level should be.

11.4 Larger Sample Size Equals More Information Equals Better Decisions

11.5 Power of a Test

power: the probability of its leading us to reject the null hypothesis when it is false. Thus, the power of a test is 1 − β.

12. Inference About a Population

12.1 Inference about a Population Mean When the Population Standard Deviation is Unknown

When the population standard deviation is unknown and the population is normal, the test statistic for testing hypotheses about μ is

t=ˉxμs/n

which is Student t-distributed with ν = n − 1 degrees of freedom.

Confidence Interval Estimator of μ When σ Is Unknown

ˉx±tα/2sn

12.2 Inference about a Population Variance

The test statistic used to test hypotheses about σ2 is

χ2=(n1)s2σ2

which is chi-squared distributed with ν = n − 1 degrees of freedom when the population random variable is normally distributed with variance equal to σ2.

Confidence Interval Estimator of σ2

Lower confidence limit (LCL) = (n1)s2χ2α/2

Upper confidence limit (UCL) = (n1)s2χ21α/2

12.3 Inference about a Population Proportion

ˆp=xn

Test Statistic for p

z=ˆPpp(1p)/n

which is approximately normal when np and n(1 − p) are greater than 5.

Confidence Interval Estimator of p

ˆp±zα/2ˆp(1ˆp)/n

Sample Size to Estimate a Proportion

n=(zα/2ˆp(1ˆp)B)2

B=zα/2ˆp(1ˆp)n

13. Inference about Comparing Two Populations

13.1 Inference about the Difference between two Means: Independent Samples

Sampling Distribution of ˉx1ˉx2:

ˉx1ˉx2 is normally distributed if the populations are normal and approximately normal if the populations are nonnormal and the sample sizes are large.
E(ˉx1ˉx2)=μ1μ2 V(ˉx1ˉx2)=σ21n1+σ22n2 Z=(ˉx1ˉx2)(μ1μ2)σ21n1+σ22n2

13.1.1 Test Statistic for μ1μ2 when σ21=σ22

t=(ˉx1ˉx2)(μ1μ2)s2p(1n1+1n2)

where s2p is called the pooled variance estimator:

s2p=(n11)s21+(n21)s22n1+n22

13.1.2 Confidence Interval Estimator of μ1μ2 when σ21=σ22

(ˉx1ˉx2)±tα/2s2p(1n1+1n2)

13.1.3 Test Statistic for μ1μ2 when σ21σ22

t=(ˉx1ˉx2)(μ1μ2)s21n1+s22n2

ν=(s21/n1+s22/n2)2(s21/n1)2n11+(s22/n2)2n21

13.1.4 Confidence Interval Estimator of μ1μ2 when σ21σ22

(ˉx1ˉx2)±tα/2s21n1+s22n2

13.1.5 Testing the Population Variances

H0: σ21σ22=1
H1: σ21σ221

F=s21s22

ν1=n11 and ν2=n21. This is a two-tail test so that the rejection region is F>Fα/2,ν1,ν2 or F<F1α/2,ν1,ν2.

Confidence Interval Estimator of σ21/σ22

LCL=s21s221Fα/2,ν1,ν2 UCL=s21s22Fα/2,ν1,ν2

13.2 Inference about the Difference between two Means: Matched Pairs Experiment

μD is the mean of the population of differences.

Test Statistic for μD

t=ˉxDμDsD/nD

which is Student t distributed with ν=nD1 degrees of freedom, provided that the differences are normally distributed.

Confidence Interval Estimator of μD

ˉxD±tα/2sDnD

13.3 Inference about the Difference between two Population Proportions

The statistic ˆp1ˆp2 is approximately normally distributed provided that the sample sizes are large enough so that n1p1, n1(1p1), n2p2, and n2(1p2) are all greater than or equal to 5.

E(ˆp1ˆp2)=p1p2

V(ˆp1ˆp2)=p1(1p1)n1+p2(1p2)n2

Z=(ˆp1ˆp2)(p1p2)p1(1p1)n1+p2(1p2)n2

ˆp1=x1n1 ˆp2=x2n2

13.3.1 Test Statistic for p1p2: Case 1

H0: p1p2=0

z=ˆp1ˆp2ˆp(1ˆp)(1n1+1n2)

ˆp=x1+x2n1+n2

13.3.2 Test Statistic for p1p2: Case 2

H0: p1p2=D,D0

z=(ˆp1ˆp2)Dˆp1(1ˆp1)n1+ˆp2(1ˆp2)n2


Previous     Next