# statistics for management and economics study notes 2

# 5. Data Collection And Sampling

## 5.1 Simple Random Sample

A simple random sample is a sample selected in such a way that every possible sample with the same number of observations is equally likely to be chosen.

## 5.2 Stratified Random Sampling

A stratified random sample is obtained by separating the population into mutually exclusive sets, or strata, and then drawing simple random samples from each stratum.

## 5.3 Cluster Sampling

A cluster sample is a simple random sample of groups or clusters of elements.

## 5.4 Sampling Error

Sampling error refers to differences between the sample and the population that exists only because of the observations that happened to be selected for the sample.

## 5.5 Nonsampling Error

Nonsampling errors result from mistakes made in the acquisition of data or from the sample observations being selected improperly.

- Errors in data acquisition.
- Nonresponse error refers to error (or bias) introduced when responses are not obtained from some members of the sample.
- Selection bias occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample.

# 6 Probability

## 6.1 Intersection

The intersection of events A and B is the event that occurs when both A and B occur. The probability of the intersection is called the **joint probability**.

## 6.2 Marginal Probability

Marginal probabilities, computed by adding across rows or down columns, are so named because they are calculated in the margins of the table.

## 6.3 Conditional Probability

The probability of event A given event B is

\[p(A|B) = \frac{p(AB)}{p(B)}\]

The probability of event B given event A is

\[p(B|A) = \frac{p(AB)}{p(A)}\]

## 6.4 Independence

Two events A and B are said to be independent if

\[p(A|B) = p(A)\]

or

\[p(B|A) = p(B)\]

## 6.5 Union

The union of events A and B is the event that occurs when either A or B or both occur. It is denoted as `A or B`

.

## 6.6 Complement Rule

The complement of event A is the event that occurs when event A does not occur.

\[p(A^c) = 1 - p(A)\]

## 6.7 Multiplication Rule

\[p(AB) = p(A)p(B|A) = p(B)p(A|B)\]

## 6.8 Addition Rule

The probability that event A, or event B, or both occur is

\[p(A or B) = p(A) + p(B) - p(AB)\]

## 6.9 Bayes’s Law Formula

\[p(A_{i}|B) = \frac{p(A_{i})p(B|A_{i})}{p(A_{1})p(B|A_{1}) + p(A_{2})p(B|A_{2}) + \cdots + p(A_{k})p(B|A_{k})}\]

# 7. Random Variables and Discrete Probability Distributions

## 7.1 Describing the Population Probability Distribution

\[E(x) = \mu = \sum xp(x)\]

\[V(x) = \sigma^2 = \sum (x-\mu)^2p(x)\]

## 7.2 Laws of Expected Value and Variance

\[E(c) = c\] \[E(x + c) = E(x) + c\] \[E(cx) = cE(x)\] \[V(c) = 0\] \[V(x + c) = V(x)\] \[V(cx) = c^2V(x)\]

## 7.3 Bivariate Distributions

The covariance of two discrete variables is defined as

\[COV(x, y) = \sigma_{xy} = \sum \sum (x - \mu_{x})(y-\mu_{y})p(x, y)\]

Coefficient of Correlation:

\[\rho = \frac{\sigma_{xy}}{\sigma_{x}\sigma_{y}}\]

## 7.4 Laws of Expected Value and Variance of the Sum of Two Variables

\[E(x + y) = E(x) + E(y)\]

\[V(x + y) = V(x) + V(y) + 2COV(x + y)\]

## 7.5 Mean and Variance of a Portfolio of Two Stocks

\[E(R_{p}) = w_{1}E(R_{1}) + w_{2}E(R_{2})\]

\[V(R_{p}) = w_{1}^2V(R_{1}) + w_{2}^2V(R_{2}) + 2w_{1}w_{2}COV(R_{1}, R_{2}) = w_{1}^2V(R_{1}) + w_{2}^2V(R_{2}) + 2w_{1}w_{2}\rho\sigma_{1}\sigma_{2}\]

## 7.6 Portfolios with More Than Two Stocks

\[E(R_{p}) = \sum_{i=1}^k w_{i}E(R_{i})\]

\[V(R_{p}) = \sum_{i=1}^k w_{i}^2V(R_{i}) + 2\sum_{i=1}^k \sum_{j=i+1}^k w_{i}w_{j}COV(R_{i}, R_{j})\]

## 7.7 Binormial Distribution

- The binomial experiment consists of a fixed number of trials (n).
- Each trial has two possible outcomes. success or failure.
- The probability of success is p. The probability of failure is 1 − p.
- The trials are independent

The probability of x successes in a binomial experiment with n trials and probability of success = p is

\[p(x) = \frac{n!}{x!(n-x)!}p^x(1-p)^{n-x}\]

### 7.7.1 Cumulative Probability

\[p(X \le 4) = p(0) + p(1) + p(2) + p(3) + p(4)\]

### 7.7.2 Binomial Probability p(X ≥ x)

\[p(X \ge x) = 1 - p(X \le (x-1))\]

### 7.7.3 Binomial Probability P(X = x)

\[p(x) = p(X \le x) - p(X \le (x-1))\]

### 7.7.4 Mean and Variance of a Binomial Distribution

\[\mu = np\] \[\sigma^2 = np(1-p)\] \[\sigma = \sqrt{np(1-p)}\]

## 7.8 Poisson Distribution

Like the binomial random variable, the Poisson random variable is the number of occurrences of events, which we’ll continue to call successes. The difference between the two random variables is that a binomial random variable is the number of successes in a set number of trials, whereas a Poisson random variable is the number of successes in an interval of time or specific region of space.

- The number of successes that occur in any interval is independent of the number of successes that occur in any other interval.
- The probability of a success in an interval is the same for all equal-size intervals.
- The probability of a success in an interval is proportional to the size of the interval.
- The probability of more than one success in an interval approaches 0 as the interval becomes smaller.

The probability that a Poisson random variable assumes a value of x in a specific interval is

\[p(x) = \frac{e^{-\mu}\mu^x}{x!}\]

the variance of a Poisson random variable is equal to its mean; that is

\[\sigma^2 = \mu\]

\[p(X \ge x) = 1 - p(X \le (x-1))\]

\[p(x) = p(X \le x) - p(X \le (x-1))\]

# 8. Continuous Probability Distributions

## 8.1 Uniform Distribution

\[f(x) = \frac{1}{b-a}, a \le x \le b\]

## 8.2 Normal Distribution

\[f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2}\]

### 8.2.1 Calculating Normal Probabilities

We standardize a random variable by subtracting its mean and dividing by its standard deviation. When the variable is normal, the transformed variable is called a standard normal random variable and denoted by Z; that is,

\[Z = \frac{X - \mu}{\sigma}\]

## 8.3 Exponential Distribution

\[f(x) = \lambda e^{-\lambda x}, x \ge 0\]

\[\mu = \sigma = \frac{1}{\lambda}\]

\[p(X > x) = e^{-\lambda x}\]

\[p(X < x) = 1 - e^{-\lambda x}\]

\[p(x_{1} < X < x_{2}) = p(X < x_{2}) - p(X < x_{1}) = e^{-\lambda x_{1}} - e^{-\lambda x_{2}}\]

## 8.4 Student t Distribution

\[f(t)=\frac{\Gamma[(\nu + 1)/2]}{\sqrt{\nu \pi} \Gamma (\nu /2)}[1 + \frac{t^2}{\nu}]^{-(\nu + 1)/2}\]

where \(\nu\) (Greek letter nu) is the parameter of the Student t distribution called the **degrees of freedom**, and \(\Gamma\) is the gamma function.

\[E(t) = 0\]

\[V(t) = \frac{\nu}{\nu - 2}, \nu \gt 2\]

Student t distribution is similar to the standard normal distribution. Both are symmetrical about 0. We describe the Student t distribution as mound shaped, whereas the normal distribution is bell shaped. As \(\nu\) grows larger, the Student t distribution approaches the standard normal distribution.

## 8.5 Chi-Squared Distribution

\[f(\chi^2) = \frac{1}{\Gamma(\nu/2)} \frac{1}{2^{\nu/2}}(\chi^2)^{(\nu/2)-1}e^{-\chi^2/2}\]

\[E(\chi^2) = \nu\]

\[V(\chi^2) = 2\nu\]

## 8.6 F Distribution

\[E(F) = \frac{\nu_{2}}{\nu_{2} - 2}, \nu_{2} \gt 2\]

\[V(F) = \frac{2\nu_{2}^2(\nu_{1} + \nu_{2} -2)}{\nu_{1}(\nu_{2}-1)^2(\nu_{2} -4)}, \nu_{2} \gt 4\]

\(\nu_{1}\) the numerator degrees of freedom and \(\nu_{2}\) the denominator degrees of freedom.