What is the key concept to make inferences about parameters based on statistics?
Sampling distribution
What is sampling distribution?
It is the distribution of a statistic taken as a random variable. So it is to take several random samples of a population and plot the results of the statistic (the thing you were looking for in the sample) in a frequency distribution. The curve which will appear is approximately a normal distribution if you took enough samples.
What is random sampling?
It is the prerequisite for valid statistical inference. It is to take a random number of observations from the population.
Central Limit Theorem
“The sampling distribution of the mean of a random sample drawn from any population is approximately
normal for a sufficiently large sample size n; the larger the sample size, the more closely it resembles a normal distribution.”
What is the estimator and the estimand?
The statistic is the estimator and the estimand is the parameter
Procedure of making an inference about the population variance
sample variance notation
s^2
What is a hypothesis
an educated guess
research hypothesis
is our research goal. called H1
Type 1 error = alpha
reject H0 if true
Type 2 error = beta
do not reject H0 when it is false
How are the error probabilities related?
They are inversely related. The higher alpha is the lower beta is.
Which type of error is more serious?
Type 1 error (To reject H0 although it is true)
–> it is desirable to have a small alpha
How does the hypothesis testing procedure start?
How can we gather evidence to reject or accept the null hypothesis?
How do we know whether the test statistic provides enough evidence to reject or nor reject the null hypothesis?
2. p-value
What is the significance level (alpha)?
It is the upper bound (the maximum tolerance) for the Type 1 error probability in a statistical test
What is defined according to the significance level?
the extreme (rejection regions) are defined according to the significance level
One-tail test
Two-tail test
Which distribution do the one and two-tail tests use?
It uses the sampling distribution of the mean
P-value
if P-value
P-value two-tail Z test
sum up the probabilities of both tails