week 2 - descriptive statistics part 2 Flashcards Preview

Biostatistics > week 2 - descriptive statistics part 2 > Flashcards

Flashcards in week 2 - descriptive statistics part 2 Deck (20)
Loading flashcards...
1
Q

What is the difference between sampling errors and non-sampling errors?

A

Sampling errors result from the fact that only a fraction of the population is being observed. They become less important when the sample size increases.

Non- sampling errors arise when the sample is not representative of the whole population and the do not neccesarily decrease with the sample size. Eg not including people in a study with no permanent home address because their existence is not recorded.

2
Q

Name some different types of sampling.

A

Non-random sampling, simple random sampling (SRS) and Stratified random sampling.

3
Q

What is non-random sampling?

A

The sample is chosen not by random. For example participants opt to be part of a study.

4
Q

What is SRS (simple random sampling)

A

SRS means everyone in a population has equal chance of being involved in a study. For example everyone in a population is provided with a number and a group of numbers is then generated randomly to decide which participants are the be involved from the population.

5
Q

How does stratified random sampling differ from SRS?

A

In stratified random sampling the population is divided into groups based on particular characteristics (eg sex, occupation etc) and then random sample is selected from each of these groups

6
Q

Calculate the mean and median for the following numbers..

83, 90, 80, 105, 85, 74, 88.

A

Mean = 86.4 Medium = 85

7
Q

Why is the mean most commonly used to describe the distribution of data?

A

Because it is the most amendable to analysis

8
Q

What are the most commonly used measures of locations (also known as measure of central tendency or average).

A

Mean, Mode and Median.

9
Q

When would you not use the mean to describe the measure of location and what might you use instead?

A

When the distribution is skewed the mean can be misleading so you would use the median instead

10
Q

A positively skewed distribution is skewed to the ….and a negatively skewed distribution is skewed to the ….

A

Positive = right. Negative = left.

11
Q

What is the range and what are its disadvantages?

A

The difference between the largest and the smallest values. The disadvantaged are that it wastes informations, extreme values (outliers) may make it unreliable, and it often increases as the sample increases.

12
Q

What are these two symbols and what is the difference between them? µ and

A

‘x-bar’ and ‘mu’. X-bar represents the mean of a sample population where ‘mu’ represents the mean of a total population.

13
Q

What is the below formula representing?

A

The formula for sample standard deviation.

14
Q

What is the formula (algebraic) of population standard deviation?

A
15
Q

How to do calculate the sample standard deviation for the sample variance?

A

The sample standard deviation is the square root of the sample variance.

16
Q

Calculate the mean, variance, standard deviation and coefficient of variation (%) of the following diastolic blood pressures. (74, 80, 83, 85, 88, 90, 105).

A

Mean = 86.4

Variance = 94.95

Standard deviation = 9.74

Coefficient of variation = 11%

17
Q

Boxplots display….

A

The first, second and third quartile (Q1, Q2, Q3) as well as Q0 = min value (that lies within 1.5 x IQR from Q1) and max value (that lies within 1.5 x IQR from Q3)

18
Q

The interquartile range represents what percent of the data. When would you use to IQR to describe your distribution?

A

50% of the data lies within Q1 and Q3 (IQR). It is used alongside the median when describing data that is not normally distributed

19
Q

When discussing the standard deviation what oher appropriate statistic do you use? When would you not use these two measurements?

A

The mean and the standard deviation go together. You would only use these in approximately normal distribution of data.

20
Q

on SPSS does percent or valid percent represent the relative frequency? which one should you represent on a table

A

valid percent represents the relative frequency it is calculated by the number of people in a catagory / the number of people who answered the question. AKA does not include the missing values.