Flashcards in Hypothesis Testing And Formulas In Excel Deck (18)

Loading flashcards...

1

## One sample t-test

###
Used for data with a NORMAL DISTRIBUTION

Example: effectiveness of a medicine. The same group of people are tested, one time before treatment and one time after.

The measurements cannot be carried out at the same time and the data are gathered from the same subjects.

Nullhypothesis: there is no significant difference (the medicine is not effective).

Degree of freedom: (n-1)

In this example that would mean the amount of people participating in the study minus one.

P value: T.TEST (array1, array2, 2,1)

Array 1: patients before treatment

Array 2: patients after treatment

Tail: 2

Type: 1

T value: T.INV.2T

Appropriate statistical parameter: T-value (p-value, degree of freedom)

2

## Two sample t-test

###
Used for data with a NORMAL DISTRIBUTION

Used for two individual groups with one variable. Example: is there a significant difference in body height between men and women?

First you have to use a F-test to find out which t-test to use. The null hypothesis of a f-test is that the variances of the two datasets are equal variance/variance = 1.

The f-test/f-value is variance of one dataset (the variance with the biggest value) divided by the variance of the other dataset.

The p-value of the f-test: t-test function.

If the f-test gives a value over 5% the variances are equal, and if the f-test gives a value under 5% the variances are unequal.

Equal variance: t-test type 2

Unequal variance: t-test type 3

Nullhypothesis of the t-test: there is no significant difference.

Degree of freedom: count both datasets (different groups) and subtract 2.

P value: T.TEST (array1, array2,2,2)

Array 1: height of men

Array 2: height of women

Tails: 2

Type: 2

Appropriate statistical parameter is T value: T.INV.2T (probability, degree of freedom)

Probability:

Degree of freedom: count both groups - 2

3

## What is the f-value and when do you need it?

###
You only need the f-value when you have two different samples.

F-value: variance/variance. The biggest value is always on top, f-value is never below 0. If above 5%, variances are equal. Below, it's unequal.

Variance in excel: Var.S

F.test gives p-value

P-value of two-sided f-test: F.dist.2t

4

## How do you use frequency function?

###
Create intervals for your data.

Create “bins” for your data. The bins are the last, or highest, number in the intervals.

Example:

Intervals: 0-10, 11-20, 21-30, 31-40

Bins: 10, 20, 30, 40

Select empty boxes for the frequencies, remember to add an extra box for at the bottom of the colon.

Press =FREQUENCY(array;bins). Select you data as the array and your bins for bins. Do not directly press enter, but press instead ctrl + shift + enter.

Create a histogram with your frequency, and change the x-axis to the intervals. Under chart design, press “select data” and select the intervals.

5

## Relative frequency

### The frequency in that “class” or category divided by total number of elements

6

## Frequency density

### Frequency divided by the width of the “class” or interval.

7

## Relative frequency density

### Frequency density divided by total number of elements

8

## Cumulative distribution

### Sum the frequencies as you go down the colons.

9

## Integral discriminator

### Total element number minus cumulative distribution

10

## Percentile

###
Function: percentile.inc

Array: data you want to find the percentile of

K= how many percent in decimals. (10% = 0,1)

11

## Correlation (Pearsons) test

###
Used for data with NORMAL DISTRIBUTION

- usually the work correlation is mentioned in question.¨

1 group with 2 variables. Example: is there a correlation between blood potassium conc. and the length of musculus palmaris longus?

Degree of freedom: n - 2

Remember to only count one dataset if the same group is tested.

Correlation coefficient (r): CORREL function. Not important which dataset is in which array. It is always between 1 and 0. If r = 0, there is no correlation.

Determination coefficient (r^2): tells how strong the relationship between the

data is.

Appropriate statistical parameter: t-value = (r* root square (degree of freedom)) / (1- r^2)

P-value: T.dist.2t (t-value, degree of freedom)

First column: x, second column: y

Slope: SLOPE (x,y)

Intercept: INTERCEPT(x,y) (intercept is where the y axis is intercepted?)

y: a x + b

a: slope

b: intercept

If 5 is given as x;

y: a * x + b

If 5 is y;

x: (y - b) / a

12

## Wilcoxon signed rank test

###
The dataset has NON-NORMAL DISTRIBUTION

1 sample and 1 variable. Example: weight before and after diet

Degree of freedom: count - 1

Mean: =AVERAGE

Standard deviation: =STDEV.S

Standard error: ST DEV/ square root (count)

T-value: (mean/ standard error)

P-value: T.DIST (t-value, degree of freedom)

Wilcoxon value: wilcoxon function (array 1, array 2, 2, 0)

TYPE 0

Array 1: weight before diet

Array 2: weight after diet

Tail (?): 2

Type: 0

13

## Mann-whitney U- test

###
Used for dataset that has a NON-NORMAL DISTRIBUTION

When we have two independent samples. Example: change in headache in subjects given aspirin and subjects given a placebo pill.

Function: Mann-whitney function: (array 1, array 2, 2)

TYPE 2

14

## Chi- square test

###
For datasets that has NON-NORMAL DISTRIBUTION

For independency:

- two of more groups. e.g. 'frequency of pulmonary cancer amongst smokers and non-smokers'. (No dependency)

For homogenity:

- two or more groups:

Tests:

- frequency data

- probability chart

- observed frequency and expected frequency. e.g, 'frequency of wearing glasses among girls and boys'.

- one group (known distribution), frequency data, testing how well the observed frequencies fit the expected frequencies. e.g, 'testing the normality of frog red blood cells'.

Degree of freedom: (number of rows - 1 )* (number of columns - 1)

Chi-square test: CHISQ.TEST (actual range; expected range)

Actual range: INNER four squares of measured.

Expected range: the calculated expected range of the measured.

Appropriate statistical parameter: Chi-square value: CHISQ.INV.RT (probability; degree of freedom)

Probability: chi-square test

15

## ANOVA

###
Analysis of variance.

Three or more independent group, separated from each other. One variable has normal distribution.

16

## Kruskall- wallis test

###
3 or more groups

non-parametric (non-normal distribution)

Ranked data without separating the groups

17

## Diagnostic tests

###
Example used: sick/healthy people.

Sensitivity - how many are true positive

Specificity - how many are true negative

Prevalence - how many are negative (all the sick true positive and false negative)

Start by making a chart with total set to 2000. Divide 2000 between sick and healthy people. If prevalence is 0,5, 1000 sick and 1000 healthy.

Sensitivity is how many true positive there are. Divide the amount of sick people by 100 and multiply that number with the percentage of sensitivity.

Specificity is how many true negative there are. Divide amount of healthy people by 100 and multiply that number with the percentage of specificity.

Positive predicted value (relevance): True positive / all positive

Negative predicted value (segregation): True negative / all negative

A-prior-probability: all bacterial / total

Correct classification rate/efficiency: all negative / total

18