Correlation Flashcards Preview

Statistics > Correlation > Flashcards

Flashcards in Correlation Deck (14)
Loading flashcards...
1
Q

Does the steepness of the line best fit change with the correlation?

A

Yes, the closer the correlation is to 1 the more steep the line will be.

2
Q

What are the two halves of the correlation formula?

A

The top half is the covariance, missing one step.

The bottom half gives the maximum possible variance.

3
Q

Why is correlation better than covariance?

A

Because it is standardised allowing direct comparison to other tests.

4
Q

What is the most famous version of the correlation test?

A

Pearson’s Correlation Coefficient

5
Q

How is the coefficient written?

A

r =

6
Q

Why should we always make a scatter plot?

A

Because the coefficient might be small showing a weak relationship but if we look at a scatter plot it might show that there is a relationship, but the line isn’t straight, it could be curved/bell-shaped (curve-linear relationship)
There could also be a small coefficient because there has been an outlier which is skewing the line of best fit, affecting the coefficient.

7
Q

What can we do with outliers?

A

Remove them, when theres a good reason

8
Q

What is the Coefficient of Determination?

A

This shows us precisely how much variance the 2 variables share. E.g. how much change in one variable predicts change in another. It is written a r2(squared) =

9
Q

What is Spearman’s Rho?

A

Spearman’s rank coefficient.
Continuous, discrete, ordinal data.
Non-parametric: assumptions of normality.
Based on rank scores.

10
Q

What are statistical tests separated into?

A

Parametric and non-parametric.

11
Q

What is parametric?

A

Making assumptions about the qualities about your data.
Examples:
needs to be ordinal or ratio not nominal because categorical data doesn’t work in parametric tests.
The assumption of homogeneity of variance - the variance of the two variables shouldn’t be too massively different.

12
Q

How does Spearman’s Rho work?

A

First, raw data has to be put into rank order. (rewrite data in order and then give them a rank score, smallest data is one then goes up ascending)
Then you would input the ranked information into the exact same computational formula for correlation.

13
Q

When can Spearmans Rho be difficult and how is it fixed?

A

With tied ranks, when one or more individual has the same score.
First you would put them in order like in the original formula and assign them a rank. When you go to assign a rank to the data that is tied, you first work what ranks would of been there had they not been tied and then you add them up and average them. You then give the pieces of tied data the average number as their rank.
Example: 2 tied data lies over ranks 3 and 4.
Average the ranks (3 + 4 = 7/2 = 3.5)
3.5 becomes the rank for both pieces of data

14
Q

What is the correlation?

A

It is a calculation that gives us a standardised value to view the relationship between the variables. It gives us a score between 0 and 1, and a direction, either positive or negative. When a score is closer to 1, it shows a stronger relationship.