regression analysis Flashcards Preview

biostatistics > regression analysis > Flashcards

Flashcards in regression analysis Deck (27)
Loading flashcards...
1
Q

when did lenarde create method of least squares

A

1805

2
Q

when did gauss use method of least squares

A

1809

3
Q

when did sir galton coin ‘REGRESSION’

A

1822-1911?? how long he lived

4
Q

when was George Yule’s joined distrubution assumed to be Gaussian

A

1851-1952

5
Q

when was Karl pearson’s joined distribution assumed to be Gaussian

A

1857-1936

6
Q

when did sir ronald fisher weaken the assumption of yule and pearson

A

1980-1962

7
Q

what ‘s the earliest form of regression

A

method of least squares by

  • LEGENDRE 1805
  • GAUSS 1809

used for astronomic observations - orbits of comets and minor planets around the sun

8
Q

what did gauss do in 1821

A

Gauss published a further development of the theory of least squares, including a
version of the Gauss–Markov theorem

9
Q

what did Francis galton do in 1890

A

the term “regression” was coined by Francis Galton to describe a biological phenomenon which was

‘heights of descendants of tall ancestors tend to
regress down towards a normal average’

10
Q

when did Udny Yule and Karl Pearson extend Galton’s work in

A

1897-1903

Galton’s work was later extended by Udny Yule and
Karl Pearson to a more general statistical context.
In the work of Yule and Pearson, the joint
distribution of the response and explanatory
variables is assumed to be Gaussian.

11
Q

when did R.A. fisher weaken pearson and yule

A

1922-1925

  • his assumption is simular to Gauss’s in 1821
  • he states that ‘conditional distribution of the response
    variable is Gaussian, but the joint distribution need not
    be’
12
Q

when did Economists use electromechanical desk calculators to calculate regressions.

A

1950s-1960s

13
Q

before what date did it take up to 24 hours to
receive the result from one regression

A

before 1970

14
Q

types of statistical modelling

A

deterministic and probabilistic models

15
Q

types of probabilistic models

A
  1. regression models
  2. correlation models
  3. othe models
16
Q

types of regression modells

A
  • simple: 1 explanatory variable
    • linear or non-linear
  • multiple: 2+ explanatory variables
    • linear or non linear
17
Q

what is regression analysis

A

the nature and strength of of the relationship betw/ variables can be examined by regression and correlation analysis

regression:

assessment of the specific forms of the relationship between variablles in order to predict/estimate the value of one variable corresponding to a given value of another variable.

18
Q

7 steps of regression modelling

A
  1. Define the problem or question
  2. Specify model
  3. Collect data
  4. Do descriptive data analysis
  5. Estimate unknown parameters
  6. Evaluate model
  7. Use model for prediction
19
Q

simple vs mx regression analyis

A

simple

  • 𝛽 is the unit change in Y per unit change in X
  • doesn’t take into account any other variable besides the single independent variable

multiple

  • 𝛽𝑖 is the unit change in Y per unit change in Xi
  • takes into account the effect of other 𝛽𝑖s
  • it is the net regression coefficient
20
Q

6 assumptions required for regression analysis

A
  1. CONTINUOUS V: the two variables should be either interval or ratio variables
  2. LINEARITY: the Y variable is linearly related to the value of the X variable
  3. INDEPENDENCE OF ERROR: the residual error is independent for each value of x
  4. NO SIGNIFICANT OUTLIERS: outliers can have a negative effect on the analyisis
  5. HOMO-SCEDASTICITY: the variation around the line of regression is constant for all values of X (random errors have a constant variance)
  6. NORMALITY: the values of Y be normally distributed at each value of X
21
Q

what is the goal of regression analysis

A

to be a statistical model that can predict values of a dependant(response) variable based on the values of the independent (expanatory) variable

22
Q

what is SİMPLE LİNEAR REGRESSİON

A

describes the linear relationship between a predictor/independant variable, plotted on the x-axis, and a response/ dependant variable, plotted on the y-axis

23
Q

what is the simplest model of the relationship
between two interval-scaled attributes,Straight line

A

a Straight line=

  • it’s slope shows the existence of an association between them.
  • thus an objective way to investigate an association betw interval attributes is to draw a
    straight line through the center of the cloud of points and measure its slope.
    *
24
Q

what if the slope is 0

A

line is horizontal and we conclude that there is no association.

non zero= association

25
Q

which 2 problems must be solved when drawing a straight line

A
  1. determine how to draw a straight line that best models the relationship between attributes and
  2. how to determine whether its slope is different from zero.
26
Q

what is the linear regression model

A

staetes that : Relationship Between Variables Is a Linear Function

27
Q

what is the THE ORDINARY LEAST SQUARE METHOD

A
  • square errors occur w/ best fit because although the diff bet/w actual Y & predicted Y are minimal but positive differences off-set negative.???
  • LEAST SQUARE minimizes the Sum of the Squareds and reduces square errors/ deviations around the line
  • used to determine the line of best fit