Probability and Statistics Basics Flashcards

(18 cards)

1
Q

Prob: What are the two equivalent definitions of events A and B being independent?

A

P(A,B) = P(A)P(B)

OR

P(A) = P(A | B=b) for all values of b

(Pretty darn sure second is correct)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Prob: Conceptually, what does it mean for A and B to be independent events?

A

A and B are independent events if knowing whether one event happened or not gives you no information on whether the other happened.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Prob: What is Bayes’ Theorem?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Prob: What is the formula for the expected value of discrete RV X?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Stat: What is the formula for MSE(Ø_h), or Mean Squared Error?

A

MSE(Ø_h) = E[(Ø_h - Ø)2 ]

= V(Ø_h) + bias(Ø_h)2

but only really need that first part

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
A

It describes how likely those a distribution with those parameters was to make that dataset.

(I think it is often talked about in the context of a specific family of distributions. So we might say, what is the likelihood of a normal distribution with these paramaters, given this dataset?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Stat: When we find a Maximum Likelihood Estimator, Min-Var Unbiased Estimator, Method of Moments Estimator, or something similar, do we typically find it in the context of some assumed distribution family (i.e. assume the distribution is normal, exponential, etc), or estimate parameters without a suspected distribution?

A

While sometimes we estimate parameters without a suspected distribution, such as distribution mean and variance, we generally more often use an assumed distribution family.

(This is mostly my opinion, and also me wanting to remember that when we for example “find the MLE”, it generally has quite a bit of structure due to an assumed distribution that we can differentiate/optimize.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Stat: What is a Maximum Likelihood Estimator?

A

It is the estimator Ø_hat of Ø that maximizes the likelihood of your data.

So, generally for some assumed distribution family such as Exponential Distributions, you try to find an estimator lambda_hat for parameter lambda that leads to the exponential distribution that was most likely to produce this data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Stat: At a high level, how do you find the MLE estimate for the parameters?

A

Differentiate the likelihood w.r.t. the parameters and set that equal to zero, then solve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Stat: Given observations X1,…,Xn, what is the maximum likelihood estimator for a population proportion: for example, the proportion of red balls if we’re drawing from red, green or blue?

A

reds/n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Stat: What is the key feature of classical, or frequentist, statistics? And what are some types of analytical tools used in this statistical philosophy?

A

In classical/frequentist statistics, the parameter Ø is constant. We examine it using estimators Ø_h, we quantify our uncertainty of its value using confidence intervals, and we test theories using hypothesis tests and p-values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Stat: What is the key feature of bayesian statistics? And what are some types of analytical tools used in this statistical philosophy?

A

The parameter Ø is viewed as variable, and we quantify our opinions around its potential values using a prob dist π.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Stats: In Bayesian statistics, how do we update π, our prior distribution of Ø, using data Xi?

A

You incorporate using a method looking very similar to Bayes’ law.

Specifically:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Stats: In bayesian statistics, what happens to the prior distribution as we get more and more data?

A

With enough data, the impact of the prior distribution on the posterior distribution tends towards 0.

17
Q

ML: When would you use MAP estimation of a parameter instead of MLE estimation?

A

MAP allows you to include some prior distribution of the parameter before factoring in the data. So, if you have suspicions about the value of a parameter, perhaps because of domain knowledge, then MAP could be better as it will allow you to encorporate that knowledge.