Probability and Statistics Basics Flashcards

Question 1

Q

Prob: What are the two equivalent definitions of events A and B being independent?

Answer

A

P(A,B) = P(A)P(B)

OR

P(A) = P(A | B=b) for all values of b

(Pretty darn sure second is correct)

Question 2

Q

Prob: Conceptually, what does it mean for A and B to be independent events?

Answer

A

A and B are independent events if knowing whether one event happened or not gives you no information on whether the other happened.

Question 3

Q

Prob: What is Bayes’ Theorem?

Question 4

Q

Prob: What is the formula for the expected value of discrete RV X?

Question 5

Q

Question 6

Q

Stat: What is the formula for MSE(Ø_h), or Mean Squared Error?

Answer

A

MSE(Ø_h) = E[(Ø_h - Ø)²]

= V(Ø_h) + bias(Ø_h)²

but only really need that first part

Question 7

Q

Answer

A

It describes how likely those a distribution with those parameters was to make that dataset.

(I think it is often talked about in the context of a specific family of distributions. So we might say, what is the likelihood of a normal distribution with these paramaters, given this dataset?)

Question 8

Q

Question 9

Q

Stat: When we find a Maximum Likelihood Estimator, Min-Var Unbiased Estimator, Method of Moments Estimator, or something similar, do we typically find it in the context of some assumed distribution family (i.e. assume the distribution is normal, exponential, etc), or estimate parameters without a suspected distribution?

Answer

A

While sometimes we estimate parameters without a suspected distribution, such as distribution mean and variance, we generally more often use an assumed distribution family.

(This is mostly my opinion, and also me wanting to remember that when we for example “find the MLE”, it generally has quite a bit of structure due to an assumed distribution that we can differentiate/optimize.)

Question 10

Q

Stat: What is a Maximum Likelihood Estimator?

Answer

A

It is the estimator Ø_hat of Ø that maximizes the likelihood of your data.

So, generally for some assumed distribution family such as Exponential Distributions, you try to find an estimator lambda_hat for parameter lambda that leads to the exponential distribution that was most likely to produce this data.

Question 11

Q

Stat: At a high level, how do you find the MLE estimate for the parameters?

Answer

A

Differentiate the likelihood w.r.t. the parameters and set that equal to zero, then solve

Question 12

Q

Stat: Given observations X₁,…,X_n, what is the maximum likelihood estimator for a population proportion: for example, the proportion of red balls if we’re drawing from red, green or blue?

Question 13

Q

Stat: What is the key feature of classical, or frequentist, statistics? And what are some types of analytical tools used in this statistical philosophy?

Answer

A

In classical/frequentist statistics, the parameter Ø is constant. We examine it using estimators Ø_h, we quantify our uncertainty of its value using confidence intervals, and we test theories using hypothesis tests and p-values.

Question 14

Q

Stat: What is the key feature of bayesian statistics? And what are some types of analytical tools used in this statistical philosophy?

Answer

A

The parameter Ø is viewed as variable, and we quantify our opinions around its potential values using a prob dist π.

Question 15

Q

Stats: In Bayesian statistics, how do we update π, our prior distribution of Ø, using data Xi?

Answer

A

You incorporate using a method looking very similar to Bayes’ law.

Specifically:

Question 16

Q

Stats: In bayesian statistics, what happens to the prior distribution as we get more and more data?

Answer

Study These Flashcards

A

With enough data, the impact of the prior distribution on the posterior distribution tends towards 0.

Question 17

Q

ML: When would you use MAP estimation of a parameter instead of MLE estimation?

Answer

Study These Flashcards

A

MAP allows you to include some prior distribution of the parameter before factoring in the data. So, if you have suspicions about the value of a parameter, perhaps because of domain knowledge, then MAP could be better as it will allow you to encorporate that knowledge.

Question 18

Q

Answer

Study These Flashcards

A

Probability and Statistics Basics Flashcards

(18 cards)