Core Probability & Random Variables Flashcards by O Cam

What is a sample space in probability theory?

The set of all possible basic outcomes of a random experiment.

How well did you know this?

Not at all

Perfectly

What is an event in probability?

Any subset of the sample space, representing a collection of outcomes we care about.

How well did you know this?

Not at all

Perfectly

How is the probability of an event informally interpreted?

As a number between 0 and 1 representing how likely the event is to occur, with 0 impossible and 1 certain.

How well did you know this?

Not at all

Perfectly

What are the three basic properties any probability measure must satisfy?

Non-negativity, P(sample space)=1, and additivity for mutually exclusive events (P(A ∪ B)=P(A)+P(B) when A and B cannot both occur).

How well did you know this?

Not at all

Perfectly

What is the complement of an event A?

The event consisting of all outcomes in the sample space that are not in A.

How well did you know this?

Not at all

Perfectly

How is the probability of the complement of A related to P(A)?

P(Aᶜ) = 1 − P(A).

How well did you know this?

Not at all

Perfectly

What does it mean for two events A and B to be mutually exclusive (disjoint)?

They cannot happen at the same time, so their intersection is empty and P(A ∩ B)=0.

How well did you know this?

Not at all

Perfectly

What is the general addition rule for probabilities of two events A and B?

P(A ∪ B) = P(A) + P(B) − P(A ∩ B).

How well did you know this?

Not at all

Perfectly

What is conditional probability P(A|B)?

The probability that event A occurs given that event B has occurred, defined as P(A ∩ B) / P(B) when P(B) > 0.

How well did you know this?

Not at all

Perfectly

How can you interpret P(A|B) intuitively?

It is the probability of A when we restrict our attention to the subset of outcomes where B has happened.

How well did you know this?

Not at all

Perfectly

What is the multiplication rule relating joint and conditional probability?

P(A ∩ B) = P(A|B) · P(B) = P(B|A) · P(A).

How well did you know this?

Not at all

Perfectly

What does it mean for two events A and B to be independent?

Knowing that one occurs gives no information about the other, so P(A ∩ B) = P(A)P(B) and P(A|B) = P(A).

How well did you know this?

Not at all

Perfectly

Can events be mutually exclusive and independent at the same time (with nonzero probabilities)?

No; if they are mutually exclusive and both have nonzero probability, then P(A ∩ B)=0≠P(A)P(B).

How well did you know this?

Not at all

Perfectly

What is Bayes’ theorem for two events A and B?

P(A|B) = P(B|A)P(A) / P(B), assuming P(B) > 0.

How well did you know this?

Not at all

Perfectly

Why is Bayes’ theorem important in ML contexts?

It provides a way to invert conditional probabilities and update beliefs about hypotheses given observed evidence.

How well did you know this?

Not at all

Perfectly

What is the law of total probability?

If {B₁,…,Bₙ} is a partition of the sample space, then P(A) = Σ P(A|Bᵢ)P(Bᵢ).

How well did you know this?

Not at all

Perfectly

What is a random variable?

A function that maps outcomes in the sample space to numeric values, allowing us to analyze numerical aspects of randomness.

How well did you know this?

Not at all

Perfectly

What is the difference between a discrete and a continuous random variable?

Study These Flashcards

A discrete random variable takes values in a countable set; a continuous random variable takes values in an interval or continuum of real numbers.

What is a probability mass function (pmf)?

Study These Flashcards

A function p(x) that gives P(X = x) for a discrete random variable X.

What two key properties must a pmf satisfy?

Study These Flashcards

p(x) ≥ 0 for all x, and the sum over all possible x of p(x) equals 1.

What is a probability density function (pdf)?

Study These Flashcards

A nonnegative function f(x) such that the probability that a continuous random variable X lies in an interval is given by the integral of f over that interval.

Why does it not make sense to ask for P(X = x) for a continuous random variable with a pdf?

Study These Flashcards

For a continuous variable, the probability at any specific point is typically 0; probabilities are associated with intervals, not points.

What is a cumulative distribution function (cdf)?

Study These Flashcards

A function F(x) = P(X ≤ x) that gives the probability that the random variable X is less than or equal to x.

How is the cdf of a discrete random variable related to its pmf?

Study These Flashcards

F(x) is the sum of p(t) over all t ≤ x.

How is the cdf of a continuous random variable related to its pdf?

F(x) is the integral of f(t) from −∞ up to x.

What does it mean for two random variables X and Y to be identically distributed?

They share the same probability distribution (same pmf or pdf), even if they are not connected or observed together.

What does it mean for random variables X₁,…,Xₙ to be independent?

For any selection of values, the joint probability factorizes into the product of the individual probabilities, e.g., P(X₁=x₁,…,Xₙ=xₙ)=∏ P(Xᵢ=xᵢ).

What is a joint distribution of two random variables X and Y?

A function (pmf or pdf) that gives probabilities (or densities) for all pairs (x,y) together, not just individually.

What is a marginal distribution?

The distribution of one variable obtained from a joint distribution by summing or integrating over the other variable(s).

How do you obtain the marginal pmf of X from a joint pmf of X and Y?

Sum the joint pmf over all possible values of Y: p_X(x) = Σ_y p_{X,Y}(x,y).

How do you obtain the marginal pdf of X from a joint pdf of X and Y?

Integrate the joint pdf over all values of Y: f_X(x) = ∫ f_{X,Y}(x,y) dy.

What is the conditional distribution of Y given X=x?

A distribution with pmf or pdf proportional to the joint distribution at X=x, normalized so probabilities sum or integrate to 1.

How is conditional pmf p(Y=y | X=x) related to the joint and marginal pmfs?

p(Y=y | X=x) = p_{X,Y}(x,y) / p_X(x), assuming p_X(x) > 0.

How is the concept of independence expressed in terms of joint and marginal distributions?

X and Y are independent if and only if their joint distribution factorizes: p_{X,Y}(x,y) = p_X(x)p_Y(y) or f_{X,Y}(x,y)=f_X(x)f_Y(y).

What is the expectation (mean) of a random variable at a high level?

The long-run average value of the variable if we could repeat the random experiment many times.

How is the expectation of a discrete random variable X with pmf p(x) defined?

E[X] = Σ x · p(x), summing over all possible values x.

How is the expectation of a continuous random variable X with pdf f(x) defined?

E[X] = ∫ x · f(x) dx, integrating over the support of X.

What does linearity of expectation mean?

For any random variables X and Y and constants a, b, E[aX + bY] = aE[X] + bE[Y], regardless of whether X and Y are independent.

What is the variance of a random variable at a high level?

A measure of how much the variable’s values spread around its mean.

How is variance Var(X) defined in terms of expectation?

Var(X) = E[(X − E[X])²].

What is the standard deviation of a random variable?

The square root of the variance, providing spread in the same units as the original variable.

What is covariance between two random variables X and Y (intuitively)?

A measure of how X and Y vary together, positive if they tend to go up and down together, negative if they move in opposite directions.

Why is correlation often used instead of raw covariance?

Correlation normalizes covariance to lie between −1 and 1, making it easier to interpret strength and direction of linear relationships.

Why does zero covariance (or correlation) not necessarily imply independence?

Because two variables can have a nonlinear relationship that makes covariance zero, yet still be statistically dependent in a more complex way.

Core Probability & Random Variables Flashcards

(44 cards)