What is a sample space in probability theory?
The set of all possible basic outcomes of a random experiment.
What is an event in probability?
Any subset of the sample space, representing a collection of outcomes we care about.
How is the probability of an event informally interpreted?
As a number between 0 and 1 representing how likely the event is to occur, with 0 impossible and 1 certain.
What are the three basic properties any probability measure must satisfy?
Non-negativity, P(sample space)=1, and additivity for mutually exclusive events (P(A ∪ B)=P(A)+P(B) when A and B cannot both occur).
What is the complement of an event A?
The event consisting of all outcomes in the sample space that are not in A.
How is the probability of the complement of A related to P(A)?
P(Aᶜ) = 1 − P(A).
What does it mean for two events A and B to be mutually exclusive (disjoint)?
They cannot happen at the same time, so their intersection is empty and P(A ∩ B)=0.
What is the general addition rule for probabilities of two events A and B?
P(A ∪ B) = P(A) + P(B) − P(A ∩ B).
What is conditional probability P(A|B)?
The probability that event A occurs given that event B has occurred, defined as P(A ∩ B) / P(B) when P(B) > 0.
How can you interpret P(A|B) intuitively?
It is the probability of A when we restrict our attention to the subset of outcomes where B has happened.
What is the multiplication rule relating joint and conditional probability?
P(A ∩ B) = P(A|B) · P(B) = P(B|A) · P(A).
What does it mean for two events A and B to be independent?
Knowing that one occurs gives no information about the other, so P(A ∩ B) = P(A)P(B) and P(A|B) = P(A).
Can events be mutually exclusive and independent at the same time (with nonzero probabilities)?
No; if they are mutually exclusive and both have nonzero probability, then P(A ∩ B)=0≠P(A)P(B).
What is Bayes’ theorem for two events A and B?
P(A|B) = P(B|A)P(A) / P(B), assuming P(B) > 0.
Why is Bayes’ theorem important in ML contexts?
It provides a way to invert conditional probabilities and update beliefs about hypotheses given observed evidence.
What is the law of total probability?
If {B₁,…,Bₙ} is a partition of the sample space, then P(A) = Σ P(A|Bᵢ)P(Bᵢ).
What is a random variable?
A function that maps outcomes in the sample space to numeric values, allowing us to analyze numerical aspects of randomness.
What is the difference between a discrete and a continuous random variable?
A discrete random variable takes values in a countable set; a continuous random variable takes values in an interval or continuum of real numbers.
What is a probability mass function (pmf)?
A function p(x) that gives P(X = x) for a discrete random variable X.
What two key properties must a pmf satisfy?
p(x) ≥ 0 for all x, and the sum over all possible x of p(x) equals 1.
What is a probability density function (pdf)?
A nonnegative function f(x) such that the probability that a continuous random variable X lies in an interval is given by the integral of f over that interval.
Why does it not make sense to ask for P(X = x) for a continuous random variable with a pdf?
For a continuous variable, the probability at any specific point is typically 0; probabilities are associated with intervals, not points.
What is a cumulative distribution function (cdf)?
A function F(x) = P(X ≤ x) that gives the probability that the random variable X is less than or equal to x.
How is the cdf of a discrete random variable related to its pmf?
F(x) is the sum of p(t) over all t ≤ x.