Concept index
Contents
This page collects definitions and tools that are used repeatedly in the main text but are not restated every time. Each entry keeps only the most common checks and formulas, so it can be used as a quick reference.
Basic modeling and distribution functions
A probability space is a triple . Here is the sample space, is the event space, and satisfies and countable additivity. In problems, first identify what the outcomes are, what the events are, and how probabilities are assigned.
A collection is a -algebra on if , and if it is closed under complements and countable unions. By De Morgan's laws, it is also closed under countable intersections. It tells us which sets are allowed to have probabilities.
For a finite or countable model, write it in three steps:
- Sample space : list all possible outcomes.
- Probability : state whether outcomes are equally likely, or give the weights.
- Random variable : map each outcome to a number.
This avoids mixing up the outcome of the random experiment with the value of a random variable.
A random variable is a measurable function from the sample space to the real line. Many random variables can be defined on the same probability space. In many problems, the clean order is to write and first, then define .
A family of events is mutually independent if for every finite set of distinct indices ,
Pairwise independence only checks the case , and is strictly weaker than mutual independence.
The distribution function of a random variable is . It is nondecreasing, right-continuous, and satisfies
Point masses are jumps:
To check that a function is a distribution function, usually verify:
- is nondecreasing.
- is right-continuous.
- .
- .
If are distribution functions and , then
is also a distribution function.
If , the inverse transform construction
gives a random variable with distribution function .
Conditional expectation, indicators, and second moments
If is a nonnegative integer-valued random variable, then
If is a general nonnegative random variable, then
in the extended sense, allowing the value .
For a mixture distribution or a multi-stage experiment, first choose a variable that simplifies the structure. Then use
In the continuous case, replace sums by integrals.
is the average prediction of after the information is given. In the discrete case, one can think of as dividing the sample space into conditional blocks; the conditional expectation is the average on each block. A common formula is the tower property
Counting random variables are often written as
Then
and the variance can be computed by
This is useful for counting adjacency relations, local structures, and numbers of appearances.
Covariance is linear in each argument. For example,
If are independent and have finite second moments, then
For sample means, centered variables, and projection residuals, covariance linearity often gives the answer in one line.
The -th moment method rewrites a tail event in terms of a higher even power. If and , Markov's inequality gives
When and , this becomes Chebyshev's inequality:
This is often used to prove convergence in probability. If
then
The usual move is to write the target difference as and bound an even moment. If the second moment is too weak, try the fourth, sixth, or another higher even moment.
Characteristic functions and independence
The characteristic function of a random variable is
It always exists, and . A distribution is uniquely determined by its characteristic function, so this tool is well suited to independent sums and limiting distributions.
If are independent, then
More generally, a sum of independent random variables corresponds to a product of characteristic functions. For limits of independent sums, first write the characteristic function of each term, then study the product.
The joint characteristic function is
If
then and are independent. Knowing only
does not usually imply independence of , because it only checks the diagonal of the joint characteristic function.
If
and is the characteristic function of a random variable and is continuous at , then
In particular, if the limit is
then the limiting distribution is .
Convergence in distribution and test functions
is equivalent to
at every continuity point of the distribution function of . It is also equivalent to
for every bounded continuous function . When using distribution functions, take limits directly only at continuity points.
If , then under suitable conditions one can construct copies with the same distributions,
such that
This can turn a weak convergence problem into an almost sure convergence problem. But it is a theorem; it does not mean the original converges almost surely.
If a.s., a.s., and are independent for each , then are independent. One proof uses bounded continuous test functions. For any bounded continuous ,
and then the dominated convergence theorem passes to the limit.
Limit theorem toolbox
If are i.i.d. and , then
Use this to replace a sample average by the theoretical mean. Before applying it, check independence, identical distribution, and the first moment condition.
If are i.i.d., , and , then
In the general case, center first and divide by the standard deviation. Before applying it, check the mean, variance, and i.i.d. assumptions.
If
then
In particular, if the denominator converges in probability to , then
This is commonly used for random normalizations and negligible error terms.
If all moments converge to the moments of a distribution that is uniquely determined by its moments, then convergence in distribution follows. For a standard normal variable , odd moments are , and
When using this method, say why the target distribution is determined by its moments. It is not enough to write only "the moments converge."
Triangular arrays
For sums whose entries change with each row,
we often write
The normalized object is
First compute , then check the relevant central limit theorem condition.
For every , if
then, under the usual surrounding assumptions, a central limit theorem holds. The steps are:
- First compute .
- Then write the Lindeberg term.
- Control it using tail integrability or a stronger moment condition.
If
then the Lindeberg condition holds. Indeed, on ,
This is a common quick check in textbooks.
Advanced tools: tail bounds and concentration
If and , then
Letting gives the second moment method:
This is useful when you want to prove that some structure appears at least once. Usually is the number of appearances; compute and control .
Let
If only the dependent pairs are included in
then in many counting problems, and imply . This is a common template in random graphs and random structures.
If the moment generating function
is finite in the relevant range, write
For , exponential Markov gives
Thus one usually writes
This is the starting point of many exponential tail bounds: write the moment generating function, then optimize over .
Let . If there is a such that for all ,
then is called sub-Gaussian with parameter . A typical tail bound is
Bounded variables, normal variables, and many independent sums have this square-exponential tail behavior.
Suppose are independent and . Let
Then is again sub-Gaussian-type, and
Use this for independent weighted sums, random signs, and deviations of empirical averages. The main step is computing the variance proxy correctly.
Let . If there are such that for ,
then is called sub-exponential with parameters . Its one-sided tail bound is
Small deviations look sub-Gaussian; large deviations become exponential.
Let be independent, with , , and
For and , a common one-sided Bernstein-type bound is
For a two-sided bound, apply the same inequality to as well. This is often much sharper than Chebyshev for sums of independent bounded variables.
If
then
This is often used to prove that a bad event happens only finitely many times, which then gives an eventual almost sure bound. If the events are independent and , the second Borel-Cantelli lemma gives .
Quick reference
- To prove convergence in probability: first try a higher even-moment method or Chebyshev.
- To prove convergence in distribution to a normal law: first try CLT plus Slutsky.
- For triangular arrays: check Lindeberg or the third-moment criterion.
- For independent sums: consider characteristic functions.
- To prove a nonnegative count is positive: try Paley-Zygmund or a second moment lower bound.
- For exponential tail bounds: write the moment generating function and try Chernoff-Cramer.
- For independent weighted sums: check whether a Hoeffding-type bound applies.
- For sums of independent bounded variables: consider a Bernstein-type bound.
- For maximum probabilities: first try a union bound, then combine it with Chernoff-Cramer, Hoeffding, or Bernstein.
- For eventual almost sure statements: consider Borel-Cantelli.
- For counting problems: write the count as a sum of indicator variables.
- For limits of distribution functions: take limits directly only at continuity points.
- For limits of expectations when you only have convergence in distribution: consider Skorohod representation or uniform integrability.
In probability, many uses of "obvious" rely on countable additivity, monotone convergence, independence, moment conditions, or the assumptions of a limit theorem. When reading a proof, it is better to mark these conditions step by step.