Random pairs and random vectors

In words...

A random pair is a pair of random variables related by a joint probability distribution.

For instance, in an experiment drawing an individual at random in a population, we can model the height and weight of the drawn individual with a random pair. In this case, the two variables are typically not independent and their relationship is encoded in the joint probability distribution. This distribution can give for instance the probability of drawing an individual with height between 170cm and 180cm and weight above 80kg.

A joint probability distribution can be decomposed as the product of a conditional probability and a marginal probability.

A random vector generalizes the concept of a random pair to more than two variables.

The variables involved in a random pair or a random vector can be independent or not.

In picture...



In maths...

A random pair $(X,Y)$ is a pair of random variables $X\in\X$ and $Y\in\Y$. The joint probability distribution is denoted $$ P_{X,Y} : \Sigma_X \times \Sigma_Y \rightarrow [0,1]. $$

Discrete random pairs

For discrete random variables, the joint probability distribution can be expressed as $$ P_{X,Y}(x,y) = P(X=x,\ Y=y) $$ and decomposed as $$ P_{X,Y}(x,y) = P(X=x,\ Y=y) = P(X=x \ |\ Y=y) P(Y=y) = P_{X|Y=y}(x) P_Y(y) $$ where $P_{X|Y=y}$ is the conditional probability distribution of $X$ given $Y=y$ and $P_Y$ is the marginal distribution of $Y$.

Continuous random pairs

For continuous random variables, the joint probability distribution is often implicit and we rather work with the joint probability density function $$ p_{X,Y}\ : \X\times\Y\rightarrow \R^+,\quad \mbox{such that }\quad \forall A\in\Sigma_X,\ B\in\Sigma_Y,\quad P_{X,Y}(A,B) = \int_A\int_B p_{X,Y}(x,y) dx dy , $$ which decomposes as $$ p_{X,Y}(x,y) = p_{X|Y}(x \ |\ y) p_Y(y) $$ where $p_{X|Y}$ is the conditional probability density function of $X$ given $Y$ and $p_Y$ is the marginal probability density function of $Y$.

Mixed random pairs

...

Random vectors

All the above can be generalized to a collection $\g X=(X_1,\dots,X_n)$ of $n>2$ variables, which we call a random vector. For instance, in the continuous case where all the variables are independent we have $$ p_{\g X} (\g x) = \prod_{i=1}^n p_{X_i}(x_i) . $$