Independence

In words...

Two events are independent if the probability of observing both of them together equals the product of their respective probabilities.

Independence of two events implies that their conditional probabilities equal their probabilities, i.e., that the probability of observing an event is not influenced by the occurrence of the other.

Two random variables are said independent if their joint probability distribution can be written as the product of their marginal distributions.

In picture...

In maths...

Independent events

Given a probability space $(\Omega,\Sigma,P)$, two events $A\in\Sigma$ and $B\in\Sigma$ are independent if and only if $$ P(A,\ B) = P(A\cap B) = P(A) P(B). $$

Independence of $A$ and $B$ implies the following for the conditional probabilities: $$ P(A\ |\ B) = \frac{P(A,\ B)}{P(B)} = \frac{P(A) P(B)}{P(B)} = P(A) $$ (a similar result holds for $P(B\ |\ A)$).

Independence of discrete random variables

Two discrete random variables $X\in\X$ and $Y\in\Y$ are said to be independent when the events $X=x$ and $Y=y$ are independent for all pairs $(x,y)\in\X\times\Y$, i.e., if and only if $$ \forall (x,y)\in \X\times\Y,\quad P(X=x, Y=y) = P(X=x)P(Y=y) $$ (which can be written as $P_{X,Y}(x,y) = P_X(x) P_Y(y)$).
Independence of two random variables implies that the conditional probability distributions equal the marginal distributions as $$ P_{X|Y=y} ( x ) = P(X=x\ |\ Y=y) = \frac{P(X=x,\ Y=y)}{P(Y=y)} = \frac{P(X=x)P(Y=y)}{P(Y=y)} = P(X=x) = P_X(x). $$

Independence of continuous random pairs

Two continuous random variables $X\in\X$ and $Y\in\Y$ with joint probability density function $p_{X,Y}$ and marginal density functions $p_X$ and $p_Y$ are independent if and only if $$ \forall (x,y)\in \X\times\Y,\quad p_{X,Y}(x,y) = p_X(x) p_Y(y), $$ which implies $$ p_{X|Y} ( x\ |\ y ) = \frac{p_{X,Y}(x, y)}{p_Y(y)} = \frac{ p_X(x) p_Y(y)}{p_Y(y)} = p_X(x). $$

Notes

The conditional probability $P(A|B)$ is not well-defined when $P(B)=0$. However, this case can be dealt with by setting $P(A|B)$ to an arbitrary value. Indeed, this value does not influence the computations involving $P(A|B)$ since it corresponds to the probability of observing $A$ given that an event with zero probability occurred.