A reproducing kernel Hilbert space (RKHS) is a Hilbert space of functions associated with a reproducing kernel at the basis of the definition of the inner product and the evaluation functional in this space.
A real-valued positive definite function is what we call in machine learning a kernel function. Common kernel functions are the linear kernel, the polynomial kernel and the Gaussian kernel. The Moore-Aronszajn theorem states that for any kernel function, there is one and only one RKHS with this function as reproducing kernel.
We call kernel function a function that is a real-valued positive definite function according to the following definition.
A real-valued function $K$ on $\X^2$ is called a positive definite function if it is symmetric and
\begin{align*}
\forall n\in\mathbb{N},\ \forall \{\g x_i\}_{i=1}^n \in \X^n,\ \forall \{a_i\}_{i=1}^n \in \R^n, \quad
\sum_{i=1}^n\sum_{j=1}^n a_i a_j K(\g x_i, \g x_j) \geq 0 .
\end{align*}
Let $\left(\H, \inner{\cdot}{\cdot}_{\H}\right)$ be a Hilbert space of real-valued functions on $\X$. A real-valued function $K$ on $\X^2$ is a reproducing kernel of $\H$ if and only if
A Hilbert space of real-valued functions which possesses a reproducing kernel is called a
reproducing kernel Hilbert space (RKHS).
Note that the reproducing property of $K$ implies in particular that
$$
\inner{K(\g x, \cdot)}{K(\g x^\prime, \cdot)}_{\H} = K(\g x, \g x^\prime) .
$$
The Moore-Aronszajn theorem states that for all kernel function $K$, there is one and only one RKHS with $K$ as reproducing kernel.
PROOF ???
Properties of kernel functions
Beside the reproducing property, kernel functions can be regarded as generalized inner products and in particular satisfy the Cauchy-Schwarz inequality
$$
K(\g x, \g x^\prime)^2 \leq K(\g x, \g x) K(\g x^\prime, \g x^\prime)
$$
Proof:
By the definition of a positive definite kernel $K$, the Gram matrix for the set of two points $\{\g x, \g x^\prime\}$,
$$
\g K = \begin{bmatrix} K(\g x, \g x) & K(\g x, \g x^\prime) \\
K(\g x^\prime, \g x) & K(\g x^\prime, \g x^\prime)
\end{bmatrix},
$$
is positive-semidefinite. As such, it has two nonnegative eigenvalues $\lambda_1$ and $\lambda_2$ and we have
$$
0\leq \lambda_1\lambda_2 = det(\g K ) = K(\g x, \g x)K(\g x^\prime, \g x^\prime) - K(\g x^\prime, \g x) K(\g x, \g x^\prime) = K(\g x, \g x)K(\g x^\prime, \g x^\prime) - K(\g x, \g x^\prime) ^2.
$$
Function class, inner product and norm
Let $K$ be a kernel function and $\left(\H, \inner{\cdot}{\cdot}_{\H}\right)$ the associated RKHS, i.e., $K$ is the reproducing kernel of $\H$.
Then, the class of functions $\H$ can be written as
$$
\H = \left\{ f\in\R^{\X} \ :\ f = \sum_{i=1}^\infty \beta_i K(\g x_i, \cdot) ,\ \beta_i \in\R,\ \g x_i\in\X,\ \|f\|_{\H} <+\infty \right\} .
$$
The RKHS norm $\|\cdot\|_{\H}$ appearing in this definition is the one induced as
$$
\left\|f \right\|_{\mathcal{H}} = \sqrt{\inner{f}{f}_{\H}}
$$
by the inner product defined for two functions, $f = \sum_{i=1}^{m} \alpha_i K(\g x_i, \cdot)$ and $g = \sum_{i=1}^{m^\prime} \beta_i K(\g x_i^\prime, \cdot)$, as
$$
\inner{f}{g}_{\H} = \sum_{i=1}^{m}\sum_{j=1}^{m^\prime} \alpha_i \beta_j K(\g x_i, \g x_j^\prime) .
$$
Positive-semidefiniteness is a direct consequence of the definition of a positive definite kernel function $K$:
$$
\inner{f}{f}_{\H} = \sum_{i=1}^{m}\sum_{j=1}^{m} \alpha_i \alpha_j K(\g x_i, \g x_j) \geq 0
$$
Positive-definiteness ($\inner{f}{f}_{\H} = 0 \Rightarrow f \overset{?}{=} 0$ ): First, we have
$$
\forall \g x\in\X,\quad \inner{f}{K(\g x, \cdot)}_{\H} = \sum_{i=1}^{m}\alpha_i K(\g x_i, \g x) = f(\g x)
$$
which shows that $K$ is a reproducing kernel of $\H$, and that
$$
f(\g x)^2 = \inner{f}{K(\g x, \cdot)}_{\H}^2
$$
Then, note that by the linearity of $\inner{\cdot}{\cdot}_{\H}$ and its positive semidefiniteness proved just above, $\inner{\cdot}{\cdot}_{\H}$ is a positive definite kernel function defined over $\H^2$:
\begin{align}
\forall n\in\mathbb{N},\ \forall \{f_i\}_{i=1}^n \in \H^n,\ \forall \{a_i\}_{i=1}^n \in \R^n, \quad
\sum_{i=1}^n\sum_{j=1}^n a_i a_j \inner{f_i}{f_j}_{\H} &= \inner{\sum_{i=1}^n a_i f_i}{\sum_{j=1}^n a_j f_j}_{\H} \\
&= \inner{f}{f}_{\H} \\
&\geq 0.
\end{align}
Therefore, it satisfies the Cauchy-Schwarz inequality, $\inner{f}{g}_{\H}^2 \leq \inner{f}{f}_{\H}\inner{g}{g}_{\H}$, and we obtain
$$
f(\g x)^2 = \inner{f}{K(\g x, \cdot)}_{\H}^2 \leq \inner{K(\g x, \cdot)}{K(\g x, \cdot)}_{\H}\inner{f}{f}_{\H} = K(\g x, \g x)\inner{f}{f}_{\H}
$$
Thus,
$$
\inner{f}{f}_{\H} = 0\ \Rightarrow \ f(\g x)^2 = 0 \ \Rightarrow \ f(\g x) = 0 \quad \forall \g x\in\X,
$$
which implies that $f$ is the zero function on $\X$: $f=0$.
Notes
The definition of positive definite kernel functions given here might be confusing since it is based on a property of positive-semidefiniteness of the resulting Gram matrices rather than positive-definiteness. However, this terminology is the most common in the machine learning literature.