Home Projects and Code

Gaussian Random Field Theory

Job Feldbrugge

Gaussian random fields occur on many branches of physics but are in particular crucial to cosmology. Current theories for the early universe predict a nearly homogeneous universe with tiny energy fluctuations

δ(q)=ρ(q)ρ¯1

with the energy density ρ and the mean density ρ¯, observable in the cosmic microwave background radiation field. After the epoch of recombination, these fluctuations gravitationally collapse to form the cosmic structure we observe today in cosmological redshift surveys. The seed for the structure in our universe turns out to be very close to a realization of a Gaussian random field with a nearly scale invariant power spectrum. I will here summarize a number of key techniques and results in Gaussian random field theory.

Gaussian random fields

Theories for origin of our universe, describe the density perturbation δ:RdR on the space Rd in terms of random fields. Instead of predicting a particular distribution of the energy in the early universe, the theories describe its statistical properties. We will here restrict attention to initial density fluctuations considered as a realization of a Gaussian random field.

The definition

First, note that the density perturbation has zero mean δ=ρ(q)/ρ¯1=0. When δ is a realization of a stationary Gaussian random field, the density perturbation at a point δ(q) is normally distributed. The spatial correlations are fully charaterized by the two-point correlation function ξ(q1,q2)=δ(q1)δ(q2). As it turns out, a random field with a Gaussian PDF with zero mean, which is fully charaterized by its two-point correlation function ξ(q1,q2), is uniquely described by the Gaussian distribution p(δ)=e12δ(q1)K(q1,q2)δ(q2)dq1dq2, with the kernel K defined as the inverse of the two-point correlation function K(q1,q)ξ(q,q2)dq=δD(d)(q1q2), with δD(d) the d-dimensional Dirac delta function. Given a set of fields S, the probability that δS is given by the functional integral P[δS]=Se12δ(q1)K(q1,q2)δ(q2)dq1dq2Dδ with Dδ the path integral measure. The expectation value of some functional Q of δ is given by the integral Q[δ]=Q[δ]e12δ(q1)K(q1,q2)δ(q2)dq1dq2Dδ.

The random field at a finite set of points

When we set S restricts the density perturbation at a finite number of points qi for i=1,2,,n, we can write the distribution as a multi-dimensional Gaussian distribution p(δ(q1),,δ(qn))=exp[12i,jnδ(qi)(M1)ijδ(qj)](2π)ndetM with the covariance matrix Mij=ξ(qi,qj). Linear statistics of the random field density perturbation Y=(Y1,,Yn) follow an analogous disbribution p(Y)=exp[12(YY¯)TMY1(YY¯)](2π)ndetMY with the mean Y¯=Y and the covariance matrix MY=YTY.

The cosmological principle

It follows from the cosmological principle that the initial conditions of our universe are statistically homogeneous (all points are statistically equivalent) and statistically isotripic (all directions are statistically equivalent). By the homogeneity, the two-point correlation function to only depend on the distance between the points, ξ(q1,q2)=ξ(q1q2). By the isotropy, the two-point correlation function is in addition only sensitive to the norm of the distrance, ξ(q1,q2)=ξ(q1q2). In these notes, I will always assume statistical homogeneity and isotropy.

Gaussian random fields in Fourier space

Gaussian random fields are conveniently described in terms of Fourier space. Given the Fourier transform δ^(k)=δ(q)eikqdq, and the inverse transformation δ(q)=δ^(k)eikqdk(2π)d, the two-point correlation function of the Fourier modes is diagonal δ^(k1)δ^(k2)=ei(k1q1k2q2)δ(q1)δ(q2)dq1dq2=ei(k1q1k2q2)ξ(q1q2)dq1dq2=ei(k1(q1q2)+(k1k2)q2)ξ(q1q2)dq1dq2=ei(k1k2)qdqeik1rξ(r)dr=(2π)dδD(d)(k1k2)eik1rξ(r)dr=(2π)dδD(d)(k1k2)P(k1), with the change of coordinates r=q1q2 and q=q2 and where the power spectrum P(k) describes the amplitude corresponding to Fourier modes. Note that we use identity

eikqdq=(2π)dδD(d)(k).

The power spectrum is the Fourier transform of the correlation function. For two-dimensional random fields P(k)=eikrξ(r)dr=0[02πeikrcosθdθ]ξ(r)rdr=2π0rJ0(kr)ξ(r)dr, with J0 the Bessel function of the first kind and k=k, using the polar coordinate measure dr=rdrdθ. For three-dimensional random fields P(k)=eikrξ(r)dr=0[02πeikrcosφdφ0πsinθdθ]ξ(r)r2dr=4π0r2J0(kr)ξ(r)dr, using the spherical coordiante measure dr=r2sinθdrdθdφ.

Not only are the two-point correlations of the Fourier modes diagonal, the Fourier modes are independent. Since the Fourier modes fully determine the real-space fields, we obtain the distribution of the Fourier modes p(δ^)=exp[12δ(q1)K(q1q2)δ(q2)dq1dq2]=exp[12[δ^(k1)eiq1k1dk1(2π)d]K(q1q2)[δ^(k2)eiq2k2dk2(2π)d]dq1dq2]=exp[12δ^(k1)[eiq1k1+iq2k2K(q1q2)dq1dq2]δ^(k2)dk1(2π)ddk2(2π)d]=exp[12δ^(k1)[ei(q1q2)k1+iq2(k2k1)K(q1q2)dq1dq2]δ^(k2)dk1(2π)ddk2(2π)d]=exp[12δ^(k1)[(2π)dδD(d)(k1k2)K(r)eirk1dr]δ^(k2)dk1(2π)ddk2(2π)d]=exp[12|δ^(k)|2K^(k)dk(2π)d].

Using the definition of the kernel can be written in terms of Fourier space using the convolution theorem δD(d)(q1q2)=K(q1q)ξ(qq2)dq=K(q1q)P(k)eik(qq2)dk(2π)ddq=[K(q1q)eik(q1q)dq]P(k)eik(q1q2)dk(2π)d=K^(k)P(k)eik(q1q2)dk(2π)d, yielding K^(k)=1/P(k) for all k. Substituting the Fourier transform of the kernel into the distribution, we obtain the simple expression p(δ^)=exp[|δ^(k)|22P(k)dk(2π)d]. This demonstrates that the amplitudes of the Fourier modes |δ^| are normally distributed with the standard deviation P(k) and the arguments arg(δ^) are uniformly distributed. Furthermore note that the Fourier modes are independent modulo the reality condition δ^(k)=[δ(q)eikqdq]=δ(q)eikqdq=δ^(k) with δ(q)=δ(q).

Generating realizations

In the analysis above, we observe that Gaussian random fields are most easily expressed in terms of Fourier space as the exponent of the distribution is diagonal in δ^. Note that the Fourier transform of a real-valued function δ:RdR satisfies the condition δ^(k)=δ^(k).

Generating a realization of a Gaussian random field on a regular lattice {qi1,,id}, with the values δi1,,id=δ(qi1,,id), can thus be reduced to drawing the normally distributed Fourier modes δ^i1,,id with the reality condition δ^i1,,id=δ^i1,,id. The realization {δi1,,id} are can be efficiently evaluated with the inverse fast Fourier transform.

These observations allow for an efficient algorithm for the generation of unconstrained Gaussian random fields:

  1. White noise: Generate the set of identically distributed random numbers ni1,,idN(0,1).
  2. Fourier transform the noise: Fast Fourier transform the realization of the white noise {n^i1,,id}=FFT[{ni1,,id}]. The Fourier transform consists of complex random numbers with a Gaussian amplitude N(0,1), a uniformly distributed argument, satisfying the reality condition n^i1,,id=n^i1,,id.
  3. Scale the Fourier modes: Define the Fourier modes δ^i1,,id=P(ki1,,id)n^i1,,id. The modes δ^i1,,id are independent complex random numbers of which the amplitudes are normally distributed |δ^i1,,id|N(0,P(ki1,,id)) satisfying the reality condition δ^i1,,id=δ^i1,,id.
  4. Transform to real-space: Use the inverse Fourier transform to generate the Gaussian random field {δi1,,id}=FFT1({δ^i1,,id}).

Generating constrained realizations

When setting up initial conditions for N-body simulations, it often suffices to construct an unconstrained Gaussian random fields. However, when we want to study the formation of a specific geometric feature it is most efficient to construct a constrained realization which satisfies a set of linear conditions. We will here follow the Hoffman-Ribak method following the notation of Weygaert and Bertschinger (1996).

We first define the set of M linear constraints Γ={CiCi[δ;qi=ci|i=1,,M}, at the points qi and values ciR for i=1,,M. Note that the linear constraints are either the function value C[δ;qi]=δ(qi), a derivative C[δ;qi]=δqi(qi), or a convolution with some kernel g, C[δ;qi]=g(qiq)δ(q)dq.

Now using Bayes formula, we write the conditional distribution p(δ|Γ)=p(δ,Γ)p(Γ)=p(δ)p(Γ), where p(δ,Γ)=p(δ) since Γ is a linear functional of δ. Since the constraints Ci are linear in terms of the Gaussian random field, the distribution p(Γ) is again Gaussian, p(Γ)=exp[12CTQ1C](2π)MdetQ, with the vector C=(C1,,CM) and the covairance matrix Q=CTC. The constrained distribution thus takes the form p(δ|Γ)=(2π)MdetQ e12(δ(q1)K(q1q2)δ(q2)dq1dq2CQ1C).

It is worthwile to simplify this expression a bit further. We can write the constraint as Ci(qi)=δD(d)(q2)Ci(qiq2)dq2=ξ(q1)K(q1q2)Ci(qiq2)dq1dq2=C^i(k)P(k)K^(k)eikqidk(2π)d, with the power spectrum P(k), the Fourier transforms K^ and C^i of the kernel K and the constraint Ci. If we now define the function P^i(k) by (2π)dP^i(k1)δD(d)(k1k2)=C^i(k1)δ^(k2) we can obtain a relation between δ^ and C^i, C^i(k)=P^i(k)P(k)δ^(k). Substituting this equation in the constraint Ci, we obtain Ci(qi)=δ^(k)P^i(k)K^(k)eikqidk(2π)d=δ^(k1)K^(k1)C^i(k1)δ^(k2)eik1qidk1(2π)ddk2(2π)d=δ(q1)K(q1q2)ξi(q2)dq1dq2, where we define ξi as the Fourier transform of P^i, ξi(q)=δ(q)Ci(qi).

Let ξij be the (ij)th element of Q. The identity for the constraint can be used to write Ciξij1Cj=δ(q1)K(q1q2)δ¯(q2)dq1dq2, where we have the mean field δ¯(q)=ξi(q)ξij1cj. We can use this to write the distribution as
p(δ|Γ)=(2π)MdetQe12F(q1)K(q1q2)F(q2)dq1dq2, where F is the deviation of δ with respect to the constrained mean field δ¯(q)=δ(q|Γ. As a consequence, the residue F is again a Gaussian random field.

Generating realizations of a constrained Gaussian random field has thus been reduced to constructing a realization of the residue. At first sight, this seems difficult since the residue vanishes at the constraints. However, Hoffmann and Ribak (1991) showed that the residue has an additional property which can be used to generate the residue out of an unconstrained realization. As it turns out, the statistical properties of the residue are independent of the values ci and only depend on the constraints Ci. This leads to the following method:

  1. Generate an unconstrained Gaussian random field δ~ with the given power spectrum.
  2. Evaluate the constriants on the realization Ci[δ~]=c~i.
  3. Construct the mean field corresponding to the values c~i and evaluate the residue F=δ~ξi(q)ξij1c~j.
  4. Add the residue F to the mean field δ¯ to obtain the constrained realization δ=F+ξi(q)ξij1cj.

Geometric statistics

In 1936 Stephen Rice, while working at Bell-Labs, developed a framework to calculate the number densities of points in one-dimensional random fields. In the subsequent years, the results were extended to multi-dimensional random fields.

Rice’s formula: Consider set of points Λ(λ1,,λd)={qRd|fi[δ](q)=λi} with the d smooth conditions fi[δ]:RdR of the function δ:RdR and λiR, with i=1,,d. For convenience we drop the the dependence on δ from the notation, i.e., fi=fi[δ] . When δ is a homogeneous random field and the set Λ(λ1,,λd) amost always consists of only isolated points, the number density N(λ1,,λd) of points in Λ(λ1,,λd) is equal to the expectation value N(λ1,,λd)=|det[fjqi]i,j=1,,d|i=1dδD(d)(fi(q)λi).

Proof: Given a continuous and differentiable realization of a the random field, construct the generalized function n(q)=rΛ(λ1,,λd)δD(d)(qr). The number of points in Λ(λ1,,λd) for this realization is given by the integral n(q)dq. Now, since the points in Λ(λ1,,λd) are isolated, every point rΛ(λ1,,λd) has a open set Ur containing only this point. We can write the number of points in Λ(λ1,,λd) as n(q)dq=rΛ(λ1,,λd)δD(d)(qr)dq=rΛ(λ1,,λd)UrδD(d)(qr)dq=rΛ(λ1,,λd)Ur|det[ifj(q)]i,j|i=1dδD(1)(fi(q)λi)dq using the transformation of the Dirac delta function UδD(d)(qf1(λ))dq=UδD(d)(f(q)λ)|detf|dq, for a smooth vector-valued function f:RdRd and λRd assuming f(q)=λ for only a single point qU. As the integrand has only support for the point r we can write the sum of integrals over Ur as a single integral over Rd. Using these identities, we obtain the number density of points in Λ(λ1,,λd) as the expectation value N(λ1,,λd)=Ln(q)dqLdq=L|det[ifj(q)]i,j|i=1dδD(1)(fi(q)λi)dqLdq=|det[fjqi]i,j=1,,d|i=1dδD(d)(fi(q)λi) in a regulating box LRd in the limit LRd, since the expectation value is independent of the position by the statistical homogeneity.

To illustrate Rice’s formula, consider a few examples:

Example 1: The number density of level crossings δ=ν in a one-dimensional random field δ correspond to the condition f1[δ]=δ,λ1=ν. Using Rice’s formula we obtain N(ν)=|δ(q)|δD(1)(δ(q)ν)=|δ(q)|p(δ=ν,δ) dδ. We can select for up- and down-crossings by restricting the integral over δ to [0,) and (,0].

Example 2: The number density of critical points in a one-dimensional random field δ at function value νR. At a critical point, the first order derivative vanishes, i.e., consider the conditions f1[δ]=δ and λ1=0. The number density thus takes the form N(ν)=|δ(q)|δD(1)(δ(q)ν)δD(1)(δ(q))=|δ|p(δ=ν,δ=0,δ) dδ. We can refine to the number density of maxima and minima by restricting the integral over δ to the intervals (,0] and [0,).

Example 3: The number density of critical points in a two-dimensional random field δ at function value νR. At the critical point, the first order derivatives vanish, i.e., consider the conditions f1[δ]=1δ,f2[δ]=2δ and λ1=λ2=0. The number density thus takes the form N(ν)=|det(12δ12δ12δ22δ)|δD(1)(δν)δD(1)(1δ)δD(1)(2δ)=|12δ 22δ(12δ)2|p(δ=ν,δ=0,12δ,12δ,22δ) d(12δ) d(12δ) d(22δ). We can refine to the number density of maxima, minima, and saddle points by restricting the integration domain to the second order derivatives for which the Hessian H as either two negative eigenvalues, two positive eigenvalues, or one negative and one positive eigenvalues, with the eigenvalues λ±=12[12δ+22δ±4(12δ)2+(12δ22δ)2].

Distribution of eigenvalue fields

Given a Gaussian random field δ:RdR, we evaluate the distribution of the eigenvalues of the Hessian for d=2 and d=3 dimensions. This is a famous result known as the Doroshkevich formula.

The Hessian of a random field is fully determined by its second order derivatives Tij=2δqiqj. In the two-dimensional case, the Hessian of the δ is determined by three degrees of freedom Y=(T11,T22,T12). The eigenvalues with λ1λ2 take the form λ1=12(T11+T22+4T122+(T11T22)2),λ2=12(T11+T224T122+(T11T22)2), while the eigenvector matrix can be parametrized by the rotation matrix V=(cosωsinωsinωcosω) for the orientation ω(π/2,π/2].

The derivatives are linear statistics of the random field. The statistic Y is thus normally distributed. The covariance matrix is of the form M=σ228(310130001) with the moment σ22=k4P(k)dk.

The distribution of the statistic thus takes the form p(T11,T22,T12)=22π3/2σ23e3k128k22σ22 with the rotationally invariant combinations k1Tr(H)=T11+T22=λ1+λ2,k2det(H)=T11T22T122=λ1λ2. The invariance of the exponent under rotations follows from the isotropy of the random field.

To obtain the distribution for the eigenvalue fields, we express the linear statistic Y in terms of the eigenvalues and the orientation. The equation (T11T12T12T22)=V(λ100λ2)VT leads to the correspondence (T11,T22,T12)=(λ1cos2ω+λ2sin2ω,λ1sin2ω+λ2cos2ω,(λ2λ1)cosωsinω).

The measure transforms as dT11dT22dT12=(λ1λ2)dλ1dλ2dω, leading to the distribution

p(λ1,λ2)=22π1/2σ23e3k128k22σ22(λ1λ2).

The distribution of the ordered eigenvalues (λ1λ2λ3) of a three-dimensional Gaussian random field, first derived in 1970 by Doroshkevich, follow a similar distribution p(λ1,λ2,λ3)=153165π3e3(2k125k2)2σ22(λ1λ2)(λ1λ3)(λ2λ3), with the rotationally invariant forms k1λ1+λ2+λ3,k2λ1λ2+λ1λ3+λ2λ3.