跳转至

2. Why Do We Need the Renormalization Group

Introduction - Multi-Scale Fluctuations, the Universality Paradox, and the Core Challenge

Lecture 1 briefly introduced what the renormalization group (RG) is, its historical development, and modern applications. Before we dive into the mathematical machinery of the renormalization group, we need to answer a more fundamental question:

Why is renormalization unnecessary for most physics problems, yet indispensable at the critical point?

In the development of statistical physics and quantum field theory, the proposal of the renormalization group marked a fundamental shift in how physics understands the concept of "scale." Its introduction was not merely a mathematical trick for handling divergences, but the only logically consistent framework for solving the "multi-scale strong coupling" problem in many-body systems. When traditional mean field theory fails at the critical point, the renormalization group systematically reduces degrees of freedom and builds a bridge connecting microscopic interactions to macroscopic universal behavior.

The Foundation of Macroscopic Simplicity - The Separation of Scales Assumption

The reason physics can describe the macroscopic world with simple equations is largely due to an implicit but crucial premise—separation of scales. This assumption forms the cornerstone of standard statistical mechanics.

Under normal circumstances, there is a huge orders-of-magnitude gap between the microscopic scale of a physical system (e.g., atomic spacing \(a \sim 10^{-10}\) m) and the macroscopic observation scale (e.g., a laboratory container \(L \sim 1\) m).

  • Microscopic level: Particles undergo violent, random thermal motion, and their mutual influence is usually confined to neighboring particles.

  • Macroscopic level: Thermodynamic quantities (such as pressure, density) are the statistical average of a vast number of microscopic degrees of freedom.

According to the central limit theorem in probability theory, as long as microscopic fluctuations are local and lack long-range correlations, these local random noises cancel out (self-averaging) when averaged over macroscopic scales. Thus, hydrodynamic equations can accurately describe the macroscopic behavior of water flow without explicitly including microscopic thermal motion terms of water molecules. In this physical picture, microscopic details and macroscopic behavior are dynamically "decoupled"—microscopic randomness versus macroscopic determinism.

The Disaster at the Critical Point - Divergence of the Correlation Length and All-Scale Coupling

However, when a physical system approaches the critical point of a continuous phase transition (second-order phase transition), such as the Curie temperature \(T_c\) of a ferromagnet, the picture of "separation of scales" completely collapses.

The key parameter describing the fluctuation properties of the system is the correlation length (\(\xi\)), which characterizes the typical distance over which a perturbation at one point can affect surrounding regions. Near the critical point, the correlation length is no longer confined to the microscopic scale but grows sharply as a power law as the temperature approaches the critical temperature, theoretically diverging to infinity:

\[ \xi \propto |T - T_c|^{-\nu} \to \infty \]

\(\xi \to \infty\) brings profound physical consequences:

1.Establishment of long-range correlations: Fluctuations at any point in the system are no longer isolated events; instead, they trigger a "domino effect" through the interaction network, affecting regions at macroscopic distances.

2.Entanglement of scales: Fluctuation modes of all possible length scales emerge in the system. Large-scale fluctuation regions contain medium-scale fluctuations inside, which in turn contain small-scale fluctuations nested within. This structure has fractal characteristics, i.e., self-similarity.

At this point, microscopic details are no longer "averaged out," but are directly coupled to macroscopic physical quantities through layer upon layer of amplified long-range correlations. Traditional mean field theory essentially assumes that fluctuations are perturbative and negligible, and therefore completely fails when dealing with such strong correlations spanning all scales. Physics urgently needs a new mathematical language capable of simultaneously handling collective fluctuation behavior on all scales from the microscopic lattice constant \(a\) to the macroscopic correlation length \(\xi\).

1. The Renormalization Group's Response - Stepwise Reduction of Degrees of Freedom

Facing \(10^{23}\) mutually entangled degrees of freedom that cannot be eliminated by simple averaging, the renormalization group adopts a strategy of "divide and conquer, iteratively eliminate." Its core idea is not to solve the entire partition function at once, but to extract the low-energy physical properties of the system by establishing mapping relationships between effective theories at different scales—mathematically, this is essentially mode decomposition in path integrals.

The partition function is the total weight of all possible states of a system. Once we can write down the partition function, we can obtain all statistical physics properties of the system (free energy, entropy, energy, fluctuations, correlations, critical behavior…). Therefore, the essence of RG is to study the behavior of the partition function at different scales. The path integral is a partition function that sums over all possible evolution trajectories with weights, describing all possible ways the system can move in the time dimension.

Thus, the partition function sums over all "static configurations," while the path integral sums over all "dynamic trajectories." The path integral is more general than the partition function; the partition function can be viewed as a special case of the path integral. If we compress time to \(t=0\), the path integral degenerates into an ordinary partition function.

The central task of statistical physics is to compute the partition function \(Z\), which contains the statistical weights of all possible microscopic configurations of the system:

\[ Z = \int \mathcal{D}\phi \, e^{-H[\phi]/k_BT} \]

Here \(\phi(x)\) is the field variable (e.g., local magnetization), and \(H[\phi]\) is the system's Hamiltonian. Since \(\phi\) contains fluctuation modes at all scales, direct integration is like trying to simultaneously track all waves in the ocean from tiny ripples to giant tsunamis—mathematically intractable.

The fundamental reason why the partition function in statistical physics is almost never analytically solvable is that it requires integration over an infinite-dimensional function space: the variables are not a finite number of coordinates, but a continuous field \(\phi(x)\) defined over all space, equivalent to summing over all possible fluctuation patterns (for each spatial point, each possible value, each possible fluctuation form, each frequency/wavevector, each mode combination); moreover, the system's Hamiltonian usually contains nonlinear coupling terms such as \(\phi^4\), making fluctuation modes at different scales strongly coupled to each other, impossible to directly integrate using linear algebra as in Gaussian theory; finally, near the critical point, the correlation length diverges, fluctuations span all scales, and the system enters the strong coupling regime, where there is no small parameter for perturbative expansion, and conventional series expansions completely fail. These three points together make the partition function mathematically intractable, and this is precisely the fundamental reason why the renormalization group must enter the scene.

The reason the renormalization group transformation can handle the unsolvability of the original partition function is that it breaks the problem into a series of steps that are easier to analyze. Although the three steps below have mathematical formality, their physical meaning can be understood in more intuitive ways.

Step 1 - Decomposition

The field variable \(\phi(x)\) contains fluctuations of various scales in space: both fine and rapidly varying structures, and broad and slowly undulating patterns. To handle contributions from different scales separately, we can decompose \(\phi\) in Fourier space according to momentum (the inverse of wavelength) into two parts:

  • Fast modes \(\phi_>\): Correspond to high-momentum, short-wavelength components. These components reflect the rapid jittering and detailed changes of the system at microscopic scales.

  • Slow modes \(\phi_<\): Correspond to low-momentum, long-wavelength components, describing smoother, more slowly varying structures at large scales.

This decomposition can be expressed as:

\[ \phi(x) = \phi_<(x) + \phi_>(x) \]

Through this step, we can distinguish the physical behavior at different scales, allowing subsequent operations to focus on their respective contributions.

Step 2 - Coarse-Graining

After completing the mode decomposition, we keep the slow modes \(\phi_<\) unchanged and integrate over the fast modes \(\phi_>\) (i.e., average over these microscopic modes). This integration does not describe the time path of particle motion, but is a statistical "weighted sum over all possible microscopic fluctuations."

The integration process:

\[ Z = \int \mathcal{D}\phi_< \left[ \int \mathcal{D}\phi_> \, e^{-H[\phi_< + \phi_>]} \right] = \int \mathcal{D}\phi_< \, e^{-H'[\phi_<]} \]

The physical meaning can be understood as:

  • Microscopic short-wavelength fluctuations, while affecting details, usually do not play a decisive role in large-scale properties;

  • By "averaging out" these short-wavelength fluctuations, we obtain a new energy functional \(H'[\phi_<]\);

  • \(H'\) no longer depends on microscopic details, but still retains their influence on macroscopic behavior.

In other words, coarse-graining simplifies a complex multi-scale system into an effective theory that only concerns large-scale structure, while the parameters implicitly record the contributions from the removed scales.

Step 3 - Rescaling

After coarse-graining, the minimum resolution scale of the field becomes larger, and the system is mathematically "zoomed in." To make the new theory \(H'\) formally comparable to the original Hamiltonian \(H\), we need to perform a scaling transformation on the coordinates and field amplitudes:

\[ x' = x/b, \quad \phi' = \zeta \phi \]

The parameter \(b > 1\) represents the range of short scales removed in the coarse-graining process, and the coefficient \(\zeta\) is used to redefine the normalization scale of the field. Through such rescaling, the new theory is formally pulled back to the original reference frame, allowing physical descriptions at different scales to be compared on the same basis.

These three steps are repeated in cycles, forming the RG flow. The system parameters are continuously updated in this process, ultimately revealing the physical structure at different scales, especially the universal behavior near the critical point.

The core idea of the renormalization group is: the behavior of a system at different observation scales is not entirely the same, and this difference that appears as scale changes can be described by "evolution of parameters." The renormalization group transformation in momentum space not only decomposes and coarse-grains the field variables, but also causes the parameters in the Hamiltonian to update with scale, yielding a set of new effective parameters. By repeatedly iterating this process, the trajectory of parameter changes forms what is called the "RG flow."

Flow in Parameter Space and the Emergence of Universality

After one renormalization group transformation, the effective Hamiltonian \(H'\) usually retains the structural form of the original theory, such as the common Ginzburg–Landau type expression. Although the form remains the same, the coupling parameters within (such as the reduced temperature \(t\), interaction strength \(u\), etc.) are updated:

\[ \{t, u, \dots\} \xrightarrow{\text{RG transformation}} \{t', u', \dots\} \]

In other words, changing the scale does not produce an entirely new physical form, but gradually changes the numerical values of the parameters. Thus we see that scale not only determines how we observe, but also affects the effective range of system parameters.

RG Flow

Viewing all possible parameter combinations in the Hamiltonian as a high-dimensional space, we can call it "theory space." In this space, the renormalization group transformation defines a set of rules for updating parameters as scale changes. The change of parameters is no longer an isolated process, but forms a trajectory that evolves continuously with scale; this trajectory is called the RG flow.

This picture reveals a profound idea: physical description depends on observation scale. As the scale continuously changes, the system's position in theory space also continuously moves. This movement is not random, but is controlled by the renormalization group equations.

Fixed Points and Scale Invariance

During the process of parameter evolution with scale, some special points appear. For these points, even after one renormalization group transformation, the system's parameters remain unchanged. If we denote the transformation as \(\mathcal{R}\), these points satisfy:

\[ H^* = \mathcal{R}H^* \]

Such points are called fixed points.

Physical meaning of fixed points: A fixed point represents a state with scale invariance: even if the system is magnified or reduced by any factor, its statistical characteristics remain completely the same. This feature corresponds precisely to the behavior of the correlation length \(\xi \to \infty\) at the critical point, because in this state there is no specific dominant scale.

The origin of universality: In different actual physical systems, the details of microscopic interactions may differ greatly. For example, liquid–gas systems and ferromagnets have completely different structures at the fundamental particle level. But under the action of the RG flow, their parameters may converge along different paths to the same fixed point. Since the properties of the fixed point are determined only by a few factors such as symmetry and spatial dimension, the macroscopic critical exponents also depend only on these universal features, not on specific microscopic structure. The resulting phenomenon is universality.

By transforming the originally intractable infinite degrees of freedom problem into analyzing the topological structure of RG flow in parameter space, the renormalization group provides an effective framework for dealing with complex systems. It not only solves the computational difficulties of critical phenomena in traditional statistical physics, but also reveals the deep reason why different systems exhibit the same laws at large scales.

2. Code Practice with the Two-Dimensional Ising Model

The two-dimensional Ising model is one of the most classic and representative models in statistical physics, used to study the phase transition behavior of magnetic systems. The model consists of spin variables arranged on lattice sites, each spin taking values \(+1\) or \(-1\), and tending to align with its neighbors through interactions. Despite its extremely simple structure, the model can exhibit rich physical phenomena, including magnetization, fluctuations, critical behavior, and universality. These properties make it not only an ideal platform for studying phase transitions, but also an important foundation for establishing the entire renormalization group theory. Therefore, in our code practice we use the Ising model as an example to further understand the ideas of the renormalization group.

The history of Ising model research dates back to the 1920s. Initially, people thought the model was unsolvable in two dimensions; its exact solution was finally found by Onsager in 1944. The exact solution revealed a key fact: near the critical temperature \(T_c\), the system exhibits diverging correlation length, scale invariance, power-law fluctuations, and other features that cannot be explained by traditional perturbation methods. To understand these seemingly complex yet universally occurring behaviors, Kadanoff proposed the "block spin idea" in the 1960s—merging small-scale degrees of freedom into large-scale variables and observing how parameters change with scale. This idea laid the foundation for Wilson's renormalization group, enabling a unified explanation of universality, critical exponents, and multi-scale behavior.

The following code practice demonstrates through actual simulation how the renormalization group idea is embodied in the Ising model:

First, we use the Metropolis algorithm to generate equilibrium spin configurations near the critical temperature \(T \approx 2.3\), where the system exhibits significant multi-scale fluctuations. Then we use the "majority rule" proposed by Kadanoff to coarse-grain the lattice (block spin transformation): treating each \(2 \times 2\) local region as an effective spin at a larger scale. After coarse-graining, the lattice distribution is smoother, extracting large-scale structure from the intense microscopic jittering.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors


plt.style.use('dark_background')

class IsingRG:
    """
    2D Ising Model and Renormalization Group Flow Simulator
    """
    def __init__(self, L=64, T=2.27):
        self.L = L
        self.T = T
        # Initialize random state (+1 or -1)
        self.lattice = np.random.choice([-1, 1], size=(L, L))

    def energy_change(self, i, j):
        """
        Calculate energy change from flipping a spin (periodic boundary conditions)
        E = -J * sum(s_i * s_j), with J=1
        """
        top = self.lattice[(i - 1) % self.L, j]
        bottom = self.lattice[(i + 1) % self.L, j]
        left = self.lattice[i, (j - 1) % self.L]
        right = self.lattice[i, (j + 1) % self.L]
        neighbors = top + bottom + left + right
        # dE = E_new - E_old = -(-s) * neighbors - (-s * neighbors) = 2 * s * neighbors
        return 2 * self.lattice[i, j] * neighbors

    def metropolis_step(self):
        """Perform one Metropolis Monte Carlo sweep"""
        # Attempt L*L flips, this is called one MCS (Monte Carlo Sweep)
        for _ in range(self.L * self.L):
            i = np.random.randint(0, self.L)
            j = np.random.randint(0, self.L)
            dE = self.energy_change(i, j)

            # Metropolis criterion: accept if energy decreases, or with Boltzmann probability if increases
            if dE <= 0 or np.random.rand() < np.exp(-dE / self.T):
                self.lattice[i, j] *= -1

    def simulate(self, steps=1000):
        """Thermalize the system"""
        for _ in range(steps):
            self.metropolis_step()

    def coarse_grain(self, block_size=2):
        """
        Perform Kadanoff block spin transformation (majority rule)
        """
        new_L = self.L // block_size
        new_lattice = np.zeros((new_L, new_L))

        for i in range(new_L):
            for j in range(new_L):
                # Extract block_size x block_size block
                block = self.lattice[i*block_size:(i+1)*block_size, 
                                   j*block_size:(j+1)*block_size]
                # Majority rule
                avg_spin = np.sum(block)
                if avg_spin > 0:
                    new_lattice[i, j] = 1
                elif avg_spin < 0:
                    new_lattice[i, j] = -1
                else:
                    # If tied, choose randomly
                    new_lattice[i, j] = np.random.choice([-1, 1])
        return new_lattice

def plot_rg_flow():
    # 2D Ising model critical temperature Tc ≈ 2/ln(1+sqrt(2)) ≈ 2.269
    # We simulate slightly above Tc to observe correlation length
    sim = IsingRG(L=128, T=2.3) 
    print("Equilibrating system near critical point (this may take a few seconds)...")
    sim.simulate(steps=1500) # Ensure proper thermalization

    original = sim.lattice
    # First renormalization step
    rg_1 = sim.coarse_grain(block_size=2)

    # Prepare for second renormalization step by creating a new instance
    # For demonstration simplicity, we directly process rg_1
    # Note: Actually evolution should be under the renormalized Hamiltonian
    # This shows configuration space flow through "snapshots"
    rg_2_dummy = IsingRG(L=64, T=2.3) 
    rg_2_dummy.lattice = rg_1
    rg_final = rg_2_dummy.coarse_grain(block_size=2) # Equivalent to 4x4 blocks in original lattice

    # Visualization
    fig, axes = plt.subplots(1, 3, figsize=(18, 6))
    # Use dark purple/yellow colormap for high contrast and dark theme compatibility
    cmap = colors.ListedColormap(['#440154', '#fde725']) 

    axes[0].imshow(original, cmap=cmap, interpolation='nearest')
    axes[0].set_title(f"Original Lattice ({128}x{128})\nMicroscopic Fluctuations", color='white')
    axes[0].axis('off')

    axes[1].imshow(rg_1, cmap=cmap, interpolation='nearest')
    axes[1].set_title(f"First RG Step (b=2)\n{64}x{64}", color='white')
    axes[1].axis('off')

    axes[2].imshow(rg_final, cmap=cmap, interpolation='nearest')
    axes[2].set_title(f"Second RG Step (b=4)\n{32}x{32}", color='white')
    axes[2].axis('off')

    plt.suptitle("Real-Space Renormalization Group Flow: Emergence of Macroscopic Order", 
                 fontsize=16, color='white', y=1.05)
    plt.tight_layout()
    plt.show()

if __name__ == "__main__":
    plot_rg_flow()
import matplotlib.pyplot as plt
from matplotlib import colors

Code Output

The code demonstrates two consecutive coarse-graining processes: from \(128 \times 128\) to \(64 \times 64\), then to \(32 \times 32\). As the scale gradually increases, microscopic details continuously disappear, while the macroscopic picture becomes clearer and more stable.

This example embodies the basic logic of the renormalization group. First, we coarse-grain the system, integrating microscopic modes into effective large-scale variables; then we observe how the coarse-grained configuration evolves, thereby indirectly reflecting the changes of Hamiltonian parameters in scale space.

This change can be understood as the system's "RG flow" in theory space. If the system is close to the critical point, after multiple coarse-graining steps it will gradually exhibit scale invariance, meaning the parameters approach some fixed point. This fixed point does not depend on the microscopic details of the model, but is determined only by symmetry and dimension, so different systems will exhibit the same critical exponents—this is why universality arises. We will analyze this example in more detail in later studies; for now, just keep it in mind.

Next, we will briefly introduce the applications of the renormalization group in statistical physics, nonequilibrium systems, and machine learning/complex networks—this is also an application thread running through this tutorial series, in addition to the mathematical exposition.

3. Multi-Scale Emergence in Nonequilibrium and Living Systems

In the real world, many interesting phenomena occur far from equilibrium, such as the collective flight of bird flocks, "active liquids" composed of active particles, and patterns formed by reaction-diffusion in chemistry. These nonequilibrium systems also exhibit multi-scale fluctuations and emergent behavior, requiring the renormalization group perspective to understand.

A typical example is the collective motion of bird flocks or insect swarms. Experimental observations have found that the velocity directions of individuals in large-scale groups are not random, but exhibit significant long-range correlations: the velocity correlation length \(\xi\) within the group is often much larger than the typical distance between individuals, and sometimes even increases with group size, exhibiting approximately infinite correlation length (scale-free). This means the entire bird flock behaves like a critical system: local perturbations (e.g., a few birds changing direction) can propagate and affect birds far away.

This is similar to the situation in a magnet at the critical point where the correlation length between spins diverges. When the correlation scale is large, we also observe dynamic scaling laws, where temporal and spatial fluctuations are intertwined in a power-law relationship: the relaxation time \(\tau\) of the group grows as a power of the spatial correlation length \(\xi\) (called critical slowing down), usually satisfying \(\tau \sim \xi^z\), defining a dynamic critical exponent \(z\)[1]. Recently, studies of natural insect swarms calculated a theoretical value of \(z\) around 1.35 through the \(\epsilon=4-d\) expansion, which agrees strikingly with experimental measurements (around 1.37). This suggests that even in biological systems like bird flocks, there may exist some kind of critical point, the RG fixed point idea still applies, and group behavior can be classified into some universality class.

RG Flow and Fixed Points in Parameter Space.
Shows the RG flow trajectory of active matter systems in parameter space. (a) RG flow on a 2D plane: arrows indicate the evolution direction of effective Hamiltonian parameters (inertia coupling f and activity cv) as the coarse-graining scale increases. The red dot represents a stable fixed point, attracting surrounding flow lines, corresponding to a universality class with dynamic critical exponent z=1.35. Different fixed points (black squares, blue diamonds) represent different physical phases or universality classes. (b) 3D view of RG flow with dissipation parameter introduced. This intuitively reveals how physical systems start from microscopic parameters, "flow" along specific trajectories, and ultimately have their macroscopic behavior determined by fixed point properties. Image source: Cavagna, A., et al. Nature Physics 19, 1043–1049 (2023)

In the broader field of active matter, RG has been proven to be a key tool. The classic Toner–Tu polar flocking theory obtained the long-range coherent motion properties of the ordered phase of groups precisely through RG analysis, for example predicting anomalous fluctuation scaling exponents (giant number fluctuations) in two-dimensional dry polar active systems, which were later confirmed experimentally[1]. In addition, various nonequilibrium phase transitions—such as chemotactic motion in bacterial clusters, motility-induced phase separation (MIPS) of active particles, and active fluctuations of membranes—have all been studied using renormalization group methods[1]. RG helps us classify these seemingly different nonequilibrium systems and find their common properties. For example, whether it is influenza spreading through a population or fire spreading through a forest, the spreading process can be abstracted as a directed percolation model, belonging to the same universality class with the same critical exponents. It is precisely RG analysis that lets us recognize this and provides theoretical predictions for critical thresholds of epidemics or ecological expansion.

It is worth mentioning that the application of RG in nonequilibrium systems continues to produce new surprises. For example, one study found that the collective motion of natural bird flocks is actually close to critical phenomena in "3.99-dimensional" space[1] (a subtle approximation of 4 dimensions), suggesting that biological groups may operate cleverly on the edge of some critical state to achieve extremely sensitive responses to environmental stimuli. These findings are all inseparable from the perspective provided by the renormalization group: viewing biological or nonequilibrium systems as multi-scale coupled wholes, and identifying those effective degrees of freedom that govern large-scale behavior.

4. RG Ideas in Neural Networks and Machine Learning

The core concepts of the renormalization group have found resonance in artificial intelligence and data science in recent years. The structure of deep neural networks has parallels with RG coarse-graining: the original input data (such as a pixel array of an image), when passing through successive network layers, extracts increasingly higher-level, abstract features. This is very similar to Kadanoff's block spin transformation—each layer is "renormalizing" the data, filtering out irrelevant detailed noise and retaining the main patterns relevant to the task. In other words, deep learning achieves progressive simplification of information through hierarchical representation, which is exactly the same as RG's separation of relevant/irrelevant degrees of freedom in physics.

In statistical physics, we know the maximum entropy principle: given macroscopic constraints (such as average energy), the Gibbs distribution is the most unbiased guess satisfying the constraints, i.e., the distribution with maximum entropy. The same idea has been introduced to machine learning to form the entropy randomization framework. Popkov et al. proposed that instead of finding a single optimal model parameter under uncertain data, we should assign a probability distribution \(P(\mathbf{w})\) to the model parameters and determine the optimal parameter distribution by maximizing entropy. The resulting parameter distribution often takes the form of a Boltzmann distribution (exponential distribution), corresponding to an application of a "generalized Gibbs principle" in machine learning. In other words, let the model parameters themselves form a statistical ensemble, choosing the distribution by the criterion of maximum entropy—this is strikingly similar to how physical systems traverse microstates at equilibrium.

Additionally, dimensionality reduction techniques commonly used in machine learning can also be understood from the RG perspective. For example, principal component analysis (PCA) tries to find the orthogonal directions with the largest variance in the data, equivalent to identifying "relevant variables" (analogous to order parameter directions in physics); random projection uses random matrices to map high-dimensional data to low dimensions while approximately preserving distance relationships, corresponding to the "universality" concept in RG—the specific details of the projection method do not matter, as long as the global structural relationships are preserved. These methods are essentially discarding "irrelevant" dimensions (noise directions) in the data while retaining "relevant" information, thus achieving coarse-graining similar to RG.

Schematic of Real-Space Mutual Information Neural Network Renormalization Group (RSMI). (a) RSMI network architecture: The algorithm uses machine learning to automatically find the optimal coarse-graining scheme. The system is divided into visible region V (gray), buffer B (yellow), and environment E (purple). The neural network (red nodes H) extracts "relevant" degrees of freedom by maximizing the mutual information between the inner region V and the outer environment E. The buffer B is introduced to filter out local correlations that only act at short distances, thereby focusing on long-range physics. (b) Algorithm flow: The network learns probability distributions to automatically identify which microscopic degrees of freedom are crucial for describing macroscopic behavior (represented by weights λ), thus "learning" the renormalization group transformation without relying on physicist intuition. Image source: Koch-Janusz, M., & Ringel, Z. Nature Physics 14, 578–582 (2018)

More cutting-edge, researchers are trying to have neural networks automatically learn RG transformations. The work of Koch-Janusz and Ringel (2018) constructed a variational RG method: using deep generative models with invertible manifold transformations, they adaptively approximate the optimal RG transformation by maximizing the mutual information between coarse-grained and original variables[2]. This method, called "neural network renormalization group," combines information theory (maximum preservation of mutual information) with the RG framework, and can automatically find the most suitable way to reorganize degrees of freedom so that the coarse-grained system retains as much key information as possible from the original. This is equivalent to using machine learning to simulate the RG schemes manually designed by physicists, and has been successfully applied to systems such as the 2D Ising model, reproducing the correct critical point and critical exponents. Similarly, Li et al.'s research applied flow-based generative networks to variational RG, achieving automatic analysis of critical systems[3].

Overall, RG ideas inspire machine learning in two ways: on one hand, we use the RG perspective to understand why deep learning works (multi-layer networks extract precisely multi-scale structure); on the other hand, we use the power of machine learning to automate RG calculations and explore new patterns in complex systems. The combination of the two heralds a new interdisciplinary field: RG methods borrowed from physics may be able to handle complex pattern recognition in high-dimensional data, while machine learning provides physics with new numerical tools for dealing with systems with huge numbers of degrees of freedom. This fully demonstrates the integration of knowledge: just as RG helps physicists find "relevant" variables in complex systems, information-based machine learning methods also identify key features in vast amounts of data. The two inspire each other and are catalyzing novel theories and technologies.

5. A Universality Perspective on Climate and Ecosystems

Climate systems and ecological networks are typical examples of complex systems: they consist of many interacting components, spanning enormous scales from local to global, from instantaneous to millions of years. The key to understanding these systems lies in grasping the coupling of processes at different scales—exactly what the renormalization group excels at.

Multi-scale coupling and teleconnections: The phenomenon of teleconnections in the atmosphere and ocean illustrates that distant climate subsystems can develop strong correlations. Teleconnection refers to how climate changes in one location can significantly affect the state of distant regions[4]. Typical examples include El Niño events causing rainfall anomalies worldwide, or the North Atlantic Oscillation (NAO) affecting winter cold spells in Eurasia. These remote effects indicate that the climate system has correlation lengths exceeding local scales—a global "correlation network." Just as the correlation function in statistical physics decays with distance but becomes long-ranged at the critical point, climate teleconnections suggest that certain climate processes are approaching criticality: changes in various regions of the system are no longer independent but are coupled as a whole.

Climate tipping points and phase transitions: Scientists have recognized that the Earth system has multiple possible tipping points, such as the melting of the Greenland ice sheet, the transition of tropical rainforests to savanna, and the collapse of ocean circulation. When external conditions change slowly, these subsystems may undergo abrupt transitions at certain threshold points[5]. This is similar to second-order phase transitions in physics: the system initially responds gently to small parameter changes, but once approaching the critical point, the entire state may suddenly reorganize. More interestingly, critical transitions in different climate or ecological systems exhibit some universal precursors, such as increased variance of fluctuations and decreased resilience. This can be directly compared to critical slowing down in critical phenomena: as the system approaches the critical point, its recovery time to the steady state becomes longer and longer (because correlation length and correlation time tend to diverge), manifesting as slower damping of perturbations and enhanced autocorrelation in continuous time series[5].

In climatology, this is used as an "early warning signal": by detecting, for example, rising autocorrelation coefficients or increased variance in temperature records, we can judge whether a climate subsystem is approaching critical collapse. Extensive research shows that from lake eutrophication to glacial-interglacial climate transitions, these indicators often show similar patterns of change before the critical point. This cross-system consistency is precisely the manifestation of universality in the climate-ecology domain: although the specific mechanisms differ, the mathematical structure of critical transitions makes many systems share similar symptoms and exponential laws.

Construction Process of Climate Networks: From Time Series to Complex Topology. This figure shows the core steps of applying statistical physics methods to study the Earth system: first, the Earth system is discretized into grid points (observation stations), and meteorological time series data (such as temperature or pressure) are extracted from each point. Then, connections between nodes are defined by computing statistical similarities between time series (such as cross-correlation or mutual information). The resulting functional climate network can capture long-range correlations (teleconnections) in the system, providing a mathematical basis for applying complex network theory and renormalization group analysis to climate tipping points. Image source: Fan, J., et al. Physics Reports 896, 1–84 (2021)

RG thinking as inspiration for climate modeling: Since the climate system spans scales from turbulent cloud clusters (kilometer scale) to planetary circulation (ten thousand kilometer scale), directly solving equations at all scales is neither realistic nor necessary. Scientists have developed various "parameterization" and multi-scale modeling methods, which are essentially coarse-graining: using effective variables at large scales to describe the net effect of small-scale processes. For example, the influence of cloud physics on large-scale thermodynamics can be expressed with a few parameters (such as cloud cover, precipitation rate), without tracking every cloud. This approach is very similar to RG: what we care about is which macroscopic variables can capture the average effect of microscopic processes. In recent years, some researchers have tried to apply renormalization group methods directly to climate models. For example, by progressively eliminating high-frequency weather disturbances, one can observe whether the effective evolution equations for long-term climate variables (such as annual mean temperature fields) have fixed points or simple scaling laws. Such analysis might explain why the probability distribution of climate variables has power-law tails at different spatial scales, or why climate networks exhibit self-similar structures at different resolutions.

In ecology, the RG perspective also provides insights. Natural ecological networks (such as food webs, population cellular automaton models) often have multiple hierarchical structures. For a specific model, it may be difficult to precisely predict which species will go extinct and exactly when, but RG tells us not to obsess over microscopic details: what matters more is identifying the universality class features of the system. For example, forest fire spread can be abstracted as a wildfire model on a lattice; when tree density reaches a certain threshold, infinite cascading fires occur, and its behavior belongs to the percolation universality class, similar to disease transmission and material fracture. This explains why the size distribution of forest fires in various regions often follows a power law—the system is in a near-critical self-organized critical state. As another example, species extinction in ecological environments may not be a linear random process, but rather there exist critical community collapse points where inter-population interactions cause cascading extinctions. All of these can be described in the language of RG: some "relevant" interactions (such as the role of keystone species) determine the fate of the ecosystem at large scales, while many weak interactions fade away in the evolutionary process.

In summary, introducing the renormalization group and universality perspective to climate and ecology helps us extract simple rules from intricate interactions. From identifying teleconnection patterns to monitoring signs of critical slowing down, RG thinking guides us to focus on scale-invariant patterns and key slow variables. This also provides inspiration for formulating climate policy or ecological interventions: rather than trying to precisely control every microscopic detail, it is better to grasp the global control parameters and use the universal behavior of the system to assess risks and tipping points. For example, by observing certain universal indicators of large ecosystems (such as fluctuations in species diversity), or the correlation structure of global climate networks, we may be able to predict potential critical transitions and take timely action.

6. Renormalization Ideas in Complex Networks and Graph Structures

Complex networks are an abstract and universal system representation—from social relationship networks to the Internet to brain neuron connections can all be abstractly represented as networks. These networks often contain multiple structural scales: both local clusters and communities (corresponding to short-range structure) and overall degree distributions and hierarchical modules (corresponding to long-range structure). To understand the organizational rules of networks at different scales, we can borrow from the renormalization group idea and perform rescaling or coarse-graining on networks.

However, applying RG to general networks is not easy. Classical RG operates on regular lattices, where "blocks" can naturally be defined uniformly. In contrast, complex networks are usually heterogeneous and irregular, with "small-world effects" (short average distances between any two points) and high clustering properties, making simple block reduction complicated[6]. Past research has tried various methods, such as spectral coarse-graining (based on network Laplacian eigenmodes) or box-covering (finding self-similar box structures on networks). These methods have revealed fractal scaling relationships in some cases, indicating that some networks (such as certain social or biological networks) have self-similar topological structures over a certain range. But overall, there is no unified RG framework like that for physical lattices that can be directly applied to general networks. This is considered an open problem in the statistical physics of complex networks[6].

Kadanoff Supernodes and the Renormalization Process in Complex Networks. To generalize RG to irregular complex networks (such as scale-free networks), researchers introduced the concept of "Kadanoff supernodes." (a) Definition of supernodes: By analyzing diffusion processes on the network (based on the Laplacian operator), nodes in the original network (lower layer) with similar dynamical distances are clustered into supernodes (different colors shown in the upper layer), similar to geometric "block spins" in lattice models. (b) Network coarse-graining: Each supernode is merged into a single renormalized node, and new connections are established based on original edge relationships. This process progressively reduces the network size while keeping the macroscopic topological properties of the network unchanged. Image source: Villegas, P., et al. Nature Physics 19, 445–450 (2023)

One of the latest advances is the proposal of the Laplacian Renormalization Group (LRG) method[6]. This method takes the network's Laplacian matrix as a starting point and views diffusion dynamics on the network as analogous to a "free field" in field theory. The basic idea is to define a diffusion scale on the network: regarding the fastest-diffusing modes as small-scale structure and progressively integrating them out to obtain a coarse-grained network. This is equivalent to implementing RG in "momentum space": high-frequency (fast diffusion) modes correspond to small-scale connections in the network and can be integrated out; low-frequency modes retain large-scale structure.

To connect with the intuition of real space, researchers also introduced the concept of Kadanoff supernodes: by examining the clustering relationships of network nodes at multiple scales, a set of closely connected points are merged into a "supernode." This method cleverly bypasses the problem that small-world networks lack obvious geometric blocks: supernodes are not fixed-size local neighborhoods, but allow dynamically selecting node sets across different scales[6]. By iteratively forming supernodes and merging edges, combined with mode truncation in momentum space, Laplacian RG can progressively simplify the original network while keeping the global properties of the network essentially unchanged.

Applications show that LRG can achieve good results on several real-world networks. For example, the neural connectome of the human brain, after coarse-graining, exhibits a degree of self-similarity: the degree distribution, clustering coefficient, etc. of the rescaled network are similar to the original network, suggesting that brain connectivity may have a power-law structure of hierarchical organization[6]. As another example, analyzing social networks at different resolutions may reveal recursively nested community structures like "communities of communities." RG provides a systematic tool to "rescale" networks and see the important connection patterns at each scale. It is worth noting that unlike the strict geometric scaling of coordinate scales in physical systems, in networks we care more about changes in topological scale (such as changes in average path length, changes in connectivity patterns). Laplacian RG defines scale precisely through the dynamical distance (diffusion distance) on the network, laying the foundation for introducing RG to complex networks[6].

In addition to Laplacian RG, scholars are also exploring other network renormalization approaches. For example, using information-theoretic methods to determine node merging criteria based on the information loss rate of the network, or defining network RG transformations through the equivalence of random walk processes. Regardless of the specific implementation, the goal is the same: describe the network at a coarser granularity while preserving key properties. This is similar to how in physics we use effective theories to describe low-energy long-range behavior without caring about high-energy short-range details. In network science, such RG concepts can help us identify the multiple scales of networks. For example, a power grid or supply chain network may have many small circles locally, but globally has backbone hubs; through RG analysis, we can quantify the contributions of these different levels to overall robustness and find "relevant structure" (such as key hub connections) and "irrelevant structure" (such as local redundant connections).

Summary - A Unified Perspective on the Multi-Scale World

The renormalization group was born from puzzles about divergent integrals and critical phenomena, but over half a century it has developed into a general language for understanding nature. From fundamental particle interactions to swallows spiraling in the sky, from the layer-by-layer weights of neural networks to the fluctuations of Earth's climate, RG thinking runs through the study of multi-scale complex systems. Macroscopic laws are not simple superpositions of microscopic laws, but emergent products through scale transformations. Through RG, we can see why vastly different systems share the same behavioral rules, and how to extract the factors governing the whole from myriad details.

The core idea of the renormalization group: focus on the big picture, let go of the small details, and proceed step by step. It embodies a philosophy for dealing with complexity: do not get lost in infinite details, but seek patterns that remain unchanged or similar across different scales. With this perspective, we can discover commonalities in various fields: critical points in physics, collective behavior of living groups, dimensionality reduction of data—all are essentially "variations on the same theme." As a proverb says, "One cannot see the true face of Mount Lu because one is within the mountain"; RG lets us step out of our original scale to examine problems, thereby seeing the universal contours of the landscape.

In quantum field theory, RG remains central to understanding the mechanisms of fundamental forces; in statistical physics, RG continues to expand into new areas of nonequilibrium and quantum materials; in interdisciplinary applications, the multi-scale perspective of RG will help us address major complexity challenges such as climate change, financial risk, and epidemic spread. For learners, sometimes the key to solving a problem lies not in stronger computational power, but in finding a way to reformulate the problem.

References

[1] Cavagna A., Di Carlo L., Giardina I., Grigera T. S., Melillo S., Parisi L., Pisegna G. & Scandolo M. (2023). Natural swarms in 3.99 dimensions. Nature Physics, 19(7), 1043–1049. doi:10.1038/s41567-023-02028-0

[2] Koch-Janusz M. & Ringel Z. (2018). Mutual information, neural networks and the renormalization group. Nature Physics, 14(6), 578–582. doi:10.1038/s41567-018-0081-4

[3] Li S.-H. & Wang L. (2018). Neural Network Renormalization Group. Physical Review Letters, 121, 260601. doi:10.1103/PhysRevLett.121.260601

[4] Fan J., Meng J., Ludescher J., Chen X., Ashkenazy Y., Kurths J., Havlin S. & Schellnhuber H.-J. (2021). Statistical physics approaches to the complex Earth system. Physics Reports, 896, 1–49. doi:10.1016/j.physrep.2020.11.004

[5] Dakos V., Boulton C. A., Buxton J. E., Abrams J. F., Arellano-Nava B., McKay D. I. A., Bathiany S., Blaschke L., Boers N., Dylewsky D., López-Martínez C., Parry I., Ritchie P., van der Bolt B., van der Laan L., Weinans E. & Kéfi S. (2024). Tipping point detection and early warnings in climate, ecological, and human systems. Earth System Dynamics, 15, 1117–1135. doi:10.5194/esd-15-1117-2024

[6] Villegas P., Gili T., Caldarelli G. & Gabrielli A. (2022). Laplacian renormalization group for heterogeneous networks. Nature Physics, 18, 878–884. doi:10.1038/s41567-022-01866-8