4. Critical Exponents: Extracting a "Minimal Parameter Set" from Divergent Behavior¶

Section 3 of Part I revealed a profound fact: near the critical point, physical quantities no longer exhibit normal behavior—response functions \(\chi\) tend to infinity, correlation length \(\xi\) diverges, and correlation functions transform from exponential decay to power-law decay. These divergences and power-law behaviors are not mutually independent coincidences, but originate from the same physical root: the system loses its intrinsic length scale and enters the scale-free world.

However, merely knowing about divergence is far from enough. The question physicists need to answer is: how fast are these divergences? Taking a ferromagnet as an example, when the temperature approaches the Curie temperature \(770°\text{C}\) from \(800°\text{C}\), by how many times will the susceptibility \(\chi\) increase? What if the temperature gets even closer? Similarly, for the power network on the eve of the 2003 North American blackout, how fast did the range of load correlations between nodes expand? For disease transmission, how does the connected scale of the infection network grow as the infection rate approaches the critical threshold?

The answers are hidden in a set of numbers called critical exponents. These exponents precisely characterize the power-law behavior of various physical quantities near the critical point—they tell us the speed of divergence, the speed of vanishing, and the rate of decay. More surprisingly, although there are dozens of physical quantities near the critical point, the number of independent exponents describing their behavior is only two. All seemingly different divergence behaviors can ultimately be derived from these two independent parameters. Behind this data compression lies a profound mathematical structure—the scaling hypothesis and homogeneity.

This section will systematically define these critical exponents, clarify the physical meaning of each exponent, and show how to extract them from experimental or simulation data. These concepts will lay the foundation for the derivation of scaling laws in the next section.

4.1 Reduced Temperature: A Unified Measure of the Critical Region¶

Before discussing critical behavior, we need a unified way to measure how close the system is to the critical point. The critical temperatures \(T_c\) of different materials can differ by several orders of magnitude—the Curie temperature of iron is \(1043 \, \text{K}\), while the critical temperature of the superconductor YBCO is only \(92 \, \text{K}\)—but the physical behavior near the critical point has universality. To compare the critical behavior of different systems, we introduce the reduced temperature:

\[ t = \frac{T - T_c}{T_c} \]

This is a dimensionless quantity that defines the critical point as \(t = 0\), with the high-temperature phase corresponding to \(t > 0\) and the low-temperature phase corresponding to \(t < 0\). The absolute value \(|t|\) of the reduced temperature measures the degree to which the system deviates from the critical point.

Why define it this way? Using \(T - T_c\) directly as the control parameter is certainly possible, but that would retain the dimension of temperature. After normalizing by \(T_c\), \(t\) becomes a purely relative deviation, making comparison between different systems possible. For example, when we say "iron at \(|t| = 0.01\)" and "water at \(|t| = 0.01\)", although their absolute temperature differences are completely different, their relative distances from their respective critical points are the same—this means they are at the same physical position within the critical region.

The introduction of reduced temperature also brings an important simplification. Near the critical point, the behavior of most physical quantities can be written in power-law form of \(|t|\). For example, correlation length \(\xi \sim |t|^{-\nu}\), susceptibility \(\chi \sim |t|^{-\gamma}\). This unified power-law form allows us to describe critical behavior with a few concise exponents, rather than establishing complex empirical formulas for each material separately.

A detail to note: Many formulas have different amplitudes on the high-temperature side (\(t > 0\)) and low-temperature side (\(t < 0\)), i.e., the coefficients in front of the power law. For example, the complete form of susceptibility can be written as:

\[ \chi = \begin{cases} \Gamma^+ |t|^{-\gamma} & \text{if } t > 0 \\ \Gamma^- |t|^{-\gamma} & \text{if } t < 0 \end{cases} \]

where \(\Gamma^+\) and \(\Gamma^-\) are usually not equal and depend on the microscopic details of the system. However, the power-law exponent \(\gamma\) is the same on both sides—this is one manifestation of universality. The amplitude ratio \(\Gamma^+/\Gamma^-\) is itself also a universal quantity independent of the specific material, but this belongs to a deeper discussion that we will skip for now.

4.2 Six Standard Critical Exponents: A Complete Dictionary of Critical Behavior¶

Near the critical point, various physical quantities of the system—specific heat, order parameter, response functions, correlation length, correlation functions—all exhibit power-law behavior. Physicists have defined a set of standard critical exponents to describe these behaviors. The following uses magnetic systems as examples, but these definitions can be directly generalized to any system with continuous phase transitions (liquid-gas transitions, superfluid transitions, percolation transitions, etc.), simply by replacing the order parameter and conjugate field with the corresponding physical quantities.

4.2.1 Specific Heat Exponent α: Anomalous Energy Absorption of the System¶

Definition:

\[ C \sim |t|^{-\alpha} \qquad (h = 0) \]

Physical meaning: Specific heat \(C\) measures how much heat the system needs to absorb for each degree of temperature increase. Far from the critical point, specific heat is a finite and smoothly varying quantity—you need a certain amount of heat to raise the system's temperature. But near the critical point, specific heat can become abnormally large, meaning the system becomes a "greedy heat reservoir": tiny heat input can hardly change the temperature, because energy is all used to drive critical fluctuations.

Three cases of \(\alpha\) values:

\(\alpha > 0\): Specific heat truly diverges, \(C \to \infty\). This means the system's "appetite" for heat tends to infinity at the critical point.
\(\alpha = 0\): Specific heat does not diverge, but shows logarithmic divergence \(C \sim \ln|t|\). This is a boundary case, growing slower than any power law, but still tending to infinity. The 2D Ising model is a typical representative of this case.
\(\alpha < 0\): Specific heat remains finite, but has a cusp at the critical point—i.e., a discontinuous derivative. The 3D percolation model (\(\alpha \approx -0.62\)) belongs to this case.

Connection to Section 2: Recall the fluctuation-dissipation theorem \(C_V \sim \text{Var}(E) / T^2\). The divergence of specific heat is equivalent to the divergence of energy fluctuations \(\text{Var}(E)\). Near the critical point, energy fluctuations become enormous, which is precisely the manifestation of the system entering a "hypersensitive" state.

4.2.2 Order Parameter Exponent β: How Order Emerges from Disorder¶

Definition:

\[ M \sim (-t)^\beta \qquad (t < 0, h = 0) \]

Physical meaning: This exponent describes how the order parameter continuously grows from zero in the ordered phase on the low-temperature side. Note the \((-t)\) in the definition—because \(t < 0\) corresponds to the low-temperature phase, so \(-t > 0\). When the temperature drops from the high-temperature side to below \(T_c\), the order parameter \(M\) begins to grow continuously from zero, with the growth rate controlled by \(\beta\).

Physical meaning of \(\beta\) values:

Smaller \(\beta\): The order parameter grows more steeply near \(T_c\). Imagine two curves starting from the origin: \(M \sim |t|^{0.1}\) grows faster near \(t \to 0\) than \(M \sim |t|^{0.5}\). Therefore, small \(\beta\) means that once the temperature drops below \(T_c\), the ordered state is "quickly" established.
Larger \(\beta\): The order parameter grows more slowly, and the ordered state requires a larger temperature deviation to fully develop.

An intuitive analogy: Imagine a flock of birds originally flying randomly (high-temperature disordered phase). When some parameter (such as bird flock density) exceeds a critical value, the flock begins to form collective motion (low-temperature ordered phase). What \(\beta\) describes is how the "intensity" of this collective motion (analogous to the order parameter) grows as the parameter changes. Small \(\beta\) means that once past the critical point, collective motion quickly strengthens; large \(\beta\) means collective motion slowly strengthens.

Typical values: Mean-field theory gives \(\beta = 1/2\), the exact solution of the 2D Ising model gives \(\beta = 1/8 = 0.125\), and numerical results for the 3D Ising model give approximately \(\beta \approx 0.326\).

4.2.3 Susceptibility Exponent γ: Divergence of Response¶

Definition:

\[ \chi \sim |t|^{-\gamma} \qquad (h = 0) \]

Physical meaning: Susceptibility \(\chi = \partial M / \partial h\) measures the response intensity of the system to external fields. This exponent directly corresponds to the "response function divergence" discussed in Section 2: when \(\gamma > 0\) (which is usually the case), susceptibility tends to infinity at the critical point, meaning the system becomes extremely sensitive to weak perturbations.

Deep connection to Section 3: Section 3 established the connection between response and fluctuations through the fluctuation-dissipation theorem \(k_B T \chi = \int d^d r \, G(r)\), and through the relationship between the integral of the correlation function and the correlation length \(\xi\), derived \(\chi \sim \xi^{2-\eta}\). Combined with \(\xi \sim |t|^{-\nu}\), we get \(\chi \sim |t|^{-\nu(2-\eta)}\). This means \(\gamma = \nu(2-\eta)\)—this is precisely the content of the Fisher scaling law. We will systematically derive these relationships in Section 5, but now you can see: there are intrinsic connections between different critical exponents; they are not independent parameters.

Returning to the North American blackout example: In the power network, "susceptibility" is analogous to the propagation coefficient of local failure impact on the entire network. When the network operates under high load (critical state), this propagation coefficient increases sharply—a tiny failure at one node can trigger a cascade response across the entire network. \(\gamma\) describes the speed of this "hypersensitivity" growth.

4.2.4 Critical Isotherm Exponent δ: Extreme Nonlinear Response¶

Definition:

\[ M(t=0) \sim h^{1/\delta} \]

Physical meaning: This exponent describes a special case—exactly at the critical temperature \(T = T_c\) (i.e., \(t = 0\)), the response of the order parameter to external field. In this case, the relationship between \(M\) and \(h\) is no longer linear (\(M \propto h\)), but exhibits strong nonlinear behavior \(M \propto h^{1/\delta}\).

Why nonlinear? Far from the critical point, the curvature of the free energy with respect to \(M\) (i.e., the second derivative) is finite, so small external fields cause small changes in order parameter, and the response is linear. But at the critical point, the free energy landscape becomes extremely flat, and the curvature approaches zero—this is precisely the "flat landscape" discussed in Section 1. In this case, linear response fails, replaced by power-law nonlinear response.

Physical meaning of \(\delta\) values:

Larger \(\delta\): The critical isotherm is more "flat", meaning a very large external field \(h\) is needed to produce appreciable order parameter \(M\). This corresponds to the case where the free energy landscape is extremely flat near the critical point.
Typical values: mean-field theory \(\delta = 3\), 2D Ising model \(\delta = 15\), 3D Ising model \(\delta \approx 4.79\).

Measurement method: In experiments, precisely adjust the temperature to \(T_c\), then measure the relationship between order parameter and external field. On a log-log plot (\(\log M\) vs \(\log h\)), the data should form a straight line with slope \(1/\delta\).

4.2.5 Correlation Length Exponent ν: The "Protagonist" of Scaling Theory¶

Definition:

\[ \xi \sim |t|^{-\nu} \]

Physical meaning: Correlation length \(\xi\) is the most core length scale in critical phenomena, defining the typical size of fluctuation clusters. \(\nu\) describes how \(\xi\) diverges as the system approaches the critical point.

Why is \(\nu\) the "most core" exponent? Section 3 already revealed that the divergence of correlation length is the fundamental characteristic of critical phenomena—it is precisely \(\xi \to \infty\) that causes the system to enter the "scale-free world" and power-law behavior to appear. In some sense, \(\nu\) controls "how fast the system enters the critical region". Once \(\nu\) is known, many other exponents can be derived through scaling laws.

Relationship between \(\nu\) and spatial dimension: The value of \(\nu\) is closely related to spatial dimension \(d\). In low-dimensional systems, fluctuations are stronger, and \(\nu\) is usually larger. For example, 2D Ising model \(\nu = 1\), while 3D Ising model \(\nu \approx 0.63\). Mean-field theory gives \(\nu = 1/2\), but this is only correct in high dimensions (\(d > 4\)).

Measurement method: Fit \(\xi\) from the non-critical correlation function \(G(r) \sim e^{-r/\xi}\), then plot \(\log \xi\) vs \(\log |t|\); the negative of the slope is \(\nu\).

4.2.6 Correlation Function Exponent η: The "Anomalous Dimension" of Fluctuations¶

Definition:

\[ G(r, t=0) \sim r^{-(d-2+\eta)} \]

Physical meaning: At the critical point (\(t = 0\)), the correlation function exhibits power-law decay. Without corrections from fluctuations, simple mean-field theory (Ornstein-Zernike theory) predicts \(G(r) \sim r^{-(d-2)}\), i.e., \(\eta = 0\). However, in real systems, the presence of fluctuations modifies this result, making the decay exponent \(d - 2 + \eta\).

Why is it called "anomalous dimension"? In the language of field theory, \(\eta\) is called the anomalous dimension. The origin of this name is related to the renormalization group. In free field theory, the field operator \(\phi\) has a definite scaling dimension given by dimensional analysis. But when interactions are present, fluctuations "correct" this scaling dimension, making it deviate from the free field value—this deviation is the source of "anomalous", and the amount of deviation is \(\eta\). We will discuss this in detail in subsequent lectures on field-theoretic renormalization group.

Typical values of \(\eta\): \(\eta\) is usually a small quantity. 2D Ising model \(\eta = 1/4 = 0.25\), 3D Ising model \(\eta \approx 0.036\), mean-field theory \(\eta = 0\). Although \(\eta\) is numerically small, its existence is a direct signature of fluctuation effects and is a key parameter for distinguishing mean-field from non-mean-field behavior.

Measurement method: Measure the correlation function \(G(r)\) at the critical point (\(t = 0\)), fit a straight line on a log-log plot (\(\log G\) vs \(\log r\)), the slope is \(-(d-2+\eta)\).

4.3 From Data to Exponents: Log-Log Plots and Extraction of Critical Exponents¶

Section 3 discussed in detail why scale-free systems must exhibit power-law distributions, and why log-log plots are the standard tool for studying critical phenomena. Now we apply this knowledge to the practical measurement of critical exponents.

Core idea: If a physical quantity \(A\) and control parameter \(x\) satisfy a power-law relationship \(A \sim x^p\), then taking logarithms:

\[ \log A = p \log x + \text{const} \]

This is a linear relationship: the plot of \(\log A\) versus \(\log x\) is a straight line, and the slope is the power-law exponent \(p\).

The following table summarizes the measurement methods for each critical exponent:

Table 1: Measurement Methods for Critical Exponents and Log-Log Slopes | Exponent | Corresponding Physical Quantity | Experimental/Simulation Conditions | Slope on Log-Log Plot | | :---: | :--- | :--- | :--- | | \(\beta\) | Order parameter \(M\) vs reduced temperature \(t < 0\) | Zero external field \(h = 0\), low-temperature side | \(\log M\) vs \(\log(-t)\): slope \(= +\beta\) | | \(\gamma\) | Susceptibility \(\chi\) vs reduced temperature \(t\) | Zero field limit \(h \to 0\) | \(\log \chi\) vs \(\log \|t\|\): slope \(= -\gamma\) | | \(\alpha\) | Specific heat \(C\) vs reduced temperature \(t\) | Zero external field \(h = 0\) | \(\log C\) vs \(\log \|t\|\): slope \(= -\alpha\) | | \(\nu\) | Correlation length \(\xi\) vs reduced temperature \(t\) | Fit \(\xi\) from \(G(r) \sim e^{-r/\xi}\) | \(\log \xi\) vs \(\log \|t\|\): slope \(= -\nu\) | | \(\eta\) | Correlation function \(G(r)\) at \(t = 0\) | Measure at critical point | \(\log G\) vs \(\log r\): slope \(= -(d-2+\eta)\) | | \(\delta\) | Order parameter \(M\) vs external field \(h\) at \(t = 0\) | Critical isotherm | \(\log M\) vs \(\log h\): slope \(= +1/\delta\) |

Note the sign of the slope: The sign of the slopes in the table needs careful understanding. Taking \(\gamma\) as an example: \(\chi \sim |t|^{-\gamma}\) means \(\chi\) increases when \(|t|\) decreases, so the slope of \(\log \chi\) vs \(\log |t|\) is negative \(-\gamma\). The situation for \(\beta\) is different: \(M \sim (-t)^\beta\) means \(M\) increases when \((-t)\) increases (on the low-temperature side \(-t > 0\)), the slope is positive \(+\beta\).

Challenges in practical measurements:

Although the above methods are very straightforward in principle, there are many difficulties in actual operation:

Finite-Size Effects: True phase transitions only exist in infinitely large systems. Finite systems in experiments and simulations cause the "pseudo-critical point" position to shift, response function peaks to be finite rather than divergent, thus systematically deviating from theoretically predicted power-law behavior.
Width of the critical region: Power laws only hold within the critical region where \(|t| \ll 1\). Far from the critical point, corrections to scaling appear. Therefore, data fitting requires careful selection of the fitting range—too far from the critical point introduces correction terms; too close to the critical point is affected by finite-size effects.
Uncertainty in critical point location: In experiments, the precise value of \(T_c\) is often not known in advance and needs to be determined from data. But errors in \(T_c\) propagate to the calculation of reduced temperature \(t\), thereby affecting the measurement of exponents.
Statistical fluctuations and systematic errors: Experimental noise and finite sampling in simulations both produce scatter on log-log plots, especially when data at small \(|t|\) values is often sparse and noisy.

More reliable methods: Due to the above challenges, simple log-log fitting is often not reliable enough. Modern research widely adopts Finite-Size Scaling (FSS) and Data Collapse methods. These techniques use data from different system sizes and "collapse" them onto the same master curve through appropriate scaling transformations. If the correct critical exponents and critical point location are chosen, the collapse succeeds; otherwise the data scatters. This method is more robust than simple power-law fitting because it simultaneously uses multiple datasets and shape information of the scaling function. Section 7 will demonstrate this method in detail through the example of percolation phase transition.

4.4 Comparison of Critical Exponents Across Different Universality Classes¶

The first three sections have mentioned a key concept multiple times: Universality. Different physical systems—ferromagnets, liquids, superconductors, percolation networks—may exhibit exactly the same critical exponents near the critical point. Systems sharing the same set of exponents are grouped into the same universality class.

The following table lists the critical exponents of several most common universality classes:

Table 2: Critical Exponents of Different Universality Classes

Exponent	Physical Definition	Mean-Field (Landau)	2D Ising (Exact)	3D Ising (Numerical)	3D Percolation (Numerical)
\(\alpha\)	\(C \sim \\|t\\|^{-\alpha}\)	0 (jump)	0 (log)	~0.11	~-0.62
\(\beta\)	\(M \sim (-t)^\beta\)	1/2	1/8 = 0.125	~0.326	~0.41
\(\gamma\)	\(\chi \sim \\|t\\|^{-\gamma}\)	1	7/4 = 1.75	~1.24	~1.80
\(\delta\)	\(M(t=0) \sim h^{1/\delta}\)	3	15	~4.79	~18
\(\nu\)	\(\xi \sim \\|t\\|^{-\nu}\)	1/2	1	~0.63	~0.88
\(\eta\)	\(G(r) \sim r^{-(d-2+\eta)}\)	0	1/4 = 0.25	~0.036	~-0.05

Mean-field theory gives a set of "simple" exponent values (\(\beta = 1/2\), \(\gamma = 1\), etc.). These values are correct in high-dimensional space (\(d > 4\)), but are corrected by fluctuation effects in low-dimensional space. Mean-field's \(\alpha = 0\) corresponds to a jump in specific heat (not divergence, but a finite step discontinuity), which is different from the logarithmic divergence of 2D Ising.
The critical exponents of the 2D Ising model can be calculated exactly (Onsager solution) and are an important benchmark for theoretical verification. Note that \(\eta = 1/4\) and \(\delta = 15\) both significantly deviate from mean-field values, showing the strong effect of two-dimensional fluctuations.
The 3D Ising model has no exact solution; exponent values come from high-precision numerical simulations and field-theoretic calculations (\(\epsilon\)-expansion). The 3D Ising universality class includes uniaxial ferromagnets, liquid-gas transition critical points, order-disorder transitions in binary alloys, etc.
The 3D percolation model describes phase transitions of geometric connectivity (such as the entangled granular network discussed in Section 7 of this lecture). Its exponents differ from the Ising universality class, reflecting the "non-thermodynamic" nature of percolation transitions. Note particularly that \(\alpha < 0\), meaning specific heat does not diverge at the critical point, but has a cusp.

Deeper meaning of universality classes: Different columns in the table represent different universality classes; systems in the same column share the same critical exponents. The deep reason for this fact is: what determines critical exponents is only the spatial dimension \(d\), the symmetry of the order parameter, and the range of interactions, independent of the microscopic details of the system (such as lattice structure, interaction strength, atomic mass). In the language of the renormalization group, systems in the same universality class converge to the same fixed point in the RG flow, and the critical exponents are determined by the properties of the fixed point. Section 6 will discuss the concept of universality classes in detail.

4.5 The Essence of Critical Exponents: "Fingerprints" of Fixed Points¶

Before ending this section, we need to clarify a key conceptual question: What exactly are critical exponents?

A common but incorrect understanding is: critical exponents are parameters fitted from experimental data, similar to determining coefficients when fitting a parabola. If so, different experiments might give different exponent values, and these values would have no special physical meaning.

However, the true nature of critical exponents is completely different. Critical exponents are "fingerprints" of fixed points—they are not fitting parameters, but mathematical constants uniquely determined by the universality class of the system.

This viewpoint can be understood from two perspectives:

From a phenomenological perspective: Section 3 already revealed that near the critical point the system enters a "scale-free world", losing its intrinsic length scale. In this case, microscopic details (such as lattice constant, interaction strength) are "forgotten", and only universal properties (such as symmetry, dimension) survive. This means critical exponents cannot depend on microscopic parameters—they must be universal.

From the renormalization group perspective: Lectures 1 and 2 introduced the core picture of RG: coarse-graining processes cause the system to "flow" from microscopic to macroscopic scales. Systems from different microscopic starting points may converge in this flow to the same fixed point—a special point that no longer changes under RG transformations. Critical exponents are determined by the linearized behavior near the fixed point: when the system approaches the fixed point, the behavior of the RG flow is completely controlled by the properties of the fixed point, independent of initial conditions.

This picture can be understood with a simple analogy. Imagine a valley (fixed point) and rivers flowing toward the valley from all directions (RG flows). Different rivers start from different origins (different microscopic systems), but they all eventually converge into the same valley. Once near the valley, the behavior of all rivers is determined by the valley's terrain, independent of their origins. Critical exponents are parameters describing the "valley terrain"—they are intrinsic properties of the fixed point, independent of specific "rivers" (microscopic systems).

Critical exponents are "fingerprints" of fixed points: looking at exponents is like identifying a person by fingerprints. This analogy also explains why precise measurement of critical exponents is so important. Fingerprints can be used to identify identity—similarly, critical exponents can be used to determine which universality class a system belongs to. If measured exponents match the 3D Ising universality class, then the system is "certified" as a 3D Ising class member, whether it is a ferromagnet, liquid, or alloy.

In this lecture, we will not use the complete mathematical machinery of the renormalization group to derive the values of critical exponents—that is the main content of subsequent lectures. We will adopt a more direct approach: the scaling hypothesis and homogeneity. The next section will show that starting from just one concise mathematical assumption, one can derive the scaling laws that critical exponents must satisfy, compressing six seemingly independent exponents into only two independent parameters. The success of this data compression is itself powerful evidence for the correctness of the scaling hypothesis.

5. Scaling Hypothesis and Homogeneity: Why Are Only Two of Six Exponents Independent?¶

Section 4 defined six critical exponents \((\alpha, \beta, \gamma, \delta, \nu, \eta)\), which respectively describe the power-law behavior of specific heat, order parameter, susceptibility, critical isotherm, correlation length, and correlation function near the critical point. If these six exponents were completely independent fitting parameters, then describing a critical system would require six degrees of freedom—meaning that almost no universal connections could exist between different systems.

However, experiments and theory both reveal an amazing fact: these six exponents are not independent. There exist strict mathematical relationships between them, called scaling laws. For example, the Rushbrooke scaling law states \(\alpha + 2\beta + \gamma = 2\); the Widom scaling law states \(\gamma = \beta(\delta - 1)\). These relationships compress six exponents into only two independent parameters—once any two of them (say \(\nu\) and \(\eta\)) are known, the other four are completely determined.

This data compression is not coincidental, but originates from the deep structure of physics near the critical point. Section 3 already revealed that at the critical point the system enters the scale-free world, losing its intrinsic length scale. This scale invariance imposes strong constraints on the mathematical form of the free energy—the singular part of the free energy must be a special functional form called a generalized homogeneous function. Starting from this concise mathematical assumption, all scaling laws can be derived as inevitable corollaries.

5.1 Why Can Only Power Laws Appear Near the Critical Point?¶

Before delving into the mathematical form of the scaling hypothesis, let us first answer a more fundamental question: Why must physical quantities near the critical point exhibit power-law behavior, rather than other forms (such as exponential, logarithmic, polynomial)?

The answer is hidden in the core fact revealed in Section 3: At the critical point, the system has only one divergent length scale—the correlation length \(\xi\). When \(\xi \to \infty\), the system loses its intrinsic length scale and enters the scale-free world. In this case, the behavior of physical quantities can only be determined by \(\xi\).

Consider the free energy density \(f\) (free energy per unit volume). From dimensional analysis, the dimension of \(f\) is \([\text{energy}]/[\text{volume}] = [\text{energy}] \cdot [\text{length}]^{-d}\). Near the critical point, the only available length scale is \(\xi\), so the singular part of the free energy density (the part that causes critical behavior) can only be:

\[ f_s \sim \xi^{-d} \]

This is a dimensional argument: if no other length scale is available, \(f_s\) must be proportional to \(\xi^{-d}\), because this is the only combination that gives the correct dimension.

Now, since \(\xi \sim |t|^{-\nu}\), we immediately get:

\[ f_s \sim \xi^{-d} \sim |t|^{d\nu} \]

This is the origin of power-law behavior: Near the critical point there is only one divergent length scale, and all physical quantities must be constructed from this scale, with the only way of "construction" being power combinations.

Returning to the North American blackout example in the introduction: when the power network operates in a critical state, the range of load correlations between nodes (analogous to \(\xi\)) extends to the entire network. In this case, the influence range of local failures is no longer determined by any intrinsic scale of the network (such as inter-node distance), but by this divergent correlation length. The system's response to perturbations (analogous to \(\chi\)) therefore exhibits power-law divergence, rather than remaining finite or growing exponentially.

5.2 Generalized Homogeneous Functions: Mathematical Expression of the Scaling Hypothesis¶

The dimensional argument in the previous subsection, while intuitive, is not rigorous—it assumed \(f_s \sim \xi^{-d}\), which is actually an additional assumption (called the hyperscaling hypothesis). We need a more general mathematical framework to describe critical behavior, and this is the scaling hypothesis.

5.2.1 What Are Homogeneous Functions?¶

Before introducing the scaling hypothesis, we need to understand a mathematical concept: homogeneous functions.

A function \(f(x, y)\) is called a homogeneous function of degree \(p\) if it satisfies the following property:

\[ f(\lambda x, \lambda y) = \lambda^p f(x, y) \]

for any positive number \(\lambda\) and some fixed exponent \(p\).

Physical intuition: Homogeneous functions describe a kind of scale invariance—if you simultaneously scale all independent variables by a factor of \(\lambda\), the function value also responds in a definite way (scaling by \(\lambda^p\)).

A simple example: The area formula of a circle \(A = \pi r^2\) is a homogeneous function of degree 2 in \(r\). If the radius is doubled, the area quadruples. This is the scaling relationship in geometry.

However, the scaling behavior in critical phenomena is more complex: reduced temperature \(t\) and external field \(h\) have different degrees of influence on physical quantities. We need a more flexible concept—generalized homogeneous functions:

\[ f(\lambda^a x, \lambda^b y) = \lambda f(x, y) \]

Here \(a\) and \(b\) are two different exponents, describing the different "weights" of \(x\) and \(y\) under scaling transformations.

5.2.2 Widom Scaling Hypothesis¶

In 1965, Benjamin Widom proposed a bold hypothesis: the singular part of the free energy is a generalized homogeneous function of reduced temperature \(t\) and external field \(h\).

First, we need to divide the free energy into two parts:

\[ f(t, h) = f_{\text{reg}}(t, h) + f_s(t, h) \]

The first part \(f_{\text{reg}}\) is the regular part, which is smooth near the critical point and can be expanded as a Taylor series in reduced temperature \(t\) and external field \(h\), corresponding to "normal" thermodynamic behavior. The second part \(f_s\) is the singular part, which is non-analytic at the critical point \((t=0, h=0)\)—it is precisely this non-analyticity that causes response function divergence and all critical phenomena. The scaling hypothesis focuses on this singular part.

The Widom scaling hypothesis states that the singular part satisfies a special functional equation:

\[ \boxed{f_s(t, h) = b^{-d} f_s(b^{y_t} t, b^{y_h} h)} \]

The physical meaning of this formula needs careful interpretation. Here \(b > 1\) is an arbitrary scaling factor, which can be understood as "coarse-graining the system by a factor of \(b\)"—imagine merging \(b^d\) original lattice sites into one new "super-site". In this coarse-graining process, the free energy density (free energy per unit volume) changes by a factor of \(b^{-d}\), because the unit volume has become \(b^d\) times larger. At the same time, the reduced temperature \(t\) and external field \(h\) also change with coarse-graining: \(t\) is multiplied by a factor \(b^{y_t}\), and \(h\) is multiplied by a factor \(b^{y_h}\). Here \(y_t\) and \(y_h\) are called the scaling dimensions of \(t\) and \(h\), describing the rate at which these two parameters grow or shrink during the coarse-graining process.

From a physical intuition perspective, this formula captures the self-similarity of systems near the critical point: if you "step back" to observe the system (i.e., coarse-grain), the statistical behavior of the system has a definite mathematical relationship with the original behavior. Reduced temperature and external field change at specific rates, while free energy density changes by a volume factor. This is precisely the precise mathematical expression of the scale invariance discussed in Section 3.

In the language of the renormalization group, \(y_t\) and \(y_h\) are precisely the characteristic exponents of the linearized RG transformation near the fixed point. Their sign determines whether the corresponding parameter is relevant (relevant, \(y > 0\), moves away from the fixed point) or irrelevant (irrelevant, \(y < 0\), flows toward the fixed point) in the RG flow. For critical phenomena, both \(t\) and \(h\) are relevant parameters (\(y_t, y_h > 0\)), which explains why controlling only these two parameters is sufficient to drive the system through the phase transition. This RG picture will be fully developed in subsequent lectures.

5.3 Deriving Relationships Between Critical Exponents from the Scaling Hypothesis¶

The power of the scaling hypothesis lies in: starting from this one formula, all relationships between critical exponents can be derived. Below we demonstrate this derivation process.

5.3.1 Eliminating the Scaling Factor b¶

The \(b\) in the scaling hypothesis is arbitrary, and we can choose a specific value to simplify calculations. The classic trick is: choose \(b\) such that \(b^{y_t} |t| = 1\).

This means \(b = |t|^{-1/y_t}\). Substituting into the scaling hypothesis:

\[ f_s(t, h) = |t|^{d/y_t} f_s\left(\pm 1, \frac{h}{|t|^{y_h/y_t}}\right) \]

where \(\pm 1\) corresponds to \(t > 0\) (high-temperature phase) and \(t < 0\) (low-temperature phase).

Define two important quantities:

\[ \nu \equiv \frac{1}{y_t}, \qquad \Delta \equiv \frac{y_h}{y_t} \]

Here \(\nu\) is precisely the correlation length exponent (because in RG language, \(\xi \sim |t|^{-1/y_t} = |t|^{-\nu}\)), and \(\Delta\) is called the gap exponent, describing the scaling weight of external field \(h\) relative to temperature \(t\).

Thus, the singular part of the free energy can be written as:

\[ \boxed{f_s(t, h) = |t|^{d\nu} \Phi_\pm\left(\frac{h}{|t|^\Delta}\right)} \]

where \(\Phi_\pm\) are two scaling functions, corresponding to the high-temperature and low-temperature phases respectively. This form tells us: the singular part of the free energy is not an arbitrary function of \(t\) and \(h\), but a specific combination form—\(|t|^{d\nu}\) multiplied by a function that depends only on \(h/|t|^\Delta\).

5.3.2 Deriving Each Critical Exponent¶

Now we can derive the relationships between each critical exponent and \(\nu\) and \(\Delta\) starting from the scaling form of the free energy.

Order Parameter \(M\) and Exponent \(\beta\)

The order parameter is the derivative of free energy with respect to external field:

\[ M(t, h) = -\frac{\partial f_s}{\partial h} \]

Differentiating the scaling form:

\[ M(t, h) = -\frac{\partial}{\partial h}\left[|t|^{d\nu} \Phi_\pm\left(\frac{h}{|t|^\Delta}\right)\right] = |t|^{d\nu - \Delta} \mathcal{M}_\pm\left(\frac{h}{|t|^\Delta}\right) \]

where \(\mathcal{M}_\pm(x) = -\Phi'_\pm(x)\) is another scaling function.

In the low-temperature phase (\(t < 0\)) at zero external field (\(h = 0\)), the order parameter grows as \(M \sim (-t)^\beta\). Comparing both sides, we get:

\[ \boxed{\beta = d\nu - \Delta} \]

Susceptibility \(\chi\) and Exponent \(\gamma\)

Susceptibility is the derivative of order parameter with respect to external field:

\[ \chi(t) = \left.\frac{\partial M}{\partial h}\right|_{h=0} \]

Differentiating the scaling form of \(M\) once more:

\[ \chi(t) \sim |t|^{d\nu - 2\Delta} \]

Since \(\chi \sim |t|^{-\gamma}\), comparing both sides:

\[ \boxed{\gamma = 2\Delta - d\nu} \]

Specific Heat \(C\) and Exponent \(\alpha\)

Specific heat is the second derivative of free energy with respect to temperature:

\[ C \sim -\frac{\partial^2 f}{\partial T^2} \sim \frac{\partial^2 f_s}{\partial t^2} \]

At \(h = 0\), \(f_s(t, 0) \sim |t|^{d\nu}\). Differentiating with respect to \(t\) twice:

\[ C \sim |t|^{d\nu - 2} \]

Since \(C \sim |t|^{-\alpha}\), comparing both sides:

\[ \boxed{2 - \alpha = d\nu} \]

This is the famous Josephson hyperscaling relation. It connects the specific heat exponent \(\alpha\), correlation length exponent \(\nu\), and spatial dimension \(d\).

Critical Isotherm Exponent \(\delta\)

At exactly \(t = 0\), we need to choose \(b\) in another way. Let \(b = h^{-1/y_h}\), substituting into the scaling hypothesis:

\[ f_s(0, h) = h^{d/y_h} f_s(0, 1) \]

Differentiating with respect to \(h\) gives the order parameter:

\[ M(0, h) \sim h^{d/y_h - 1} = h^{(d - y_h)/y_h} \]

Since \(M(0, h) \sim h^{1/\delta}\), comparing both sides:

\[ \boxed{\delta = \frac{y_h}{d - y_h}} \]

This can be re-expressed using \(\nu\) and \(\Delta\) as \(\delta = \Delta / (d\nu - \Delta) = \Delta / \beta\), giving \(\Delta = \beta \delta\).

5.3.3 Summary: All Exponents Determined by Two Parameters¶

Let us organize the relationships we have derived:

\[ \begin{aligned} \beta &= d\nu - \Delta \\ \gamma &= 2\Delta - d\nu \\ 2 - \alpha &= d\nu \\ \delta &= \frac{\Delta}{\beta} = \frac{\Delta}{d\nu - \Delta} \end{aligned} \]

These four equations contain four exponents \((\alpha, \beta, \gamma, \delta)\) and two fundamental parameters \((\nu, \Delta)\). This means: once \(\nu\) and \(\Delta\) (or equivalently, \(\nu\) and \(\eta\)) are known, all other exponents are completely determined.

Six critical exponents, but only two are independent—this is precisely the core prediction of the scaling hypothesis.

5.4 Four Major Scaling Laws: Identities Between Exponents¶

By eliminating \(\nu\) and \(\Delta\), we can obtain identities that directly connect critical exponents. These are the famous scaling laws.

5.4.1 Rushbrooke Scaling Law¶

\[ \boxed{\alpha + 2\beta + \gamma = 2} \]

Derivation: Substituting \(\alpha = 2 - d\nu\), \(\beta = d\nu - \Delta\), \(\gamma = 2\Delta - d\nu\) into the left side:

\[ (2 - d\nu) + 2(d\nu - \Delta) + (2\Delta - d\nu) = 2 - d\nu + 2d\nu - 2\Delta + 2\Delta - d\nu = 2 \]

The identity holds!

Physical meaning: The Rushbrooke scaling law connects the three most easily measured thermodynamic response functions—specific heat (\(\alpha\)), order parameter (\(\beta\)), and susceptibility (\(\gamma\)). It was one of the earliest predictions of scaling theory to be experimentally verified.

Numerical verification: Taking the 3D Ising model as an example, \(\alpha \approx 0.11\), \(\beta \approx 0.326\), \(\gamma \approx 1.24\). Substituting: \(0.11 + 2 \times 0.326 + 1.24 = 2.002 \approx 2\). The agreement is very good!

5.4.2 Widom Scaling Law¶

\[ \boxed{\gamma = \beta(\delta - 1)} \]

Derivation: From \(\gamma = 2\Delta - d\nu\) and \(\beta = d\nu - \Delta\), we have \(\gamma + \beta = \Delta\). Also from \(\delta = \Delta / \beta\), we get \(\Delta = \beta \delta\). Therefore:

\[ \gamma = \Delta - \beta = \beta \delta - \beta = \beta(\delta - 1) \]

Physical meaning: The Widom scaling law connects the divergence of susceptibility (\(\gamma\)) with the nonlinearity of the critical isotherm (\(\delta\)) and the growth of the order parameter (\(\beta\)). It reflects the homogeneity structure of the equation of state \(M = M(t, h)\).

5.4.3 Fisher Scaling Law¶

\[ \boxed{\gamma = (2 - \eta)\nu} \]

Derivation: Deriving this scaling law requires using the fluctuation-dissipation theorem \(k_B T \chi = \int d^d r \, G(r)\) established in Section 3.

Near the critical point, the complete form of the correlation function is (Ornstein-Zernike form plus power-law correction):

\[ G(r) \sim \frac{1}{r^{d-2+\eta}} e^{-r/\xi} \]

Integrating the correlation function in \(d\)-dimensional space:

\[ \chi \sim \int_0^\infty dr \, r^{d-1} \cdot \frac{1}{r^{d-2+\eta}} e^{-r/\xi} = \int_0^\infty dr \, r^{1-\eta} e^{-r/\xi} \]

This integral can be calculated using the substitution \(u = r/\xi\):

\[ \chi \sim \xi^{2-\eta} \int_0^\infty du \, u^{1-\eta} e^{-u} \sim \xi^{2-\eta} \]

Substituting \(\xi \sim |t|^{-\nu}\):

\[ \chi \sim |t|^{-\nu(2-\eta)} \]

Comparing with \(\chi \sim |t|^{-\gamma}\), we get \(\gamma = (2-\eta)\nu\).

Physical meaning: The Fisher scaling law connects thermodynamic response (susceptibility \(\chi\)) with spatial geometric structure (correlation function decay exponent \(\eta\)). Its profundity lies in: the macroscopic "hypersensitive" response can be completely explained by the microscopic correlation function's "long tail". This is precisely the mathematical manifestation of the "correlation length divergence \(\Leftrightarrow\) power-law correlation \(\Leftrightarrow\) response divergence" equivalence chain established in Section 3.

5.4.4 Josephson Hyperscaling Law¶

\[ \boxed{d\nu = 2 - \alpha} \]

This relation is also called hyperscaling because it includes the spatial dimension \(d\).

Physical meaning: The hyperscaling law comes from the dimensional argument \(f_s \sim \xi^{-d}\) discussed in Section 5.1. It connects the specific heat exponent \(\alpha\), correlation length exponent \(\nu\), and spatial dimension \(d\).

Unlike the other three scaling laws, the hyperscaling law fails in high-dimensional space. When the spatial dimension \(d\) exceeds a certain critical value \(d_c\) (called the upper critical dimension, for the Ising model \(d_c = 4\)), fluctuations become relatively weak, and the system's behavior is dominated by mean-field theory. In this case, the intuition \(f_s \sim \xi^{-d}\) no longer holds, and the hyperscaling law is modified.

We will discuss this in detail in subsequent lectures on Landau theory and the Ginzburg criterion. For now, just remember: when \(d > d_c\), hyperscaling fails, but the other three scaling laws still hold.

5.4.5 Summary of Scaling Laws¶

Table: Four Major Scaling Laws and Their Properties

Name	Formula	Exponents Involved	Applicability Conditions
Rushbrooke	\(\alpha + 2\beta + \gamma = 2\)	\(\alpha, \beta, \gamma\)	Exact when scaling hypothesis holds
Widom	\(\gamma = \beta(\delta - 1)\)	\(\beta, \gamma, \delta\)	Exact when scaling hypothesis holds
Fisher	\(\gamma = (2 - \eta)\nu\)	\(\gamma, \eta, \nu\)	Exact when scaling hypothesis holds
Josephson	\(d\nu = 2 - \alpha\)	\(\alpha, \nu, d\)	Only holds when \(d < d_c\)

5.5 Data Collapse: Experimental Verification of the Scaling Hypothesis¶

The scaling hypothesis not only gives relationships between exponents, but also provides a powerful experimental verification method—data collapse.

5.5.1 From Scaling Form to Data Collapse¶

Returning to the scaling form of the order parameter:

\[ M(t, h) = |t|^\beta \mathcal{M}_\pm\left(\frac{h}{|t|^\Delta}\right) \]

This formula tells us: although magnetization curves at different temperatures \(t\) and external fields \(h\) look different, if the coordinate axes are rescaled in the correct way, all curves should collapse onto the same master curve.

The specific operation method is as follows: first measure the magnetization \(M\) as a function of external field \(h\) at several different temperatures \(t_1, t_2, \ldots\), obtaining a family of curves that look chaotic. Then perform scaling transformation on each dataset—change the horizontal axis to \(x = h / |t|^\Delta\), and the vertical axis to \(y = M / |t|^\beta\). If the scaling hypothesis is correct, after this transformation, all data points should fall on the same curve \(y = \mathcal{M}_\pm(x)\).

The success or failure of the collapse provides key diagnostic information. If the data indeed collapses onto a master curve, it means you chose the correct \(\beta\) and \(\Delta\), and the scaling hypothesis is verified. Conversely, if collapse fails (data points scattered randomly), it means either the parameter values are wrong, or the system does not satisfy the scaling hypothesis at all—possibly because measurements are not in the true critical region, or there are corrections to scaling behavior.

5.5.2 The Power of Data Collapse¶

The data collapse method has significant advantages over traditional single-point power-law fitting. First, it utilizes all measurement results from the entire critical region, rather than just a few data points near the critical point, so statistical precision is higher and it is less sensitive to systematic errors. Second, if exponent values obtained from other independent measurements (such as specific heat measurements) are used to attempt collapse, the success of collapse provides an independent test of the scaling hypothesis—this is a powerful self-consistency check.

Additionally, by adjusting the value of \(T_c\) to optimize collapse quality, the critical temperature can be determined very precisely—this is often more accurate than directly searching for response function peaks. Finally, data collapse can be used to identify universality class: using theoretical exponent values from different universality classes to attempt collapse respectively, the set of exponents that successfully collapses reveals the universality class membership of the system.

Using the city data in the figure below as an example, city street networks, population numbers, and residential CO₂ emissions at different spatial resolutions are systematically coarse-grained. By gradually increasing the box size, cities are viewed as statistical systems composed of multi-scale units, providing the foundation for subsequent scale analysis and distribution rescaling. This process reveals the structural consistency of urban metabolism variables with spatial scale changes. For related paper interpretation, see: Cities Also Have "Physical Phase Transitions": Discovering Universal "Collapse" Laws of Cities in Fluctuations

In different cities (cross-continental samples), urban metabolism variables such as population (N), emissions (M), and building scale (B) show obvious differences at the original scale (left column). When variables are appropriately rescaled by power laws according to their means (right column), the distribution functions of all cities collapse onto the same universal curve, indicating that city data are not independent from each other, but are governed by the same scale-invariant structure. This data collapse phenomenon shows: Urban systems can be viewed as belonging to the same universality class in a statistical sense, with their spatial and social heterogeneity effectively absorbed into a single scaling function.

5.5.3 Finite-Size Scaling¶

In practical numerical simulations, systems are always of finite size (such as an \(L \times L \times L\) lattice). Finite systems cannot exhibit true phase transitions—response functions do not diverge, only showing finite peaks. How can we extract the critical behavior of infinite systems from finite system data?

The answer is Finite-Size Scaling (FSS). The core idea is: in finite systems, when the correlation length \(\xi\) grows to approach the system size \(L\), the system "feels" the presence of boundaries, and critical behavior is truncated.

FSS predicts: near the critical point, physical quantities depend on both \(t\) and \(L\), satisfying scaling forms:

\[ M(t, L) = L^{-\beta/\nu} \widetilde{\mathcal{M}}\left(t \cdot L^{1/\nu}\right) \]

\[ \chi(t, L) = L^{\gamma/\nu} \widetilde{\chi}\left(t \cdot L^{1/\nu}\right) \]

These formulas lead to specific schemes for finite-size data collapse: set the horizontal axis as \(x = t \cdot L^{1/\nu}\) (for percolation it is \((p - p_c) \cdot L^{1/\nu}\)), and the vertical axis as \(M \cdot L^{\beta/\nu}\) or \(\chi \cdot L^{-\gamma/\nu}\). If the exponents are chosen correctly, data from different system sizes \(L\) will collapse onto the same master curve. The subtlety of this method is: it turns finite-size effects from an "obstacle" into a "tool"—by systematically varying \(L\), we actually gain more data available for collapse. Section 7's percolation phase transition practice will demonstrate in detail how to use these techniques to extract critical exponents from simulation data.

5.6 When Does Hyperscaling Fail? Upper Critical Dimension¶

Before ending this section, we need to discuss an important boundary condition: under what circumstances does hyperscaling fail?

5.6.1 Upper Critical Dimension¶

Section 5.4 already mentioned that the hyperscaling law \(d\nu = 2 - \alpha\) includes spatial dimension \(d\), which is different from other scaling laws. The deeper reason is: this relation depends on the assumption \(f_s \sim \xi^{-d}\), i.e., that free energy density is completely determined by correlation length.

However, when spatial dimension is high enough, fluctuation effects become relatively weak. In high-dimensional space, particles have more directions to "avoid" the influence of each other's fluctuations, so mean-field approximation (theory that ignores fluctuations) becomes increasingly accurate.

There exists a critical dimension \(d_c\), called the upper critical dimension, which divides spatial dimension into three regions. When \(d < d_c\), fluctuation effects dominate, critical exponents are determined by collective fluctuation behavior, and hyperscaling holds. When \(d > d_c\), fluctuations are relatively unimportant, the system exhibits mean-field behavior, critical exponents take the "classical values" given by Landau theory, and hyperscaling fails. When exactly \(d = d_c\), the situation is at the boundary, and usually logarithmic corrections appear—power-law behavior is multiplied by logarithmic factors.

The specific value of the upper critical dimension depends on the universality class of the system:

Universality Class	Upper Critical Dimension \(d_c\)
Ising / Liquid-Gas / Binary Mixtures	4
XY / Superfluid Transition	4
Heisenberg / Ferromagnetic Transition	4
Percolation	6

5.6.2 Mean-Field Exponents¶

When \(d > d_c\), critical exponents take mean-field values (also called Landau exponents):

\[ \alpha = 0, \quad \beta = \frac{1}{2}, \quad \gamma = 1, \quad \delta = 3, \quad \nu = \frac{1}{2}, \quad \eta = 0 \]

It can be verified that these values satisfy the Rushbrooke, Widom, and Fisher scaling laws, but do not satisfy the hyperscaling law (unless exactly \(d = 4\)).

Taking the \(d = 5\) Ising model as an example, we can clearly see the failure of hyperscaling. If hyperscaling \(d\nu = 2 - \alpha\) holds, substituting \(d = 5\) and mean-field value \(\nu = 1/2\) would predict \(\alpha = 2 - 5 \times 0.5 = -0.5\). But in fact, the specific heat exponent of the \(d = 5\) Ising model is \(\alpha = 0\) (mean-field value), contradicting the hyperscaling prediction. This contradiction is clear evidence of hyperscaling failure in high dimensions.

The fundamental reason for failure involves so-called dangerous irrelevant variables. In RG language, these variables correspond to directions flowing toward the fixed point (hence called "irrelevant"), but they still influence the scaling behavior of certain physical quantities in subtle ways. This deep mechanism will be discussed in detail in subsequent lectures on Landau theory and Wilson renormalization group.

5.6.3 Why Is This Important?¶

Understanding when hyperscaling holds and when it fails is crucial for correctly applying scaling theory. In practical research, for three-dimensional systems (\(d = 3\)), since \(3 < 4 = d_c\), hyperscaling usually holds and can be safely used. But for high-dimensional models or when using mean-field approximation, caution is needed—directly applying hyperscaling will give wrong results. For percolation problems, the upper critical dimension is \(d_c = 6\), so three-dimensional percolation (\(d = 3 < 6\)) still satisfies hyperscaling, which will be verified in the practice of Section 7.

Remember a bridge sentence: The reason RG is necessary is not just to calculate the values of critical exponents, but more importantly to tell us which scaling relations hold or fail under what conditions.

6. Universality Classes: Microscopically Vastly Different, Yet Critical Behavior "Converges"¶

The end of Section 4 raised a profound question: critical exponents are "fingerprints of fixed points", and different systems in the same universality class share the same exponents. But this leads to a more fundamental question—what determines which universality class a system belongs to?

Recall the examples in the introduction: the Curie transition of ferromagnets, the liquid-gas critical point of water, and the order-disorder transition of binary alloys—the microscopic physics of these three systems are completely different. Ferromagnets involve exchange interactions of electron spins, liquid-gas transitions involve van der Waals forces between molecules, and alloy transitions involve atomic occupation on lattices. However, their critical exponents are nearly identical (all belonging to the 3D Ising universality class)!

This universality is one of the most amazing characteristics of critical phenomena. It means: microscopic details are "forgotten" near the critical point, and only a few macroscopic features determine the critical behavior of the system.

This section will reveal what these "macroscopic features" are, and why they can determine universality class.

6.1 Physical Origin of Universality: Microscopic Details Drowned by Fluctuations¶

To understand universality, we need to return to the core conclusion of Section 3: At the critical point, correlation length \(\xi \to \infty\).

Far from the critical point, the system's behavior is determined by the details of microscopic interactions. Different lattice structures, different interaction strengths, different atomic masses all lead to different thermodynamic properties. It is like viewing a mosaic painting—standing too close, you see the color and shape of each small tile.

But near the critical point, the situation fundamentally changes. When \(\xi\) is much larger than microscopic scales (such as lattice constant), the system's behavior is dominated by collective fluctuations spanning the scale of \(\xi\). It is like standing far enough to view a mosaic painting—you see the overall pattern and no longer care about the details of individual tiles.

Mathematically, this "forgetting" can be understood through RG flow: systems from different microscopic starting points flow toward the same fixed point during the coarse-graining process. Once near the fixed point, the system's behavior is completely determined by the properties of the fixed point, independent of initial conditions. Just like different rivers eventually converge into the same valley—once in the valley, the behavior of rivers is determined by the valley's terrain, independent of their sources.

Physically, this means critical behavior depends only on the few macroscopic features of the system, not on microscopic details. These macroscopic features are precisely the factors that determine universality class.

6.2 Three Key Factors Determining Universality Class¶

After decades of theoretical and experimental research, physicists have found that the factors determining universality class can be reduced to three:

6.2.1 Spatial Dimension d¶

Spatial dimension is one of the most important factors determining universality class. The same model can have completely different critical behavior in different dimensions. The physical reason is that dimension determines the "strength" of fluctuations: in low-dimensional systems, particles have fewer directions to "avoid" the influence of fluctuations, so fluctuation effects are stronger.

The behavior of the Ising model in different dimensions vividly demonstrates this. In one dimension, the Ising model has no finite-temperature phase transition at all (\(T_c = 0\)), because thermal fluctuations in one-dimensional systems are sufficient to destroy any ordered state—only a few "domain walls" are enough to completely destroy long-range order.

In two dimensions, a qualitative change occurs: the Ising model has a true phase transition, and Onsager gave an exact analytical solution in 1944, with critical exponents taking special values \(\beta = 1/8\), \(\gamma = 7/4\), \(\nu = 1\). The three-dimensional Ising model also has a phase transition, but critical exponents need to be obtained through numerical calculation or field-theoretic methods, with values \(\beta \approx 0.326\), \(\gamma \approx 1.24\), \(\nu \approx 0.63\)—completely different from the two-dimensional case.

When dimension reaches four or higher, fluctuations become weak enough, and the system enters the mean-field regime, with critical exponents taking classical values \(\beta = 1/2\), \(\gamma = 1\), \(\nu = 1/2\). This evolutionary sequence from no phase transition to nontrivial phase transition to mean-field phase transition clearly demonstrates the decisive influence of dimension on critical behavior.

6.2.2 Symmetry of Order Parameter¶

The mathematical structure of the order parameter—whether it is a scalar, vector, or tensor, and what symmetry it has—is another key factor determining universality class. The physical reason is that different symmetries correspond to different low-energy excitation modes (Goldstone modes), leading to different critical fluctuation structures.

The simplest case is the Ising class (\(n = 1\)), where the order parameter is a scalar that can only take positive or negative values, with discrete symmetry group \(\mathbb{Z}_2 = \{+1, -1\}\). This two-state symmetry is extremely common in physics: spontaneous magnetization in ferromagnets can point up or down, in liquid-gas transitions the system is in liquid or gas phase, in binary alloys A atoms are enriched or B atoms are enriched. All these systems belong to the Ising universality class, despite their completely different microscopic physics.

When the order parameter has continuous rotational symmetry, the situation becomes richer. The XY class (\(n = 2\)) has an order parameter that is a two-dimensional vector, which can rotate arbitrarily in the plane, with continuous symmetry group \(O(2)\). Physical realizations include the complex order parameter of superfluid helium-4 \(\psi = |\psi|e^{i\phi}\) (phase \(\phi\) can take any value), Cooper pair condensation in thin-film superconductors, and planar spin systems. It is worth mentioning that the two-dimensional XY model has a special phase transition—the BKT transition (Berezinskii-Kosterlitz-Thouless), which has no true long-range order, but is a topological phase transition driven by unbinding of vortex-antivortex pairs.

The Heisenberg class (\(n = 3\)) has an order parameter that is a three-dimensional vector, which can rotate arbitrarily in three-dimensional space, with symmetry group \(O(3)\). Typical physical realizations are isotropic ferromagnets (such as Fe, Ni, Co), where the magnetization direction can point in any spatial direction, as well as Néel vectors of antiferromagnets.

There is a general rule between these different universality classes: the larger the number of components \(n\) of the order parameter, the stronger the fluctuations (because there are more "directions" to fluctuate), and the greater the deviation of critical exponents from mean-field values. This explains why the critical exponents of the Heisenberg class deviate more from mean-field values than those of the Ising class.

6.2.3 Range of Interactions¶

The rate at which interactions decay with distance also affects universality class.

Short-range interactions (decay exponentially with distance or are truncated):

Most interactions in solids belong to this category
Different short-range interaction forms (nearest-neighbor, next-nearest-neighbor, finite-range) belong to the same universality class

Long-range interactions (power-law decay \(J(r) \sim r^{-\sigma}\)):

When \(\sigma\) is small enough, interactions are "too strong", fluctuations are suppressed, and the system exhibits mean-field behavior
When \(\sigma\) increases, the system gradually transitions to short-range universality class behavior
There exists a critical value \(\sigma_c\); when \(\sigma > \sigma_c\), the system belongs to the short-range universality class

Examples:

Gravitational interaction (\(\sigma = 1\)) is a typical long-range interaction, causing gravitational systems not to satisfy the usual thermodynamic limit.
Dipole interaction (\(\sigma = 3\)) may lead to special critical behavior in low-dimensional systems.

6.3 A "Map" of Universality Classes¶

Combining the above factors, we can draw a "map" of universality classes:

Table: Major Universality Classes and Their Characteristics

Universality Class	Dimension	Order Parameter Symmetry	Typical Physical Systems	Critical Exponent Characteristics
2D Ising	\(d=2\)	\(\mathbb{Z}_2\)	Gas-liquid transition of adsorption layers, single-layer magnetic films	\(\beta=1/8\), \(\gamma=7/4\), \(\nu=1\) (exact)
3D Ising	\(d=3\)	\(\mathbb{Z}_2\)	Ferromagnet Curie point, liquid-gas critical point, binary alloys	\(\beta\approx0.326\), \(\gamma\approx1.24\), \(\nu\approx0.63\)
3D XY	\(d=3\)	\(O(2)\)	Superfluid helium \(\lambda\) transition, superconducting transition	\(\beta\approx0.35\), \(\gamma\approx1.32\), \(\nu\approx0.67\)
3D Heisenberg	\(d=3\)	\(O(3)\)	Isotropic ferromagnets (Fe, Ni, Co)	\(\beta\approx0.37\), \(\gamma\approx1.40\), \(\nu\approx0.71\)
3D Percolation	\(d=3\)	Connectivity	Porous media permeation, network connectivity	\(\beta\approx0.41\), \(\gamma\approx1.80\), \(\nu\approx0.88\)
Mean-Field	\(d>d_c\)	Any	High-dimensional systems, long-range interactions	\(\beta=1/2\), \(\gamma=1\), \(\nu=1/2\)

Note the special nature of percolation: Percolation phase transitions do not involve temperature, but are driven by occupation probability \(p\). Its "order parameter" is the relative size of the largest connected cluster, and its "susceptibility" analog is the fluctuation of cluster size. Despite completely different physical mechanisms, percolation still satisfies scaling laws and forms its own universality class.

6.4 Verification of Universality: Experiments and Numerics¶

A powerful prediction of the universality hypothesis is: completely different physical systems, as long as they belong to the same universality class, should have the same critical exponents.

This prediction has been verified by numerous experiments and numerical simulations. For example:

Verification of 3D Ising Universality Class:

System	\(\beta\)	\(\gamma\)	\(\nu\)	Method
Iron Curie transition	0.34 ± 0.02	1.25 ± 0.05	0.64 ± 0.03	Neutron scattering
CO₂ liquid-gas critical point	0.32 ± 0.01	1.24 ± 0.02	0.63 ± 0.02	Light scattering
Binary mixture phase separation	0.33 ± 0.02	1.23 ± 0.03	0.62 ± 0.02	Light scattering
3D Ising model (MC)	0.3265 ± 0.0003	1.237 ± 0.002	0.6301 ± 0.0004	Monte Carlo

These measurement results are highly consistent within experimental error, strongly supporting the universality hypothesis.

6.5 Universality Classes and Renormalization Group Fixed Points¶

This section has been discussing universality classes from a phenomenological perspective, but the deep mechanism of universality needs to be understood through the renormalization group. Let us preview the picture provided by RG.

RG theory reveals the profound connection between universality classes and fixed points:

\[ \boxed{\text{Universality Class} \longleftrightarrow \text{Basin of Attraction of RG Fixed Point}} \]

\[ \boxed{\text{Critical Exponents} \longleftrightarrow \text{Characteristic Exponents of RG Transformation Near Fixed Point}} \]

Different microscopic systems correspond to different starting points in parameter space. When we perform RG transformations (i.e., coarse-graining), the system "flows" in parameter space—from microscopic to macroscopic scales. Systems belonging to the same universality class, no matter how different their starting points, eventually converge to the same fixed point. Once the system approaches the fixed point, its behavior is completely determined by the properties of the fixed point, independent of initial conditions.

Critical exponents are determined by the linearized behavior (eigenvalues) of RG transformations near the fixed point. This is why systems with completely different microscopic physics can have the same critical behavior—they converge to the same fixed point in the RG flow, just like different rivers eventually converge into the same valley, and the valley's terrain (properties of the fixed point) determines the final behavior of rivers.

This picture also explains why only spatial dimension \(d\), order parameter symmetry, and interaction range determine universality class. In RG language, these are "relevant parameters", whose values determine which fixed point the system flows toward. In contrast, other microscopic details (such as lattice structure, specific numerical values of interaction strength) are "irrelevant parameters"; they gradually disappear in the RG flow and do not affect which fixed point the system eventually reaches. This complete RG picture will be rigorously developed and proven in subsequent lectures on Landau theory, block-spin RG, and Wilson field-theoretic RG.

6.6 From Critical Exponents to Universality Classes¶

The core message of this section can be summarized as follows. Universality is the most amazing characteristic of critical phenomena—systems with completely different microscopic physics can exhibit the same critical behavior and share the same critical exponents. This universality is not coincidental, but originates from the physical mechanism of fluctuation dominance near the critical point: when correlation length diverges, microscopic details are "drowned" by collective behavior, and only a few macroscopic features—spatial dimension \(d\), symmetry of the order parameter (\(\mathbb{Z}_2\), \(O(2)\), \(O(3)\), etc.), and range of interactions—determine the system's critical behavior.

From the renormalization group perspective, universality has a deeper geometric interpretation: systems in the same universality class converge to the same fixed point in the RG flow in parameter space, and critical exponents are precisely the "fingerprints" of this fixed point. By precisely measuring critical exponents, we can determine which universality class a system belongs to, just like identifying a person through fingerprints.

This "all roads lead to Rome" universality is one of the most profound simplification principles in physics. It tells us: near the critical point, nature hides a deep order—microscopic complexity is drowned by fluctuations, leaving only a few universal numbers to characterize the entire critical behavior.

7. Practice: From Entangled Granular Networks to Universality Verification of Percolation Phase Transition¶

The previous six sections established the complete theoretical framework of critical phenomena: from the classification of phase transitions and definition of order parameters, to the divergence of correlation length and physical meaning of critical exponents, to the mathematical structure of scaling hypothesis and classification principles of universality classes. Although these concepts are elegant, they remain mere paper talk if they cannot connect with real physical systems. Therefore, this section will demonstrate how to apply this theoretical framework to frontier research through a concrete case—a study of entangled granular networks published in Nature Communications in 2025—and personally verify the power of the universality hypothesis through numerical simulation.

7.1 Paper Interpretation: Percolation Phase Transition in Entangled Granular Networks¶

The paper "Percolation transition in entangled granular networks" by Seongmin Kim, Daihui Wu, and Yilong Han, published in Nature Communications in 2025 (DOI: 10.1038/s41467-025-66228-3), studied a very special granular system. Unlike traditional spherical particles, this paper studies C-shaped particles—shapes similar to staples or open rings. When these C-shaped particles vibrate in a container, they "hook" onto each other through geometric entanglement, forming a connected network structure.

The uniqueness of this system lies in the completely different nature of its "interactions" from traditional thermodynamic systems. In ferromagnets, spins are coupled through exchange interactions; in liquid-gas systems, molecules attract through van der Waals forces. But in entangled granular networks, the "bonding" between particles is purely topological—two C-shaped particles can pass through each other like chain links, forming so-called Hopf link structures. This entanglement is purely geometric, involving no energy potential wells, so the system is non-thermodynamic, purely geometry-driven.

The core question posed by the paper is: as particle density gradually increases, how does the entangled network evolve from isolated small clusters to a macroscopic network spanning the entire system? Can this emergence of connectivity be described in the language of phase transitions? If so, which universality class does it belong to?

The paper directly uses the critical phenomena language we defined in Section 4. The relative size of the largest connected cluster \(S_1 = s_1/N\) plays the role of order parameter, analogous to the magnetization \(M\) of ferromagnets; the second moment of cluster sizes excluding the largest cluster \(\chi = \sum_{i \neq 1} s_i^2 / N\) plays the role of susceptibility, measuring the system's response to perturbations; reduced density \(\eta = n \cdot v_{\text{ex}}\) (number density times excluded volume) plays the role of reduced temperature \(t\), controlling how far the system is from the critical point.

Secondly, the paper adopts the Finite-Size Scaling (FSS) methods and data collapse techniques introduced in Section 5 to determine critical exponents and verify universality class membership. Since systems in experiments and simulations are always of finite size, true phase transitions (divergence of response functions) cannot be observed in finite systems.

The paper's conclusion also directly verifies the universality principles discussed in Section 6. Despite the completely different microscopic physics of the entangled granular system (geometric entanglement, friction, non-convex shapes) from traditional percolation models (lattice occupation), the paper finds they belong to the same universality class—3D percolation universality class. This means microscopic details (such as friction coefficient, particle opening angle) only affect the non-universal critical point location \(\eta_c\), but do not affect critical exponents. This is precisely the core prediction of the universality hypothesis.

7.2 Paper Reproduction¶

Directly reproducing the complete DEM simulation or RCP model of the paper requires substantial computational resources and specialized granular dynamics code, but the core analysis methods of the paper—finite-size scaling and data collapse—can be completely demonstrated on a simpler system: three-dimensional cubic lattice percolation (3D site percolation).

3D site percolation is a classic model in percolation theory: on a three-dimensional cubic lattice, each site is "occupied" with probability \(p\), adjacent occupied sites are considered "connected", forming connected clusters. When \(p\) is below the critical probability \(p_c \approx 0.3116\), the system has only isolated small clusters; when \(p > p_c\), a macroscopic cluster spanning the entire system (the "percolating cluster") emerges. This model belongs to the same universality class as the entangled granular network in the paper, thus having the same critical exponents.

By implementing the complete analysis pipeline on this simple model, we can: verify whether the definitions of order parameter and susceptibility are correct; test whether finite-size scaling formulas hold; demonstrate the practical operation of data collapse techniques; confirm the critical exponent values of the 3D percolation universality class.

Before entering the code, we need to clarify the physical quantities to be calculated.

Order parameter \(S_1\) is defined as the number of lattice sites \(s_1\) in the largest connected cluster divided by the total number of system lattice sites \(N = L^3\). When \(p < p_c\), \(S_1 \to 0\) (no macroscopic cluster); when \(p > p_c\), \(S_1 > 0\) (percolating cluster emerges). Near the critical point, \(S_1 \sim (p - p_c)^\beta\), where \(\beta \approx 0.41\) is the order parameter exponent for 3D percolation.

Susceptibility \(\chi\) is defined as the second moment of all cluster sizes excluding the largest cluster: \(\chi = \sum_{i \neq 1} s_i^2 / N\). This quantity measures the fluctuation degree of "typical cluster size". At the critical point, \(\chi\) peaks and diverges with system size: \(\chi_{\max} \sim L^{\gamma/\nu}\), where \(\gamma \approx 1.80\) is the susceptibility exponent and \(\nu \approx 0.88\) is the correlation length exponent.

The following code implements the complete 3D percolation phase transition analysis pipeline, showing changes in cluster structure during the phase transition process.

"""
Lecture 4 3D Percolation Phase Transition Analysis
1. Monte Carlo simulation of 3D site percolation
2. Order parameter and susceptibility calculation with finite-size scaling analysis
3. Data collapse to verify universality class membership
4. Generate 3D visualization of percolation cluster evolution
"""
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from collections import defaultdict
import matplotlib.animation as animation
import os

# Set plotting style
plt.style.use('dark_background')
plt.rcParams['font.family'] = 'DejaVu Sans'
plt.rcParams['mathtext.fontset'] = 'dejavusans'

# ============================================================
# Part 1: Union-Find Data Structure
# ============================================================
# Union-Find is an efficient data structure for handling connectivity problems.
# In percolation problems, we need to quickly determine whether two sites
# belong to the same cluster and merge clusters when adding new connections.
# With path compression and union by rank optimization, each operation has
# amortized time complexity close to O(1).

class UnionFind:
    """
    Union-Find data structure for efficient connected component computation.
    """
    def __init__(self, n):
        # parent[i] stores the parent node of node i
        # Initially each node is its own parent (i.e., each node forms a separate cluster)
        self.parent = list(range(n))
        # size[i] stores the size of the tree rooted at node i
        self.size = [1] * n
        # Track the current number of connected components
        self.n_components = n

    def find(self, x):
        """
        Find the root node of the cluster containing node x.
        Uses path compression optimization: directly connect all visited nodes to the root.
        """
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]

    def union(self, a, b):
        """
        Merge the clusters containing nodes a and b.
        Uses union by rank optimization: attach smaller tree to larger tree for balance.
        """
        ra, rb = self.find(a), self.find(b)
        if ra == rb:
            return False  # Already in the same cluster
        # Attach smaller tree to larger tree
        if self.size[ra] < self.size[rb]:
            ra, rb = rb, ra
        self.parent[rb] = ra
        self.size[ra] += self.size[rb]
        self.n_components -= 1
        return True

    def get_cluster_sizes(self):
        """Get all cluster sizes, sorted in descending order"""
        cnt = defaultdict(int)
        for i in range(len(self.parent)):
            root = self.find(i)
            cnt[root] += 1
        sizes = sorted(cnt.values(), reverse=True)
        return sizes

    def get_cluster_labels(self):
        """
        Return the cluster label for each node.
        Labels are sorted by cluster size: largest cluster has label 0, second largest has label 1, etc.
        """
        cluster_map = defaultdict(list)
        for i in range(len(self.parent)):
            root = self.find(i)
            cluster_map[root].append(i)

        # Sort clusters by size in descending order
        sorted_clusters = sorted(cluster_map.values(), key=len, reverse=True)
        labels = np.zeros(len(self.parent), dtype=int)
        for label, cluster in enumerate(sorted_clusters):
            for node in cluster:
                labels[node] = label
        return labels


# ============================================================
# Part 2: 3D Site Percolation Core Simulation Functions
# ============================================================
# 3D site percolation model: on an L x L x L cubic lattice, each site is
# "occupied" with probability p. If two adjacent sites are both occupied,
# they belong to the same connected cluster. When p exceeds the critical
# probability p_c ≈ 0.3116, a macroscopic cluster spanning the entire
# system (percolating cluster) emerges.

def generate_percolation_config(L, p):
    """
    Generate a 3D site percolation configuration and compute connected clusters.

    Parameters:
        L: Linear size of the cubic lattice
        p: Occupation probability

    Returns:
        occupied: Boolean array indicating whether each site is occupied
        cluster_labels: Cluster label for each site (unoccupied sites have label -1)
        S1: Order parameter (relative size of largest cluster)
        chi: Susceptibility (second moment excluding largest cluster)
        sizes: List of all cluster sizes (descending order)
    """
    N = L * L * L
    # Randomly decide whether each site is occupied
    occupied = np.random.random(N) < p
    n_occupied = np.sum(occupied)

    # If no sites are occupied, return zero values
    if n_occupied == 0:
        return occupied, np.full(N, -1, dtype=int), 0, 0, []

    # Build index mapping for occupied sites
    # index_map: original index -> compressed index
    # reverse_map: compressed index -> original index
    occupied_indices = np.where(occupied)[0]
    index_map = {old: new for new, old in enumerate(occupied_indices)}
    reverse_map = {new: old for old, new in index_map.items()}

    # Build edge list: connect adjacent occupied sites
    edges = []
    for idx in occupied_indices:
        # Convert 1D index to 3D coordinates
        x, y, z = idx // (L*L), (idx // L) % L, idx % L
        # Check three positive direction neighbors
        for dx, dy, dz in [(1, 0, 0), (0, 1, 0), (0, 0, 1)]:
            nx, ny, nz = x + dx, y + dy, z + dz
            if nx < L and ny < L and nz < L:
                nidx = nx * L * L + ny * L + nz
                if occupied[nidx]:
                    edges.append((index_map[idx], index_map[nidx]))

    # Use Union-Find to compute connected components
    uf = UnionFind(n_occupied)
    for a, b in edges:
        uf.union(a, b)

    sizes = uf.get_cluster_sizes()
    cluster_labels_occupied = uf.get_cluster_labels()

    # Map labels back to original index space
    cluster_labels = np.full(N, -1, dtype=int)
    for new_idx, old_idx in reverse_map.items():
        cluster_labels[old_idx] = cluster_labels_occupied[new_idx]

    # Compute physical quantities
    # Order parameter S1: fraction of total sites in largest cluster
    s1 = sizes[0] if sizes else 0
    S1 = s1 / N

    # Susceptibility chi: second moment excluding largest cluster
    # This measures fluctuations in "typical cluster size"
    chi = sum(s**2 for s in sizes[1:]) / N if len(sizes) > 1 else 0

    return occupied, cluster_labels, S1, chi, sizes


def compute_observables(L, p, n_samples=50):
    """
    Perform Monte Carlo sampling for given parameters, return mean and standard error of observables.

    Parameters:
        L: System linear size
        p: Occupation probability
        n_samples: Number of Monte Carlo samples

    Returns:
        S1_mean, chi_mean: Mean values of observables
        S1_err, chi_err: Standard errors
    """
    S1_list, chi_list = [], []

    for _ in range(n_samples):
        _, _, S1, chi, _ = generate_percolation_config(L, p)
        S1_list.append(S1)
        chi_list.append(chi)

    return (np.mean(S1_list), np.mean(chi_list),
            np.std(S1_list)/np.sqrt(n_samples),
            np.std(chi_list)/np.sqrt(n_samples))


# ============================================================
# Part 3: 3D Visualization and GIF Generation
# ============================================================
# Intuitively demonstrate the percolation phase transition through animation:
# as occupation probability p increases from low to high, observe how cluster
# structure evolves from isolated small dots to a large network spanning the system.

def create_percolation_gif(L=15, p_values=None, output_path='percolation_3d.gif'):
    """
    Generate a GIF animation of 3D percolation cluster evolution.

    The animation shows how cluster structure evolves as occupation probability p
    increases from low to high. The largest cluster is shown in red, other clusters in blue.

    Parameters:
        L: System linear size (recommend 10-20, larger is slow)
        p_values: Sequence of occupation probabilities to display
        output_path: Output GIF file path
    """
    if p_values is None:
        # Gradual transition from subcritical to supercritical
        p_values = np.linspace(0.15, 0.45, 30)

    # Use fixed random seed for animation continuity
    # This ensures consistency in site occupation across frames
    np.random.seed(42)
    base_random = np.random.random(L * L * L)

    fig = plt.figure(figsize=(12, 10), facecolor='black')

    def update(frame):
        """Update plot for each frame"""
        fig.clear()
        ax = fig.add_subplot(111, projection='3d', facecolor='black')

        p = p_values[frame]
        # Determine occupation state based on base random numbers and current probability
        occupied = base_random < p

        # Compute cluster structure
        N = L * L * L
        n_occupied = np.sum(occupied)

        if n_occupied > 0:
            # Reuse logic from generate_percolation_config
            occupied_indices = np.where(occupied)[0]
            index_map = {old: new for new, old in enumerate(occupied_indices)}
            reverse_map = {new: old for old, new in index_map.items()}

            edges = []
            for idx in occupied_indices:
                x, y, z = idx // (L*L), (idx // L) % L, idx % L
                for dx, dy, dz in [(1, 0, 0), (0, 1, 0), (0, 0, 1)]:
                    nx, ny, nz = x + dx, y + dy, z + dz
                    if nx < L and ny < L and nz < L:
                        nidx = nx * L * L + ny * L + nz
                        if occupied[nidx]:
                            edges.append((index_map[idx], index_map[nidx]))

            uf = UnionFind(n_occupied)
            for a, b in edges:
                uf.union(a, b)

            sizes = uf.get_cluster_sizes()
            cluster_labels_occupied = uf.get_cluster_labels()

            cluster_labels = np.full(N, -1, dtype=int)
            for new_idx, old_idx in reverse_map.items():
                cluster_labels[old_idx] = cluster_labels_occupied[new_idx]

            S1 = sizes[0] / N if sizes else 0
            chi = sum(s**2 for s in sizes[1:]) / N if len(sizes) > 1 else 0
        else:
            cluster_labels = np.full(N, -1, dtype=int)
            S1, chi = 0, 0

        # Plot occupied sites
        occupied_mask = occupied.reshape(L, L, L)
        labels_3d = cluster_labels.reshape(L, L, L)

        x, y, z = np.where(occupied_mask)

        if len(x) > 0:
            labels_flat = labels_3d[occupied_mask]

            # Color mapping: largest cluster in red, others in blue gradient
            colors = np.zeros((len(x), 4))
            for i, label in enumerate(labels_flat):
                if label == 0:  # Largest cluster
                    colors[i] = [1, 0.2, 0.2, 0.9]  # Red, high opacity
                else:
                    # Other clusters in blue tones, smaller clusters are darker
                    intensity = max(0.3, 1 - label * 0.03)
                    colors[i] = [0.2, 0.5, intensity, 0.6]

            ax.scatter(x, y, z, c=colors, s=60, depthshade=True)

        # Set axes
        ax.set_xlim(0, L)
        ax.set_ylim(0, L)
        ax.set_zlim(0, L)
        ax.set_xlabel('X', color='white', fontsize=12)
        ax.set_ylabel('Y', color='white', fontsize=12)
        ax.set_zlabel('Z', color='white', fontsize=12)

        # Set transparent background
        ax.xaxis.pane.fill = False
        ax.yaxis.pane.fill = False
        ax.zaxis.pane.fill = False
        ax.xaxis.pane.set_edgecolor('gray')
        ax.yaxis.pane.set_edgecolor('gray')
        ax.zaxis.pane.set_edgecolor('gray')
        ax.tick_params(colors='white')

        # Determine current phase
        p_c = 0.3116
        if p < p_c - 0.03:
            phase = "Subcritical"
            phase_color = '#66CCFF'
        elif abs(p - p_c) <= 0.03:
            phase = "Critical"
            phase_color = '#FFFF66'
        else:
            phase = "Supercritical"
            phase_color = '#FF6666'

        # Title shows current parameters and observables
        title = f'3D Site Percolation (L={L})\n'
        title += f'p = {p:.3f}  |  $p_c$ ≈ 0.3116  |  '
        ax.set_title(title, color='white', fontsize=14, pad=20)

        # Add observable information on the plot
        info_text = f'$S_1$ = {S1:.3f}\n$\\chi$ = {chi:.1f}\n{phase}'
        ax.text2D(0.02, 0.95, info_text, transform=ax.transAxes,
                  fontsize=12, color=phase_color, verticalalignment='top',
                  bbox=dict(boxstyle='round', facecolor='black', alpha=0.7))

        # Rotate view angle for dynamic effect
        ax.view_init(elev=20, azim=frame * 4)

        return ax,

    print(f"Generating GIF animation ({len(p_values)} frames)...")
    print(f"  System size: L = {L}")
    print(f"  Probability range: p in [{p_values[0]:.2f}, {p_values[-1]:.2f}]")

    anim = animation.FuncAnimation(fig, update, frames=len(p_values), 
                                   interval=250, blit=False)
    anim.save(output_path, writer='pillow', fps=4, dpi=100)
    plt.close()
    print(f"GIF saved: {output_path}")


# ============================================================
# Part 4: Complete FSS Analysis Pipeline
# ============================================================

def run_fss_analysis(output_dir='.'):
    """
    Run complete finite-size scaling analysis and generate analysis plots.
    """
    print("=" * 70)
    print("3D Site Percolation Finite-Size Scaling Analysis")
    print("=" * 70)

    # Parameter settings
    p_c = 0.3116  # Critical probability for 3D site percolation
    p_values = np.linspace(0.20, 0.42, 25)
    L_values = [8, 12, 16, 20]
    n_samples = 50

    # 3D percolation critical exponents (literature values)
    beta = 0.41    # Order parameter exponent
    gamma = 1.80   # Susceptibility exponent
    nu = 0.88      # Correlation length exponent
    nu_bar = 3 * nu  # d * nu

    # Monte Carlo sampling
    print("\nRunning Monte Carlo simulation...")
    results = {L: {'p': [], 'S1': [], 'chi': [], 'S1_err': [], 'chi_err': []} 
               for L in L_values}

    for L in L_values:
        print(f"  L = {L}:", end=" ")
        for i, p in enumerate(p_values):
            S1, chi, S1_err, chi_err = compute_observables(L, p, n_samples)
            results[L]['p'].append(p)
            results[L]['S1'].append(S1)
            results[L]['chi'].append(chi)
            results[L]['S1_err'].append(S1_err)
            results[L]['chi_err'].append(chi_err)
            if (i + 1) % 10 == 0:
                print(f"{i+1}", end=" ")
        print("Done")

    # Generate analysis plots
    print("\nGenerating analysis plots...")
    fig, axes = plt.subplots(2, 2, figsize=(14, 12))
    colors = plt.cm.viridis(np.linspace(0.2, 0.8, len(L_values)))

    # (a) Order parameter vs p
    ax1 = axes[0, 0]
    for L, color in zip(L_values, colors):
        ax1.errorbar(results[L]['p'], results[L]['S1'], 
                     yerr=results[L]['S1_err'],
                     label=f'L = {L}', color=color, marker='o', 
                     markersize=4, capsize=2, linewidth=1.5)
    ax1.axvline(p_c, color='red', linestyle='--', label=f'$p_c$ = {p_c}', alpha=0.7)
    ax1.set_xlabel('Occupation probability $p$', fontsize=12)
    ax1.set_ylabel('Order parameter $S_1$', fontsize=12)
    ax1.set_title('(a) Order Parameter vs Occupation Probability', fontsize=14)
    ax1.legend(loc='upper left')
    ax1.grid(alpha=0.3)

    # (b) Susceptibility vs p
    ax2 = axes[0, 1]
    for L, color in zip(L_values, colors):
        ax2.errorbar(results[L]['p'], results[L]['chi'], 
                     yerr=results[L]['chi_err'],
                     label=f'L = {L}', color=color, marker='s', 
                     markersize=4, capsize=2, linewidth=1.5)
    ax2.axvline(p_c, color='red', linestyle='--', label=f'$p_c$ = {p_c}', alpha=0.7)
    ax2.set_xlabel('Occupation probability $p$', fontsize=12)
    ax2.set_ylabel('Susceptibility $\\chi$', fontsize=12)
    ax2.set_title('(b) Susceptibility vs Occupation Probability', fontsize=14)
    ax2.legend(loc='upper right')
    ax2.grid(alpha=0.3)

    # (c) S1 data collapse
    ax3 = axes[1, 0]
    for L, color in zip(L_values, colors):
        N = L ** 3
        p_arr = np.array(results[L]['p'])
        S1_arr = np.array(results[L]['S1'])
        x_scaled = (p_arr - p_c) * N ** (1/nu_bar)
        y_scaled = S1_arr * N ** (beta/nu_bar)
        ax3.scatter(x_scaled, y_scaled, label=f'L = {L}', color=color, s=40, alpha=0.8)
    ax3.axvline(0, color='white', linestyle=':', alpha=0.5)
    ax3.set_xlabel('$(p - p_c) N^{1/\\bar{\\nu}}$', fontsize=12)
    ax3.set_ylabel('$S_1 \\cdot N^{\\beta/\\bar{\\nu}}$', fontsize=12)
    ax3.set_title(f'(c) Data Collapse for $S_1$ ($\\beta$={beta}, $\\bar{{\\nu}}$={nu_bar:.2f})', fontsize=14)
    ax3.legend()
    ax3.grid(alpha=0.3)

    # (d) chi data collapse
    ax4 = axes[1, 1]
    for L, color in zip(L_values, colors):
        N = L ** 3
        p_arr = np.array(results[L]['p'])
        chi_arr = np.array(results[L]['chi'])
        x_scaled = (p_arr - p_c) * N ** (1/nu_bar)
        y_scaled = chi_arr * N ** (-gamma/nu_bar)
        ax4.scatter(x_scaled, y_scaled, label=f'L = {L}', color=color, s=40, alpha=0.8)
    ax4.axvline(0, color='white', linestyle=':', alpha=0.5)
    ax4.set_xlabel('$(p - p_c) N^{1/\\bar{\\nu}}$', fontsize=12)
    ax4.set_ylabel('$\\chi \\cdot N^{-\\gamma/\\bar{\\nu}}$', fontsize=12)
    ax4.set_title(f'(d) Data Collapse for $\\chi$ ($\\gamma$={gamma}, $\\bar{{\\nu}}$={nu_bar:.2f})', fontsize=14)
    ax4.legend()
    ax4.grid(alpha=0.3)

    plt.tight_layout()
    plt.suptitle('3D Site Percolation: Finite-Size Scaling Analysis', fontsize=16, y=1.02)

    fig_path = os.path.join(output_dir, 'percolation_fss_analysis.png')
    plt.savefig(fig_path, dpi=300, bbox_inches='tight', facecolor='black')
    plt.close()
    print(f"FSS analysis plot saved: {fig_path}")

    # Output scaling law verification
    print("\n" + "=" * 70)
    print("Scaling Law Verification")
    print("=" * 70)
    alpha_perc = -0.62
    print(f"\n3D Percolation Critical Exponents (Literature Values):")
    print(f"  beta = {beta}, gamma = {gamma}, nu = {nu}, alpha = {alpha_perc}")

    rushbrooke = alpha_perc + 2*beta + gamma
    print(f"\nRushbrooke Scaling Law: alpha + 2*beta + gamma = {rushbrooke:.2f} (Theoretical value: 2)")
    print(f"Hyperscaling: d*nu = {3*nu:.2f}, 2-alpha = {2-alpha_perc:.2f}")

    return results


def main():
    """Main function: run complete analysis pipeline"""
    print("\n" + "=" * 70)
    print("3D Percolation Phase Transition Analysis and Visualization")
    print("=" * 70)

    # Set output directory
    output_dir = os.path.dirname(os.path.abspath(__file__))

    # Run FSS analysis
    print("\n[1/2] Running finite-size scaling analysis...")
    run_fss_analysis(output_dir)

    # Generate GIF animation
    print("\n[2/2] Generating 3D visualization animation...")
    gif_path = os.path.join(output_dir, 'percolation_3d.gif')
    create_percolation_gif(L=15, output_path=gif_path)

    print("\n" + "=" * 70)
    print("Analysis complete!")
    print("=" * 70)

if __name__ == "__main__":
    main()

Figure (a) shows the order parameter \(S_1\) as a function of occupation probability \(p\). When \(p < p_c\), the system has only isolated small clusters, the fraction of the largest cluster is very small, \(S_1 \approx 0\). As \(p\) increases and crosses the critical point \(p_c \approx 0.3116\), a macroscopic cluster spanning the system emerges, and \(S_1\) begins to grow rapidly. Curves for different system sizes \(L\) cross near the critical point, which is a typical characteristic of finite-size effects.

Figure (b) shows the behavior of susceptibility \(\chi\). Far from the critical point, \(\chi\) is small because the cluster size distribution is relatively uniform. When approaching the critical point, clusters of various sizes appear, size fluctuations reach maximum, and \(\chi\) peaks. The peak height increases with system size \(L\), which is precisely the finite-size manifestation of "response divergence"—if \(L \to \infty\), the peak height would tend to infinity.

Figures (c) and (d) are the core results of this section. Through FSS transformation, we map data from different system sizes to unified scaling coordinates. If the critical exponents used are correct, data from different \(L\) should collapse onto the same master curve.

From the figures, using 3D percolation universality class exponents \(\beta = 0.41\), \(\gamma = 1.80\), \(\bar{\nu} = 2.64\), the data indeed collapses well together. The success of this collapse has dual significance: it verifies that our simulation correctly implements the 3D percolation model; it demonstrates how to determine universality class membership through data collapse.

The animation shows how cluster structure evolves from sparse isolated points (subcritical phase) to an interconnected network (supercritical phase) as occupation probability \(p\) increases from low to high. The largest cluster is shown in red, other clusters in blue. Near the critical point, clusters of various sizes coexist, and the system exhibits self-similar fractal structure—this is precisely the visual manifestation of "scale-free" behavior.

The paper performed exactly the same analysis on entangled granular networks as we did: defining order parameter \(S_1\) and susceptibility \(\chi\), measuring how these quantities change with the control parameter (reduced density \(\eta\)) at different system sizes, then using 3D percolation critical exponents for FSS data collapse.

The key finding of the paper is: despite the completely different microscopic physics of the entangled granular system from lattice percolation—the former involves real three-dimensional particles, friction, geometric entanglement, and Hopf link topology, while the latter is just random occupation on lattices—they belong to the same universality class and have the same critical exponents.

This result confirms the universality principle discussed in Section 6: critical exponents depend only on spatial dimension, order parameter symmetry, and interaction range, not on microscopic details. Both entangled granular systems and lattice percolation are three-dimensional systems, both have "connectivity" type order parameters, and both have short-range interactions (entanglement only occurs between adjacent particles), so they belong to the same universality class. Microscopic complexity—friction coefficients, particle shapes, vibration parameters—only affects the non-universal critical point location \(\eta_c\), not the universal exponents describing critical behavior.

This is why this lecture so strongly emphasizes the concept of universality: once you understand it, you gain a powerful analytical tool. Facing a new phase transition system, you don't need to calculate its critical behavior from scratch; you only need to determine which universality class it belongs to, and then directly use the critical exponents of that universality class. This "knowledge reuse" is precisely the charm of statistical physics.

Summary¶

This lecture started from experimental facts of critical phenomena and established a complete quantitative framework for describing phase transitions. First, the divergence and vanishing behavior of physical quantities near the critical point was compressed and encoded into six critical exponents \((\alpha, \beta, \gamma, \delta, \nu, \eta)\), which respectively characterize the power-law behavior of specific heat, order parameter, susceptibility, critical isotherm, correlation length, and correlation function. This "minimal parameter set" reduces infinitely many possible critical behaviors to a few universal numbers.

Subsequently, a fact was revealed: these six exponents are not independent. The scaling hypothesis states that the singular part of the free energy is a generalized homogeneous function of reduced temperature and external field. Starting from this concise mathematical assumption, scaling laws such as Rushbrooke, Widom, and Fisher can be systematically derived, compressing six exponents into only two independent parameters. The existence of scaling laws means critical exponents are not arbitrary fitting parameters, but are strongly constrained by the deep mathematical structure of physics near the critical point.

Finally, the origin of universality was discussed. Near the critical point, the divergence of correlation length causes the system to enter a "scale-free world", with microscopic details drowned by collective fluctuations, leaving only a few macroscopic features—spatial dimension, order parameter symmetry, interaction range—to determine the system's critical behavior. From the renormalization group perspective, if different microscopic systems flow toward the same fixed point during the coarse-graining process, they share the same set of critical exponents and belong to the same universality class. Critical exponents are precisely "fingerprints" of fixed points.

The next lecture will enter the "classical framework" of phase transition theory—Landau mean-field theory. Landau theory gives a set of "very easy to calculate" critical exponents (\(\beta = 1/2\), \(\gamma = 1\), \(\delta = 3\), and other mean-field values), while the Ginzburg criterion tells us when fluctuation effects are sufficient to break Landau's predictions. The boundaries of these "successes and failures" will ultimately be unified by the renormalization group: Why does the upper critical dimension exist? Why does hyperscaling fail in certain dimensions? From there, we will formally step into the mathematical machinery of the renormalization group.