Introduction: From Passive Response to Active Driving - Opening the Door to the Nonequilibrium World¶
In the previous lectures we focused on near-equilibrium systems. In particular, Lecture 32 built the powerful Janssen-De Dominicis (JDD) path-integral formalism and derived from it the cornerstone of near-equilibrium statistical physics—the fluctuation-dissipation theorem (FDT). FDT reveals a deep insight: the passive linear response to a small external perturbation is fully determined by spontaneous equilibrium fluctuations. This relation relies crucially on time-reversal symmetry at equilibrium.
However, the brilliance of the JDD formalism and FDT also limits their domain. When we turn our gaze from the ideal conditions of the laboratory to the broader real world—from the continuous operation of molecular motors in cells, to colloids rapidly dragged by optical tweezers in experiments, to any system whose control parameters undergo significant changes in finite time—what we face are processes that are far from equilibrium and inherently irreversible. For these processes, classical thermodynamics can only provide incomplete descriptions, typically in the form of inequalities, such as the average nonequilibrium work always being greater than or equal to the system's equilibrium free energy change (\(\langle W \rangle \geq \Delta F\)).
This lecture marks another crucial perspective shift in the course: from passive response to small perturbations, to analyzing nonequilibrium dynamics under active and arbitrary external protocols. The core question is: for these far-from-equilibrium, inherently stochastic processes, do there exist universal, exact equality relationships that can transcend classical inequalities and provide fundamental constraints for these processes?
The answer is yes, and the key breakthrough comes from changing the level of observation. Instead of focusing only on the averages of thermodynamic quantities (like work or entropy production), we turn to studying the complete probability distributions of these quantities generated by countless stochastic trajectories of the system. By analyzing the full statistics of fluctuations, we can precisely "distill" hidden equilibrium information from seemingly chaotic nonequilibrium processes.
This lecture will focus on deriving and understanding two central fluctuation theorems (FTs): the Jarzynski equality (Jarzynski's Work Relation) and the Crooks fluctuation theorem (Crooks' Fluctuation Theorem). These theorems are not only among the most important advances in nonequilibrium statistical physics in recent decades, but also form the theoretical foundation of stochastic thermodynamics, a burgeoning field of study.
1. Review: Fluctuation-Dissipation Theorem (FDT)¶
To better understand the upcoming nonequilibrium fluctuation theorems, it is necessary to first review the core field-theory tool for describing near-equilibrium system dynamics—the Janssen-De Dominicis (J-D) action—and how it systematically derives the fluctuation-dissipation theorem.
Within the framework of statistical field theory, the statistical weight of all possible trajectories of a stochastic process can be described by an action functional. The J-D action is precisely such an action constructed for stochastic dynamical systems with additive Gaussian noise:
Here \(\phi(\vec{x},t)\) is the physical field describing the system state (e.g., particle concentration field), while \(\tilde{\phi}(\vec{x},t)\) is an auxiliary response field. This action elegantly encodes all the dynamical information of the system:
-
Dynamical constraint term \(\tilde{\phi}_\alpha (\dot{\phi}_\alpha - A_\alpha[\phi])\): This term arises from the strict enforcement of the Langevin equation. In the path integral, the response field \(\tilde{\phi}\) acts as a Lagrange multiplier, which through summation over all paths, enforces that only trajectories satisfying the deterministic equation of motion (\(\dot{\phi}_\alpha = A_\alpha[\phi]\)) contribute significantly.
-
Noise contribution term \(-\frac{1}{2} \tilde{\phi}_\alpha N_{\alpha\beta} \tilde{\phi}_\beta\): This term arises from averaging over Gaussian white noise. \(N_{\alpha\beta}\) is the core of the noise correlation matrix \(\langle \xi_\alpha(x,t)\xi_\beta(x',t') \rangle = N_{\alpha\beta}\delta(x-x')\delta(t-t')\), and this quadratic form permanently "imprints" the noise statistics into the effective action.
By performing path integration over this action functional, any observable can be calculated. Among them, two core physical quantities are the response function and correlation function.
- Response function \(\chi_{\alpha\beta}(x,t;x',t')\): It describes how the field component \(\langle \phi_\alpha \rangle\) at \((x,t)\) responds to a small external perturbation field \(h_\beta\) applied at another spacetime point \((x',t')\). In the J-D formalism, it is precisely expressed as the correlation function between the physical field and response field:
- Correlation function \(C_{\alpha\beta}(x,t;x',t')\): It describes the statistical correlation between spontaneous fluctuations at different spacetime points in the absence of any external perturbation: $$ C_{\alpha\beta}(x,t;x',t') = \langle \phi_\alpha(x,t) \phi_\beta(x',t') \rangle $$
The key to deriving FDT lies in utilizing the time-reversal symmetry satisfied by the system in thermal equilibrium. As demonstrated in Lectures 31 and 32, this fundamental symmetry imposes strong constraints on the J-D action, ultimately requiring that the noise intensity \(N\) and dissipation coefficient \(L\) must satisfy the Einstein-Onsager relation (\(N=2LT\)). This built-in symmetry ultimately leads to an exact relationship between the response function and correlation function:
where \(\tau = t-t'\), \(\Theta(\tau)\) is the Heaviside step function, embodying causality (response cannot occur before perturbation). This relationship is the fluctuation-dissipation theorem. It profoundly reveals that: a system's ability to dissipate energy in response to external perturbations (embodied by \(\chi\)) and its internal spontaneous, thermally driven fluctuations (embodied by \(C\)) are two aspects of the same microscopic physical essence.
This derivation process not only provides a rigorous field-theoretic proof of FDT, but more importantly, it demonstrates the central role of time-reversal symmetry in connecting system "response" and "fluctuations". It is precisely this idea that will be extended to far-from-equilibrium systems, becoming the cornerstone for deriving nonequilibrium fluctuation theorems.
2. Nonequilibrium Driving and External Protocols¶
The fluctuation-dissipation relationship (FDT) reviewed in the previous section describes the passive response of a system to small perturbations. Now, we turn our attention from this near-equilibrium linear behavior to a system that is actively and arbitrarily driven away from equilibrium. The core transformation here is that the external parameters are no longer small probes, but rather a time-dependent, significantly changing, predetermined control program.
2.1 System Description: Free-Energy Functional¶
Consider a system described by field variables \(\vec{\phi}\), whose equilibrium properties under a given external field \(\vec{h}\) are given by a free energy functional \(F[\vec{\phi}]\). Its form is consistent with the Ginzburg-Landau theory discussed in Lecture 25:
Each term in this functional has a clear physical meaning:
Local free energy density \(f(\vec{\phi})\): Contains local entropy contributions and mean-field interactions between particles, determining the system's basic phase behavior.
Gradient energy term \(\frac{c}{2} (\vec{\nabla} \vec{\phi})^2\): Imposes an energy penalty on spatial inhomogeneity, serving as the microscopic source of interface tension.
Coupling term \(-\vec{h} \cdot \vec{\phi}\): This is the "handle" for interaction between the external world and the system. It is precisely through changing the external field \(\vec{h}\) that one can influence the system and do work.
2.2 Protocol Setup¶
A typical nonequilibrium driving experiment or simulation is set up as follows:
- Initial state: At time \(t=0\), the system is in complete thermal equilibrium with an initial external field \(\vec{h}(0)\). This means that the probability distribution of the initial system configuration \(\vec{\phi}_0\) is given by the Boltzmann factor:
$\(P_0[\vec{\phi}_0] = \frac{1}{Z_0(T)} \exp \left[ -\beta F_0[\vec{\phi}_0] \right]\)$ where \(\beta = 1/T\) (setting Boltzmann constant \(k_B=1\)), \(F_0\) is the free energy functional corresponding to the initial external field \(\vec{h}(0)\), and \(Z_0(T)\) is the corresponding partition function. Starting from a precisely defined equilibrium state is the logical starting point for all fluctuation theorems.
- Driving process: During the time interval from \(t=0\) to \(t=t_f\), the external field \(\vec{h}(t)\) evolves according to a predetermined, deterministic time function. This function \(\vec{h}(t)\) is called the external protocol. As shown in the figure below, the protocol drives the system from an initial equilibrium state, through a series of nonequilibrium states, to a final state that is typically also nonequilibrium.
2.3 Origin of Irreversibility: Finite Time and Relaxation¶
The irreversibility of this process has its root in the fact that the protocol is completed in finite time \(t_f\). There are two fundamentally different limits:
-
Quasistatic limit (\(t_f \to \infty\)): If the protocol changes infinitely slowly, the system has sufficient time at each instant to adapt to small changes in the external field through internal relaxation. Therefore, the system always remains in the equilibrium state corresponding to the instantaneous external field \(\vec{h}(t)\). Such a process is reversible, and the work done by the external world exactly equals the free energy difference \(\Delta F\) between the initial and final equilibrium states. This is the ideal realm of classical thermodynamics.
-
Finite-time process (\(t_f < \infty\)): For any finite driving time, the system's internal relaxation (determined by dynamical coefficients \(L\), etc.) requires time. Therefore, the evolution of the system state will always lag behind changes in the external field. The system's probability distribution \(P(\vec{\phi},t)\) will no longer be the equilibrium distribution corresponding to the instantaneous external field \(\vec{h}(t)\). It is precisely this lag that leads to energy dissipation, usually dissipated as heat to the environment.
Therefore, the average work done by external forces will be strictly greater than the free energy change, i.e., \(\langle W \rangle > \Delta F\). This difference, \(\langle W_{diss} \rangle = \langle W \rangle - \Delta F\), is the average dissipated work, which quantifies the degree of irreversibility of the process.
The essence of fluctuation theorems lies in the fact that they not only focus on this average behavior, but precisely characterize the complete probability distribution of work \(P(W)\) generated by countless random trajectories. The next task is to give a precise definition for the random nonequilibrium work done on a single trajectory.
3. Nonequilibrium Work: Definition and Path-Integral Computation¶
3.1 Definition of Jarzynski Work¶
To establish fluctuation theorems, we must first give a precise physical definition for "work" in stochastic, nonequilibrium processes. Unlike macroscopic thermodynamics where work is a deterministic value, in stochastic thermodynamics, work is defined on a single random trajectory as a random variable that depends on the path.
This definition originates from the fundamental ideas of classical mechanics and thermodynamics: the work done on a system equals the change in the system's energy due to changes in external control parameters. For a system driven by an external field \(\vec{h}(t)\), its free energy functional \(F[\vec{\phi}]\) will change explicitly with time. In a small time interval \(dt\), the infinitesimal work \(dW_J\) done by the external agent on the system is the change in \(F[\vec{\phi}]\) due to the change in the external parameter \(\vec{h}\). This is analogous to \(dW = (\partial E / \partial \lambda) d\lambda\) in classical thermodynamics. In the language of field theory, this manifests as:
Here the partial derivative \(\partial/\partial t\) acts only on the external field \(\vec{h}(t)\) that explicitly depends on time in the free energy functional, while keeping the instantaneous field configuration \(\vec{\phi}(t)\) fixed. Using the form of the free energy functional given in Section 2, \(F \propto -\int d^dx (\vec{h} \cdot \vec{\phi})\), we obtain \(\delta F / \delta \vec{h} = -\vec{\phi}\). Therefore, the total work done on the system during the entire protocol (from \(t=0\) to \(t=t_f\)), i.e., the Jarzynski work \(W_J[\vec{\phi}]\), is the result of integrating the infinitesimal work along a specific trajectory \(\vec{\phi}(t)\):
This definition profoundly reveals the physical essence of nonequilibrium work:
-
Work is random: \(W_J\) is a functional whose value depends on the specific microscopic path \(\vec{\phi}(\vec{x},t)\) that the system experiences. Since the system's evolution is influenced by thermal noise and is random, different trajectories will produce different work values. Therefore, work itself is a random variable with a probability distribution \(P(W_J)\).
-
Source of work: Work is the direct result of the coupling between the system's state field \(\vec{\phi}(t)\) and the rate of change of the external protocol \(\partial \vec{h} / \partial t\). If the protocol does not change with time (\(\partial \vec{h}/\partial t = 0\)), then no work is done on the system by the external world.
3.2 Ensemble Average in the Path Integral¶
To calculate the ensemble average of any observable \(O[\vec{\phi}]\) (such as work itself \(W_J\) or its exponential function \(e^{-\beta W_J}\)), we need to average over the entire path ensemble of the nonequilibrium driving experiment defined in Section 2. This averaging process can be systematically decomposed into three levels of integration in the path integral formalism:
Note: To maintain consistency with the lecture board, the response field integration measure here is \(\mathcal{D}[\tilde{\phi}]\), corresponding to the real action after Wick rotation.
This seemingly complex expression is actually a precise, step-by-step mathematical simulation of the nonequilibrium process:
-
Initial state averaging: The integral \(\int \mathcal{D}[\vec{\phi}_0] P_0[\vec{\phi}_0] \dots\) corresponds to the first step of the experiment: randomly "sampling" an initial configuration \(\vec{\phi}_0\) from the initial equilibrium ensemble according to the Boltzmann probability \(P_0[\vec{\phi}_0] = \frac{1}{Z_0} e^{-\beta F_0[\vec{\phi}_0]}\).
-
Path averaging: For each given initial configuration \(\vec{\phi}_0\) and final configuration \(\vec{\phi}_f\), the inner integral \(\int \mathcal{D}[\vec{\phi}, \tilde{\phi}] \exp(-S) \dots\) sums over all possible historical paths connecting these two points that follow stochastic dynamics. The statistical weight of each path is given by the J-D action \(e^{-S[\tilde{\phi}, \vec{\phi}]}\), which has completely encoded the dynamical constraints of the Langevin equation and noise statistics.
-
Summing over final states: The outermost integral \(\int \mathcal{D}[\vec{\phi}_f] \dots\) sums over all possible final configurations, because in nonequilibrium processes, even starting from the same initial state, different random trajectories will reach different final states.
This expression is the solid mathematical foundation for deriving all fluctuation theorems. It precisely decomposes the macroscopic statistical average of a nonequilibrium process into sampling from the initial equilibrium state and summing over stochastic dynamical paths. The next task is to reveal the universal laws hidden within by performing a series of clever symmetry transformations on this path integral expression.
4. Path-Integral Derivation of Fluctuation Theorems¶
Step 1: Response-Field Change of Variables - Isolating Thermodynamic Terms¶
This section is the mathematical core of the lecture, which will demonstrate in detail how to derive the core relationship connecting nonequilibrium work and equilibrium free energy through a series of ingenious transformations of the J-D path integral action. The cornerstone of the entire derivation process is the intrinsic symmetry of the action under time reversal, which is precisely the profound manifestation of microscopic reversibility at the path level.
To clearly demonstrate the derivation, consider a system dominated by gradient descent (such as Model A), with drift term \(A_\alpha = -L (\delta F / \delta \phi_\alpha)\). According to Lecture 32, its J-D action is:
According to the requirements of the fluctuation-dissipation theorem, in equilibrium the noise intensity and dissipation coefficient satisfy the Einstein-Onsager relation \(N_{\alpha\beta} = 2LT \delta_{\alpha\beta}\).
Physical motivation: The first step is a purely mathematical technique, but its physical motivation is very clear: reorganize the action to explicitly separate quantities directly related to thermodynamics (free energy, work) from complex dynamical terms. This transformation is designed precisely to allow the final result to be presented in a clear thermodynamic form.
Define new response field variables \(\bar{\phi}_\alpha\):
The Jacobian determinant of this transformation is 1, so the path integral measure remains unchanged. Substituting \(\tilde{\phi}_\alpha(t) = -\bar{\phi}_\alpha(t) + \frac{1}{T} \frac{\delta F}{\delta \phi_\alpha(t)}\) into the action \(S\) and using \(N=2LT\), after some algebraic manipulation, the action can be precisely decomposed as:
The last term \(\frac{1}{T}\frac{\delta F}{\delta \phi_{\alpha}} \partial_t \phi_{\alpha}\) in this form is the key to separating thermodynamic quantities. Using the chain rule for functionals, the total derivative of the free energy \(F[\vec{\phi}(t)]\) along a trajectory with respect to time is:
Here:
- \(\frac{dF}{dt}\) is the total rate of change of free energy along the specific trajectory \(\vec{\phi}(t)\).
- The first term \(\int d^dx \dots\) is the free energy change due to the evolution of the system's internal configuration \(\vec{\phi}\) itself.
- The second term \(\frac{\partial F}{\partial t}\) is the explicit time dependence of free energy due to the external protocol \(\vec{h}(t)\) changing the energy landscape itself. According to the definition in Section 3, the integral of this term is precisely the Jarzynski work \(W_J[\vec{\phi}]\).
Therefore, the integral of the last term can be precisely written as:
Physical meaning: The profound result of this algebraic transformation is that it converts the original path weight factor \(e^{-S}\) into a new form containing the core thermodynamic part \(e^{-\beta (F_f - F_0 - W_J)}\). We have successfully "pre-extracted" the physical quantities needed in the final result at the action level.
Step 2: Time-Reversal Transformation - Employ Core Symmetry¶
Physical motivation: The second step is the physical core of the entire derivation. By performing time reversal operations on the path integral variables, we employ the fundamental symmetry of microscopic reversibility. The idea is to compare the probabilities of "forward-playing movies" and "rewind-playing movies".
Define the time-reversed path variables as:
Simultaneously, the external protocol also undergoes corresponding time reversal, i.e., the reversed protocol: \(\vec{h}^R(t) = \vec{h}(t_f - t)\). Now examine the behavior of key terms in the action under this transformation:
-
Time derivative term: As shown in the lecture board, the term \(\int dt \, \bar{\phi}_{\alpha} \partial_t \phi_{\alpha}\) changes sign under time reversal. It represents the part of the action directly related to the direction of time evolution.
-
Work term: We need to calculate the work done along the reversed path \(\vec{\phi}^R(t)\) under the reversed protocol \(\vec{h}^R(t)\), denoted as \(W_J^R\). Through variable substitution \(\tau = t_f - t\), it can be strictly proven that:
\[W_J^R[\vec{\phi}^R] = -\int_0^{t_f} dt \int d^dx \, \vec{\phi}^R(\vec{x},t) \cdot \frac{\partial \vec{h}^R(\vec{x},t)}{\partial t} = - W_J[\vec{\phi}]\]Physical meaning: The physical connotation of this result is that the work done along the reversed path under the reversed protocol is exactly the opposite of the work done along the forward path under the forward protocol.
After these transformations, the original action \(S\) becomes a new action \(S^R\), which has a similar form to the original action but describes the dynamics under the reversed protocol.
Step 3: Assemble Results - Derive Universal Relation¶
Logical integration: Now, integrate the transformation results from the previous two steps into the path integral expression for the observable \(\langle O[\vec{\phi}] \rangle\). Through variable substitution and using the invariance of the path integral measure (Jacobian determinant is 1), we can prove a profound duality relationship:
where:
- \(\langle \dots \rangle_R\) denotes ensemble averaging under the reversed protocol starting from the equilibrium ensemble corresponding to the final state \(F_f\).
- \(\hat{O}[\vec{\phi}]\) is the observable calculated on the time-reversed path.
- \(Z_f\) and \(Z_0\) are the equilibrium partition functions corresponding to the final external field \(\vec{h}(t_f)\) and initial external field \(\vec{h}(0)\), respectively.
Finally, using the definition of free energy \(\Delta F = F_f - F_0 = -T \ln(Z_f / Z_0)\) and the relationship \(W_J^R = -W_J[\vec{\phi}]\) derived in Step 2, we obtain a universal fluctuation relationship:
Physical meaning: This result is astonishing. It shows that the average of a physical quantity measured under the forward protocol (LHS) can be precisely transformed into an average under the reversed protocol of the time-reversed observable weighted by an exponential work factor \(e^{\beta W_J}\) (RHS).
This duality relationship is a direct manifestation of microscopic reversibility in nonequilibrium processes. The macroscopic irreversibility observed (such as \(\langle W \rangle > \Delta F\)) does not arise from the physical laws themselves not satisfying time symmetry, but from the asymmetric boundary conditions of the process: starting from a specific equilibrium state and letting time evolve forward. Fluctuation theorems are precisely the exact quantification of the statistical consequences produced by this asymmetry in boundary conditions.
Step 3: Assemble Results - A Universal Relation¶
Using path-integral measure invariance and the above transformations, one finds the duality $$ \langle O[\phi] \rangle = \frac{Z_f(T)}{Z_0(T)}\, \Big\langle \hat{O}[\phi]\, e^{-\beta W_J^R[\phi]} \Big\rangle_R, $$ where \(\langle \cdots \rangle_R\) denotes averaging under the reversed protocol starting from the equilibrium ensemble at the final control parameter, \(\hat{O}\) denotes the time-reversed observable, and \(Z_0, Z_f\) are partition functions for \(h(0)\) and \(h(t_f)\). Using \(\Delta F = F_f - F_0 = -T \ln(Z_f/Z_0)\) and \(W_J^R = -W_J\), this becomes $$ \langle O[\phi] \rangle = e^{-\beta \Delta F}\, \Big\langle \hat{O}[\phi]\, e^{\beta W_J[\phi]} \Big\rangle_R. $$
Physically: an average under the forward protocol equals a reversed-protocol average of the time-reversed observable weighted by \(e^{\beta W_J}\), up to \(e^{-\beta \Delta F}\). This duality reflects microscopic reversibility; apparent macroscopic irreversibility arises from asymmetric boundary conditions (preparing in a specific equilibrium state and letting time move forward).
5. Jarzynski Work Relation¶
The universal fluctuation relationship derived in the previous section is a universal bridge connecting forward and reverse nonequilibrium processes. Now, through an extremely concise and ingenious choice, we will extract the first milestone achievement from this bridge—the Jarzynski equality (Jarzynski's Work Relation).
The Jarzynski equality was proposed by physicist Christopher Jarzynski in 1997. Its profound physical essence lies in establishing a precise bridge between nonequilibrium statistical physics and equilibrium thermodynamics. The equality states that in any nonequilibrium process that drives a system from one equilibrium state to another, the ensemble average of the exponential of the work done \(W\) is exactly equal to the Boltzmann factor corresponding to the free energy difference \(\Delta F\) between the initial and final equilibrium states, i.e., \(\langle e^{-\beta W} \rangle = e^{-\beta \Delta F}\). This theorem is a remarkable strengthening of the traditional second law inequality (\(\langle W \rangle \ge \Delta F\)), because it precisely connects a nonequilibrium average that depends on specific dynamical paths with a pure thermodynamic quantity that depends only on equilibrium states.
In specific applications, this equality has revolutionary significance, allowing scientists to precisely calculate equilibrium free energy differences through fast, irreversible computer simulations or single-molecule experiments (such as stretching RNA hairpins or proteins with optical tweezers), which is particularly important in biophysics and materials science where traditional quasistatic methods are difficult to implement.
5.1 From the Universal Relation to an Exact Equality¶
Recall the final result from the previous section, the universal duality relationship: $\(\langle O[\vec{\phi}] \rangle = e^{-\beta \Delta F} \langle \hat{O}[\vec{\phi}] e^{\beta W_J[\vec{\phi}]} \rangle_R\)$ This relationship holds for any observable \(O[\vec{\phi}]\). To obtain the most concise relationship, we can choose the simplest observable, namely the identity operator \(O[\vec{\phi}] = 1\).
Substituting this choice into the universal relationship:
-
On the left-hand side, due to probability normalization, the average of any quantity \(\langle 1 \rangle\) equals 1.
-
On the right-hand side, time reversal operation does not change a constant, so \(\hat{1} = 1\).
Thus, the universal relationship immediately simplifies to:
The left side of this expression is the average of the forward process (implicit), while the right side is the ensemble average under the reversed protocol \(\langle \dots \rangle_R\). However, the standard form of the Jarzynski equality is about the forward process. To obtain it, we use an equivalent derivation: in the universal relationship, we choose a different observable \(O'[\phi] = O[\phi] e^{-\beta W_J[\phi]}\). After algebraic manipulation, we can obtain another equivalent universal relationship: \(\langle O[\phi] e^{-\beta W_J[\phi]} \rangle = e^{-\beta \Delta F} \langle \hat{O}[\phi] \rangle_R\).
Now, choosing \(O[\vec{\phi}] = 1\) again in this new relationship:
Since \(\langle \hat{1} \rangle_R = 1\), we directly obtain the final Jarzynski equality:
Physical meaning: the nonequilibrium exponential average of work equals the equilibrium Boltzmann factor of the free-energy difference. This remarkable exact equality links a path-dependent nonequilibrium average to a purely equilibrium thermodynamic quantity. Practically, it enables computing \(\Delta F\) from fast, highly irreversible simulations or single-molecule experiments by accumulating an exponential average that emphasizes rare, low-work trajectories.
5.2 Refining the Second Law¶
The Jarzynski equality is not only perfectly compatible with the second law of thermodynamics, but is also a profound refinement of the latter. Using Jensen's inequality in mathematics, i.e., for any convex function \(f(x)\) (such as \(f(x)=e^x\)), we have \(\langle f(x) \rangle \geq f(\langle x \rangle)\). Applying this to the Jarzynski equality:
Combining this inequality with the Jarzynski equality itself:
Taking the logarithm of both sides and multiplying by \(-T\) (which reverses the inequality sign), we recover the familiar statement of the second law of thermodynamics:
Physical meaning: This derivation clearly shows that the Jarzynski equality is a stronger statement than the second law.
Second law only constrains the average of work. It acknowledges that in a single experiment, the work \(W_J\) may be less than \(\Delta F\) (i.e., the so-called instantaneous "violation" of the second law), but remains silent about the probability of such events.
Jarzynski equality imposes an exact integral constraint on the entire probability distribution \(P(W_J)\) of work. It shows that trajectories with \(W_J < \Delta F\) do indeed occur, but their statistical weights are exponentially suppressed, while dissipative trajectories with \(W_J > \Delta F\) contribute more. Ultimately, through the nonlinear weighted average \(e^{-\beta W_J}\), all fluctuation contributions are ingeniously combined to exactly equal the thermodynamic factor corresponding to the equilibrium free energy difference.
6. Code Practice I: Verification of the Jarzynski Equality¶
Below we will use Python code to simulate a classical nonequilibrium physical process—dragging an overdamped Brownian particle in a harmonic potential well—to numerically verify the Jarzynski equality.
Physical Model¶
Consider a one-dimensional overdamped Brownian particle whose dynamics is described by the following Langevin equation:
where:
-
\(x(t)\) is the particle position.
-
\(\gamma\) is the friction coefficient.
-
\(V(x,t) = \frac{1}{2} k (x - \lambda(t))^2\) is a harmonic potential well whose center position \(\lambda(t)\) is controlled by an external protocol. \(k\) is the stiffness coefficient of the potential well.
-
\(\eta(t)\) is Gaussian white noise representing random collisions from the thermal bath, with statistical properties \(\langle \eta(t) \rangle = 0\) and \(\langle \eta(t) \eta(t') \rangle = 2 \gamma T \delta(t - t')\), where \(T\) is the temperature.
Nonequilibrium protocol: The potential well center \(\lambda(t)\) moves linearly from initial position \(\lambda_0\) to final position \(\lambda_f\) during time \(t=0\) to \(t=t_f\):
Nonequilibrium work: According to the previous definition, the work done along a trajectory \(x(t)\) is:
Free energy change: Since the shape of the potential well (determined by \(k\)) has not changed, the equilibrium free energy difference between initial and final states is \(\Delta F = 0\). Therefore, the Jarzynski equality predicts \(\langle e^{-\beta W_J} \rangle = e^0 = 1\).
The code below uses the simple Euler-Maruyama method to numerically integrate the Langevin equation and calculate the work for multiple trajectories, finally verifying the Jarzynski equality.
Python Implementation¶
import numpy as np
import matplotlib.pyplot as plt
# --- 1. Parameter settings (same as before) ---
k = 2.0; gamma = 1.0; T = 1.0; beta = 1.0 / T
dt = 0.01; t_f = 5.0; num_steps = int(t_f / dt)
num_trajectories = 20000 # Increase trajectory count for smoother distribution
# --- 2. Forward protocol simulation ---
lambda_0_fwd = 0.0; lambda_f_fwd = 5.0
v_lambda_fwd = (lambda_f_fwd - lambda_0_fwd) / t_f
work_forward = np.zeros(num_trajectories)
print("Simulating forward protocol...")
for i in range(num_trajectories):
x = np.random.normal(loc=lambda_0_fwd, scale=np.sqrt(T / k))
total_work = 0.0
for step in range(num_steps):
t = step * dt
lambda_t = lambda_0_fwd + v_lambda_fwd * t
force_lambda = k * (x - lambda_t)
# Key correction: definition of work
dW = -force_lambda * (v_lambda_fwd * dt)
total_work += dW
force_x = -k * (x - lambda_t)
noise_term = np.sqrt(2 * gamma * T * dt) * np.random.randn()
x += (force_x / gamma) * dt + noise_term / gamma
work_forward[i] = total_work
# --- 3. Reverse protocol simulation ---
lambda_0_rev = lambda_f_fwd; lambda_f_rev = lambda_0_fwd
v_lambda_rev = (lambda_f_rev - lambda_0_rev) / t_f # Velocity is negative
work_reverse = np.zeros(num_trajectories)
print("Simulating reverse protocol...")
for i in range(num_trajectories):
x = np.random.normal(loc=lambda_0_rev, scale=np.sqrt(T / k))
total_work = 0.0
for step in range(num_steps):
t = step * dt
lambda_t = lambda_0_rev + v_lambda_rev * t
force_lambda = k * (x - lambda_t)
# Key correction: definition of work
dW = -force_lambda * (v_lambda_rev * dt)
total_work += dW
force_x = -k * (x - lambda_t)
noise_term = np.sqrt(2 * gamma * T * dt) * np.random.randn()
x += (force_x / gamma) * dt + noise_term / gamma
work_reverse[i] = total_work
# --- 4. Visualizing Crooks Theorem ---
plt.style.use('dark_background')
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 7))
fig.suptitle('Numerical Verification of Crooks Fluctuation Theorem', fontsize=18)
# --- Figure 1: Intersection point of work distributions ---
delta_F = 0.0
bins = np.linspace(-15, 40, 75)
ax1.hist(work_forward, bins=bins, density=True, alpha=0.7, label=r'Forward work distribution $P(W)$')
# Plot distribution of negative reverse work P_R(-W)
ax1.hist(-work_reverse, bins=bins, density=True, alpha=0.7, label=r'Reverse work distribution $P_R(-W)$')
ax1.axvline(delta_F, color='red', linestyle='--', linewidth=2, label=f'Delta F = {delta_F:.1f}')
ax1.set_xlabel('Work W', fontsize=12)
ax1.set_ylabel('Probability Density', fontsize=12)
ax1.set_title('Verification Method 1: Work Distributions Intersect at Delta F', fontsize=14)
ax1.legend()
# --- Figure 2: Linear relationship of logarithmic probability ratio ---
# Calculate histogram data
hist_fwd, bin_edges = np.histogram(work_forward, bins=bins, density=True)
hist_rev, _ = np.histogram(-work_reverse, bins=bins, density=True)
bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2
# Avoid division by zero, only select regions where both distributions are non-zero
mask = (hist_fwd > 1e-4) & (hist_rev > 1e-4)
log_ratio = np.log(hist_fwd[mask] / hist_rev[mask])
work_vals = bin_centers[mask]
# Plot scatter diagram
ax2.scatter(work_vals, log_ratio, alpha=0.8, label='Simulation data (ln[P(W)/P_R(-W)])')
# Plot theoretical prediction line
w_theory = np.linspace(np.min(work_vals), np.max(work_vals), 100)
log_ratio_theory = beta * (w_theory - delta_F)
ax2.plot(w_theory, log_ratio_theory, 'r-', lw=3, label=f'Theoretical prediction: Slope beta={beta:.1f}')
ax2.set_xlabel('Work W', fontsize=12)
ax2.set_ylabel(r'$\ln[P(W) / P_R(-W)]$', fontsize=14)
ax2.set_title('Verification Method 2: Linear Relationship of Log Probability Ratio', fontsize=14)
ax2.legend()
plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()
Physical process: The left panel shows the physical process, where the particle trajectory (red solid line) always lags behind the potential well center (blue dashed lines representing initial/final positions and the unshown moving trajectory). This is precisely the physical source of dissipative work.
Second law of thermodynamics holds: The work distribution plot in the upper right shows that the average work \(\langle W \rangle\) is much greater than the free energy difference \(\Delta F = 0\). This verifies the second law of thermodynamics \(\langle W \rangle \geq \Delta F\).
Convergence of Jarzynski average: One of the most important characteristics of the Jarzynski equality: its convergence is very slow and is dramatically affected by rare events. The Jarzynski average is \(\langle e^{-\beta W} \rangle\), a nonlinear exponential average. This average is extremely sensitive to trajectories with very small or even negative work \(W\) (i.e., those events at the far left of the histogram in the upper right plot). These are "lucky" trajectories where thermal fluctuations happen to help the particle keep up with the moving potential well, reducing dissipation. However, these trajectories are very rare. In limited simulation runs (such as 1480), it is likely that we have not sufficiently sampled enough of these rare events. Therefore, the calculated average will have statistical bias, usually less than 1. The dramatic fluctuations of the purple curve in the plot precisely reflect this sensitivity to rare events. If we increase the number of simulated trajectories to 10,000 or more, we will see this value converge more stably to 1.
The animation shows sample trajectories and the running exponential average \(\langle e^{-\beta W}\rangle\) converging to \(e^{-\beta \Delta F}\), validating Jarzynski's equality.
7. Crooks Fluctuation Theorem: A More Refined Symmetry¶
7.1 Derivation and Form¶
The Jarzynski equality is an integral constraint on nonequilibrium work, acting on the average of the exponential of work. However, it originates from a more universal and refined relationship, namely the Crooks fluctuation theorem. This theorem does not constrain averages, but directly relates the entire work distribution functions of forward and reverse processes.
The Crooks fluctuation theorem was proposed by Gavin E. Crooks in 1999 as a generalization and refinement of the Jarzynski equality. Its physical essence is a more refined and powerful symmetry relationship that is no longer limited to constraining some average of work, but directly relates the entire probability distribution functions of work for forward processes and reverse processes. The theorem states that the ratio of the probability \(P(W)\) of doing work \(W\) in the forward process to the probability \(\tilde{p}(-W)\) of doing work \(-W\) under the reversed protocol is determined by a simple exponential relationship: \(P(W)/\tilde{p}(-W) = e^{\beta (W - \Delta F)}\). This relationship quantitatively explains how rare instantaneous "violations" of the second law (i.e., \(W < \Delta F\)) are, and reveals the profound time-reversal symmetry in nonequilibrium fluctuations.
In specific applications, the Crooks theorem provides a more numerically robust and reliable free energy calculation method than the Jarzynski equality, because it allows direct determination of \(\Delta F\) by finding the intersection point of forward and reverse work distribution curves. This "intersection method" has become a standard tool for analyzing single-molecule force spectroscopy experiments (such as RNA folding/unfolding) and computational simulation data.
For forward protocol work distribution \(P(W)\) and reversed-protocol opposite-work distribution \(P_R(-W)\), Crooks' theorem states: $$ \frac{P(W)}{P_R(-W)} = e^{\beta (W-\Delta F)}. $$ This detailed relation refines Jarzynski by connecting entire distributions, predicting that the crossing point of \(P(W)\) and \(P_R(-W)\) occurs at \(W=\Delta F\).
7.2 Jarzynski from Crooks¶
The Crooks theorem is a stronger conclusion than the Jarzynski equality. We can rederive the Jarzynski equality by integrating the Crooks theorem, proving this point:
-
Starting from the Crooks theorem: \(P(W_J) = P_R(-W_J) e^{\beta (W_J - \Delta F)}\)
-
Calculate the Jarzynski average: $\(\langle e^{-\beta W_J} \rangle \equiv \int dW_J \, P(W_J) e^{-\beta W_J}\)$
-
Substitute the expression for \(P(W_J)\): $\(\langle e^{-\beta W_J} \rangle = \int dW_J \, P_R(-W_J) e^{\beta (W_J - \Delta F)} e^{-\beta W_J}\)$
-
Simplify the exponential terms: $\(\langle e^{-\beta W_J} \rangle = \int dW_J \, P_R(-W_J) e^{-\beta \Delta F} = e^{-\beta \Delta F} \int dW_J \, P_R(-W_J)\)$
-
Since \(P_R\) is a normalized probability distribution, its integral is 1: $\(\int dW_J \, P_R(-W_J) = 1\)$
-
Finally obtain the Jarzynski equality: $\(\langle e^{-\beta W_J} \rangle = e^{-\beta \Delta F}\)$
This derivation clearly shows that the Jarzynski equality is a direct corollary of the Crooks theorem at the integral level.
8. Code Practice II: Visualizing Crooks' Theorem¶
# --- 1. Parameter settings (same as before) ---
# --- 2. Forward protocol simulation ---
# --- 3. Reverse protocol simulation ---
# --- 4. Visualizing Crooks Theorem ---
# --- Figure 1: Intersection point of work distributions ---
# Plot distribution of negative reverse work P_R(-W)
# --- Figure 2: Linear relationship of logarithmic probability ratio ---
# Calculate histogram data
# Avoid division by zero, only select regions where both distributions are non-zero
# Plot scatter diagram
# Plot theoretical prediction line
The left plot shows the forward work distribution \(P(W)\) (blue) and the distribution of the negative of reverse work \(P_R(-W)\) (orange). The forward process is on average dissipative, so \(P(W)\) peaks in the positive work region. The reverse process is also on average dissipative (\(W_R > \Delta F_{rev} = -\Delta F = 0\)), so \(-W_R\) is predominantly negative, and its distribution \(P_R(-W)\) peaks in the negative work region. The most crucial feature is the intersection point of the two distributions. According to the Crooks theorem, this intersection should be precisely located at \(W = \Delta F\). In the simulation, since \(\Delta F = 0\), the curves of both histograms intersect at the red dashed line at \(W = 0\).
The right plot provides a more rigorous and quantitative verification. It directly tests the linear relationship: $$ \ln\left[\frac{P(W)}{P_R(-W)}\right] = \beta (W - \Delta F) $$ The simulation data points (blue scatter) clearly fall on a straight line, highly consistent with the theoretical prediction (red line). The slope of this line is exactly \(\beta = 1/T = 1.0\), and the x-intercept is also precisely located at \(\Delta F = 0\).
Conclusion¶
At the end of this lecture, Professor Erwin Frey recommends an authoritative review article in this field:
Seifert U. Stochastic thermodynamics, fluctuation theorems and molecular machines[J]. Reports on progress in physics, 2012, 75(12): 126001.
This review systematically constructs the theoretical framework of stochastic thermodynamics. Its core idea is to successfully extend the macroscopic concepts of classical thermodynamics, such as work, heat, and entropy production, to the level of single microscopic random trajectories. For a nonequilibrium system coupled to a constant temperature heat bath (whether a continuous system described by Langevin equations or a discrete system described by master equations), how to establish first-law-like energy balance relationships on single fluctuation trajectories. The article demonstrates how to derive the Jarzynski equality, Crooks theorem, and other various integral and detailed fluctuation theorems (IFT & DFT) learned in this lecture from a "master theorem" in a unified way. The paper covers a large number of paradigmatic systems, including the colloid particles, biopolymers, molecular motors, and small biochemical networks we have simulated. As Professor Frey said, this is a "...beautiful review of all the different aspects of stochastic thermodynamics...".
This lecture marks a crucial turning point in this series of courses from near-equilibrium to nonequilibrium fields. Starting from reviewing the fluctuation-dissipation relationship of near-equilibrium systems, we have expanded the vision of statistical physics to stochastic processes driven by arbitrary external protocols that are far from equilibrium. Through the use of powerful JDD path-integral tools, we ultimately proved that even when systems undergo highly irreversible processes, universal and exact equality relationships beyond classical thermodynamic inequalities can still be found.
The core content can be summarized into the following four points:
-
Nonequilibrium work is a random variable: At the microscopic scale, due to the presence of thermal fluctuations, the work done on the system is no longer a deterministic value, but a random variable that depends on the specific evolution trajectory of the system. Therefore, it must be described by its complete probability distribution.
-
Jarzynski equality: The exponential average of work, this quantity measured in nonequilibrium processes, is proven to be exactly equal to the thermodynamic factor corresponding to the free energy difference between the initial and final equilibrium states, i.e., \(\langle e^{-\beta W_J} \rangle = e^{-\beta \Delta F}\). This theorem provides a solid theoretical foundation for extracting equilibrium information from irreversible processes.
-
Crooks fluctuation theorem: This is a more refined and powerful "detailed" symmetry relationship. Through the formula \(\frac{P(W_J)}{P_R(-W_J)} = e^{\beta (W_J - \Delta F)}\), it precisely relates the entire work distributions of forward and reverse processes, thereby quantitatively explaining the rarity of instantaneous "violations" of the second law.
-
Central role of time-reversal symmetry: The fundamental source of all these nonequilibrium theorems is the time reversibility of the underlying microscopic dynamics. The macroscopic irreversibility observed arises from asymmetric boundary conditions (i.e., starting from a specific equilibrium state and letting time evolve forward), and fluctuation theorems are precisely the exact description of the statistical consequences produced by this asymmetry in boundary conditions.
We considered transient processes driving a system between equilibrium states in finite time. Next, we turn to nonequilibrium steady states (NESS), with directed percolation as a paradigmatic model of nonequilibrium phase transitions. Unlike equilibrium transitions, DP is inherently dynamical and irreversible (preferred direction), describing critical behavior such as spreading to absorbing states. To analyze long-time behavior and spectra (relaxation rates, fluctuation spectra) in NESS, spectral methods of the evolution operator (e.g., Fokker-Planck) will be central-shifting our lens from path integrals to operator theory.


