Equations of Motion
The Entropy
Time Evolution In Macroscopic Systems. 
III: Selected Applications 
... Department of Physics & Astronomy, University of Wyoming 
... Laramie, Wyoming 82071 
Abstract. The results of two recent articles expanding the Gibbs variational principle to encompass all of
statistical mechanics, in which the role of external sources is made explicit, are utilized to further explicate the
theory. Representative applications to nonequilibrium thermodynamics and hydrodynamics are presented, describing several
fundamental processes, including hydrodynamic fluctuations. A coherent description of macroscopic relaxation dynamics
is provided, along with an exemplary demonstration of the approach to equilibrium in a simple fluid.
1. Introduction
In his classic exposition of the Elementary Principles of Statistical Mechanics (1902) Willard Gibbs introduced
two seminal ideas into the analysis of manybody systems: the notion of ensembles of identical systems in phase
space, and his variational principle minimizing an `average index of probability.' With respect to the first, he
noted that it was an artifice that ``may serve to give precision to notions of probability," and was not necessary.
It now seems clear that this is indeed the case, and our understanding of probability theory has progressed to the
point that one need focus only on the single system actually under study, as logic requires.
Gibbs never revealed the reasoning behind his variational principle, and it took more than fifty years to
understand the underlying logic. This advance was initiated by Shannon (1948) and developed explicitly by
Jaynes (1957a), who recognized the principle as a Principle of Maximum Entropy (PME) and of fundamental importance
to probability theory. That is, the entropy of a probability distribution over an exhaustive set of mutually exclusive
propositions {A_{i}},
S_{I} º k 
å
i

P_{i}lnP_{i} , k > 0, (1) 

is to be maximized over all possible distributions subject to constraints expressed as expectation values
. The crucial notion here is that all probabilities are logically based on some kind of
prior information, so that P_{i}=P(A_{i}I), and that information here is just those constraints along with any other
background information relevant to the problem at hand. It is this maximum of S_{I} that has been identified in the
past with the physical entropy of equilibrium thermodynamics when the constraints involve only constants of the motion.
It may be of some value to stress the uniqueness of the information measure (1) in this respect. In recent years a
number of other measures have been proposed, generally on an ad hoc basis, and often to address a very specific type
of problem. While these may or may not be of value for the intended purpose, their relation to probable inference is
highly dubious. That issue was settled years ago when Shore and Johnson (1980) demonstrated that any function different
from (1) is inconsistent with accepted methods of inference unless the two have identical maxima under the same constraints;
the argument does not rely on their interpretation as information measures.
There is nothing in the principle as formulated by either Gibbs or Jaynes, other than the constraints, to restrict
it to timeindependent probabilities; indeed, this had already been noted by Boltzmann with respect to some of his ideas on
entropy and expressed by Planck in the form S_{B}=klogW. In turn, the only way probabilities can evolve in time is for I
itself to be time dependent, thus freeing the {A_{i}} from restriction to constants of the motion. In two recent
articles (Grandy, 2004 a,b) this notion has been exploited to extend the variational principle to the entire range of
statistical physics. ^{1} It was emphasized there how important it is to take into account
explicitly any external sources, for they are the only means by which the constraints can change and provide new information
about the system. The results of these steps appear to form a sound basis in probability theory for the derivation of
macroscopic equations of motion based on the underlying microscopic dynamics.
The subtitle Gibbs chose for his work was, Developed with Especial Reference to The Rational Foundation of Thermodynamics,
the continuation of which motivates the present essay. The intention here is not to develop detailed applications, but
only to further explicate their logical foundations. To keep the discussion somewhat self contained we begin with a brief
review and summary of the previous results, including some additional details of the equilibrium scenario that provide a
framework for the nonequilibrium case. At each stage it is the nature of the information, or the constraints on the PME,
that establishes the mathematical structure of the theory, demonstrating how the maximum entropy functional presides
over all of statistical mechanics in much the same way as the Lagrangian governs all of mechanics.
The Basic Equilibrium Scenario
A brief description of the elementary structure of the PME was given in I, as well as in a number of other places,
but we belabor it a bit here to serve as a general guide
for its extension; the quantummechanical context in terms of the density matrix r is adopted for the sake of brevity.
Information is specified in terms of expectation values of a number of Hermitian operators {f_{r}}, r=1,¼,m < n, and
the von Neumann information entropy
is maximized subject to constraints
Tr(r)=1 , áf_{r}ñ = Tr(rf_{r}) . (3) 

Lagrange multipliers {l_{r}} are introduced for each constraint and the result of the variational calculation is
the canonical form
r = 
1
Z

e^{l·f} , Z(l_{1},¼,l_{m})=Tr 
æ è

e^{l·f} 
ö ø

, (4) 

in terms of the convenient scalarproduct notation l·f º l_{1}f_{1}+¼+l_{m}f_{m}. The multipliers
and the partition function Z are identified by substitution into (3):
áf_{r}ñ =  
¶
¶l_{r}

lnZ , r=1,¼,m , (5) 

a set of m coupled differential equations. Equation (4) expresses the initial intent of the PME, to construct a prior
probability distribution, or initial state from raw data.
The maximum entropy, which is no longer a function of the probabilities but of the input data only, is found by substituting (4)
into (2): ^{2}
A structure for the statistical theory follows from the analytical properties of S and Z, beginning with the
total differential dS=kl·dáfñ, as follows from (5). Hence,

¶S
¶l_{r}

=0 , 
¶S
¶áf_{r}ñ

=kl_{r} . (7) 

The operators f_{r} can also depend on one or more `external' variables a, say, so that
and because lnZ=lnZ(l_{1},¼,l_{m},a) we have the reciprocity relation

æ è

¶S
¶a

ö ø

{áf_{r}ñ}

=k 
æ è

¶lnZ
¶a

ö ø

{l_{r}}

, (9) 

indicating which variables are to be held constant under differentiation. When such external variables are
present the total differential becomes
where
dQ_{r} º dáf_{r}ñádf_{r}ñ , ádf_{r}ñ º 

¶f_{r}
¶a

da 

. (11) 

As in I, changes in entropy are always related to a source of some kind, here denoted by dQ_{r}.
With (7) the maximum entropy (6) can be rewritten in the form

1
k

S= 
1
k


¶S
¶áfñ

·áfñ+a 
æ è

lnZ
a

ö ø

. (12) 

If (lnZ/a) is independent of a it can be replaced by ¶lnZ/¶a, and from
(9) we can write
S=áfñ· 
¶S
¶áfñ

+a 
¶S
¶a

, (13) 

which is just Euler's theorem exhibiting S as a homogeneous function of degree 1.
Thus, under the stated condition the maximum entropy is an extensive function of the input data.
As always, the sharpness of a probability distribution, and therefore a measure of its predictive power,
is provided by the variances and covariances of the fundamental variables. One readily verifies that
áf_{m}f_{n}ñáf_{m}ñáf_{n}ñ 

= 
¶áf_{m}ñ
¶l_{n}

= 
¶áf_{n}ñ
¶l_{m}


 

(14) 
defining the covariance functions whose generalizations appear throughout the theory; they represent the
correlation of fluctuations.
In addition to the maximum property of S_{I} with respect to variations
in the probability distribution, the maximum entropy itself possesses a variational property of
some importance. If we vary the entropy in (6) with respect to all parameters in the problem, including
{l_{r}} and {f_{r}}, we obtain an alternative to the derivation of (10):
dS= 
å
r

l_{r}dQ_{r}= 
å
r

l_{r}Tr(f_{r}dr) , (15) 

where dQ_{r} is defined in (11). Hence, S is stationary with respect to small changes
in the entire problem if the distribution itself is held constant. The difference in the two types of
variational result is meaningful, as is readily seen by examining the second variations. For the case
of S we compute d^{2} S from (13)
and retain only firstorder variations of the variables. If S is to be a maximum with respect
to variation of those constraints, then the desired stability or concavity condition is
d^{2} S @ dl·dáfñ+da·d 
æ è

¶S
¶a

ö ø

< 0 . (16) 

We return to this presently, but it is precisely the condition employed by Gibbs (1875) to establish all his
stability conditions in thermodynamics.
So far there has been no mention of physics, but the foregoing expressions pertain to fixed constraints,
and therefore are immediately applicable to the case of thermal equilibrium and constants of the motion.
In the simplest case only a single operator is considered, f_{1}=H, the system Hamiltonian, and (4) becomes
the canonical distribution
r_{0}= 
1
Z_{0}

e^{bH} , Z_{0}(b)=Tre^{bH} , (17) 

where b = (kT)^{1}. When a is taken as the system volume V, (11) identifies the internal energy
U=áHñ, elements of heat dQ and work dW=ádHñ, and the pressure. Because classically
the Kelvin temperature is defined as an integrating factor for heat, T must be the absolute temperature and k
is Boltzmann's constant. The
first term in (11) also expresses the first law of thermodynamics; while this cannot be derived from more
primitive dynamic laws, the relation arises here as a result of probable inference. If a second function f_{2}=N, the total
number operator, had been included, the grand canonical distribution would have been obtained in place of (17).
In this application to classical thermodynamics Eq.(13) takes on special significance, for it expresses the
entropy as an extensive function of extensive variables, if the required condition is fullfilled. In all but the very
simplest models direct calculation of lnZ is not practicable, so one must pursue an indirect course. There may be
various ways to establish this condition, but with a = V the standard procedure is to demonstrate it in the infinite
volume limit, where it is found to hold for many common Hamiltonians. But in some situations and for some systems it may
not be possible to verify the condition; then the theory is describing something other than classical thermodynamics.
The one remaining step needed to complete
the derivation of elementary equilibrium thermodynamics is to show that theoretical expectation values are equal
to measured values; the necessary conditions are discussed in the Appendix.
Nonequilibrium States
Although Gibbs was silent on exactly why he chose this variational principle, his intent was quite clear: to define and
construct a description of the equilibrium state. That is, the PME provides a criterion for that state. To continue in this
vein, then, we should seek to extend the principle to construction of an arbitrary nonequilibrium state. The procedure for
doing this, and its rationale, were outlined in II, where we noted that the main task in this respect is to gather
information that varies in both space and time and incorporate it into a density matrix describing a nonequilibrium state.
To illustrate the method of information gathering, consider a system with a fixed timeindependent
Hamiltonian and suppose the data to be given over a
spacetime region R(x,t) in the form of an expectation value of a Heisenberg operator
F(x,t), which could, for example, be a density or a current.
We are reminded that the full equation of motion for such operators, if they
are also explicitly time varying, is
i ℏ 
×
F

=[F,H]+¶_{t} F , (18) 

and the superposed dot will always denote a total time derivative.
When the input data vary continuously over R
their sum becomes an integral and there is a distinct Lagrange multiplier for each spacetime point.
Maximization of the entropy subject to the constraint provided by that information leads to a
density matrix describing this macrostate:
r = 
1
Z

exp 
é ë

 
ó õ

R

l(x,t)F(x,t) d^{3}x dt 
ù û

, (19a) 

where
Z[l(x,t)]=Trexp 
é ë

 
ó õ

R

l(x,t)F(x,t) d^{3}x dt 
ù û

(19b) 

is now the partition functional. The Lagrange multiplier function is identified as the solution
of the functional differential equation
áF(x,t)ñ º Tr[rF(x,t)]= 
d
dl(x,t)

lnZ , (x,t) Î R , (20) 

and is defined only in the region R.
Note carefully that the data set denoted by áF(x,t)ñ
is a numerical quantity that has been equated to an expectation value to incorporate it into a
density matrix. Any other operator J(x,t), including J=F, is determined at any other spacetime
point (x,t) as usual by
áJ(x,t)ñ = Tr 
é ë

rJ(x,t) 
ù û

=Tr 
é ë

r(t)J(x) 
ù û

. (21) 

That is, the system with fixed H still evolves unitarily from the initial nonequilibrium state (19); although
r surely will no longer commute with H, its eigenvalues nevertheless remain unchanged.
Inclusion of a number of operators F_{k}, each with its own informationgathering region R_{k} and
its own Lagrange multiplier function l_{k}, is straightforward, and if the data are time independent
r can describe an inhomogeneous equilibrium system as discussed in connection with (II10). The question
is sometimes raised concerning exactly which functions or operators should be included in the description of a macroscopic state, and
the short answer is: include all the relevant information available, for the PME will automatically eliminate that
which is redundant or contradictory. A slightly longer answer was provided by Jaynes (1957b)
in his second paper introducing informationtheoretic
ideas into statistical mechanics. He defined a density matrix providing a definite probability
assignment for each possible outcome of an experiment as sufficient for that experiment. A density matrix
that is sufficient for all conceivable experiments on a system is called complete for that system. Both
sufficiency and completeness are defined relative to the initial information, and the existence of complete
density matrices presumes that all measurable quantities can be represented by Hermitian operators and that all
experimental measurements can be expressed in terms of expectation values. But even if one could in principle
employ a complete density matrix, it would be extremely awkward and inconvenient to do so in practice, for that
would require a much larger function space than necessary. If the system is nonmagnetic and there are no magnetic
fields present, then there is no point to including those coordinates in a description of the processes of
immediate interest, but only those that are sufficient in the present context. The great selfcorrecting feature of the PME
is that if subsequent predictions are not confirmed by experiment, then this is an indication that some relevant
constraints have been overlooked or, even better, that new physics has been uncovered.
The form (19) illustrates how r naturally incorporates memory effects while placing no restrictions on
spatial or temporal scales. But this density matrix is definitely not a function of
space and time; it merely provides an initial nonequilibrium distribution corresponding to data áF(x,t)ñ Î R. Lack of any other information outside R  in the
future, say  may tend to render r less and less reliable, and the quality of predictions may
deteriorate. Barring any further knowledge of system behavior this deterioration represents a fading memory,
which becomes quite important if the system is actually allowed to relax from this state, for an experimenter
carrying out a measurement on an equilibrium sample cannot possibly know the history of everything that has been done
to it, so it is generally presumed that it has no memory. Relaxation processes will be discussed in Section 4 below.
Steady State Processes
With an understanding of how to construct a nonequilibrium state it's now possible to move on to the next stage,
which is to steadystate systems in which there may be steady currents, but all variables remain time independent.
This is perhaps the most wellunderstood nonequilibrium scenario, primarily because it shares with equilibrium the
property that it is stationary. But in equilibrium the Hamiltonian commutes with the density matrix, [H,r]=0, which
implies that r also commutes with the time evolution operator. In the steady state, though, it is almost certain
that H will not commute with the operators in the exponential defining r, even if the expectation values
defining the state are time independent. While this time independence is a necessary condition, an additional criterion
is needed to guarantee stationarity, and that is provided by requiring that [H,r]=0. In II it was shown that
this leads to the additional constraint that only the diagonal parts of the specified operators appear in the
steadystate density matrix, providing a theoretical definition of the steady state. By `diagonal part' of an
operator we mean that part that is diagonal in the energy representation, so that it commutes with H.
For present purposes the most useful expression of the diagonal part of an operator is
F^{d}=F 
lim
e® 0^{+}


ó õ

0
¥

e^{et} ¶_{t} F(x,t) dt , e > 0 , (22) 

where the time dependence of F is unitary: F(t)=e^{itH/ ℏ} F e^{itH/ ℏ}.
The resulting steadystate density matrix is given by (II12) and (II13) and is found by a simple modification of
(19) above: remove all time dependence, including that in R, and replace F(x,t) by F^{d}(x); in addition,
a term bH is included in the exponentials to characterize an earlier equilibrium reference state.
Substitution of the resulting r_{ss} into (2) provides the maximum entropy of the stationary state:

1
k

S_{ss} = lnZ_{ss}[b,l(x)] +báHñ_{ss} + 
ó õ

l(x)áF^{d}(x)ñ_{ss} , d^{3}x (23) 

and F is often specified to be a current.
This is the timeindependent entropy of the steadystate process as it was established and has nothing to do
with the entropy being passed from source to sink; entropy generation takes place only at the boundaries and not internally.
Some applications of r_{ss} were given in II and others will be made below. No mention appears in II, however, of
conditions for stability of the steady state, so a brief discussion is in order here.
Schlögl (1971) has studied stability conditions for the steady state in some depth through consideration of the quantum
version of the information gain in changing from a steadystate density matrix r¢ to another r,
L(r,r¢)=Tr 
é ë

r 
æ è

lnrlnr¢ 
ö ø


ù û

, (24) 

which is effectively the entropy produced in going from r¢ to r.
He notes that L is a Liapunov function in that it is positive definite, vanishes only if r = r¢, and has
a positive secondorder variation. Pfaffelhuber (1977) demonstrates that the symmetrized version, which is more
convenient here,
L^{*}(r,r¢)= 
1
2

Tr 
é ë


æ è

rr¢ 
ö ø


æ è

lnrlnr¢ 
ö ø


ù û

, (25) 

is an equivalent Liapunov function, and that its firstorder variation is given by
dL^{*}= 
1
2

d(Dl DáF^{d}ñ) . (26) 

The notation is that Dl = ll¢, for example. If d is taken as a time variation,
then Liapunov's theorem immediately provides a stability condition for the steady state,
Remarkably, (27) closely resembles the Gibbs condition (16) for stability of the equilibrium state, but in terms of
,
as well as the GlansdorffPrigogine criterion of phenomenological thermodynamics. But the merit of the present approach
is that L^{*} does not depend directly on the entropy and therefore encounters no ambiguities in defining a
.
Thermal Driving
A variable, and therefore the system itself, is said to be thermally driven if no new variables other than
those constrained experimentally are needed to characterize the resulting state, and if the Lagrange
multipliers corresponding to variables other than those specified remain constant. ^{3}
As discussed in I, a major
difference with purely dynamic driving is that the thermallydriven density matrix is not constrained to evolve
by unitary transformation alone.
It was argued in I and II that a general theory of nonequilibrium must necessarily account explicitly for external sources,
since it is only through them that the macroscopic constraints on the system can change. With that discussion as background
let us suppose that a system is in thermal equilibrium with timeindependent
Hamiltonian in the past, and then at t=0 a
source is turned on smoothly and specified to run continuously, as described by its effect on the expectation value
áF(t)ñ. That is, F(t) is given throughout the changing interval [0,t] and is
specified to continue to change in a known way until further notice.
We omit spatial dependence explicitly here in the interest of clarity, noting that the following equations are
generalized to arguments (x,t) in (II41)(II49).
For convenience we consider only a single driven operator; multiple operators, both driven and otherwise,
are readily included. Based on the probability model of I, the PME then provides the density matrix for thermal
driving:

= 
1
Z_{t}

exp 
é ë

bH 
ó õ

t
0

l(t¢)F(t¢) dt¢ 
ù û

, 
 
=Tr exp 
é ë

bH 
ó õ

t
0

l(t¢)F(t¢) dt¢ 
ù û

. 

(28) 
The theoretical maximum entropy is obtained explicitly by
substitution of (28) into (2),

1
k

S_{t}=lnZ_{t}+báH ñ_{t} + 
ó õ

t
0

l(t¢)áF(t¢)ñ_{t} dt¢ ; (29) 

it is the continuously remaximized information entropy.
Equation (29) indicates explicitly that áH ñ_{t} changes only as a result of changes in,
and correlation with F.
The expectation value of another operator at time t is áC ñ_{t}=Tr[r_{t} C], and
direct differentiation yields

=Tr 
é ë

C(t)¶_{t}r_{t} +r_{t} 
×
C

(t) 
ù û


 
=á 
×
C

(t)ñ_{t} l(t)K_{CF}^{t}(t,t) , 

(30) 
where the superposed dot denotes a total time derivative.
We have here introduced the covariance function
K_{CF}^{t}(t¢,t) º á 
F(t¢)

C(t)ñ_{t}áF(t¢)ñ_{t}áC(t)ñ_{t} =  
dáC(t)ñ_{t}
dl(t)

, (31) 

where the overline denotes a generalized Kubo transform with respect to the operator lnr_{t}:

F(t)

º 
ó õ

1
0

e^{ulnrt} F(t)e^{ulnrt} du , (32) 

which arises from the possible noncommutativity of F(t) with itself at different times. The
superscript t in K_{CF}^{t} implies that the density matrix r_{t} is employed everywhere on the
righthand side of the definition, including the Kubo transform; this is necessary to distinguish it from
several approximations.
In II we introduced a new notation into (30), which at first appears to be only a convenience:
s_{C}(t) º 
d
dt

áC(t)ñ_{t}á 
×
C

(t)ñ_{t} = l(t)K_{CF}^{t}(t,t) . (33) 

For C=F

º 
d
dt

áF(t)ñ_{t}á 
×
F

(t)ñ_{t} 
 

(34) 
which was seen to have the following interpretation: s_{F}(t) is the rate at which F is
driven or transferred by the external source, whereas dáF(t)ñ_{t}/dt is the total time rateofchange
of áF(t)ñ_{t} in the system at time t, and
is the rate of change produced by internal relaxation. Thus, we can turn the scenario around and take the source as
given and predict áF(t)ñ_{t}, which is the more likely experimental arrangement. This reversal of viewpoint
is much like that associated with (4) suggesting that one could as well consider l the independent
variable, rather than f, as was discussed in connection with (II10); in fact, this is usually what is done in
practice in applications of (17), where the temperature is specified.
If the source strength is given, then the second line of (34) provides a nonlinear transcendental equation determining
the Lagrange multiplier function l(t).
An important reason for eventually including spatial dependence is that we can now derive the
macroscopic equations of motion. For example, if F(t) is
one of the conserved densities e(x,t) in a simple fluid, such as those in (II66), and J(x,t)
is the corresponding current density, then the local microscopic continuity equation

×
e

(x,t)+\boldnabla·J(x,t)=0 (35) 

is satisfied irrespective of the the state of the system. When this is substituted into (34) we obtain
the macroscopic conservation law

d
dt

áe(x,t)ñ_{t} +\boldnabla·áJ(x,t)ñ_{t} = s_{e}(x,t) . (36) 

Specification of sources automatically provides the thermokinetic equations of motion, and in II it was shown how all
these expressions reduce to those of the steady state when the driving rate is constant.
Everything to this point is nonlinear, but in many applications some sort of approximation becomes necessary, and
often sufficient, for extracting the desired physical properties of a particular model. The most common procedure is
to linearize the density matrix in terms of the departure from equilibrium, which means that the integrals in (28),
for example, are in some sense small. The formalism for this was discussed briefly in II and a systematic exposition can be
found elsewhere (Heims and Jaynes, 1962; Grandy, 1988). In linear approximation the expectation value of any operator
C(x,t) is given by

= 
ó õ

K_{CF}(x,t;x^{¢},t^{¢})l(x^{¢},t^{¢}) d^{3}x^{¢} dt^{¢} , 
  
=á 
F(x^{¢},t^{¢})

C(x,t)ñ_{0} áF(x^{¢})ñ_{0}áC(x)ñ_{0} , 
 

where we have reinserted the spatial dependence. The integration limits in (37) have been omitted deliberately
so that the general form applies to any of the preceding scenarios.
The subscripts 0 indicate that all expectation values on the righthand sides of (37) and (38)
are to be taken with the equilibrium distribution (17), including the linear covariance function K_{CF}=K^{0}_{CF}
and the Kubo transform (32); K_{CF} is independent of l. It may be useful to note that, rather than
linearize about equilibrium, the same procedure can also be used to linearize about the steady state.
The spacetime transformation properties of the linear covariance function (38) are of some importance in later
applications, so it's a moment well spent to examine these. We generally presume time and space translation invariance
in the initial homogeneous equilibrium system, such that the total energy and number operators of (II67), as well
as the total momentum operator P, commute with one another. In this system these translations are generated,
respectively, by the unitary operators
U(t)=e^{iHt/ ℏ} , U(x)=e^{ix·P/ ℏ} , (39) 

and F(x,t)=U^{f}(x)U^{f}(t) F U(t)U(x). Translation invariance, along with (32) and
cyclic invariance of the trace, provide two further simplifications: the singleoperator expectation
values are independent of x and t in an initially homogeneous system, and the arguments of K_{CF} can now be
taken as r=xx¢, t = tt¢.
Generally, the operators encountered in covariance functions possess definite transformation
properties under space inversion (parity) and time reversal. Under the former A(r,t)
becomes P_{A}A(r,t), P_{A}=±1, and under the latter T_{A}A(r,t), T_{A}=±1.
Under inversion the covariance function (38) behaves as follows:

=K_{CF}(r,t) = P_{C}P_{F}K_{CF}(r,t) 
 
=T_{C}T_{F}K_{CF}(r,t)=P_{C}P_{F}T_{C}T_{F}K_{CF}(r,t) 

(40) 
where the first equality again follows from cyclic invariance.
For many operators, including those describing a simple fluid, PT=+1 and the full reciprocity
relation holds:
K_{CF}(r,t)=K_{FC}(r,t) . (41) 

In fact, by changing integration variables in (32) it is easy to show that the nonlinear covariance function (31) also
satisfies a reciprocity relation: K^{t}_{CF}(x¢,t¢;x,t)=K^{t}_{FC}(x,t;x¢,t¢).
One further property of linear covariance functions will be found useful. Consider the spatial Fourier transform
in which we examine the limit k=k® 0,

lim
k® 0

K_{ab}(k,t)= 
lim
k® 0


ó õ

e^{ik·r} K_{ab}(r,t) d^{3}r = 
ó õ

K_{ab}(r,t) d^{3}r . (42) 

That is, taking the limit is equivalent to integrating over the entire volume. But this is also the
longwavelength limit, in which the wavelengths of slowlydecaying modes span the entire volume.
Suppose now that a is a locallyconserved density, such as those describing a simple fluid, whose volume integral A
is then a conserved quantity commuting with the Hamiltonian in the equilibrium system. In this event (39) implies that
the lefthand side of (42) reduces to K_{Ab}(0,0), independent of
space, time, and Kubo transform; the covariance function has become a constant correlation function, as in (14), and is
just a thermodynamic quantity.
2. Nonequilibrium Thermodynamics
In equilibrium thermodynamics everything starts with the entropy, as in (10), and the same is true here. The instantaneous
maximum entropy of thermal driving is exhibited in (29),
and with S_{t} now a function of time one can compute its total time derivative as

1
k


dS_{t}
dt

= 
æ è

¶lnZ_{t}
¶a

ö ø


×
a

+b 
dáHñ_{t}
dt

l(t) 
ó õ

t
0

l(t¢)K^{t}_{FF}(t,t¢) dt¢ , (43) 

the spatial variables again being omitted temporarily. Because H is not explicitly driven its Lagrange multiplier
remains the equilibrium parameter b.
With a = V, the system volume, the equilibrium expressions (8) and (11) identified the term in lnZ as a work
term. In complete analogy, the first term on the righthand side of (43) is seen to be a power term when the volume
is changing; one identifies the timevarying pressure by writing this term as bP(t).
Commonly the volume is held constant and the term containing the Hamiltonian written out explicitly, so that (43) becomes

=bl(t)K^{t}_{HF}(t,0)l(t) 
ó õ

t
0

l(t¢)K^{t}_{FF}(t,t¢) dt¢ 
 

(44) 
where we have employed (34) and defined a new parameter

º b 
K^{t}_{HF}(t,0)
K^{t}_{FF}(t,t)

+ 
ó õ

t
0

l(t¢) 
K^{t}_{FF}(t,t¢)
K^{t}_{FF}(t,t)

dt¢ 
 
= 
æ è

dS_{t}
dáF(t)ñ_{t}

ö ø

[(thermal)  (driving)]

, 

(45) 
as discussed in II.
The subscript `thermal driving' reminds us that this derivative is evaluated somewhat differently than in the equilibrium
formalism because the expectation values of H and F are not independent here.
When the source strength s_{F}(t) is specified the Lagrange multiplier itself is determined from (34) and
g_{F} is interpreted as a transfer potential governing the transfer of F to or from the system.
If two systems can exchange quantities F_{i} under thermal driving, then the conditions for
migrational equilibrium at time t are
g_{Fi}(t)_{1}=g_{Fi}(t)_{2} . (46) 

In II it was noted that S_{t} refers only to the information
encoded in the distribution of (28) and cannot refer to the internal entropy of the system. In equilibrium
the maximum of the information entropy is the same as the experimental entropy, but that is not necessarily
the case here.
For example, if the driving is removed at time t=t_{1}, then S_{t1} in (29) can only provide the entropy
of that nonequilibrium state at t=t_{1}; its value will remain the same during subsequent relaxation, owing to unitary
time evolution. Although the maximum information (or theoretical) entropy provides a complete description of the system
based on all known physical constraints on that system, it cannot describe the ensuing relaxation, for it contains
no new information about that process. We return to this in Section 4 below.
Combination of (34) and the second line of (43) strongly suggests the natural expression

1
k


×
S

t

=g_{F}(t) 
æ è

d
dt

áF(t)ñ_{t} á 
×
F

(t)ñ_{t} 
ö ø

. (47) 

in which the first term on the righthand side represents the total time rateofchange of entropy
arising from
the thermal driving of F(t), whereas the second term is the rateofchange of internal entropy
owing to relaxation. Thus, the total rate of entropy production in the system can be written

×
S

tot

(t)= 
×
S

t

+ 
×
S

int

(t) , (48) 

where the entropy production of transfer owing to the external source,
, is given by the first line of (44).
This latter
quantity is a function only of the driven variable F(t), whereas the internal entropy depends on all variables, driven or not,
necessary to describe the nonequilibrium state and is determined by the various relaxation
processes taking place in the system. If spatial variation is included the righthand side of (47) is integrated over the
system volume.
It is important to understand very clearly the meaning of Eq.(48), so we restate more carefully the interpretation of each term.
From (44),
is the rate of change of the entropy of the macroscopic state of the system due to the
source alone; it involves the maximum of the information entropy and is associated entirely with the source. The term
is the contribution to the rate at which the entropy of that state is changing due to relaxation mechanisms within the
system itself. Thus,
is the total rate of change of the entropy of the macroscopic state. When the
source is removed
, and
is the rate at which the entropy of the macroscopic state
changes due to internal relaxation processes. Thus, entropy is always associated with a macroscopic state and its
rate of change under various processes; the entropy of the surroundings does not enter into this discussion. Equation (48)
does not apply to steadystate processes, in which no entropy is generated internally.
Linear Heating
As a specific example it is useful to make contact with classical thermodynamics
and choose the driven variable to be the totalenergy function for the system, E(t). ^{4}
In the presence of external forces this quantity is not necessarily the Hamiltonian, but can be defined in
the same way as H in the isolated system, (II67), in terms of the energy density operator:
E(t) º 
ó õ

V

h(x,t) d^{3}x (49) 

and the time evolution is no longer unitary. The point is that H does not change in time, only its expectation value.
In the case of pure heating
and (43) and (44) become, respectively,

1
k


×
S

t

=g_{E}(t)s_{E}(t) , (50a) 

g_{E}(t)=b 
K^{t}_{HE}(t,0)
K^{t}_{EE}(t,t)

+ 
ó õ

t
0

l(t¢) 
K^{t}_{EE}(t,t¢)
K^{t}_{EE}(t,t)

dt¢ . (50b) 

The dimension of g_{E}(t) is E^{1}, so it is reasonable to interpret this transfer parameter
as a timedependent `inverse temperature' b(t)=[kT(t)]^{1}; the temperature must change continuously
as heat is added to
or removed from the system, though it is difficult to define a measurable quantity like this globally.
Hence, in analogy with the equilibrium form S=dQ/T, Eq.(10), the content of (50a) is that
because the rate of external driving is just
here.
A further analogy, this time with (11), follows from the first line of
(34), which extends the First Law to
because any work done in this scenario would change only the internal energy.
These last two expressions are remarkably like those advocated by Truesdell (1984) in his development of
Rational Thermodynamics, and are what one might expect from naïve time differentiation of the corresponding
equilibrium expressions. Indeed, such an extrapolation may provide a useful guide to nonequilibrium relations,
but in the end only direct derivation from a coherent theory should be trusted.
In (48) the term
is positive semidefinite, for it corresponds to the increasing internal
entropy of relaxation; this is demonstrated explicitly in Section 4.
Combination with (51) then allows one to rewrite (48) as an inequality:

×
S

tot

=kg_{E} 
×
Q

+ 
×
S

int

³ kg_{E} 
×
Q

. (53) 

One hesitates to refer to this expression as an extension of the Second Law, for such a designation is fraught with
ambiguity; the latter remains a statement about the entropies of two equilibrium states. Instead, it may be more
prudent to follow Truesdell in referring to (53) as the ClausiusPlanck inequality.
A linear approximation in (50b), as described by (37) and (38), leads to considerable simplification, after which
that expression becomes
b(t) @ b+ 
ó õ

t
0

l(t¢) 
K_{EE}(tt¢)
K_{HH}

dt¢ , (54) 

while recalling that b = b(0) is the equilibrium temperature. The static covariance function is now just an
equilibrium thermodynamic function proportional to the energy fluctuations (and hence to the heat capacity).
In this approximation the expectation value of the driven energy function is
áE(t)ñ_{t} @ áEñ_{0} 
ó õ

t
0

l(t¢)K_{EE}(tt¢) dt¢ , (55) 

so that if, for example, energy is being transferred into the system (s_{E} > 0), then the
integral must be negative. We can then write
b(t) @ b 
ê ê

ó õ

t
0

l(t¢) 
K_{EE}(tt¢)
K_{HH}

dt¢ 
ê ê

, (56) 

and b(t) is decreasing from the equilibrium value. The physical content of (56) therefore is
that T(t) @ T(0)+DT(t), as expected. Although it is natural to interpret T(t) as a `temperature',
we are cautioned that only at t=0 is that interpretation unambiguous.
A complementary calculation is also of interest, in which spatial variation is included and the homogeneous system
is driven from equilibrium by a source coupled to the energy density h(x,t).
In linear approximation the process is described by

= 
ó õ

V

d^{3}x¢ 
ó õ

t
0

dt¢ l(x¢,t¢)K_{hh}(xx¢,tt¢) , 
  
= 
ó õ

V

l(x¢,t)K_{hh}(xx¢,t=0) d^{3}x¢ . 
 

After a welldefined period of driving the source is removed, the system is again isolated, and we expect it to
relax to equilibrium (see Section 4); the upper limit of integration in (57a) is now a constant, say t_{1}.
Presumably this is a reproducible process.
For convenience we take the system volume to be all space and integrate Eqs.(57) over the entire volume,
thereby converting the densities into total Hamiltonians. Owing to spatial uniformity and cyclic invariance
the covariance function in (57a) is then independent of the time (the evolution being unitary after
removal of the source), and we shall denote the volume integral of s_{h}(x,t) by s_{h}(t).
Combination of the two equations for t > t_{1} yields
áHñáHñ_{0}= 
ó õ

t_{1}
0

s_{h}(t¢) dt¢ , t > t_{1} , (58a) 

or
which is independent of time and identical to (55) at t=t_{1}. The total
energy of the new equilibrium state is now known, and the density matrix describing that state can be
constructed via the PME.
The last few paragraphs provide a formal description of slowly heating a pot of water on a stove, but in reality much
more is going on in that pot than simply increasing the temperature. Experience tells us that the number density is
also varying, though N/V is constant (if we ignore evaporation), and a proper treatment ought to include both densities.
But thermal driving of
h(x,t) requires that n(x,t) is explicitly not driven, changing only as a result of changes in h,
through correlations. The proper density matrix describing this model ^{5} is
r_{t}= 
1
Z_{t}

exp 
é ë

bH 
ó õ

d^{3}x¢ 
ó õ

t
0


é ë



 
+l_{n}(x¢,t¢)n(x¢,t¢) 
ù û

dt¢ 
ù û

, 

(59) 
and the new constraint is expressed by the generalization of (34) to the set of equations

= 
ó õ

l_{h}(x¢,t) K_{hh}(xx¢;tt) d^{3}x¢ 
 
 
ó õ

l_{n}(x¢,t) K_{hn}(xx¢;tt) d^{3}x¢ , 
 
= 
ó õ

l_{h}(x¢,t) K_{nh}(xx¢;tt) d^{3}x¢ 
 
 
ó õ

l_{n}(x¢,t) K_{nn}(xx¢;tt) d^{3}x¢ , 

(60) 
asserting explicitly that s_{n} º 0.
In this linear approximation l_{n} is determined by l_{h} and we can now carry out the spatial
Fourier transformations in (60). The source strength driving the heating is thus
s_{h}(k,t)=l_{h}(k,t)K_{hh}(k,0) 
é ë

1 
K_{nh}(k,0)^{2}
K_{hh}(k,0)K_{nn}(k,0)

ù û

, (61) 

where the t=0 values in the covariance functions refer to equal times. For Hermitian operators the covariance
functions satisfy a Schwarz inequality, so that
the ratio in square brackets in this last expression is always less than or equal to unity;
hence the driving strength is generally reduced by the nodriving constraint on n(x,t).
The expression (61) is somewhat awkward as it stands, so it's convenient to introduce a new variable, or operator,
h¢(k,t) º h(k,t)  
K_{nh}(k,0)
K_{nn}(k,0)

n(k,t) . (62) 

Some algebra then yields in place of (61)
s_{h}(k,t)=l_{h}(k,t)K_{h¢h¢}(k,0) . (63) 

In the linear case, at least, it is actually h¢ that is the driven variable under the constraint that
n is not driven, and the source term has been renormalized.
With (63) the two expectation values of interest are

= 
ó õ

t
0

s_{h}(k,t¢) 
K_{h¢h¢}(k,tt¢)
K_{h¢h¢}(k,0)

dt¢ , 
  
= 
ó õ

t
0

s_{h}(k,t¢) 
K_{n h¢}(k,tt¢)
K_{h¢h¢}(k,0)

dt¢ . 
 

Thus, the number density changes only as a consequence of changes in the energy density. The reader will have no
difficulty finding explicit expressions for the new covariance functions in these equations, as well as showing
that total particle number is conserved in the process, as expected.
This Section provides explicit procedures for carrying out some calculations in nonequilibrium thermodynamics.
Undoubtedly one can develop many more applications of this kind along the lines suggested by Truesdell (1984),
and in discussions of socalled extended irreversible thermodynamics (e.g., Jou, et al, 2001).
3. Transport Processes and Hydrodynamics
Linear transport processes in a simple fluid were discussed briefly in II in terms of the locally conserved number
density n(x,t), energy density h(x,t), and momentum density mj(x,t), where m is the
particle mass and j the current in the fluid that was initially homogeneous; the associated local
equations of motion (continuity equations) of the type (35) are given in (II66). System response to
local excesses of these densities was studied in the longwavelength limit that essentially defines hydrodynamics,
and a generic expression for the steadystate expectation values in the perturbed system, in linear approximation,
was presented in (II71). This expression represents the leading term in an expansion in powers of the Lagrange
multiplier l, which in this scenario eventually becomes a gradient expansion. In the cases of the densities
n and h the procedure led to Fick's law of diffusion and Fourier's law of heat conduction, respectively:

= 

ó õ

¥
0

e^{et} dt 
ó õ

v

K_{jj}(xx¢,t) d^{3}x¢ 

ó õ

v

K_{nn}(xx¢) d^{3}x¢ 

·Ñán(x)ñ_{ss} 
 

(66) 

@ ÑT· 
ó õ

v

d^{3}x¢ 
ó õ

¥
0

e^{et} 
K_{qq}(xx¢, t)
kT^{2}(x¢)

dt 
 

(67) 
where j is a particle current, q a heat current, and v is the volume outside of which the covariance function
vanishes.
The diffusion tensor D and the thermal conductivity tensor k are in many cases considered scalar constants,
but both are easily extended to timedependent coefficients by releasing the steadystate constraint. Correlation
function expressions of the type developed here have been found by a number of authors over the years and are
often known as GreenKubo coefficients. Although they are usually obtained by contrived methods, rather than as a
straightforward result of probability theory, those results clearly exhibited the right instincts for how the
dissipative parameters in the statistical theory should be related to the microscopic physics.
If the spatial correlations in (66) are long range, then v® V and the discussion following (42) implies that
K_{jj} becomes independent of time. In this event the time integral diverges and D does not exist. A
similar divergence arises in both (66) and (67) if the time correlations decay sufficiently slowly, possibly
indicating the onset of new phenomena such as anomalous diffusion. Of course, one is not obliged to make the
longwavelength approximation, and in these cases it may not be prudent to do so. These observations remain valid
even when the processes are not stationary.
The same procedure is readily generalized to more complicated processes such as thermal diffusion, thermoelectricity,
and so on. For example, in a system of particles each with electric charge e, and
in the presence of a static external electric field derived from a potential per unit charge f(x), the
appropriate Lagrange multiplier is no longer the chemical potential m, but the electrochemical potential
y(x,t)=m(x,t) +ef(x,t) . (68) 

For a steadystate process in the longwavelength limit and linear approximation one finds for the electric current j
and the heat current q the set of coupled equations

= 
1
ekT

Ñy·L_{jj}(x)+ 
1
kT^{2}

ÑT·L_{jq}(x) , 
 
= 
1
ekT

Ñy·L_{qj}(x)+ 
1
kT^{2}

ÑT·L_{qq}(x) , 

(69) 
where the secondrank tensors can be written generically as
L_{AB}(x)= 
lim
e® 0^{+}


ó õ

¥
0

e^{et} dt 
ó õ

v

á 
B(x¢)

A(x,t)ñ_{0} d^{3}x¢ . (70) 

These are the thermoelectric coefficients: L_{jj} is proportional to D and
L_{qq} is proportional to k, whereas L_{jq} is the Seebeck coefficient
and L_{qj} the Peltier coefficient. In an isotropic medium the symmetry properties of the
covariance functions imply the Onsagerlike reciprocity relation L_{jq}=L_{qj},
but this clearly depends strongly on the spacetime behavior of the operators involved, as well the specific system
under study. An indepth critique of Onsager reciprocity has been provided by Truesdell(1984).
It remains to examine the momentum density in the simple fluid; although the analysis of transport coefficients is readily
applied to other systems, the fluid presents a particularly clear model for this exposition. Within the
steady state context, the linear approximation, and the longwavelength limit, (II71) for a perturbation in mj
leads to the following expression for the expectation of another operator C(x):

= m 
ó õ

V

l_{i}(x¢)K_{Cji}(xx¢) d^{3}x¢ 
 
+ 
æ è

Ñ_{k}l_{i} 
ö ø


ó õ

v

d^{3}x¢ 
ó õ

¥
0

e^{et} K_{CTik}(xx¢,t) dt , 

(71) 
where here and subsequently we adopt the notation DC(x)=C(x)áCñ_{0}, and the
is understood. In addition, sums over repeated indices are implied.
First consider C as the total momentum operator P. Then, in the given scenario, we know that
á 
j(x¢)

P(x)ñ_{0}=(n_{0}/b)d_{ij}

, where n_{0} and b are
equilibrium values and K_{PTij}(xx¢,t)=0 from the symmetry properties (40). Hence (71) reduces to

ó õ

V

ámj_{i}(x)ñ_{ss} d^{3}x= 
n_{0} m
b


ó õ

V

l_{i}(x) d^{3}x . (72) 

A convenient notation emerges by defining a fluid velocity v_{i}(x) by writing
ámj_{i}(x)ñ_{ss}=mn_{0}v_{i}(x), so that the Lagrange multiplier is identified as
l=bv.
Now take C in (71) to be a component of the energymomentum tensor. By the usual symmetry arguments,
áT^{ki}(x)ñ_{ss} = áT^{ki}ñ_{0}+Ñ_{m}l_{n} 
ó õ

¥
0

e^{et} dt 
ó õ

v

K_{ki,mn}(xx¢,t) d^{3}x¢ , (73) 

where the covariance function is that of T^{mn} and T^{ki},
and the equilibrium expectation value áT^{ki}ñ_{0}=P_{0}d^{ki} is the hydrostatic pressure.
With the identification of l, (73) is the general equation defining a fluid; in fact, we can now show that
it's a Newtonian fluid.
There are numerous properties of T^{mn} and the covariance functions involving it that are needed at this point;
they will simply be stated here and can be verified elsewhere (e.g., Puff and Gillis, 1968). As expected,
T^{mn} is symmetric and the space integral in (71) can be carried out immediately; the spaceintegrated covariance
function has only a limited number of components in the isotropic equilibrium medium, in that nonzero values can
only have indices equal in pairs, such as áT_{12}T_{12}ñ_{0} and áT_{22}T_{33}ñ_{0}; in addition, the
spaceintegrated tensor itself has the property
T_{11}=T_{22}=T_{33}=^{1}/_{3} T , (74) 

where T is the tensor trace. If we adopt the notation v_{i,k} º ¶v_{i}/¶x_{k} and convention that
sums are performed over repeated indices, then (73) can be rewritten as
áT_{ki}(x)ñ_{ss} = P_{0}d_{ki}h 
æ è

v_{k,i}+v_{i,k}^{2}/_{3}v^{l}_{,l}d_{ki} 
ö ø

zv^{l}_{,l}d_{ki} , (75) 

where we have identified the shear viscosity
h º 
b
V


ó õ

¥
0

e^{et} K_{mn,mn}(t) dt , m ¹ n , (76a) 

and the bulk viscosity
z º 
b
9V


ó õ

¥
0

e^{et} K_{TT}(t) dt . (76b) 

In (76a) any values of m ¹ n can be used, but are generally dictated by the initial current; the indices are
not summed. Note that both h and z are independent of spacetime coordinates.
Equations of Motion
Local conservation laws of the form (35) for a simple fluid were displayed in (II66), and their conversion into
macroscopic equations of motion for expectation values of densities and currents takes the form (36) under thermal
driving. For simplicity we shall only consider the momentum density to be driven here, although most generally
all three densities could be driven. With mj(x,t) as the driven variable, the exact macroscopic equations
of motion are

d
dt

án(x,t)ñ_{t}+Ñ_{i}áj^{i}(x,t)ñ 

 
m 
d
dt

áj^{i}(x,t)ñ_{t}+Ñ_{k}áT^{ki}(x,t)ñ 

 

d
dt

áh(x,t)ñ_{t}+Ñ_{i}áq^{i}(x,t)ñ 

 

The rate s_{j} that the source drives the momentum density can equally be written as a force density
nF. As for other driving possibilities, an example would be to drive the number density in an ionized gas,
rather than the momentum, and study electron transport. The source in this case would be
s_{e}=n_{e}m_{e}E, where E is a static field and m_{e} is the dc electron mobility.
The nonlinear equations of motion (77) are independent of the longwavelength approximation
and are valid arbitrarily far from equilibrium. They represent five equations for the five densities, but it is
necessary to employ specific models for áT^{ki}(x,t)ñ and áq^{i}(x,t)ñ, and
these are most often taken as the linear
approximations to the heat and momentum currents, as in (67) and (75), respectively. For example, under thermal driving
the linear approximation (73) is replaced by
áT^{ki}(x,t)ñ = áT^{ki}ñ_{0}+ 
ó õ

v

d^{3}x¢ 
ó õ

t
0

Ñ_{m}l_{n}K_{ki,mn}(xx¢,tt¢) dt¢ . (78) 

The longwavelength limit is still appropriate and the pressure remains hydrostatic. While the Lagrange multiplier is
now given by (34), it is basically still a velocity, so that one can proceed in the same manner as above
and find for the dynamic viscosities
h(t)= 
b
V


ó õ

t
0

K_{mn,mn}(tt¢) dt¢ m ¹ n , (79a) 

z(t)= 
b
9V


ó õ

t
0

K_{TT}(tt¢) dt¢ . (79b) 

In linear approximation the solutions to the equations of motion (77) are just the linear predictions of the type (37).
With the notation Dn(x,t)=n(x,t)ánñ_{0}, etc., for the deviations, these are

= 
ó õ

v


ó õ

t
0

l(x¢,t¢)·K_{nj}(xx¢,tt¢) d^{3}x¢dt¢ , 
  
= 
ó õ

v


ó õ

t
0

l(x¢,t¢)·K_{hj}(xx¢,tt¢) d^{3}x¢dt¢ , 
  
= 
ó õ

v


ó õ

t
0

l(x¢,t¢)·K_{Tmnj}(xx¢,tt¢) d^{3}x¢dt¢ , 
  
= 
ó õ

v


ó õ

t
0

l(x¢,t¢)·K_{jj}(xx¢,tt¢) d^{3}x¢dt¢ , 
  
= 
ó õ

v


ó õ

t
0

l(x¢,t¢)·K_{qj}(xx¢,tt¢) d^{3}x¢dt¢ . 
 

In these equations l is obtained from (34),
l(x,t)= 
s_{j}(x,t)
K_{jj}(xx¢)

, (81) 

where the denominator is an equaltime covariance. To verify that these are solutions, first take the time derivative
of (80a),

= 
ó õ

v

l(x¢,t)·K_{nj}(xx¢,0) d^{3}x¢ 
 
+ 
ó õ

v


ó õ

t
0

l(x¢,t¢)·K_{[(n)\dot]j}(xx¢,tt¢) d^{3}x¢dt¢ . 

(82) 
Employ the microscopic conservation law
to write
K_{[(n)\dot]j}=Ñ·K_{jj} and note that the first term on the righthand side of (82)
vanishes by the symmetry properties (40); the result is just what one finds from taking the divergence of (80d), thereby
verifying (77a). A similar calculation verifies (77c), but verification of (77b) contains a different twist. In the
time derivative of (80d) the term analogous to the first term on the righthand side of (82) is, from (II44),

ó õ

v

l(x¢,t)·K_{jj}(xx¢,0) d^{3}x¢ = s_{j}(x,t) . (83) 

Thus, at least in linear approximation, the statistical predictions (80) are completely consistent with, and provide the
firstorder solutions to the deterministic equations (77).
Fluctuations
The statistical fluctuations are determined, as always, by the correlation of deviations, or covariance functions. In general
these are the nonlinear covariances (31), or possibly those defined by the steady state distribution. In any case, they are
usually quite difficult to evaluate other then in very approximate models. For the moment, then, attention will be focused
on the linear
covariance functions, not only because they are readily evaluated in the longwavelength approximation, but that and the
linear approximation are well controlled. Thus, in linear hydrodynamics a first approach to fluctuations is a study of
K_{nn}, K_{jj}, K_{hh}, K_{qq}, and K_{ki,mn}, the last referring to the energymomentum tensors. One
should note that these only describe statistical fluctuations; whether they can be equated with possible physical
fluctuations is another matter and is discussed in the Appendix.
It is well known (e.g, Grandy, 1988) that the Fouriertransformed quantity K_{AB}(k,w) is the
dissipative part of the physical response of the system, whereas the ordinary (i.e., no Kubo transform)
correlation function C_{AB} º áDADBñ_{0}, in terms of the deviations defined above, describes
the actual fluctuations. The two are related by a fluctuationdissipation theorem,
K_{AB}(k,w)= 
1e^{b ℏw}
b ℏw

C_{AB}(k,w) 
®
b ℏw << 1

C_{AB}(k,w) . (84) 

The covariances of the densities n, h, and j are obtained in a straightforward manner in the longwavelength
limit and are simple (equilibrium) thermodynamic functions (e.g., Puff and Gillis, 1967; Grandy, 1987). They are:

  
= 
kT^{2}C_{V}
V

+ 
k_{T}
b


é ë

h_{0}+P_{0} 
aT
k_{T}


ù û

2

, 
  
 

where k_{T} is the isothermal compressibility, a the coefficient of thermal expansion, C_{V} the constantvolume
heat capacity, and subscripts 0 refer to equilibrium quantities. Expressions on the righthand sides of (85) are most readily
obtained from familiar thermodynamic derivatives, as suggested in (14), but
correlations for the dissipative currents require a little more work.
Consider first
K_{ql qm}(x¢,t¢;x,t)= 


q_{m}

q_{l}(r,t) 

0


,
which will be abbreviated K_{lm} temporarily. The spacetime Fourier transform is
K_{lm}(k,w)= 
ó õ

d^{3}r e^{ik·r} 
ó õ

¥
¥

dt e^{iwt} K_{lm}(r,t) . (86) 

Introduce a bit of dispersion into the integral, to insure convergence, by replacing w with w±ie,
e > 0. The properties (40) imply that this covariance function is invariant under time reversal, so that in the limits
k® 0, w® 0 (86) becomes ^{6}

lim
[(k® 0)  (w® 0)]

K_{lm}(k,w)=2 
lim
e® 0


ó õ

d^{3}r 
ó õ

¥
0

dt e^{et} K_{lm}(r,t) . (87) 

But this last expression is just the thermal conductivity k in (67), if we note that the factor
T(x¢) ~ T(x) can be extracted from the integral in that expression, as is usual. Symmetry again
implies that only the diagonal components contribute here, so the inverse transformation of (87) yields in the
longwavelength limit
K_{ql qm}(r,t) @ 2kkT^{2}d(r)d(t)d_{lm} . (88) 

Similarly, fluctuations in the energymomentum tensor are described by the covariance function
K_{kl,mn}(r,t) = 


T_{mn}

T_{kl}(r,t) 

0

P_{0}^{2}d_{kl}d_{mn}

. With the definitions
(76) the above procedure leads to
K_{kl,mn}(r,t) @ 2kT 
é ë

h 
æ è

d_{km}d_{ln}+d_{kn}d_{lm} 
ö ø

+ 
æ è

z^{2}/_{3}h 
ö ø

d_{kl}d_{mn} 
ù û

d(r)d(t) . (89) 

The expressions (88) and (89) for fluctuations in the dissipative currents are precisely those found by Landau and Lifshitz
(1957) in their study of hydrodynamic fluctuations. Here, however, there is no need to introduce fictitious `random'
forces, additional dissipative stresses, or extraneous averaging processes, for these are just the straightforward
results expected from probability theory. Hydrodynamic fluctuations apparently have been observed in convection waves
in an isotropic binary mixture of H_{2}O and ethanol (Quentin and Rehberg, 1995).
When a system is farther from equilibrium, but in a steady state, the deviations from that state can be studied in much
the same way. In accordance with the stationary constraint,
when the current is driven at a constant rate the density matrix r_{ss} depends only on the diagonal part of the
current operator, j^{d}(x). By definition this operator commutes with the Hamiltonian, as does the totalmomentum
operator, which again leads to some simplifications in the longwavelength limit. But the fact remains that expectation
values and covariance functions are very difficult to evaluate with a density matrix r_{ss}, let alone in an
arbitrary nonequilibrium state, which has led to alternative means for estimating fluctuations about these states.
In fluid mechanics it is customary to reformulate the equations of motion (77) in terms of the fluid velocity v
by introducing a notation áj(x,t)ñ = án(x,t)ñv(x,t); this is motivated in part by
Galilean invariance. Transformation of the densities and currents can be performed by a unitary operator
U(G)=exp[iv·G/ ℏ], where G is the generator of Galilean transformations and v is
a drift velocity. The results for the expectation values are, in addition to that for j,

=áT_{ij}(x,t)ñ_{0}+mnv_{i}v_{j} , 
 
=áh(x,t)ñ_{0}+^{1}/_{2}mnv^{2} , 
 
= 
æ è

áh(x,t)ñ_{0}+^{1}/_{2}mnv^{2} 
ö ø

v_{i}+v_{k} áT_{kj}(x,t)ñ_{0} , 

(90) 
where here the subscript 0 denotes values in the rest frame.
This, of course, is merely a redescription of the system in the laboratory frame and contains no dynamics; dissipation
cannot be introduced into the system by a unitary transformation! But one now sees the view from the laboratory frame.
With the additional definition of the mass density r = mán(x,t)ñ, not to be confused with the
density matrix, (77) can be rewritten as


r 
æ è

¶_{t}v^{j}+v·Ñv^{j} 
ö ø

+¶_{i}T^{ij} 


¶_{t}(rh)+¶_{i} 
æ è

rhv^{i}+q^{i} 
ö ø




(91) 
where every quantity with the exception of F^{j} is actually an expectation value in the state described by r_{t}.
For large macroscopic systems we expect these values to represent sharp predictions but still possess small
deviations from these values. Such a deviation can excite the mass density r to a temporary new value
r¢=r+Dr, for example.
By direct substitution (91) can be converted into a set of equations for
the deviations of all the variables, and by retaining only terms linear in those deviations they become linear equations
that have a good chance of being solved in specific scenarios. The statistical fluctuations for
the dissipative currents have the same forms as those of (88) and (89), but T, h,and z are now space and
time dependent ^{7}
(e.g., Morozov, 1984); in addition, the stresstensor model of (75) now contains deviations
in the fluid velocity as well. These equations will not be developed here since they are given in detail elsewhere
(e.g., Fox, 1984); an application to the RayleighBénard problem has been given by Schmitz and Cohen (1985).
Ultrasonic Propagation
Two special cases of thermal driving arise when the rate is either zero or constant, describing
equilibrium or a steadystate process, respectively. Another occurs when s_{F} can usefully
be replace by a timedependent boundary condition. For example, a common experimental arrangement in
the study of acoustical phenomena is to drive a quartz plate piezoelectrically so that it generates
sound waves along a plane. Thus it is quite realistic to characterize
the external source by specifying the particle
current on the boundary plane at z=0. The system excited by
the sound wave is then described by the density matrix
r = 
1
Z

exp 
ì í
î

bH+ 
ó õ

dx^{¢} 
ó õ

dy^{¢} 
ó õ

dt^{¢} l(x^{¢},y^{¢},t^{¢})·J(x^{¢},y^{¢},0;t^{¢}) 
ü ý
þ

, (92) 

and Z is the trace of the exponential.
To keep the model simple the volume is taken to be the entire halfspace z > 0, and we presume the
zcomponent of current to be specified over the entire
xyplane for all time. Although there are no currents in
the equilibrium system, current components at any time in the
perturbed system in the right halfspace are given by áJ_{a}(x,y,z;t)ñ = \Tr[r J_{a}(x,y,z;t)].
Restriction to smallamplitude disturbances, corresponding to
small departures from equilibrium, implies the linear
approximation to be adequate:
áJ_{a}(x,y,z;t)ñ = 
ó õ

¥
¥

dx^{¢} 
ó õ

¥
¥

dy^{¢} 
ó õ

¥
¥

dt^{¢} l(x^{¢},y^{¢},t^{¢})K_{JaJz}(x^{¢},y^{¢},0,t^{¢};x,y,z,t) , (93) 

and consistency requires this expression to reproduce the boundary
condition at z=0:
áJ_{z}(x,y,0;t)ñ = 
ó õ

¥
¥

dx^{¢} 
ó õ

¥
¥

dy^{¢} 
ó õ

¥
¥

dt^{¢} l(x^{¢},y^{¢},t^{¢})K_{JzJz}(xx^{¢},yy^{¢},0;tt^{¢}) . (94) 

We are thus considering low intensities but arbitrary frequencies.
Linearity suggests it is sufficient to consider the disturbance
at the boundary to be a monochromatic plane wave. Thus,
áJ_{z}(x,y,0;t)ñ = Je^{iwt} , (95) 

where J is a constant amplitude. Substitution of this
boundary value into (94) allows one to solve the integral
equation for l(x^{¢},y^{¢},t^{¢})
immediately by Fourier transformation, and the
Lagrangemultiplier function is determined directly by means of
the driving term, as expected. We find that
l(x,y,t)=l_{w} e^{iwt} , (96) 

with
l_{w}^{1} º J^{1} 
ó õ

¥
¥

dx 
ó õ

¥
¥

dy K_{JzJz}(x,y,0;w) , (97) 

so that l is independent of spatial variables.
Given the form of the covariance function in (94), the
current in the right halfspace will also be independent of x
and y:
áJ_{a}(x,y,z;t)ñ = l_{w} e^{iwt} 
ó õ

¥
¥

dx^{¢} 
ó õ

¥
¥

dy^{¢} 
ó õ

¥
¥

dt^{¢} e^{iwt¢} K_{JaJz}(x^{¢},y^{¢},z;t^{¢}) . (98) 

Define a function
K_{JaJz}(z,w) º K_{JaJz}(0,0,z;w) = 
ó õ

¥
¥


dk_{z}
2p

e^{ikzz} K_{JaJz}(k_{z},w) , (99) 

where K_{JaJz}(k_{z},w) º K_{JaJz}(0,0,k_{z};w). Then (93) for the current in the
perturbed system can be rewritten as
J_{a}(z,t) º áJ_{a}(0,0,z;t)ñ = J_{a}(z)e^{iwt} , (100) 

and
J_{a}(z) º l_{w}K_{JaJz}(z,w) . (101) 

In the same notation,
l_{w}^{1}=J^{1} K_{JzJz}(0,w) .
Thus, the amplitude of the sound wave relative to that of the
initial disturbance is

J_{a}(z)
J

= 
K_{JaJz}(z,w)
K_{JzJz}(0,w)

. (102) 

So, application of a
monochromatic plane wave at the boundary results in a disturbance
that propagates through the system harmonically, but with an
apparent attenuation along the positive zaxis given by
J_{a}(z). Analysis of the spatial decay
depends on the detailed structure of the currentcurrent
covariance function, and only on that; this remains true if we synthesize a general wave form.
As an example, suppose that
K_{JzJz}(k_{z},w)=2pg(w) d(k_{z}k_{0}) . (103a) 

From(102) the zcomponent of current is then
J_{z}(z)=J e^{ik0z} , (103b) 

and the initial plane wave propagates with no attenuation.
More interesting is a covariance function of Lorentzian
form, such as
K_{JzJz}(k_{z},w)= 
af(w)
a^{2}+(k_{z}k_{0})^{2}

. (104a) 

A similar calculation yields
J_{z}(z)=J e^{ik0z} e^{az} , (104b) 

which exhibits the classical exponential attenuation. Although
the Lorentzian form
provides at least a sufficient condition for exponential decay
of the sound wave, there clearly is no obvious requirement for
the attenuation to be exponential in general.
Note that the number density itself could have been
predicted in the above discussion, merely by replacing J_{a} with n. In a similar manner we find that
n(z,t) º án(0,0,z;t)ñánñ_{0} = l_{w} K_{nJz}(z,w) e^{iwt} . (105) 

But the covariance function
K_{nJz} is directly proportional to only the
densitydensity covariance function K_{nn}, and therefore
K_{nJz}(z,w)= 
w
2p


ó õ

¥
¥


dk_{z}
k_{z}

e^{ikzz} K_{nn}(k_{z},w) . (106) 

The variation in density, n(z,t), is directly related to the
correlated propagation of density fluctuations, and it is
precisely this correlation of fluctuations that makes
intelligible speech possible.
The preceding model has been adapted by Snow (1967) to an extensive study
of freeparticle systems. Although there is essentially no propagation in the
classical domain, the quantum systems do exhibit interesting behavior, such as second sound.
4. Relaxation and The Approach To Equilibrium
In the thermally driven system the instantaneous nonequilibrium state is described by the density
matrix (28), with associated entropy (29). If the driving source is removed at time t=t_{1} the
macroscopic nonequilibrium state r_{t1} at that time has maximum information entropy

1
k

S_{t1}=lnZ_{t1}+báH ñ_{t1} + 
ó õ

t_{1}
0

l(t¢)áF(t¢)ñ_{t1} dt¢ , (107) 

which is fixed in time. From (48) we note that the total rate of entropy production is now
and we expect the system to relax to equilibrium; that is, we want to identify and study the relaxation entropy
S_{int}(t). In the discussion following (46) it seemed that not much could be said about this quantity in general
because there was no information available at that point to describe the relaxation. But now it appears that there
are indeed two cogent pieces of information that change the situation greatly. The first new piece is that the source
has been removed, so that the system is now isolated from further external influences; the second is the set of
equations of motion (77), in which the source term is zero. Although S_{t1} cannot evolve to the canonical entropy
of equilibrium, we can now construct an S_{int}(t) that does, because the relaxation is deterministic.
The absence of external driving assures us that the microscopic time evolution now takes place through unitary
transformation, and that the density matrix develops by means of the equation of motion i ℏ¶_{t}r = [H,r].
Unquestionably, if the system returns to equilibrium at some time t in the future, the density matrix r(t)
evolved in this way will correctly predict equilibrium expectation values. As discussed at length in I, however, there
is virtually no possibility of carrying out such a calculation with the complete system Hamiltonian H; and even if that
could be done it is not possible for r(t) to evolve into the canonical equilibrium distribution, because the
eigenvalues of r_{t1} remain unchanged under unitary transformation. In addition, the information entropy is also
invariant under this transformation, simply because there is no new macroscopic information being supplied to it. But this
last observation is the key point: it is not the microscopic behavior that is relevant, for we have no access to that in
any event; it is the macroscopic behavior of expectation values that should be the proper focus.
Some very general comments regarding relaxation were made in II, primarily in connection with (II88), but now it is
possible to be much more specific in terms of a definite model.
In the following the simple fluid is chosen as a relaxation model because it is both familiar and elementary in structure.
Nevertheless, the procedure should apply to any system with welldefined equations of motion analogous to (77).
At t_{1} total system quantities such as E, N, V, etc., are fixed and define the eventual equilibrium state; indeed,
if we knew their values  and surely they could be predicted quite accurately with r_{t1} 
that state could be constructed immediately by means of the PME. But varying densities and currents
continue to exist in the stillinhomogeneous system for t ³ t_{1}, and will have to smooth out or vanish on the way to
equilibrium. As an example, the number density satisfies (77a) and, for t > t_{1}, there are no external forces so that the
relaxing particle current is found from Fick's law: áj(x,t)ñ = D Ñán(x,t)ñ,
in which it is presumed that D is independent of spacetime coordinates.
In both these expressions one can always replace ánñ by dn(x,t) º án(x,t)ñn_{0}, where n_{0} is a final value; it is the density of the eventual equilibrium state and is given in principle by
the above values N/V.
Combination of the two yields the wellknown diffusion equation
¶_{t}dn(x,t)=D Ñ^{2}dn(x,t) , (108) 

which is to be solved subject to knowing the initial value dn(x,t_{1}); this value can be taken as that
predicted by r_{t1}.
The next step is to construct the relaxation density matrix r_{r} at t=t_{1}+e in terms of the densities
án(x,t)ñ and áh(x,t)ñ in the simple fluid. Both ánñ
and áhñ are now solutions of ddeterministicequations ^{8} and do not depend on any previous values except the last
instant  only initial values are required. The counterpart of this in probability theory is a Markov process, which
is exactly what r_{r} will describe. The relaxation density matrix is then
r_{r}(t)= 
1
Z_{r}(t)

exp 
é ë

 
ó õ

V

b(x,t)h(x,t) d^{3}x + 
ó õ

V

l(x,t)n(x,t) d^{3}x 
ù û

, (109) 

and the Lagrange multipliers are now formally determined from
áh(x,t)ñ =  
d
db(x,t)

lnZ_{r}(b,m) , án(x,t)ñ =  
d
dl(x,t)

lnZ_{r}(b,m) . (110) 

The positive sign in front of the second integral in (109) is taken because in the Boltzmann region the chemical potential
is negative.
Also, the momentum density could have been included here as well, but these two operators are sufficient for the present
discussion.
The form of (109) requires further comment to avoid confusion. Contrary to the belief expressed in some works, it is
not a sensible procedure to construct a density matrix from data taken at a single point in time and expect the result
to adequately describe nonequilibrium processes. Not only is all history of what has been done to the system in the past
disregarded, but no information is provided to r about if, why, or how the input data are actually varying in time; the
resulting probability distribution cannot possibly have anything to say about timevarying quantities. But this is not
what has been done in (109); r_{r} can be constructed accurately at any instant for t > t_{1} because the input information
is available at any instant as a solution to (108) and its counterpart for dh. As a consequence one can consider
r_{r}(t) to evolve deterministically as well.
When the source is removed the total system maximum entropy for t > t_{1} is the relaxation entropy

= lnZ_{r}(t)+ 
ó õ

V

b(x,t)áh(x,t)ñ_{r} d^{3}x 
 
 
ó õ

V

l(x,t)án(x,t)ñ_{r} d^{3}x . 

(111) 
This is now the physical internal entropy of the system, for this is one of the few instances when the
entropy of the system can be disentangled from S_{t}.
Formal solutions to (108) and the corresponding heat equation are well known; we write the generic solution as
u(x,t) and express the initial value as u(x,0)=f(x). For a large volume, which we may as well take
as all space, a short calculation via Fourier analysis yields two equivalent forms for the solution:

= 
1
(4pDt)^{3/2}


ó õ

f(x¢)e^{(xx¢)2/4Dt} d^{3}x¢ 
 
= 
ó õ


d^{3}k
(2p)^{3}

e^{ik·x} f(k)e^{Dk2t} , 

(112) 
both of which vanish in the limit t®¥. The first form has the merit of demonstrating that it also vanishes as
x®¥, indicating that the densities smooth out at the distant boundaries at any time;
in addition, it readily reproduces the initial condition as t® 0. The second form, however, is a bit more transparent
and, depending on the spectral properties of f(k), reveals that the dominant wavelengths l will
determine the relaxation time t ~ l^{2}/D. Of course, t will be slightly different for dn and
dh, but they should be of the same order of magnitude.
These results demonstrate that in the system after the source is removed at t=t_{1} the macroscopic densities evolve
to constants:
In turn, this implies that in (111) those constants can be extracted from the integrals to yield the additional constants
b º òb(x,¥) d^{3}x, bm º òl(x,¥) d^{3}x. ^{9}
Because r_{r} can be reconstructed at any time, it must be that h(x,t)® H/V, n(x,t)® N/V, as well,
where H and N are the Hamiltonian and number operator, respectively. Our conclusion is that
r_{r}(t) 
®
t®¥

r_{eq}= 
1
Z_{G}

e^{b(HmN)} , (114) 

and the relaxation entropy increases monotonically to the equilibrium value:

1
k

S_{int}(t) 
®
t®¥


1
k

S_{eq}=lnZ_{G}+báHñ_{0} bmáNñ_{0} , (115) 

where b and the chemical potential m are the equilibrium parameters of the grand canonical distribution.
To see that this is actually an increase, integrate both quantities in (113) over all space, which
provides upper bounds N and E for each integral, respectively. But for any two integrable functions
f_{1}, f_{2}, with f_{1} £ C, an upper bound, it is a theorem that

ê ê

ó õ

b
a

f_{1}(x)f_{2}(x) dx 
ê ê

£ C 
ó õ

b
a

f_{2}(x) dx . (116) 

Hence, S_{int}(t) in (111) can never be larger than S_{eq} in (115), and we have demonstrated the monotonically
increasing approach to equilibrium. Note that this represents a complete equilibrium in that the system retains no
memory of anything done to it in the past; this memory loss obviously comes about because the evolution has been deterministic.
We have only a theoretical prediction, of course, and only experiment can confirm that it actually is an equilibrium state.
While (115) is in agreement with the Second Law, it is certainly not a
statement of that law, for we started from a nonequilibrium state.
The crucial point in this demonstration must be emphasized. What makes the procedure possible is the ability to construct a
density matrix
at any time using the information available at that time. It is the context based on probability theory and the PME that
allows introduction of r_{t1} and r_{r} above, thereby providing a cohesive description of relaxation.
5. Some Final Comments
A major aim of this exposition has been to demonstrate the broad applicability of Gibbs' variational principle
in governing all of statistical mechanics, once the notion of thermal driving by external sources is brought into the
picture. But what may tend to get lost here is the fundamental role of probability in illuminating the way
to a coherent theory, as was discused at length in I and II. When the entropy concept is realized to begin with a
rule for constructing prior probability distributions its physical manifestation becomes much clearer, in that it is
now seen, not as a property of the physical system, but of the macroscopic state of that system, or of a process
that is occurring in it. The singular nature of the equilibrium state has obscured this feature heretofore because these
different views coalesce there. This leads us to comment on a point often missed in phenomenological theories of
nonequilibrium thermodynamics, where there is a tendency to consider entropy as just another field variable and posit
the existence of an `entropy density' as if it were a conserved dynamical variable. While macroscopic dynamical
variables such as energy, particle number, angular momentum, etc., indeed emerge as average values of their microscopic
counterparts, entropy is only a macroscopic quantity that connects the two domains through probability theory.
Were it otherwise, the implication would be that when external sources are removed the total entropy is fixed as
that of the nonequilibrium state at that time; as we have seen, that cannot be so. This
doesn't mean that entropy is nonphysical when it stands still long enough to be measured in an equilibrium state, or when
characterizing the rate of relaxation to that state, but it does mean that it is not a dynamical variable; it is, however,
a functional of dynamical variables and changes in time only because they do. Boltzmann,
Gibbs, and Planck understood this point long ago, but somehow early in the 20th century it seems to have gotten lost,
and continues to be misconstrued.
With a straightforward demonstration of the approach to equilibrium in hand, it remains to address the question of
irreversibility. In the model above the relaxation is driven by density gradients in the system, whose decays are
described by the diffusion and heat equations; similar equations govern the relaxation in other models. Clearly
these equations are not timereversal invariant, and in equilibrium there are no gradients or external forces to
drive the system in reverse  the state is stable under small fluctuations. From a macroscopic point of view, then,
one can see why the relaxation process is irreversible, but this does nothing to explain why the microscopic
equations of motion, which are invariant under time reversal, cannot conspire to bring the system back to the
original nonequilibrium state it was in at t=t_{1}. After all, Poincaré's recurrence theorem is certainly true and
a manybody system left to itself will return to its initial microscopic state at some time in the future;
that time turns out to be something like 10^{1023} years, but the point remains.
It has long been understood,
though apparently not widely appreciated, that the necessary microscopic initial conditions for the reversed
microscopic motions
have probability close to zero of being realized in an equilibrium system with N >> 1 degrees of freedom, which does
fit in nicely with that huge recurrence time. The tool for proving this was provided long ago by Boltzmann in what may have
been the first connection between entropy and information. In the form given it by Planck, the maximum entropy is
written as S_{B}=klnW, where W is the measure of a volume in phase space or of a manifold in Hilbert space; it
measures the size of the set of Nparticle microstates compatible with the macroscopic constraints on the system. As
common sense would dictate, the greater the number of possibilities, the less certain we are of which microstate the
system might occupy; conversely, more constraints narrow the choices and reduce that uncertainty. Subject to those
constraints, a system will occupy the macrostate that can be realized in the greatest number of ways,
the state of maximum entropy; barring external intervention, microscopic dynamics will keep it there. In addition,
Boltzmann noted that there was nothing in this relation restricting it to equilibrium states.
Boltzmann's beautiful qualitative insight is readily quantified. Consider a system in a macrostate A_{1} with
entropy S_{1}=klnW_{1}, where W_{1} is the size of the set of states C_{1} compatible with the constraints
defining A_{1}. Now expose it to a positivesource process that carries it into a new macrostate A_{2} with
entropy S_{2}=klnW_{2} and set of compatible states C_{2}; by this is meant a source that adds thermal energy
or particle number, or perhaps increases
the volume. Although unnecessary, to keep the argument simple we consider these to be initial and final equilibrium states.
If this is a reproducible experiment  in the sense that A_{1} can be reproduced, but certainly not
any particular microstate in C_{1}  then it is surely necessary that W_{2} ³ W_{1}, which is already an elementary
statement of the Second Law. But just how much larger is the set C_{2}? From Boltzmann's
expression the ratio of phase volumes can be written

W_{1}
W_{2}

=exp 
æ è

 
S_{2}S_{1}
k

ö ø

. (117) 

If the difference in entropies is merely a nanocalorie at room temperature this number is
,
and the number of microstates compatible with A_{1} is vanishingly small compared with those compatible with A_{2}.
Therefore, to have any
reasonable probability of being realized the initial microstates required to regain the macrostate A_{1}
would have to be contained in the highprobability manifold C_{2}, whose microstates are all compatible
with the constraints defining the macrostate A_{2}, and not A_{1}; these requisite initial microstates must
therefore lie in the complement of C_{2} and have
very low probability. Moreover, this manifold of requisite initial microstates must have dimension about the same
as that of C_{1}, but from (117) this dimension is immeasurably
smaller than that of C_{2}, so that it is even less probable that these initial states could be
realized. As understood so clearly by Gibbs and Boltzmann, the
timereversed evolution is not impossible, it's just extremely improbable. At this point it is difficult to see what
more there is to say about the origin of thermodynamic irreversibility in macroscopic systems.
Appendix
The derivation of thermodynamics from probability theory cannot be considered
complete without establishing the relationship of expectation values to physically measurable values of
the associated macroscopic variables. To address this point, consider any
timedependent classical variable f(t), where it suffices to
suppress any other independent variables in this discussion, and consider just the equilibrium system.
Given any equilibrium
probability distribution for f(t), the best prediction we can make for the
variable, in the sense of minimum expected square of the error, is the
expectation value
independent of time. The reliability
of this prediction is determined, as always, by the expected square of
the deviation of f from the value (A.1), or the variance
again independent of time. Only if Df/áfñ << 1 is the
distribution making a sharp prediction, which is to expected for N >> 1 degrees of freedom.
Now (A.1) just reflects the value of f predicted by the probability
distribution, and is not necessarily the same as the value actually measured
for the single physical system being studied. Similarly, (A.2) is only a
measure of how well the distribution is predicting that expectation value;
it represents the statistical fluctuations, and may or may not correspond to
possible fluctuations of the physical quantity. Certainly knowledge that the
value of f is known only to ±1% does not imply that the physical value
actually fluctuates by ±1%. This is a point stressed repeatedly by E.T. Jaynes
in several different contexts, and we follow his arguments here (e.g., Jaynes, 1979).
Nevertheless, the reality of physical fluctuations
is not in doubt, as evidenced by the phenomena of Brownian motion, ccriticalopalescence,
and spontaneous voltage fluctuations in resistors at constant temperature, so that we
might expect a relationship between the two types.
To uncover possible connections of this kind we note that the value measured
in the laboratory is not an expectation value, but a time average:

f

º 
1
T


ó õ

T
0

f(t) dt , (A.3) 

where the averaging time T will be left unspecified for the moment.
The best prediction we can make for this measured value, then, is
á 
f

ñ = 

1
T


ó õ

T
0

f(t) dt 

= 
1
T


ó õ

T
0

áfñ dt , (A.4) 

or in equilibrium,
This is a rather general rule of probability theory: an expectation value
áfñ is not equivalent to a time average
, but it
is equal to the expectation value of that average.
Thus it seems that the predictions of statistical mechanics are clearly related
to measured physical values, if the prediction (A.5) is reliable. This again
is determined by the variance,
(D 
f

)^{2}= 
1
T^{2}


ó õ

T
0

dt 
ó õ

T
0

dt¢[áf(t)f(t¢)ñáf(t)ñáf(t¢)ñ] . (A.6) 

In equilibrium with timeindependent Hamiltonian the expectation values
depend only on t¢t. With some judicious changes of variable the
last expression can then be reduced to a single integral:
(D 
f

)^{2} = 
1
T^{2}


ó õ

T
0

(Tt) K_{ff}(t) dt , (A.7) 

in terms of the covariance function
K_{ff}(t) º áf(0)f(t)ñáfñ^{2} . (A.8) 

If the integrals in (A.7) converge as T®¥, then the correlations
will die off as 1/ÖT and hence
in the limit.
When this is true we can assert with some confidence that the expected measurable
value equals the expectation value;
otherwise, there is no sharp relation between expectation values and time
averages. Note, however, that there is no guarantee that the measured value will
actually be the same as that in (A.5); only experiment can verify that.
In the same vein, we can ask how the measurable meansquare fluctuations are related
to the statistical fluctuations. The timeaverage of the deviation from the measured
mean is

º 
1
T


ó õ

T
0

[f(t) 
f

]^{2} dt 
 

(A.9) 
and the expectation value is found to be
á(df)^{2}ñ = (Df)^{2} (D 
f

)^{2} . (A.10) 

Hence, measurable fluctuations can be the same as the statistical fluctuations only if the distribution
is such, and the averaging time so long, that
. Invariably this is
presumed to be the case.
Reliability of the prediction (A.10) is determined, as usual, by the variance á(df)^{4}ñá(df)^{2}ñ^{2}, which reduces to a 4fold time average of a
4point correlation function. Clearly, verification of any statement equating physical
fluctuations with statistical fluctuations will involve some decidedly nontrivial
calculations.
 Fox, R.F. (1978), ``Hydrodynamic fluctuation theories," J. Math. Phys. 19, 1993.
 Gibbs, J.W. (187578), ``On the Equilibrium of Heterogeneous Substances,'' Trans. Conn. Acad. Sci. III
108, 343. [Reprinted in The Scientific Papers of J. Willard Gibbs, Vol.1, Dover, NY, 1961.]
 (1902), Elementary Principles in Statistical Mechanics, Yale University Press, New Haven, Conn.
 Grandy, W.T., Jr.(1987), Foundations of Statistical Mechanics, Vol.I: Equilibrium Theory,
Reidel, Dordrecht.
 (1988), Foundations of Statistical Mechanics, Vol.II:
Nonequilibrium Phenomena, Reidel, Dordrecht.
 (2004a), ``Time Evolution in Macroscopic Systems. I: Equations of Motion," Found. Phys. 34, 1.
 (2004b), ``Time Evolution in Macroscopic Systems. II: The Entropy," Found. Phys. 34, 16.
 Heims, S.P. and E.T Jaynes (1962), ``Theory of Gyromagnetic Effects and
Some Related Magnetic Phenomena," Rev. Mod. Phys. 34, 143.
 Jaynes, E.T. (1957a), ``Information Theory and Statistical Mechanics,''
Phys. Rev. 106, 620.
 (1957b), ``Information Theory and Statistical Mechanics.II," Phys. Rev.
108, 171.
 (1979), ``Where Do We Stand On Maximum Entropy?," in R.D.Levine and M.
Tribus (eds.), The Maximum Entropy Formalism, M.I.T. Press, Cambridge,
MA.
 Jou, D., J. CasasVásquez, and G. Lebon (2001), Extended Irreversible Thermodynamics, Springer,
Berlin.
 Landau, L.D. and E.M. Lifshitz (1957), ``Hydrodynamic Fluctuations," Sov. Phys. JETP 5, 512.
[Zh. Eksp. Teor. Fiz. 32, 618 (1957).]
See, also, Fluid Mechanics, Pergamon, New York, 1959.
 Mitchell, W.C. (1967), ``Statistical Mechanics of Thermally Driven
Systems," Ph.D. thesis, Washington University, St. Louis, MO (unpublished).
 Morozov, V.G. (1984), ``On the Langevin Formalism for Nonlinear and Nonequilibrium Hydrodynamic Fluctuations,"
Physica A 126, 443.
 Pfaffelhuber, E. (1977), ``InformationTheoretic Stability and Evolution Criteria in Irreversible Thermodynamics,"
J. Stat. Phys. 16, 69.
 Puff, R.D. and N.S. Gillis (1968), ``Fluctuations and Transport Properties
of ManyParticle Systems," Ann. Phys. (N.Y.) 6, 364.
 Quentin, G. and I. Rehberg (1995), ``Direct Measurement of Hydrodynamic Fluctuations in a Binary Mixture,"
Phys. Rev. Letters 74, 1578.
 Schmitz, R. (1988), ``Fluctuations in Nonequilibrium Fluids," Phys. Repts. 171, 1.
 Schmitz, R. and E.G.D. Cohen (1985), ``Fluctuations in a Fluid under a Stationary Heat Flux. I. General Theory,"
J. Stat. Phys. 38, 285.
 Schlögl, F. (1971), ``Produced Entropy in Quantum Statistics," Z. Physik 249, 1.
 Shannon, C. (1948), ``Mathematical Theory of Communication," Bell System Tech. J. 27,379, 623.
 Shore, J.E. and R.W. Johnson (1980), ``Axiomatic Derivation of the Principle of Maximum Entropy and the
Principle of Minimum CrossEntropy," IEEE Trans. Inf. Th. IT26, 26.
 Snow, J.A. (1967) ``Sound Absorption in Model Quantum
Systems," Ph.D. thesis, Washington University, St. Louis, MO (unpublished).
 Truesdell, C. (1984), Rational Thermodynamics, Springer, New York.
Equations of Motion
The Entropy
Footnotes:
^{1}These references will be denoted as I and II, respectively, in what follows, the corresponding
equations then referred to as (In) and (IIn).
^{2}In this subsection we denote the maximumentropy function by S, without subscripts.
^{3}This characterization of thermal driving
was first introduced by Mitchell (1967).
^{4}The energy density is
usually what is driven; this will be considered presently when spatial variation is included.
^{5}This model was presented years ago by
Mitchell (1967) as an early example of modemode coupling.
^{6}The limit w® 0 eliminates the 3pointcorrelation term contributions
that are regular in w.
^{7}These quantities can be obtained by employing expressions such as (37) and (38), rather that adapting
the steadystate scenario.
^{8}This is not entirely accurate; while (108) is
mathematically deterministic, it has a strong inferential component in Fick's law and its variables are (sharp) expectation
values. A better choice might be `quasideterministic'.
^{9}More directly, the Lagrange
multiplier functions are evolving in parallel with the extensive variables and at the same rate.
File translated from
T_{E}X
by
T_{T}H,
version 3.10.
On 10 Oct 2003, 15:21.