Time Evolution In Macroscopic Systems.

II: The Entropy

... W.T. Grandy, Jr.

... Department of Physics & Astronomy, University of Wyoming

... Laramie, Wyoming 82071

Abstract. The concept of entropy in nonequilibrium macroscopic systems is investigated in the light of an extended equation of motion for the density matrix obtained in a previous study. It is found that a time-dependent information entropy can be defined unambiguously, but it is the time derivative or entropy production that governs ongoing processes in these systems. The differences in physical interpretation and thermodynamic role of entropy in equilibrium and nonequilibrium systems is emphasized and the observable aspects of entropy production are noted. A basis for nonequilibrium thermodynamics is also outlined.

1. Introduction

The empirical statement of the Second Law of thermodynamics by Clausius (1865) is

S(initial) £ S(final) , (1)

where S is the total entropy of everything taking part in the process under consideration, and the entropy for a single closed system is defined to within an additive constant by

S(2)-S(1) = ó
õ 2

1
dQ
T
= ó
õ 2

1
C(T) dT
T
, (2)

where C(T) is a heat capacity. The integral in (2) is to be taken over a reversible path, a locus of thermal equilibrium states connecting the macroscopic states 1 and 2, which is necessary because the absolute temperature T is not defined for other than equilibrium states; dQ represents the net thermal energy added to or taken from the system in the process. As a consequence, entropy is defined in classical thermodynamics only for states of thermal equilibrium. Equation (1) states that in the change from one state of thermal equilibrium to another along a reversible path the total entropy of all bodies involved cannot decrease; if it increases, the process is irreversible. That is, the integral provides a lower bound on the change in entropy. This phenomenological entropy is to be found from experimental measurements with calorimeters and thermometers, so that by construction it is a function only of the macroscopic parameters defining the macroscopic state of a system, S(V,T,N), say, where V and N are the system volume and particle number, respectively. It makes no reference to microscopic variables or probabilities, nor can any explicit time dependence be justified in the context of classical thermodynamics. Equation (1) is a statement of macroscopic phenomenology that cannot be proved true solely as a consequence of the microscopic dynamical laws of physics, as appreciated already by Boltzmann (1895): ``The Second Law can never be proved mathematically by means of the equations of dynamics alone." (Nor, for that matter, can the First Law!)

Theoretical definitions of entropy were first given in the context of statistical mechanics by Boltzmann and Gibbs, and these efforts culminated in the formal definition of equilibrium ultimately given by Gibbs (1902) in terms of his variational principle. In Part I (Grandy, 2003, preceding paper ¹) we observed that the latter is a special case of a more general principle of maximum information entropy (PME), and in equilibrium it is that maximum subject to macroscopic constraints that is identified with the experimental entropy of (2). One of the dominant concerns in statistical mechanics and thermodynamics has long been that of extending these notions unambiguously to nonequilibrium phenomena and irreversible processes, and it is that issue we shall address in this work.

How is one to define a time-dependent S(t) for nonequilibrium states? Do there even exist sensible physical definitions of experimental and theoretical `entropy' analogous to those describing equilibrium states? Other than S(t) and the density matrix r(t), what other key parameters might be essential to a complete description of nonequilibrium? These and other questions have been debated, sometimes heatedly, for over a century without any broad consensus having been reached; perhaps a first step toward clarifying the issue should be to understand the source of the differences of opinion that lead to so many different points of view. One problem, of course, is a lack of experimental guidance in determining those features of the phenomena that are really of fundamental importance, and associated with this has been the necessary restriction of theories to linear departures from equilibrium, owing to enormous calculational difficulties with the appropriate nonlinear forms. What happens, then, is that many theoretical descriptions of nonequilibrium systems tend to predict similar results in the linear domain and there is little to distinguish any fundamental differences that get to deeper matters.

In view of these obstacles it may be useful to look at the problem from a different perspective. Such sharp disagreements would seem to arise from different hidden premises in various approaches to nonequilibrium statistical mechanics, and we suggest here that these have much to do with differing views of the underlying probability theory and its precise role. We initiated an examination of this point in I, which culminated in the expression (I-40) as the appropriate form of the equation of motion for the density matrix. Our purpose here is to apply the implications of that result to a further study of time varying macroscopic systems.

As a preliminary step it might be helpful to note some overall features of entropy and nonequilibrium processes that have to be considered in any approach to the problem of generalizing S. Suppose we prepare a system in a nonequilibrium state by applying an external force of some kind that substantially perturbs the equilibrium system, possibly increasing its energy and adding matter to it, for example. This state is defined by removing the external source at time t=t₀, and at that instant it is described by a density matrix r(t₀). Whether or not we can define a physical entropy at that time, we can certainly compute the information entropy of that nonequilibrium state as S_I(t₀)=-kTr[r(t₀)lnr(t₀)]. Because the system is now isolated, r(t) can only evolve from r(t₀) by unitary transformation and S_I remains constant into the future. ² What happens next?

At the cutoff t=t₀ the entropy S_I(t₀) refers only to the nonequilibrium state at that time. In the absence of any other external influences we expect the system to relax into a new equilibrium state, for no other reason than it is the one that can be realized in the overwhelmingly greatest number of ways subject to the appropriate macroscopic constraints. The interesting thing is that often those constraints are already fixed once the external sources are removed, so that the total energy, particle number, volume, etc. at t₀ are now determined for t > t₀. Density inhomogeneities may remain, of course, which will relax to uniformity over the relaxation period, or to equilibrium distributions in a static external field. The entropy of the final equilibrium state is definitely not S_I(t₀), but it is in principle known well before equilibrium is reached: it is the maximum of the information entropy subject to constraints provided by the values of those thermodynamic variables at t=t₀. We may or may not know these values, of course, although a proper theory might predict them; but once the final macrostate is established they can be measured and a new r_f calculated by means of the PME, and hence a new entropy predicted for comparison with the experimental form of Clausius; indeed, in equilibrium the Clausius entropy (2) is an upper bound for S_I. Thus, in this relaxation mode we may not see a nice, continuous, monotonically increasing entropy function that can be followed into the equilibrium state; but that's not too surprising, given that we know r(t₀) cannot evolve unitarily into r_f. (More about relaxation later.) There remains a significant dynamical evolution during this relaxation period, but it is primarily on a microscopic level; its macroscopic manifestation is to be found in the relaxation time, and the possible observation of decaying currents. One thing we might compute and measure in this mode is that relaxation time, which does not necessarily have an immediate connection with entropy. (There may, however, exist a `relaxation entropy' associated with the redistribution of energy, say, during the relaxation period.)

Ironically, the equilibrium state described so well by classical thermodynamics is essentially a dead end; it is a singular limit in the sense discussed by Berry (2002). Equilibrium is actually a very special, ideal state, for most systems are not usually in equilibrium, at least not completely. As external influences, and therefore time variations, become smaller, the system still remains in a nonequilibrium state evolving in time. In the limit there is a discontinuous qualitative change in the macroscopic system and its description. That is, there is no longer either an `arrow of time' or a past history ³, and the main role of the theory is to compare neighboring states of thermal equilibrium without regard for how those states might have been prepared.

It has long been understood, though not widely, that entropy is not a property of a physical system per se, but of the thermodynamic system describing it, and the latter is defined by the macroscopic constraints imposed. The above remarks, however, lead us to view entropy more as a property of the macrostate, or of the processes taking place in a system. In the equilibrium state these distinctions are blurred, because the thermodynamic system and the macrostate appear to be one and the same thing and there are no time-dependent macroscopic processes. It will be our goal in the following paragraphs to clarify these comments, as well as to provide an unambiguous definition of entropy in nonequilibrium systems, and to understand the possibly very different roles that the entropy concept plays in the two states.

2. Some Preliminary Extensions of the Equilibrium Theory

In I we briefly outlined the variational method of constructing an initial density matrix that could then evolve in time via the appropriate equations of motion. A principal application of that construction is to equilibrium systems, in which case the quantum form of (I-14) becomes, as in (I-10),

r = 1
Z
e^-bH , Z(b)=Tre^-bH , (3)

where H is the system Hamiltonian and b = (kT)^-1. But, if there is no restriction to constants of the motion, the resulting state described by (3) could just as well be one of nonequilibrium based on information at some particular time. Data given only at a single point in space and time, however, can hardly serve to characterize a system whose properties are varying over a space-time region, so a first generalization of the technique is to information available over such regions. Thus, the main task in this scenario is to gather information that varies in both space and time and incorporate it into a density matrix describing a nonequilibrium state. Given an arbitrary but definite thermokinetic history, we can look for what general behavior of the system can be deduced from only this. The essential aspects of this approach were first expounded by Jaynes (1963, 1967, 1979).

To illustrate the method of information gathering, consider a system with a fixed time-independent Hamiltonian and suppose the data to be given over a space-time region R(x,t) in the form of an expectation value of a Heisenberg operator F(x,t). We are reminded that the full equation of motion for such operators, if they are also explicitly time varying, is

i ℏ
×

F

=[F,H]+¶_t F .      (4)

When the data vary continuously over R their sum becomes an integral and there is a distinct Lagrange multiplier for each space-time point. Maximization of the entropy subject to the constraint provided by that information leads to a density matrix describing this macrostate:

r = 1
Z
exp é
ë - ó
õ

R
l(x,t)F(x,t) d³x dt ù
û ,      (5)

where

Z[l(x,t)]=Trexp é
ë - ó
õ

R
l(x,t)F(x,t) d³x dt ù
û      (6)

is now the partition functional. The Lagrange multiplier function is identified as the solution of the functional differential equation

áF(x,t)ñ º Tr[rF(x,t)]=- d
dl(x,t)
lnZ ,       (x,t) Î R ,      (7)

and is defined only in the region R. Note carefully that the data set denoted by áF(x,t)ñ is a numerical quantity that has been equated to an expectation value to incorporate it into a density matrix. Any other operator J(x,t), including J=F, is determined at any other space-time point (x,t) as usual by

áJ(x,t)ñ = Tr é
ë rJ(x,t) ù
û =Tr é
ë r(t)J(x) ù
û .     (8)

That is, the system with fixed H still evolves unitarily from the initial nonequilibrium state (5); although r surely will no longer commute with H, its eigenvalues nevertheless remain unchanged.

Inclusion of a number of operators F_k, each with its own information-gathering region R_k and its own Lagrange multiplier function l_k, is straightforward. If F is actually time independent an equilibrium distribution of the form (3) results, and a further removal of spatial dependence brings us back to the canonical distribution of the original PME. But the full form (5) illustrates how r naturally incorporates memory effects while placing no restrictions on spatial or temporal scales. Nor are there any issues of retardation, for example, since the procedure is a matter of inference, not dynamics (at this point).

Some further discussion is required here. The density matrix r in (5) is not a function of space and time; it merely provides an initial nonequilibrium distribution corresponding to data áF(x,t)ñ Î R. Lack of any other information outside R - in the future, say - may tend to render r less and less reliable, and the quality of predictions may deteriorate (fading memory). The maximum entropy itself is a functional of the initial values áF_k(x,t)ñ Î R_k and follows from substitution of (5) into the information entropy:

S_noneq[{áF_kñ}] º klnZ[{l_k}]+k
å
k
ó
õ

R_k
l_k(x,t)áF_k(x,t)ñ d³x dt . (9)

Although there is no obvious connection of S_noneq with the thermodynamic entropy, it does provide a measure of the number of microscopic states consistent with the history of a system over the R_k(x,t); it might thus be interpreted as the physical entropy of the initial nonequilibrium state (5). If we visualize the evolution of a microstate as a path in `phase space-time', then S_noneq is the cross section of a tube formed by all paths by which the given history could have been realized, a natural extension of Boltzmann's S_B=klnW, where W is a measure of the set of microscopic states compatible with the macroscopic constraints on the system. In this sense S_noneq governs the theory of irreversible processes in much the same way as the Lagrangian governs mechanical processes. The role of entropy is thus greatly expanded to describe not only the present nonequilibrium state, but also the recent thermokinetic history leading to that state. We begin to see that here, unlike the equilibrium situation, entropy is intimately related to processes.

If the information-gathering region R is simply a time interval we arrive at the initial state r(t₀) considered in the previous section. Restriction of R to only a spatial region leads to a description of inhomogeneous systems. For example, specifying the particle number density án(r)ñ throughout the system, in addition to H, constitutes a separate piece of data at each point in the volume, and hence requires a corresponding Lagrange multiplier at each point. The distribution (3) is then replaced by

r = 1
Z
exp é
ë -bH+ ó
õ l(r¢)n(r¢) d³r¢ ù
û , (10)

which reduces to the grand canonical distribution if n(r) is in fact spatially constant throughout V, or if only the volume integral of n(r) is specified. A similar expression is obtained if, rather than specifying or measuring án(r)ñ, the inhomogeneity is introduced by means of an external field coupled to n(r). In that case l(r) is given as a field strength and án(r)ñ is to be determined; that is, l is taken as an independent variable. Extensive application of (10) to inhomogeneous systems is given in the review by Evans (1979).

To this point there has been no mention of dynamic time evolution; we have only described how to construct a single, though arbitrary, nonequilibrium macrostate based on data ranging over a space-time region. A first step away from this restriction is to consider steady-state systems, in which there may be currents, but all variables are time independent. The main dynamical features of the equilibrium state are that it deals only with constants of the motion, among which is the density matrix itself: [H,r]=0. These constraints characterize the time-invariant state of an isolated (or closed) system, because the vanishing of the commutator implies that r commutes with the time-evolution operator, so that all expectation values are constant in time. The time-invariant state of an open system is also stationary, but H almost certainly will not commute with the operators {F_k} defining that state, and hence not with r. Nevertheless, we can add that constraint explicitly as the definition of a steady-state probability distribution, and the requirement that [r,H]=0 leads to the result that only that part of F_k that is diagonal in the energy representation is to be included in r. ⁴ It is reasonably straightforward to show (e.g., Grandy, 1988) that a representation for the diagonal part of an operator is given by

F^d

=F-
lim
e® 0⁺
ó
õ 0

-¥
e^et ¶_t F(x,t) dt

=
lim
e® 0⁺
e ó
õ 0

-¥
e^et F(x,t) dt

=
lim
t®¥
1
t
ó
õ 0

-t
F(x,t) dt ,
(11)
where the time dependence of F is determined by (4), and e > 0. The second line follows from an integration by parts; the third is essentially Abel's theorem and equates the diagonal part with a time average over the past, which is what we might expect for a stationary process. That is, F^d is that part of F that remains constant under a unitary transformation generated by H.

Consider a number of operators F_k(x) defining a steady-state process. Then the steady-state distribution r_ss is simply a modification of that described by (5) and (6):

r_ss = 1
Z_ss
exp é
ë -
å
k
ó
õ

R_k
l_k(x)F^d_k(x) d³x ù
û , (12)

where

Z_ss[l(x)]=Trexp é
ë -
å
k
ó
õ

R_k
l_k(x)F^d_k(x) d³x ù
û . (13)

We illustrate some applications of these expressions further on, but note that in their full nonlinear form they present formidable difficulties in calculations.

This last caveat suggests that we first examine small departures from equilibrium, much in the spirit of Eqs.(I-17)-(I-19). Suppose the equilibrium distribution to be based on expectation values of two variables, áfñ and ágñ, with corresponding Lagrange multipliers l_f, l_g. We also suppose that no generalized work is being done on the system, so that only `heat-like' sources may operate. A small change from the equilibrium distribution can be characterized by small changes in the Lagrange multipliers, which in turn will induce small variations in the expectation values. Thus,

dáfñ

= ¶áfñ
¶l_f
dl_f+ ¶áfñ
¶l_g
dl_g ,
(14a)
dágñ

= ¶ágñ
¶l_f
dl_f+ ¶ágñ
¶l_g
dl_g .
(14b)

But from (I-13b) and (I-14)) the negatives of these derivatives are just the covariances of f and g,

K_fg=K_gf

º - ¶áfñ
¶lg
=- ¶ágñ
¶l_f

=áfgñ-áfñágñ ,
(15)
so (14) reduce to the matrix equation

æ
ç
ç
ç
è

dáfñ
dágñ
ö
÷
÷
÷
ø =- æ
ç
ç
ç
è

K_ff

K_fg
K_gf

K_gg
ö
÷
÷
÷
ø æ
ç
ç
ç
è

dl_f
dl_g
ö
÷
÷
÷
ø . (16)

In references on irreversible thermodynamics (e.g., de Groot and Mazur, 1962) the quantities on the left-hand side of (16) are called fluxes, and the (-dl)s are thought of as the forces that drive the system back to equilibrium. We can thus think of the ls as potentials that produce such forces. Linear homogeneous relations such as (16) were presumed by Onsager (1931), but here they arise quite naturally, and in (15) we observe the celebrated Onsager reciprocity relations.

Suppose now that we add another constraint to the maximum-entropy construction by letting f be coupled to a weak thermal source. In addition, we shall specify that g is explicitly not driven, so that any internal changes in it can only be inferred from the changes in f. We thus set dl_g=0 in (16) and those equations reduce to

dáfñ

=-K_ffdl_f=dQ_f ,
dágñ

=-K_gfdl_f .
(17)
So for small variations the change in the coupled variable is essentially the source strength itself; the internal change in g is also proportional to that source strength, but modulated by the extent to which g and f are correlated:

dágñ = K_gf
K_ff
dQ_f , (18)

exhibiting what is sometimes referred to as mode-mode coupling. These expressions are precisely what one expects from a re-maximization of the entropy subject to a small change dáfñ. For example, if dQ_f > 0 and f and g are positively correlated, K_gf > 0, then we expect increases in the expectation values of both quantities, as well as a corresponding increase in the maximum entropy.

Although this discussion of small departures from equilibrium is only a first step, it reinforces, and serves as a guide to, the important role of sources in any deeper theory. It also exhibits the structure of the first approximation, or linearization of such a theory, which is often a necessary consideration. We return to the essential aspects of that approximation a bit later.

3. Sources and Thermal Driving

We seek a description of macroscopic nonequilibrium behavior that is generated by an arbitrary source whose precise details may be unknown. One should be able to infer the presence of such a source from the data, and both the strength and rate of driving of that source should be all that are required for predicting reproducible effects. Given data - expectation values, say - that vary continuously in time, we infer a source at work and expect r to be a definite function of time, possibly evolving principally by external means. In I we argued that, because all probabilities are conditional on some kind of given information or hypothesis, P(A_i|I) can change in time only if the information I is changing in time, while the propositions {A_i} are taken as fixed. This then served as the basis for an abstract model of time-dependent probabilities. With this insight we can see how the Gibbs algorithm might be extended to time-varying macroscopic systems in a straightforward manner.

As in I, information gathered in one time interval can certainly be followed by collection in another a short time later, and can continue to be collected in a series of such intervals, the entropy being re-maximized subject to all previous data after each interval. Now let those intervals become shorter and the intervals between them closer together, so that by an obvious limiting procedure they all blend into one continuous interval whose upper endpoint is always the current moment. Thus, there is nothing to prevent us from imagining a situation in which our information or data are continually changing in time. A rationalé for envisioning re-maximization to occur at every moment, rather than all at once, can be found by again appealing to Boltzmann's expression for the entropy: S_B=lnW. At any moment W is a measure of the phase volume of all those microstates compatible with the macroscopic constraints - and lnW is the maximum of the information entropy at that instant. As Boltzmann realized, this is a valid representation of the maximized entropy even for a nonstationary state. It is essential to understand that W is a number representing the multiplicity of a macrostate that changes only as a result of changing external constraints. It is not a descriptor of which microscopic arrangements are being realized by the system at the moment - there is no way we can ascertain that - but only a measure of how many such states may be compatible with the macrostate defined by those constraints. In principle we could always compute a W for a set of values of the macroscopic constraints without ever carrying out an experiment. Thus, we begin to see how an evolving entropy can possibly be related to the time-dependent process.

There may seem to be a problem here for someone who thinks of probabilities as real physical entities, since it might be argued that the system cannot possibly respond fast enough for W to readjust its content instantaneously. But it is not the response of the system that is at issue here; only the set of possible microstates compatible with the present macroscopic constraints readjusts. Those potentialities always exist and need no physical signal to be realized. A retardation problem might exist if we were trying to follow the system's changing occupation of microstates, but we are not, because we cannot. The multiplicity W does not change just because the microstate occupied by the system changes; in equilibrium those changes go on continuously, but W remains essentially constant. Only variations in the macroscopic constraints can change W, and those are instantaneous and lead to immediate change in the maximum information entropy S_B.

To introduce the notion of a general source let us consider a generic system described by a density matrix

r = 1
Z
e^aA+bB+gC ,      (19)

and a process that drives the variable B such that an amount DB is transferred into the system. That is, B is driven by some means other than dynamically, with no obvious effective Hamiltonian. In addition, the variable A is explicitly not driven, but can change only as a result of changes in B if A and B are correlated. Since there is no new information regarding A, even though it is free to readjust when B is changed, the Lagrange multiplier a must remain unchanged. We also add the further constraint on the process that C is to remain unchanged under transfer of DB. This is a generalization of the scenario described by (16), and can be summarized as follows:

da

=0 ,        áAñ® áAñ¢ ,
db

¹ 0 ,        áBñ® áBñ+DB ,
dg

=- K_CB
K_CC
db ,        áCñ® áCñ .
     (20)
This is the most general form of a constrained driven process, except for inclusion of a number of variables of each kind. Any such driving not tied to a specific dynamic term in a Hamiltonian will be referred to as thermal driving. A variable, and therefore the system itself, is said to be thermally driven if no new variables other than those constrained experimentally are needed to characterize the resulting state, and if the Lagrange multipliers corresponding to variables other than those specified remain constant. As discussed in I, a major difference with purely dynamic driving is that the thermally-driven density matrix is not constrained to evolve by unitary transformation alone.

Let us suppose that the system is in thermal equilibrium with time-independent Hamiltonian in the past, and then at t=0 a source is turned on smoothly and specified to run continuously, as described by its effect on the expectation value áF(t)ñ. That is, F(t) is given throughout the changing interval [0,t] and is specified to continue to change in a known way until further notice. ⁵ Although any complete theory of nonequilibrium must be a continuum field theory, we shall omit spatial dependence explicitly here in the interest of clarity and return to address that point later. For convenience we consider only a single driven operator; multiple operators, both driven and constrained, are readily included. Based on the probability model of I, the PME then provides the density matrix for thermal driving:

r_t

= 1
Z_t
exp é
ë -bH- ó
õ t

0
l(t¢)F(t¢) dt¢ ù
û ,
Z_t[b,l(t)]

=Tr exp é
ë -bH- ó
õ t

0
l(t¢)F(t¢) dt¢ ù
û ,
(21)
and the Lagrange-multiplier function is formally obtained from

áF(t)ñ_t=- d
dl(t)
lnZ_t , (22)

for t in the driving interval. Reference to the equilibrium state is made explicit not only because it provides a measure of how far the system is removed from equilibrium, but also because it removes all uncertainty as to the previous history of the system prior to introduction of the external source; clearly, these are not essential features of the construction.

Since r_t can now be considered an explicit function of t, we can employ the operator identity
¶_xe^A(x)=e^A(x)

¶_xA

to compute the time derivative:

¶_tr_t=r_tl(t) é
ë áF(t)ñ_t-

F(t)

ù
û , (23)

where the overline denotes a generalized Kubo transform with respect to the operator lnr_t:

F(t)

º ó
õ 1

0
e^-ulnr_t F(t)e^ulnr_t du , (24)

which arises here from the possible noncommutativity of F(t) with itself at different times.

The expression (23) has the form of what is often called a `master equation', but it has an entirely different origin and is exact; it is, in fact, the ¶_tr term in the equation of motion (I-40). Because l(t) is defined only on the information-gathering interval [0,t], Eq.(23) just specifies the rate at which r_t is changing in that interval. Although r_t does not evolve by unitary transformation under time-independent H in the Heisenberg picture, in this case it does evolve explicitly, and in the Schrödinger picture this time variation will be in addition to the canonical time evolution. In turn, an unambiguous time dependence for the entropy is implied, as follows.

The theoretical maximum entropy S_t=-kTr[r_tlnr_t] is obtained explicitly by substitution from (21),

1
k
S_t=lnZ_t+báH ñ_t + ó
õ t

0
l(t¢)áF(t¢)ñ_t dt¢ ; (25)

it is the continuously re-maximized information entropy. Equation (25) indicates explicitly that áH ñ_t changes only as a result of changes in, and correlation with F. The constraint that H is explicitly not driven implies that áH ñ_t and áF(t¢)ñ_t are no longer independent, and that means that l(t) cannot be determined directly from S_t by functional differentiation in (25); this has important consequences.

The expectation value of another operator at time t is áC ñ_t=Tr[r_t C], and direct differentiation yields

d
dt
áC(t)ñ_t

=Tr é
ë C(t)¶_tr_t +r_t
×

C

(t) ù
û

=á
×

C

(t)ñ_t -l(t)K_CF^t(t,t) ,
(26)
where the superposed dot denotes a total time derivative. We have here introduced the covariance function

K_CF^t(t¢,t) º á

F(t¢)

C(t)ñ_t-áF(t¢)ñ_táC(t)ñ_t = - dáC(t)ñ_t
dl(t)
, (27)

which is a quantum mechanical generalization of the static covariance (15). Note that all of the preceding entities are completely nonlinear, in that expectation values, Kubo transforms, and covariance functions are all written in terms of the density matrix r_t, which is the meaning of the superscript t on K_CF^t. Although time-translation invariance is not a property of the general nonequilibrium system, it is not difficult to show that the reciprocity relation K^t_CF(t¢,t)=K^t_FC(t,t¢) is valid.

Let us introduce a new notation into (26), which at first appears to be only a convenience:

s_C(t) º d
dt
áC(t)ñ_t-á
×

C

(t)ñ_t = -l(t)K_CF^t(t,t) . (28)

For a number of choices of C and F the equal-time covariance function vanishes, but if C=F an illuminating interpretation first noticed by Mitchell (1967) emerges:

s_F(t)

º d
dt
áF(t)ñ_t-á
×

F

(t)ñ_t

=-l(t)K_FF^t(t,t) .
(29)
Owing to the specification of thermal driving, dáF(t)ñ_t/dt is the total time rate-of-change of áF(t)ñ_t in the system at time t, whereas
á
×

F

(t)ñ_t

is the rate of change produced by internal relaxation. Hence, s_F(t) must be the rate at which F is driven or transferred by the external source, and is often what is measured or controlled experimentally. One need know nothing else about the details of the source, because its total effect on the system is expressed by the second equality in (29), which is similar to the first line of (17). If the source strength is given, then (29) is a nonlinear transcendental equation determining the Lagrange multiplier function l(t).

An important reason for eventually including spatial dependence is that we can now derive the macroscopic equations of motion. For example, if F(t) is one of the conserved densities e(x,t) in a simple fluid and J(x,t) the corresponding current density, then the local microscopic continuity equation

×

e

(x,t)+Ñ·J(x,t)=0 (30)

is satisfied irrespective of the the state of the system. When this is substituted into (29) we obtain the macroscopic conservation law

d
dt
áe(x,t)ñ_t +Ñ·áJ(x,t)ñ_t = s_e(x,t) , (31)

which is completely nonlinear. Specification of sources therefore provides automatically the thermokinetic equations of motion; for example, if e is the momentum density mj(x,t), so that J is the stress tensor T_ik, then a series of transformations turns (31) into the Navier-Stokes equations of fluid dynamics.

Nonequilibrium Thermodynamics

The notion of thermal driving provides a basis for nonequilibrium thermodynamics, which can be developed in much the same way as is done for the equilibrium theory (e.g., Grandy, 1987). As with that case, the operator F can also depend on an external variable a, so that at time t the entropy is S_t=S_t[áHñ_t, áF(t)ñ_t; a]; of course, we could also include a number of other measured variables {F_i}, though only H and F will be employed here. But now S_t is also a function of time and, from (25), its total time derivative is

1
k
dS_t
dt
= æ
è ¶lnZ_t
¶a
ö
ø
×

a

+b dáHñ_t
dt
-l(t) ó
õ t

0
l(t¢)K^t_FF(t,t¢) dt¢ . (32)

Although ¶_t Z_t contributes to

×

S

t

, its contribution is cancelled because

¶_tlnZ_t = -l(t)áF(t)ñ_t , (33)

which also provides a novel representation for Z_t upon integration. In principle, then, one can follow the increase (or decrease) of entropy in the presence of external sources (or sinks).

The most common type of external variable a is the system volume V, so that in the equilibrium theory (¶áHñ/¶V)dV=-P dV is an element of work. This suggests a general interpretation of the first term on the right-hand side of (32). As an example, in the present scenario consider the simple process of an adiabatic free expansion of a gas, wherein only the work term is involved in (32). We can now model this by specifying a form for a = V; for example,
V(t)=V₀ æ
è 2-e^-bt ö
ø

would, for b very large, rapidly inflate the volume to double its size over an interval from t=0 to some later time t. The coefficient of

×

a

in (32) is proportional to the pressure, so that one also needs an equation of state for the gas; but usually the pressure is proportional to V^-1 and therefore decreases exponentially as well. In the case of an ideal gas, integration of this form for

×

S

t

over (0,t) yields the expected change S_t-S₀=kNln2. This result is almost independent of the model as long as V(t) @ 2V₀.

Ordinarily

×

a

=0

. In this case we can also explicitly evaluate the term containing the Hamiltonian and rewrite (31) as

1
k
dS_t
dt

=-bl(t)K^t_HF(t,0)-l(t) ó
õ t

0
l(t¢)K^t_FF(t,t¢) dt¢

=g_F(t)s_F(t) ,
(34)
where we have employed (29) and defined a new parameter

g_F(t) º b K^t_HF(t,0)
K^t_FF(t,t)
+ ó
õ t

0
l(t¢) K^t_FF(t,t¢)
K^t_FF(t,t)
dt¢ . (35)

Although this expression for g at first glance seems only a bookkeeping convenience, it is actually of some physical significance, as suggested by (20). As noted above, the thermal driving constraint on H prevents áHñ_t and áF(t)ñ_t from being completely independent; indeed, neither of them is independent of áF(t¢)ñ_t. In turn, and unlike the equilibrium case, ¶áf_mñ/¶l_n and ¶l_n/ ¶áf_mñ are no longer the respective elements of a pair of mutually inverse matrices. Thus, dS_t/dáF(t)ñ_t does not determine l(t); rather, from (25),

dS_t
dáF(t)ñ_t
= dáHñ_t
dl(t)
dl(t)
dáF(t)ñ_t
+ ó
õ t

0
l(t¢) dáF(t¢)ñ_t
dl(t)
dl(t)
dáF(t)ñ_t
dt¢ . (36)

Owing to interdependencies we can now write dl(t)/dáF(t)ñ_t=1/K^t_FF(t,t), and hence the right-hand side of (36) is just g_F(t), which now has the general definition

g_F(t) º æ
è dS_t
dáF(t)ñ_t
ö
ø

[(thermal) || (driving)]
. (37)

The subscript ``thermal driving" reminds us that this derivative is evaluated somewhat differently than in the equilibrium formalism. When the source strength s_F(t) is specified the Lagrange multiplier itself is determined from (29).

Physically, g_F is a transfer potential in the same sense that the ls in Eq.(16) are thought of as potentials. Just as products of potentials and expectation values appear in the structure of the equilibrium entropy, in thermal driving the entropy production (34) is always a sum of products of transfer potentials and source terms measuring the rate of transfer. So, the entropy production is not in general given by products of `fluxes' and `forces', and S_t and

×

S

t

are not simple generalizations of equilibrium quantities. But the ordinary potentials also play another role in equilibrium: if two systems in contact can exchange energy and particles, then they are in equilibrium if the temperatures and chemical potentials of the two are equal. Similarly, if two systems can exchange quantities F_i under thermal driving, then the conditions for migrational equilibrium at time t are

g_{F_i}(t)₁=g_{F_i}(t)₂ . (38)

Migrational equilibrium in stationary processes is discussed, for example, by Tykodi (1967).

What is the physical interpretation to be given to S_t? Clearly it refers only to the information encoded in the distribution of (21) and cannot refer to the internal entropy of the system. In equilibrium the maximum of this information entropy is the same as the experimental entropy, but that is not necessarily the case here. For example, if the driving is removed at time t=t₁, then S_t₁ in (25) can only provide the entropy of that nonequilibrium state at t=t₁; its value will remain the same during subsequent relaxation, owing to unitary time evolution. Although the maximum information (or theoretical) entropy provides a complete description of the system based on all known physical constraints on that system, it cannot describe the ensuing relaxation, for it contains no new information about that process. Nevertheless, S_t does have a definite physical interpretation.

The form of s_F in (29) suggests a natural separation of the entropy if that expression is substituted into the second line of (34):

1
k

×

S

t
=g_F(t) æ
è d
dt
áF(t)ñ_t -á
×

F

(t)ñ_t ö
ø . (39)

Thus,

×

S

t

has the qualitative form

×

Q

/T

, as intuition might have suggested. The first term on the right-hand side of (39) must represent the total time rate-of-change of entropy

×

S

tot

arising from the thermal driving of F(t), whereas the second term is the rate-of-change of internal entropy

×

S

int

owing to relaxation. Thus, the total rate of entropy production can be written

×

S

tot
(t)=
×

S

t
+
×

S

int
(t) , (40)

where the entropy production of transfer owing to the external source,

×

S

t

, is given by (34). This latter quantity is a function only of the driven variable F(t), whereas the internal entropy depends on all variables, driven or not, necessary to describe the nonequilibrium state and is determined by the various relaxation processes taking place in the system. Calculation of

×

S

int

, of course, depends on a rather detailed model of the system; we'll have more to say on this below. ⁶

In an equilibrium system the major role of S is associated with the Second Law, and this law in its traditional form has little to say about nonequilibrium processes. In these latter processes, however, it is

×

S

t

, rather than S_t itself that plays the major role, as is seen in (34)-(37). That is,

×

S

t

governs the transfer process in terms of the rate of driving and the transfer potential, in much the same way that S governs the direction of changes between equilibrium states through dQ/T. In nonequilibrium processes

×

S

t

also governs the rate; this is true even in the steady state when one takes into account sources and sinks.

The distinction between theoretical entropy in equilibrium scenarios and in nonequilibrium processes cannot be emphasized enough. If external forces are removed, it is a mathematical theorem that neither r_t nor S_t can evolve into their equilibrium counterparts. This is a singular limit, as discussed earlier, and unless these distinctions are clearly recognized few real advances can be made in nonequilibrium statistical mechanics.

Constant Driving Rate and Spatial Variation

To complete the general development, logical consistency requires an examination of thermal driving at a constant rate. For this purpose it will first be useful to record the generalizations of the primary equations of thermal driving to include spatial coordinates:

¶_tr_t = r_t ó
õ l(x¢,t) é
ë áF(x¢,t)ñ_t-

F(x¢,t)

ù
û d³x¢ ,      (41)

1
k
S_t = lnZ_t +báHñ_t + ó
õ d³x¢ ó
õ t

0
dt¢ l(x¢,t¢)áF(x¢,t¢)ñ_t ,      (42)

1
k

×

S

t

=-b ó
õ l(x¢,t)K_HF^t(x¢,t) d³x¢

      - ó
õ d³x^¢¢l(x^¢¢,t) ó
õ d³x¢ ó
õ t

0
dt¢l(x^¢¢,t¢)K_FF^t(x^¢¢,t;x¢,t¢) ,
     (43)

s_F(x,t) = - ó
õ l(x¢,t)K_FF^t(x¢,t;x,t) d³x¢ .      (44)

This last expression can be inverted by introducing an inverse integral operator:

l(x,t) = - ó
õ é
ë K^t_FF(x¢,t;x,t) ù
û -1

s_F(x¢,t) d³x¢ ,      (45)

which is a nonlinear integral equation for l(x,t). Thus, the right-hand side of (45) is really only a shorthand notation for the iterated solution. Upon substitution of (45) into (43) we find that

1
k

×

S

t
= ó
õ g_F(x,t)s_F(x,t) d³x ,      (46)

where

g_F(x,t)

º b ó
õ d³x¢ é
ë K^t_FF(x,t;x¢,t) ù
û -1

K^t_HF(x¢,t)

       + ó
õ d³x¢ ó
õ d³x^¢¢ ó
õ t

0
dt¢l(x^¢¢,t) é
ë K^t_FF(x,t;x¢,t) ù
û -1

K_FF^t(x¢,t;x^¢¢,t¢) .
     (47)

We can verify this expression for g_F from the more general definition

g_F(x,t) º æ
è dS_t
dáF(x,t)ñ_t
ö
ø

[(thermal) || (driving)]
, (48)

if we note two properties of functional differentiation. First, the ordinary chain rule for partial differentiation of F[x(s),y(s)] with respect to s,

¶F
¶s
= ¶F
¶x
¶x
¶s
+ ¶F
¶y
¶y
¶s
,

generalizes to

dáG(x,t)ñ
dáF(x,t)ñ

= ó
õ dáG(x,t)ñ
dl_F(x¢,t)
dál_F(x¢,t)
dáF(x,t)ñ
d³x¢

= ó
õ K^t_GF(x¢,t;x,t) é
ë K^t_FF(x,t;x¢,t) ù
û -1

d³x¢ ,
(49)
for example. Second, in Eq.(42) for S_t the upper limit t on the time integral, and the subscript on áHñ_t, prevent the functional derivative from yielding merely l(x,t), which is determined by s_F(x,t) at any rate. Rather, we obtain (47) for g_F(x,t).

Specification of constant driving means that s_F is constant in time, and from (29) or (44) this in turn implies that l(x,t) must actually be independent of time in the steady state. This last assertion follows because the covariance function in these equations is time independent, owing to the re-emergence of unitary time evolution in the absence of internal time variation. That is, the integrals in (21), generalized to include spatial variables, can now be rewritten in the form

ó
õ d³x¢l(x¢) ó
õ t

0
F(x¢,t¢) dt¢ . (50)

But now the form of the time integral no longer makes sense in the context of time-independent driving.

If a constant rate of driving is specified as a constraint on the initial probability distribution we take this to mean that the initial data were constant in the distant past, and at least up to the time of observation. In requiring this one faces the possibility of a divergent integral, so that it is necessary to regularize the integral, along the lines of methods often employed in quantum field theory. In the present case we rewrite the time integral in (50) as a time average over the past:

lim
t®¥
1
t
ó
õ 0

-t
F(x¢,t¢) dt¢ . (51)

This, however, is just the diagonal part of the operator F(x¢) as given by Eq.(11), and hence constant driving corresponds with our definition of the steady state. In this scenario we can then replace all the time integrations over operators by the diagonal parts of those operators and omit all time dependence. We see that, in the sense of this procedure, the steady state is also a singular limit of the general nonequilibrium state, in that the latter does not reduce in a completely straightforward mathematical way to the former.

In the steady state we expect time derivatives of all expectation values to vanish; hence from (29) we have the further implication that the constant rate of driving is exactly balanced by the rate of internal relaxation. This is how the system responds to steady currents.

Although there exist stationary currents within the system, the steady driving takes place in the terminal parts, or boundaries of the system, and such currents imply irreversible dissipation. There must then be an overall rate of dissipation or entropy production generated by the external sources. This rate is provided by Eq.(46), now rewritten in the form

1
k

×

S

t
= ó
õ g_F(x)s_F(x) d³x .      (52)

The general definition of g_F(x) still applies, but the explicit form is now

g_F(x)

º b ó
õ d³x¢ é
ë K^ss_F^dF^d(x;x¢) ù
û -1

K^ss_HF^d(x¢)

       + ó
õ d³x¢ ó
õ d³x^¢¢l(x^¢¢) é
ë K^ss_F^dF^d(x;x¢) ù
û -1

K_F^dF^d^ss(x¢,;x^¢¢) .
     (53)

4. The Linear Approximation

Much, though not all, of the work on macroscopic nonequilibrium phenomena has of necessity centered on small departures from equilibrium, or the linear approximation, so it is of some value to outline that reduction of the present theory and discuss briefly some applications. We envision situations in which the system has been in thermal equilibrium in the remote past and later found to produce data of the form considered above. By considering both classes of data we obtain a measure of the departure from equilibrium. In describing the general method of linearization the character of the perturbing term and the scenario under consideration are immaterial; hence, we can take the distribution (21) with integration limits replaced by the space-time region R as our generic model and, for brevity, temporarily omit space dependences. ⁷ Thus, we consider the model

r

= 1
Z
exp ì
í
î -bH- ó
õ

R
l(t)F(t) dt ü
ý
þ ,
(54)
Z[b,l(t)]

=Trexp ì
í
î -bH- ó
õ

R
l(t)F(t) dt ü
ý
þ ,
(55)

where b refers to the temperature of the previous equilibrium state - no other value of b makes sense until the system returns to equilibrium.

By linear approximation we mean ``linear in the departure from equilibrium." In the present case that means that the entire integral in (54) and (55) is in some sense small. An expansion of the exponential operator follows from repeated application of the identity

e^A+B=e^A é
ë 1+ ó
õ 1

0
e^-xA Be^x(A+B) dx ù
û , (56)

where B is the small perturbation. The first-order, or linear approximation to the expectation value of another operator C is (Heims and Jaynes, 1962; Jaynes, 1979; Grandy, 1988)

áCñ @ áCñ₀ - ó
õ 1

0

e^-xABe^xA C

0
dx+ áBñ₀áCñ₀ , (57)

where
áBñ₀=Tr æ
è e^A B ö
ø

. In (57) we again encounter the Kubo transform of the operator B with respect to A, the nonlinear form of which was introduced in (24).

Application of this approximation scheme to (54) and (55) reveals that the leading-order departure of the expectation value of C at time t from its equilibrium value is

áC(t)ñ-áCñ₀=- ó
õ

R
K_CF(t,t¢)l(t¢) dt¢ ,      (58)

where K_CF º K⁰_CF is the linearized version of the covariance function defined in (27):

K_CF(t,t¢)

º á

F(t¢)

C(t)ñ₀-áFñ₀áCñ₀

=- dáC(t)ñ
dl(t)
,
     (59)
and á¼ñ₀ is an expectation value in terms of the equilibrium distribution r₀. Time independence of the Hamiltonian confers the same property upon the single-operator expectations, and also guarantees time-translation invariance: K_CF(t,t¢)=K_CF(t-t¢). One verifies the reciprocity relation

K_CF(t-t¢)=K_FC(t¢-t)      (60)

from a change of variables and cyclic invariance of the trace. Note that it is always the second variable that carries the Kubo transform. If C and F are Hermitian, K_CF is real and K_FF ³ 0. In this case K_CF has all the properties of a scalar product on a linear vector space, and thus satisfies the Schwarz inequality: K_CCK_FF-K_CF² ³ 0, with equality if and only if C=cF, with c a real constant.

The covariance function (59) clearly depends only on equilibrium properties of the system. Quite generally, then, small departures from equilibrium caused by anything are described principally by equilibrium fluctuations. While this provides some useful physical insight, the other side of the coin is that covariance functions are exceedingly difficult to calculate for interacting particles, other than in some kind of perturbation theory. The linear approximation represents considerable progress, but formidable mathematical barriers remain. In practice, however, it is usually the relations among these and other quantities that interest us; after all, we seldom evaluate from first principles the derivatives in the Maxwell relations, yet they provide us with important insights. Linear hydrodynamics provides one area in which various approximation schemes for correlation functions have proved fruitful.

In the absence of external driving the Lagrange multiplier function l(t) is determined formally by (7), but one suspects that if we set C=F and restrict t to the region R, then (58) becomes a Fredholm integral equation determining l(t) in the only interval in which it is defined. This indeed turns out to be the case, though the demonstration that the two procedures are equivalent requires a little effort (Grandy, 1988). This is, in fact, a very rich result, and to discuss it in slightly more detail it will be convenient to specify R more definitely, as [-t,0], say. Thus, the expression

áF(t)ñ-áFñ₀=- ó
õ 0

-t
K_FF(t-t¢)l(t¢) dt¢ (61)

is now seen to have several interpretations as t ranges over (-¥,¥). When t > 0 it gives the predicted future of F(t); with -t £ t £ 0 it provides a linear integral equation determining l(t); and when t < -t it yields the retrodicted past of F(t). This last observation underscores the facts that K_FF(t) is not necessarily a causal function unless required to be so, and that these expressions are based on probable inference; in physical applications the dynamics enters into computation of the covariance function, but does not dictate its interpretation in various time domains. Although physical influences must propagate forward in time, logical inferences about the present can affect our knowledge of the past as well as the future. Retrodiction, of course, is at the heart of fields such as archeology cosmology, geology, and paleontology.

When the perturbed system is spatially nonuniform we find that (58) and (59) are replaced by

áC(x,t)ñ-áC(x)ñ₀

=- ó
õ

R
K_CF(x,t;x^¢,t^¢)l(x^¢,t^¢) d³x^¢ dt^¢ ,
(62)
K_CF(x,t;x^¢,t^¢)

=á

F(x^¢,t^¢)

C(x,t)ñ₀ -áF(x^¢)ñ₀áC(x)ñ₀ ,
(63)

so that in its causal domain K_CF(x,t;x^¢,t^¢) takes the form of a Green function. Note that the single-operator expectation values are also independent of x in an initially homogeneous system, and that the generalization to include a number of operators F_k(x,t) is straightforward.

If the equilibrium system is also space-translation invariant it is useful to employ the notation r º x-x¢. Generally, the operators encountered in covariance functions possess definite transformation properties under space inversion (parity) and time reversal. Under the former A(r,t) becomes P_AA(-r,t), P_A=±1, and under the latter T_AA(r,-t), T_A=±1. For operators describing a simple fluid, say , PT=+1 and one verifies that the full reciprocity relation holds:

K_CF(r,t)=K_FC(r,t) . (64)

The efficacy of these equations of the linear approximation will become apparent as we present some sample applications.

Linear Transport Processes

The generic model for a macroscopic fluid is most readily described as a continuum in terms of various densities, and representations in terms of quantum-mechanical operators are defined in terms of field operators in a Fock representation (e.g., Fetter and Walecka, 1971). The three basic density operators in the fluid are the number density n, momentum density mj, and energy density h, where j is the particle current-density operator. Unless so specified, these generally have no explicit time dependence, so that their equations of motion in the Heisenberg picture are

×

n

(x,t)

= i
ℏ
[H,n(x,t)] ,
(65a)
m
×

j

(x,t)

= i
ℏ
[H,mj(x,t)] ,
(65b)

×

h

(x,t)

= i
ℏ
[H,h(x,t)] .
(65c)

But the left-hand sides of these equations are also involved in statements of the local microscopic conservation laws in the continuum, which usually relate time derivatives of densities to divergences of the corresponding currents. The differential conservation laws are thus obtained by evaluating the commutators on the right-hand sides in the forms

×

n

(x,t)

=-Ñ·j(x,t) ,
(66a)
m
×

j

(x,t)

=-Ñ·T(x,t) ,
(66b)

×

h

(x,t)

=-Ñ·q(x,t) .
(66c)

The superposed dot in these equations indicates a total time derivative. In the absence of external forces and sources (65) are equivalent to unitary transformations, and the Hamiltonian and total-number operator, respectively, are given by

H= ó
õ h(x,t) d³x , N= ó
õ n(x,t) d³x , (67)

both independent of time.

The current density j is just the usual quantum-mechanical probability current density, so that (66a) is easily verified. Identification of the energy current density q and stress tensor T, however, is far from straightforward; in fact, they may not be uniquely defined for arbitrary particle-particle interactions. But if the Hamiltonian is rotationally invariant we can restrict the discussion to spherically-symmetric two-body potentials. Two further symmetry properties arise from time independence and spatial uniformity in the equilibrium system: time-translation and space-translation invariance, respectively. These latter two invariances are expressed in terms of volume-integrated, or total energy, number, and momentum operators, so that the commutators [H,P], [H,N], [P,N] all vanish. Specification of these symmetry properties defines a simple fluid, and the operators q and T can be identified uniquely by evaluation of the commutators in Eqs.(65b,c). The algebra is tedious and the results are given, for example, by Puff and Gillis (1968), and Grandy (1988). Thus, the five local microscopic conservation laws (66) completely characterize the simple fluid and lead to five long-lived hydrodynamic modes. Local disturbances of these quantities cannot be dissipated locally, but must spread out over the entire system.

As a first application of the linear theory we return to the steady-state scenario of Eqs.(12) and (13) and also incorporate a term -bH in the exponentials to characterize an earlier equilibrium reference state. Denoting the deviation from equilibrium as DF(x) = F(x)-áF(x)ñ₀, we find that in linear approximation another operator C will have expectation value

áDC(x)ñ_ss

=- ó
õ

R
l(x¢)K_CF(x-x¢) d³x¢

       +
lim
e® 0⁺
ó
õ

R
d³x¢ ó
õ 0

-¥
e^et l(x¢)K_C[(F)\dot](x-x¢,t) dt ,
     (68)
where we have employed the expression (11) for the diagonal part of an operator, and subscripts ss refer to the steady-state distribution. Specify F(x) to be one of the fluid densities d(x), so that the continuity equations (66) lead to the identity

d
dt
K_dB(x,t)=-Ñ·K_jB(x,t) ,      (69)

and thus K_C[(d)\dot] in (68) can be replaced by -Ñ¢·K_CJ. Let R(x) be the system volume V, and presume K_CJ to vanish at large distances. An integration by parts then reduces (68) to

áDC(x)ñ_ss

=- ó
õ

V
l(x¢)K_Cd(x-x¢) d³x¢

       +
lim
e® 0⁺
ó
õ

V
d³x¢ ó
õ 0

-¥
e^et Ñ¢l(x¢)·K_CJ(x-x¢,t) dt ,
     (70)
in which we have dropped the surface term.

Classical hydrodynamics corresponds to a long-wavelength approximation by presuming that Ñ¢l varies so slowly that it is effectively constant over the range for which K_CJ is appreciable. ⁸ With this in mind we can extract the gradient from the integral and write

áDC(x)ñ_ss

@ - ó
õ

V
l(x¢)K_Cd(x-x¢) d³x¢

+Ñl·
lim
e® 0⁺
ó
õ

v
d³x¢ ó
õ 0

-¥
e^et K_CJ(x-x¢,t) dt ,
(71)
which is the fundamental equation describing linear transport processes in the steady state. The integration region v is the correlation volume, outside of which the correlations vanish; it is introduced here simply as a reminder that the spatial correlations are presumed to be of short range.

As an example, let d be the number density n with gradient characterized by the deviation Dn(x)=n(x)-ánñ₀. The specified density gradient and the predicted current density, respectively, are then

áDn(x)ñ_ss

=- ó
õ

V
l(x¢)K_nn(x-x¢) d³x¢

+Ñl· ó
õ

v
d³x¢ ó
õ 0

-¥
e^et K_nj(x-x¢,t) dt

=- ó
õ

V
l(x¢)K_nn(x-x¢) d³x¢ ,
(72)

áj(x)ñ_ss

=- ó
õ

V
l(x¢)K_jn(x-x¢) d³x¢

+Ñl· ó
õ

v
d³x¢ ó
õ 0

-¥
e^et K_jj(x-x¢,t) dt

= Ñl· ó
õ

v
d³x¢ ó
õ 0

-¥
e^et K_jj(x-x¢,t) dt ,
(73)

where the limit e® 0⁺ is understood. We have noted that the second term of the first line in (72) and the first term of the first line in (73) vanish by symmetry.

Now take the gradient in (72), make the long-wavelength approximation, and eliminate Ñl between this result and (73), which leads to the relation

áj(x)ñ_ss

=-
ó
õ ¥

0
e^-et dt ó
õ

v
K_jj(x-x¢,t) d³x¢

ó
õ

v
K_nn(x-x¢) d³x¢
·Ñán(x)ñ_ss

º -D(x)·Ñán(x)ñ_ss ,
(74)
with the proviso that e® 0⁺. This is Fick's law of diffusion, in which we have identified the diffusion tensor D that can now be calculated in principle from microscopic dynamics; owing to spatial uniformity in the equilibrium system D(x) is actually independent of x. For more general nonequilibrium states the same type of calculation produces a quantity D(x,t) having the same form as that in (74), and the long-wavelength approximation also involves one of short memory. (By `short memory' we mean that recent information is the most relevant, not that the system somehow forgets.)

It is remarkable that linear constitutive equations such as Fick's law arise from almost nothing more than having some kind of data available over a space-time region. These relations have long been characterized as phenomenological, since they are not derived from dynamical laws. We now see why this is so, for the derivation here shows that they are actually laws of of inference. Indeed, what we usually mean by `phenomenological' is `inferred from experience', a notion here put on a sound footing through probability theory. When they are coupled with the corresponding conservation laws, however, one does obtain macroscopic dynamical laws, such as the diffusion equation.

Because it involves a slightly different procedure, and will provide a further example below, let us consider thermal conductivity (which need not be restricted to fluids). A steady gradient in energy density is specified in the form of a deviation Dh(x)=h(x)-áhñ₀. By a calculation similar to the above we find for the expected steady-state heat current

áq(x)ñ_ss= ó
õ

v
d³x¢ ó
õ ¥

0
e^-et Ñl(x¢)·K_qq(x-x¢, t) dt , (75)

where the limit e® 0⁺ is understood, and we have not yet invoked the long-wavelength limit. In this case we do not eliminate Ñl, for it contains the gradient of interest. Both dimensionally, and as dictated by the physical scenario, l must be b(x)=[kT(x)]^-1, a space-dependent temperature function. Although such a quantity may be difficult to measure in general, it is well-defined in the steady state. With this substitution the long-wavelength approximation of constant temperature gradient in (75) yields

áq(x)ñ_ss

@ -ÑT· ó
õ

v
d³x¢ ó
õ ¥

0
e^-et K_qq(x-x¢, t)
kT²(x¢)
dt

º -k·ÑT(x) ,
(76)
in which we identify the thermal conductivity tensor k, which again is independent of x. This is Fourier's law of thermal conductivity; it applies to solids as well as fluids, but calculation of the covariance function remains a challenge. It is left to the reader to verify that k, as well as D in (74), are positive.

A common model employing (76) is that of a uniform conducting rod of length L and thermal conductivity k. We can calculate the constant rate of transfer of entropy from the source to the sink by means of (52), in which the transfer potential g(x) is simply the spatial temperature distribution b(x), and s(x) is the (constant) rate of driving on the end boundaries of the rod. In this case the driving rate is given by the heat current áqñ_ss itself, inserting thermal energy at one end and taking it out at the other. Hence,

1
k

×

S

t

= ó
õ L

0
1
kT(x)
(-kÑT) é
ë d_x,0-d_x,L ù
û dx

= k
L

æ
è T_H-T_C ö
ø 2

T_H T_C
,
(77)
which is identical to the more intuitively obtained result (e.g., Palffy-Muhoray, 2001). Although (52) itself is completely nonlinear, one notes that we have employed the linear form of Fourier's law (76) for the current. This calculation illustrates the importance of boundary conditions in describing stationary processes; Tykodi (1967) has also emphasized the role of terminal parts in describing the steady state. Indeed, the entropy generated in this process is entirely that of the external world. It is also of some interest to note that

×

S

t

is by no means a minimum in this state (Palffy-Muhoray, 2001).

Linear Response Theory

An important feature of the thermal driving mechanism is that the actual details of the thermal driving source are irrelevant, and only the rates and strengths at which system variables are driven enter the equations. It should make no difference in many situations whether the driving is thermal or mechanical; we examine the latter context here.

The theory of dynamical response was described very briefly in Eqs.(I-5)-(I-8), and the linear version follows as described there. The underlying scenario is that a well-defined external field is imposed on a system that has been in thermal equilibrium in the remote past, as described by the Hamiltonian H₀. It is then presumed that the response to this disturbance can be derived by adding a time-dependent term to the Hamiltonian, so that effectively H=H₀-Fv(t), t > 0, where v(t) describes the external field and F is a system operator to which it couples. Some of the difficulties with this approach were sketched in I, including the observation that r(t) can only evolve unitarily. We now see that these problems can be resolved by noting that dynamical response is just a special case of thermal driving.

For eventual comparison with the results of linear response theory we shall need an identity for the time derivative of the covariance function. Direct calculation in the definition (59) yields

d
dt
K_CF(t)

= i
b ℏ

[C, F(t)]

0

=-b^-1f_CF(t) ,
(78)
where f_CF is the linear response function. Clearly, the covariance function contains a good deal more information than does the dynamic response function.

The derivation of the generic maximum-entropy distribution in (I-14) disguises a subtle point regarding that procedure. We note from (I-16) that the Lagrange multiplier l can also be determined from the maximum entropy:

l = 1
k
¶S
¶áfñ
. (79)

Together with (I-15) this reveals a reciprocity implying that the probability distribution can be obtained by specifying either áfñ or l. An example of this choice is illustrated in the canonical distribution (I-10), which could be obtained by specifying either the energy or the temperature; this option was also exercised in the model of spatial inhomogeneity of Eq.(10). Thus, we return to Eqs.(21), replacing H with H₀, and let l(t¢) be the independent variable. In linear approximation (29) expresses l(t) directly in terms of the source strength, or driving rate, and dimensional considerations suggest that we write this variable in the form

l(t¢)

=b d
dt¢
é
ë q(t-t¢)v(t¢) ù
û

=b é
ë -d(t-t¢)v(t¢)+q(t-t¢) d
dt¢
v(t¢) ù
û ,
(80)
with the condition that v(0)=0. The step-function q(t-t¢) is included in (80) because l is defined only on the interval [0,t].

Substitution of (80) into (21) yields the distribution relevant to a well-defined external field,

r_t

= 1
Z_t
exp é
ë -bH₀+b ó
õ t

0
é
ë d(t-t¢)-q(t-t¢) d
dt¢
ù
û v(t¢) F(t¢) dt¢ ù
û

= 1
Z_t
exp é
ë -bH₀ +b ó
õ t

0
v(t¢)
×

F

(t¢) dt¢ ù
û ,
(81)
and Z_t, as usual, is the trace of the numerator. Although the exponential contains what appears to be an effective Hamiltonian, we do not assert that
ò₀^t v(t¢)
×

F

(t¢) dt¢

is an addition to the equilibrium Hamiltonian H₀; there is no rationalé of any kind for such an assertion. The Lagrange multiplier function l(t) is a macroscopic quantity, as is its expression as an independent variable in (80). The linear approximation (58), along with the identity (78), yields the departure from equilibrium of the expected value of another operator C at any future time t under driving by the external field:

áC(t)ñ-áCñ₀

=b ó
õ t

0
v(t¢)K_C[(F)\dot](t-t¢) dt¢

= b ó
õ t

0
v(t¢) d
dt¢
K_CF(t-t¢) dt¢

= ó
õ t

0
v(t¢)f_CF(t-t¢) dt¢ ,
(82)
which is precisely the result obtained in linear response theory. But now we also have the time-evolved probability distribution (81) from which we can develop the associated thermodynamics. Equation (82) confirms that, at least linearly, both r_t and a unitarily evolved r(t) will predict the same expectation values. But, as suggested following (78), r(t) contains no more macroscopic information than it had to begin with.

As an example of an external source producing a time-varying field, suppose a component of electric polarization M_i(t) is specified, leading to the density matrix

r_t= 1
Z_t
exp é
ë -bH₀ + ó
õ t

0
l_i(t¢)M_i(t¢ dt¢ ù
û . (83)

We presume no spontaneous polarization, so that in linear approximation the expectation of another component at time t is

áM_j(t)ñ = ó
õ t

0
l(t¢)á

M_i(t¢)

M_j(t)ñ₀ dt¢ . (84)

Now, with the additional knowledge that (84) is the result of turning on an external field one might be led to think that the Lagrange multiplier is simply a field component, say E_i(t). But (80) shows that, even when the effect is to add a time-dependent term to the Hamiltonian, the actual source term is somewhat more complicated; only the d-function term in (80) corresponds to that possibility, and the actual source term also describes the rate of change of the field. This again illustrates the earlier observation that the covariance function contains much more information than the dynamic response function.

With (80) we can rewrite (84) explicitly as

áM_j(t)ñ = bá

M_i(t)

M_j(t)ñ₀ E_i(t)

-b ó
õ t

0
á

M_i(t¢)

M_j(t)ñ₀ dE_i(t¢)
dt¢
dt¢ ,
     (85)
which is just the result obtained from the theory of dynamic response. But we've uncovered much more, because now one can do thermodynamics. In the present scenario we have specified thermal driving of the polarization and incorporated that into a density matrix; additionally, the Lagrange multiplier has been chosen to be the independent variable corresponding to an external field, which allows us to identify the source strength. Thus, from (25) we have a definite expression for the time-dependent entropy of the ensuing nonequilibrium state:

1
k
S_t

=lnZ_t +báHñ_t- ó
õ t

0
l(t¢)áM(t¢)ñ_t dt¢ ,

@ 1
k
S₀ +b ó
õ t

0
l(t¢)K_H₀M(t¢) dt¢+O(l) ,
     (86)
where the second line is the linear approximation and we have identified the entropy of the equilibrium system as S₀=klnZ₀+kbáH₀ñ₀. In the case of dynamic response, if one makes the linear approximation to r(t) in (I-6) and computes the entropy similarly, it is found that S(t)-S₀ vanishes identically, as expected. With (78), (80), and (86), however, the entropy difference can also be written in terms of the linear response function:

1
k
æ
è S_t-S₀ ö
ø @ b ó
õ t

0
v(t¢)f_H₀M(t¢) dt¢ ,      (87)

once again exhibiting the canonical form DQ(t)/T. These remarks strongly suggest that the proper theory of response to a dynamical perturbation is to be found as a special case of thermal driving.

Relaxation

When external sources are removed we expect the system to relax to a (possibly new) state of thermal equilibrium. If the driving ceases at time t=t₁, say, then from that point on the system is described by (21) with the replacement t® t₁ everywhere, barring any further external influence. These equations define the nonequilibrium state at t=t₁, from which the subsequent behavior can be predicted.

As discussed earlier, S_t₁ as given by (25) cannot evolve to the entropy of some equilibrium state, for the same reason that r_t₁ cannot evolve to a canonical equilibrium distribution; both evolve from t=t₁ under unitary transformation. It should be sufficient, however, to show that the macrovariables describing the thermodynamic system may relax to a set of equilibrium values. Then, with those predicted values, we can construct a new canonical density matrix via entropy maximization that will describe the new equilibrium state. The value S_t₁ remains the entropy of the nonequilibrium state at the time the driving was removed.

If we add energy and matter, say, to the system, then the total energy and particle number of the ensuing equilibrium state are fixed at the cutoff t=t₁. The energy and number densities, however, continue to evolve to uniform values over the relaxation period, and these processes define that period. Often the densities themselves are the driven variables; for example, the pot of water is heated over the area of the bottom of the pot. But in the example of electric polarization the total moment is not fixed at cutoff, and its decay in zero field defines the relaxation process. In view of these various possibilities, we shall consider a generic variable f(t) whose expectation value describes the relaxation.

Calculation of the exact expectation values is essentially intractable, of course, so we again employ the linear approximation. For example, at time t ³ t₁ the expectation of interest is

Df(t) º áf(t)ñ_t₁ - áfñ₀

@ ó
õ t₁

0
l(t¢)K_ff(t-t¢) dt¢

= ó
õ t₁

0
s_f(t¢) K_ff(t-t¢)
K_ff(0)
dt¢ ,
(88)
where we've utilized (29). Although everything on the right-hand side of (88) is presumably known, the actual details of the relaxation process depend crucially on the behavior of K_ff(t-t¢) for t > t₁. But if f is the driven energy operator, say, then Df will be independent of time and (88) provides the new total energy of the equilibrium system.

In the discussion following Eq.(60) it was noted that the covariance functions satisfy the Schwarz inequality. From this we can see that the ratio r(t-t¢)=|K_ff(t-t¢)/K_ff(0)| in (88), and therefore the integrand, reach their maxima at t¢=t where r(0)=1. Further, r(t-t¢) is less than unity for t¢ < t, and again for all t > t₁; the exact magnitude of r depends on the decay properties of K_ff(t-t¢). In any event, the major contribution to the integral arises from the region around the cutoff t=t₁. The relaxation time t can be estimated by studying the asymptotic properties of Df(t), which in turn requires an examination of K_ff(t-t¢). We seek a time t₂ for which K_ff no longer contributes appreciably to the integral:

|K_ff(t₂-t¢)| << 1 , (89)

by some criterion. Then, t @ t₂-t₁.

For many covariance functions the ratio r(t-t¢) in (88) will tend to some constant value as t/t₁ becomes large, while others may tend to zero. For example, if we turn off the burner under a pot of water at t=t₁, the total energy of the equilibrium system will be áE(t₁)ñ_t₁, so that K_EE would be expected to reach its nonzero asymptotic form very quickly. But in the polarization example of (84) we expect the correlations to decay to zero as the system relaxes back to the unpolarized state; this may, or may not, be rapid. One can only uncover the particular behavior from a detailed study of the covariance functions as determined by the relaxation mechanisms specific to the system, which are generally governed by particle interactions.

There is also a connection here with the rate-of-change of internal entropy,

×

S

int

. When the source is removed the entropy production of transfer immediately vanishes for t > t₁. Equation (40) then implies that the total rate of entropy production is entirely that of relaxation, and the compelling conclusion is that

×

S

int

is actually the relaxation rate itself and may be observable. This interpretation is reinforced by comparing the time derivative of áf(t)ñ_t₁ in (88) with the definition of

×

S

int

in (39).

Finally, the preceding discussion suggests that one can actually construct an explicit expression for the relaxation entropy S_int, and hence for

×

S

int

. Equation (88) provides a continuous set of data áf(t)ñ_t₁ from the cutoff at t=t₁ to the effective equilibrium point at t=t₂. These predicted values are as good as any other data for constructing a `relaxation distribution' via the PME, and hence a maximum entropy of relaxation. Note carefully, though, that this entropy production should disappear at equilibrium while S_t possibly approaches the thermodynamic entropy of that state; the latter is determined by values of variables characterizing the equilibrium state and which most often have already been set at t=t₁. In a simple fluid, say, the approach to equilibrium is described by taking f(t) in (88) to be a density that may approach a constant value denoting a homogeneous state. Further details of this construction will be discussed elsewhere.

5. Summary

The aim of this discussion has been to expand the concept of theoretical entropy in equilibrium thermodynamics to encompass macroscopic systems evolving in time. In doing so we find that the maximum information entropy S_t, while providing a complete description of the nonequilibrium state at any instant, does not assume the dominant role it does in an equilibrium context. Rather, the rates and directions of processes are the most important features of nonequilibrium systems, and the rate of entropy production

×

S

t

takes the form of a transfer potential times a rate of transfer, or a generalized intuitive form of

×

Q

/T

. This suggests that in nonequilibrium thermodynamics it is

×

S

t

that governs the ongoing macroscopic processes and can be expressed as a measurable quantity via Eq.(34). In the absence of external sources (or sinks) the rate of entropy production simply describes the relaxation rate; the theoretical maximum entropy itself characterizes only the nonequilibrium state from which the system is relaxing to the singular equilibrium limit. Further thought leads us to conclude that these interpretations can also be applied to ongoing processes.

Many writers have expressed a belief in the existence of an additional variational principle characterizing nonequilibrium states and processes in much the same way that the Gibbs algorithm governs equilibrium states. Although various candidates have been put forth in special contexts, none have achieved that same lofty position. This is perhaps not surprising in light of the foregoing discussion, for it would seem that the Gibbs variational principle has all along been the rule governing all thermodynamic states, possibly because it has its roots in a fundamental rule of probability theory: the principle of maximum information entropy. The difficulty has not been our inability to find a compelling definition of physical entropy for nonequilibrium states; rather, it was a failure to understand the specific role of entropy over the entire spectrum of thermodynamic states. It's only the nature of the constraints that changes, from constants of the motion to steady state to time dependent, while the principle remains the same. This is a satisfying result in that it provides a certain economy of principles.

The formalism presented here applies to macroscopic systems arbitrarily far from equilibrium, although the nonlinear equations provide formidable mathematical barriers to any detailed calculations. While the linear approximation is the most fruitful approach, even here the covariance functions remain somewhat complicated and resistant to exact computation; various attacks have produced some progress, however, in the context of linear hydrodynamics. At present it is the formal relations containing covariance functions that can prove most useful, in which carefully chosen models of these nonequilibrium correlations can play a role similar to that of potential models in equilibrium statistical mechanics. Although the present results may lay some groundwork for a complete theory of nonequilibrium thermodynamics, there is a great deal of room for expansion and further development. Perhaps a third paper in this series will describe additional steps along these lines.

REFERENCES

Berry, M. (2002), ``Singular Limits," Physics Today 55 (5), 10.

Boltzmann, L. (1895), ``On certain questions of the theory of gases," Nature 51, 413.

Clausius, R. (1865), ``Über verschiedene für die Anwendung bequeme Formen der Hauptgleichungen der mechanische Wärmetheorie," Ann. d. Phys.[2] 125, 390.

de Groot, S.R. and P. Mazur (1962), Nonequilibrium Thermodynamics, North-Holland, Amsterdam.

Evans, R. (1979), ``The nature of the liquid-vapour interface and other topics in the statistical mechanics of non-uniform, classical fluids," Adv. Phys. 28, 143.

Fano, U. (1957), ``Description of States in Quantum Mechanics by Density Matrix and Operator Techniques," Rev. Mod. Phys. 29, 74.

Fetter, A.L. and J.D. Walecka (1971), Quantum Theory of Many-Particle Systems, McGraw-Hill, New York.

Gibbs, J.W. (1902), Elementary Principles in Statistical Mechanics, Yale University Press, New Haven, Conn.

Grandy, W.T., Jr.(1987), Foundations of Statistical Mechanics, Vol.I: Equilibrium Theory, Reidel, Dordrecht.

(1988), Foundations of Statistical Mechanics, Vol.II: Nonequilibrium Phenomena, Reidel, Dordrecht.

(2004), ``Time Evolution in Macroscopic Systems. I: Equations of Motion," Found. Phys. 34, 1.

Jaynes, E.T. (1963), ``Information Theory and Statistical Mechanics," in K.W. Ford (ed.), Statistical Physics, Benjamin, New York.

(1967), ``Foundations of Probability Theory and Statistical Mechanics," in M. Bunge (ed.), Delaware Seminar in the Foundations of Physics, Springer-Verlag, New York.

(1979), ``Where Do We Stand On Maximum Entropy?," in R.D.Levine and M. Tribus (eds.), The Maximum Entropy Formalism, M.I.T. Press, Cambridge, MA.

Kubo, R., M. Toda, and N. Hashitsume (1985), Statistical Physics II, Springer-Verlag, Berlin.

Mitchell, W.C. (1967), ``Statistical Mechanics of Thermally Driven Systems," Ph.D. thesis, Washington University, St. Louis (unpublished).

Nakajima, S. (1958), ``On Quantum Theory of Transport Phenomena," Prog. Theor. Phys. 20, 948.

Onsager, L. (1931), ``Reciprocal Relations in Irreversible Processes. I," Phys. Rev. 37, 405.

Palffy-Muhory, P. (2001), ``Comment on `A check of Prigogine's theorem of minimum entropy production in a rod in a nonequilibrium stationary state' by Irena Danielewicz-Ferchmin and A. Ryszard Ferchmin [Am. J. Phys. 68 (10), 962-965 (2000)]," Am. J. Phys. 69, 825.

Puff, R.D. and N.S. Gillis (1968), ``Fluctuations and Transport Properties of Many-Particle Systems," Ann. Phys. (N.Y.) 6, 364.

Tykodi, R.J. (1967), Thermodynamics of Steady States, Macmillan, New York.

Equations of Motion

Applications

Footnotes:

¹Equation (n) of that paper will be denoted here by (I-n).

²Because there are numerous `entropies' defined in different contexts, we shall denote the experimental equilibrium entropy of Clausius as S without further embellishments, such as subscripts.

³Requiring the equilibrium system to have no `memory' of its past precludes `mysterious' effects such as those caused by spin echos.

⁴This prescription for stationarity was advocated earlier by Fano (1957), and has also been employed by Nakajima (1958) and by Kubo, et al (1985).

⁵The lower limit of the driving interval is chosen as 0 only for convenience.

⁶Equation (40) is reminiscent of, but not equivalent to, similar expressions for entropy changes, such as dS = dS_ext + dS_int, that can be found in works on phenomenological nonequilibrium thermodynamics (e.g., de Groot and Mazur, 1962).

⁷The equilibrium distribution is taken as canonical only for convenience; for example, one could just as well use the grand canonical form, as well as include different types of particle.

⁸We presume that the fluctuations are not correlated over the entire volume.

File translated from T_EX by T_TH, version 3.10.
On 10 Oct 2003, 14:48.