Large Deviations
09 Nov 2005 17:39
The limit theorems of probability theory --- the weak and strong laws of large numbers, the central limit theorem, etc. --- basically say that averages taken over large samples converge on expectation values. (The strong law of large numbers asserts almost-sure convergence, the central limit theorem asserts a kind of convergence in distribution, etc.) These results say little or nothing about the rate of convergence, however, which is often important for many applications of probability theory, e.g., statistical mechanics. One way to address this is the theory of large deviations. (I believe the terminology goes back to Varadhan in the 1970s, but that's just an impression, rather than research.)
Let me say things sloppily first, so the idea comes through, and then more
precisely, so people who know the subject won't get too upset.
Suppose X is a random variable with expected value EX, and we
consider
, the sample mean
of n samples of
X.
"obeys a large deviations principle" if there is a
non-negative function r, called the rate function, such
that
. (The rate function has to
obey some sensible but technical continuity conditions.) This is
a large deviation result, because the difference between the empirical
mean and the expectation is remaining constant as n grows --- there
has to be a larger and large conspiracy, as it were, among the samples to keep
deviating from the expectation in the same way. Now, one reason what I've
stated isn't really enough to satisfy a mathematician is that the right-hand
side converges on zero, so the functional form of the probability could be
anything which also converges on zero and that'd be satisfied, but we want to
pick out exponential convergence. The usual way is to look at the
limiting growth rate of the probability. Also, we want the probability that
the difference between the empirical mean and the expectation falls into any
arbitrary set. So one usually sees the LDP asserted in some form like, for any
reasonable set A,
(Actually,
to be completely honest, I really shouldn't be assuming that there is
a limit to those probabilities. Instead I should connect the lim inf of that
expression to the infimum of the rate function over the interior of A,
and the lim sup to the rate function over the closure of A.)
Similar large deviation principles can be stated for the empirical distribution, the empirical process, functionals of sample paths, etc., rather than just the empirical mean. There are tricks for relating LDPs on higher-level objects, like the empirical distribution over trajectories, to LDPs on lower-level objects, like empirical means. (These go under names like "the contraction principle".)
Since ergodic theory extends the probabilistic limit laws to stochastic processes, rather than just sequences of independent variables, it shouldn;t be surprising that large deviation principles also hold for some stochastic processes. I am particularly interested in LDPs for Markov processes, and their applications. There are also important connections to information theory, since in an awful lot of situations, the large deviations rate function is the Kullback-Leibler divergence, a.k.a. the relative entropy.
A related, but strictly speaking distinct, topic is the host of inequalities which give bounds on deviations of averages from expectations for various types of processes or sample functionals --- Bernstein, Chernoff, Hoeffding, etc. (For instance, Hoeffding's inequality gives deviation probabilities for averages of strictly bounded functionals.) Inequalities of this type are generally just upper bounds, but ones which apply even at small sample sizes, rather than just asymptotically, whereas the LDP is about asymptotic bounds from both directions. Since they've got very similar uses, however, I include them in here too.
- Recommended:
- James Bucklew, Large Deviation Techniques in Decision, Simulation, and Estimation
- Thomas Cover and Joy Thomas, Elements of Information Theory [Very nice chapter on large deviations for IID sequences]
- Amir Dembo and Ofer Zeitouni, Large Deviations Techniques and Applications [Chapters 2, 4 and 5, and parts of chapter 6, are available in postscript format via Prof. Dembo's page for his course on large deviations]
- Frank den Hollander, Large Deviations [Nice introductory text for people with an applied probability background. Short.]
- Richard S. Ellis
- "The Theory of Large Deviations: from Boltzmann's 1877 Calculation to Equilibrium Macrostates in 2D Turbulence", Physica D 133 (1999): 106--136
- Entropy, Large Deviations, and Statistical Mechanics
- M. I. Friedlin and A. D. Wentzell, Random Perturbations of Dynamical Systems
- S. R. S. Varadhan, "Large Deviations", Annals of Probability 36 (2008): 397--419 [Copy via Prof. Varadhan. Wald Lecture for 2005.]
- Recommended, more specialized:
- R. R. Bahadur, Some Limit Theorems in Statistics [1971. The notation is now much more transparent, and the proofs of many basic theorems considerably simplified. But if there's a better source for statistical applications than this little book, I've yet to find it.]
- Julien Barré, Freddy Bouchet, Thierry Dauxois and Stefano Ruffo, "Large deviation techniques applied to systems with long-range interactions", cond-mat/0406358 = Journal of Statistics Physics 119 (2005): 677--713
- J.-R. Chazottes and D. Gabrielli, "Large deviations for empirical entropies of Gibbsian sources", math.PR/0406083 = Nonlinearity 18 (2005): 2545--2563 [This is a very cool result which shows that block entropies, and entropy rates estimated from those blocks, obey the large deviation principle even as one lets the length of the blocks grow with the amount of data, provided the block-length doesn't grow too quickly (only logarithmically). I wish I could write papers like this.]
- W. De Roeck, Christian Maes and Karel Netocny, "H-Theorems from Autonomous Equations", cond-mat/0508089 [this basically derives the H-theorem of statistical mechanics as a large deviations result, assuming a certain reasonable Markovian form for the macroscopic dynamics. In fact, we have a separate argument that you don't have that Markovian form, you're just not trying hard enough; see here]
- Gregory L. Eyink
- "Action principle in nonequilbrium statistical dynamics," Physical Review E 54 (1996): 3419--3435 [Least action as a consequence of Markovian LDP]
- "A Variational Formulation of Optimal Nonlinear Estimation," physics/0011049 [Nice connections between optimal state estimation (assuming a known form for the underlying stochastic process), nonequilibrium statistical mechanics, and large deviations theory, leading to tractable-looking numerical schemes for estimation.]
- S. Orey and S. Peliken, "Large deviations principles for stationary processes", Annals of Probability 16 (1988): 1481--1496
- To read:
- Paul H. Algoet and Brian H. Marcus, "Large Deviation Theorems for Empirical Types of Markov Chains Constrained to Thin Sets," IEEE Trans. Info. Theory 38 (1992): 1276--1291
- Alexei Andreanov, Giulio Biroli, Jean-Philippe Bouchaud, and Alexandre Lefèvre, "Field theories and exact stochastic equations for interacting particle systems", Physical Review E 74 (2006): 030101 = cond-mat/0602307
- Igor Bjelakovic, Jean-Dominique Deuschel, Tyll Krueger, Ruedi
Seiler, Rainer Siegmund-Schultze and Arleta Szkola
- "A quantum version of Sanov's theorem", quant-ph/0412157 = Communications in Mathematical Physics 260 (2005): 659--671 [Quantum large deviations!]
- "Typical support and Sanov large deviations of correlated states", math.PR/0703772 = Communications in Mathematical Physics 279 (2008): 559--584
- Patrick Cattiaux and Arnaud Guillin, "Deviation bounds for additive functionals of Markov process", math.PR/0603021 [Non-asymptotic bounds for the probability that time averages deviate from expectations with respect to the invariant measure, when the process is stationary and ergodic and the invariant measure is reasonably regular.]
- J.-R. Chazottes, P. Collet, C. Kuelske, and F. Redig, "Concentration inequalities for random fields via coupling", math.PR/0503483
- Po-Ning Chen, "Generalization of Gartner-Ellis theorem", IEEE Transactions on Information Theory 46 (2000): 2752--2760
- Zhiyi Chi
- "Large deviations for template matching between point processes", Annals of Applied Probability 15 (2005): 153--174 = math.PR/0503463
- "On the asymptotic of likelihood ratios for self-normalized large deviations", arxiv:0709.1506
- A. de Acosta, "A general nonconvex large deviation result II", Annals of Probability 32 (2004): 1873--1901 = math.PR/0410101
- Zach Deitz and Sunder Sethuraman, "Large deviations for a class of nonhomgeneous Markov chains", math.PR/0404230
- B. Derrida, "Non equilibrium steady states: fluctuations and large deviations of the density and of the current", cond-mat/0703762
- B. Derrida, Joel L. Lebowitz and Eugene R. Speer, "Exact Large Deviation Functional for the Density Profile in a Stationary Nonequilibrium Open System," cond-mat/0105110
- Paul Dupuis and Richard S. Ellis, A Weak Convergence Approach to the Theory of Large Deviations [PDF]
- Vlad Elgart and Alex Kamenev, "Rare Events Statistics in Reaction--Diffusion Systems", cond-mat/0404241 [i.e., large deviations]
- Andreas Engel, Remi Monasson and Alexander K. Hartmann, "On Large Deviation Properties of Erdos-Renyi Random Graphs", Journal of Statistical Physics 117 (2004): 387--426
- Jin Feng and Thomas G. Kurtz, Large Deviations for Stochastic Processes [Online]
- Cristian Giardina', Jorge Kurchan, Luca Peliti, "Direct evaluation of large-deviation functions", cond-mat/0511248 ["We introduce a numerical procedure to evaluate directly the probabilities of large deviations of physical quantities, such as current or density, that are local in time. The large-deviation functions are given in terms of the typical properties of a modified dynamics, and since they no longer involve rare events, can be evaluated efficiently and over a wider ranges of values."]
- Nathael Gozlan and Christian Léonard, "A large deviation approach to some transportation cost inequalities", math.PR/0510601
- Alice Guionnet, "Large deviations and stochastic calculus for large random matrices", Probability Surveys 1 (2004): 72--172 [Open access]
- O. V. Gulinskii and R. S. Liptser, "Example of Large Deviations for Stationary Processes", Theory of Probability and Applications 44 (1999): 211--225 [PDF]
- Marian Grendar Jr. and Marian Grendar, "Chernoff's bound forms," math.PR/0306326
- Te Sun Han
- "Hypothesis Testing with the General Source", IEEE Transactions on Information Theory 46 (2000): 2415--2427 = math.PR/0004121 ["The asymptotically optimal hypothesis testing problem with the general sources as the null and alternative hypotheses is studied.... Our fundamental philosophy in doing so is first to convert all of the hypothesis testing problems completely to the pertinent computation problems in the large deviation-probability theory. ... [This] enables us to establish quite compact general formulas of the optimal exponents of the second kind of error and correct testing probabbilities for the general sources including all nonstationary and/or nonergodic sources with arbitrary abstract alphabet (countable or uncountable). Such general formulas are presented from the information-spectrum point of view."]
- "An information-spectrum approach to large deviation theorems", cs.IT/0606104
- Svante Janson, "Large deviations for sums of partly dependent random variables", Random Structures and Algorithms 24 (2004): 234--248 ["We use and extend a method by Hoeffding to obtain strong large deviation bounds for sums of dependent random variables with suitable dependency structure. The method is based on breaking up the sum into sums of independent variables. Applications are given to U-statistics, random strings and random graphs." Applied here only to Erdos-Renyi (IID) random graphs, but might be extendable to Markov random graphs...? PDF preprint]
- Vladislav Kargin, "A Large Deviation Inequality for Vector Functions on Finite Reversible Markov Chains", math.PR/0508538
- Michael Keyl, "Quantum state estimation and large deviations", quant-ph/0412053
- F. Klebaner and R. Liptser, "Large Deviations for Past-Dependent Recursions", math.PR/0603407 [Corrected version of Problems of Information Transmission 32 (1996): 23--34]
- Leonid Kontorovich [I am really not sure why Leo bothered to take
my class...]
- "Measure Concentration of Hidden Markov Processes", math.PR/0608064
- "Measure concentration of Markov Tree Processes", math.PR/0608511
- Ioannis Kontoyiannis and S. P. Meyn
- "Large deviations asymptotics and the spectral theory of multiplicatively regular Markov processes", math.PR/0509310 = Electronic Journal of Probability 10 (2005): 61--123
- "Spectral Theory and Limit Theorems for Geometrically Ergodic Markov Processes", math.PR/0209200 = Annals of Applied Probability 13 (2003): 304--362
- Vivien Lecomte, Cécile Appert-Rolland, and
Frédéric van Wijland
- "Thermodynamic formalism for systems with Markov dynamics", cond-mat/0606211
- "Thermodynamic formalism and large deviation functions in continuous time Markov dynamics", cond-mat/0703435
- Vivien Lecomte and Julien Tailleur, "A numerical approach to large deviations in continuous time", Journal of Statistical Mechanics: Theory and Experiment 2007: P03004
- Carlos A. Leon and Francois Perron, "Optimal Hoeffding bounds for discrete reversible Markov chains", math.PR/0405296
- Robert Sh. Liptser and Anatolii A. Pukhalskii, "Limit theorems on large deviations for semimartingales", math.PR/0510028 [But published in a journal in 1992]
- P. Major
- "On a multivariate version of Bernstein's inequality", math.PR/0411287
- "A multivariate generalization of Hoeffding's ineqality", math.PR/0411288
- Satya N. Majumdar and Alan J. Bray, "Large-Deviation Functions for Nonlinear Functionals of a Gaussian Stationary Markov Process", cond-mat/0202138 = Physical Review E 65 (2002): 051112
- K. Marton, "Bounding
-distance by informational
divergence: a method to prove measure
concentration", Annals
of Probability 24 (1996): 857--866 [Thanks to Leo
Kontorovich for the pointer]
- David McAllester, "A Statistical Mechanics Approach to Large Devations Theorems" [E-print available via CiteSeer --- published?]
- Abdelkader Mokkadem, Mariane Pelletier and Baba Thiam, "Large and moderate deviations principles for kernel estimators of the multivariate regression", math.ST/0703341
- K. Netocny and F. Redig, "Large deviations for quantum spin systems", math-ph/0404018 = Journal of Statistical Physics 117 (2004): 521--547
- Enzo Olivieri and Maria Eulalia Vares, Large Deviations and Metastability [Blurb]
- Y. Oono, "Large Deviation and Statistical Physics", Progress in Theoretical Physics 99 (1989): 165--205 [Should that be "Deviations"?]
- Huyen Pham, "Some applications and methods of large deviations in finance and insurance",math.PR/0702473
- Anatoly Puhalskii, Large Deviations and Idempotent Probability
- Anatolii A. Puhalskii, "Stochastic processes in random graphs", math.PR/0402183 [Large deviations for Erdos-Renyi graphs. Memo to self: how much work would it be to extend this to Markovian graphs?]
- Hong Qian, "Relative Entropy: Free Energy Associated with Equilibrium Fluctuations and Nonequilibrium Deviations", math-ph/0007010 = Physical Review E 63 (2001): 042103
- Olivier Rivoire, "The cavity method for large deviations", cond-mat/0506164 = Journal of Statistical Mechanics: Theory and Experiment (2005): P07004 ["A method is introduced for studying large deviations in the context of statistical physics of disordered systems. The approach, based on an extension of the cavity method to atypical realizations of the quenched disorder, allows us to compute exponentially small probabilities (rate functions) over different classes of random graphs."]
- David Ruelle, Thermodynamic Formalism
- L. Saulis and V. A. Statulevicius, Limit Theorems for Large Deviations
- Adam Shwartz, Large Deviations in Performance Modeling
- Youngchul Sung, Lang Tong and H. Vincent Poor, "A Large Deviations Apoproach to Sensor Scheduling for Detection of Correlated Random Fields", cs.IT/0501056
- Ted Theodosopoulos, "A Reversion of the Chernoff Bound", math.PR/0501360
- To write:
- "Large Deviations in Exponential Families of Stochastic Automata/Hidden Markov Models"
